upleb.uk

Public git repos — served from a NIP-34 GRASP relay at git.upleb.uk

summaryrefslogtreecommitdiff
path: root/docs
diff options
context:
space:
mode:
Diffstat (limited to 'docs')
-rw-r--r--docs/explanation/grasp-02-proactive-sync-purgatory-git-data.md67
-rw-r--r--docs/explanation/grasp-02-proactive-sync.md57
-rw-r--r--docs/explanation/purgatory-design.md574
3 files changed, 471 insertions, 227 deletions
diff --git a/docs/explanation/grasp-02-proactive-sync-purgatory-git-data.md b/docs/explanation/grasp-02-proactive-sync-purgatory-git-data.md
index 31c3e46..8fb5798 100644
--- a/docs/explanation/grasp-02-proactive-sync-purgatory-git-data.md
+++ b/docs/explanation/grasp-02-proactive-sync-purgatory-git-data.md
@@ -12,7 +12,13 @@
12 12
13## Overview 13## Overview
14 14
15When Nostr events arrive before their git data, they enter **purgatory** waiting to be served. But they don't wait passively—ngit-grasp **actively hunts** for the missing git data across all git servers assoicated with the repo until it finds what it needs. 15When Nostr events arrive before their git data, they enter **purgatory** waiting to be served. But they don't wait passively—ngit-grasp **actively hunts** for the missing git data across all git servers associated with the repo until it finds what it needs.
16
17This applies to three types of purgatory entries:
18
19- **Announcement purgatory** — kind 30617 announcements waiting for a git push to prove the repo has content
20- **State event purgatory** — kind 30618 state events waiting for their referenced git objects
21- **PR event purgatory** — kind 1617/1618 PR events waiting for their referenced commits
16 22
17### How It Works 23### How It Works
18 24
@@ -42,6 +48,7 @@ We respect remote server capacity with:
42✅ **Respectful throttling** - 5 concurrent + 30/min per domain, plays nice with other implementations 48✅ **Respectful throttling** - 5 concurrent + 30/min per domain, plays nice with other implementations
43✅ **Smart timing** - 3min delay for user pushes, 500ms for synced events 49✅ **Smart timing** - 3min delay for user pushes, 500ms for synced events
44✅ **30min expiry** - Auto-cleanup of events when data never arrives 50✅ **30min expiry** - Auto-cleanup of events when data never arrives
51✅ **Soft expiry for announcements** - Bare repo deleted at 30min, event retained 24h to allow revival
45✅ **Fully testable** - Mock-based architecture for reliable unit tests 52✅ **Fully testable** - Mock-based architecture for reliable unit tests
46 53
47--- 54---
@@ -73,6 +80,16 @@ Timeline D: Data never arrives
73 t=60s: Retry → all servers checked, no data 80 t=60s: Retry → all servers checked, no data
74 ... 81 ...
75 t=1800s: 30 minutes expired → event discarded, purgatory cleaned up 🗑️ 82 t=1800s: 30 minutes expired → event discarded, purgatory cleaned up 🗑️
83
84Timeline E: Announcement purgatory (no git data within 30 min)
85 t=0s: Announcement received → bare repo created, enters announcement purgatory
86 t=0.5s: Start hunting git servers for any content
87 ...
88 t=1800s: 30 minutes expired → bare repo deleted, event retained (soft_expired=true)
89 t=3600s: State event arrives (slow sync) → bare repo recreated, expiry reset ✅
90 t=5400s: Git push arrives → announcement promoted to DB, served to clients ✅
91 OR
92 t=86400s: 24 hours elapsed, no revival → event added to expired_events, removed 🗑️
76``` 93```
77 94
78**Without proactive sync**: Events in Timeline C would wait indefinitely (or until manual git push). 95**Without proactive sync**: Events in Timeline C would wait indefinitely (or until manual git push).
@@ -330,11 +347,11 @@ Both methods check `has_capacity()` and trigger `try_process_next()` if true.
330 347
331--- 348---
332 349
333## 30-Minute Purgatory Expiry 350## Purgatory Expiry
334 351
335Purgatory entries **automatically expire** after 30 minutes to prevent unbounded memory growth. 352### State and PR Events: 30-Minute Hard Expiry
336 353
337### Why 30 Minutes? 354State and PR purgatory entries **automatically expire** after 30 minutes.
338 355
339From the [GRASP-01 spec](https://github.com/DanConwayDev/grasp/blob/main/01.md#purgatory): 356From the [GRASP-01 spec](https://github.com/DanConwayDev/grasp/blob/main/01.md#purgatory):
340 357
@@ -346,25 +363,40 @@ This balances:
346- 🧹 **Short enough** to prevent memory leaks from abandoned events 363- 🧹 **Short enough** to prevent memory leaks from abandoned events
347- 🔄 **Recoverable** events are still on other relays and can be re-submitted 364- 🔄 **Recoverable** events are still on other relays and can be re-submitted
348 365
349### Implementation 366Each entry tracks `expires_at: Instant` (30 min from creation). The sync loop checks expiry before processing via `has_pending_events()`. If all events for an identifier have expired, the identifier is removed from the sync queue.
350 367
351Each purgatory entry tracks: 368To prevent infinite re-sync loops, expired event IDs are added to an `expired_events` set. If a sync delivers an event that previously expired, it is rejected with `"previously expired from purgatory without git data"`.
352 369
353- `created_at: Instant` - When added to purgatory 370**Implementation**: [`src/purgatory/mod.rs:DEFAULT_EXPIRY`](../../src/purgatory/mod.rs)
354- `expires_at: Instant` - When to discard (created_at + 30min)
355 371
356The main sync loop checks expiry before processing: 372### Announcement Purgatory: Two-Phase Soft Expiry
357 373
358```rust 374Announcements use a different expiry strategy because they have an additional concern: the bare git repo created on arrival must be cleaned up, but we also need to avoid re-syncing the announcement event on every sync cycle.
359if !self.has_pending_events(&identifier) {
360 // No events remain (expired or released) → remove from sync queue
361 self.sync_queue.remove(&identifier);
362}
363```
364 375
365**Note**: Expiry is checked implicitly via `has_pending_events()`. If all events for an identifier have expired, the identifier is removed from the sync queue. 376**Phase 1 — Initial 30-minute expiry:**
366 377
367**Implementation**: [`src/purgatory/mod.rs:DEFAULT_EXPIRY`](../../src/purgatory/mod.rs) 378- Delete the bare git repo (frees disk space, respects the protocol's 30-minute expiry)
379- Set `soft_expired = true` on the entry
380- Extend `expires_at` by **24 hours** (`SOFT_EXPIRY_EXTENDED`)
381- Continue syncing state events for this repo (same as active purgatory)
382
383**Phase 2 — 24-hour soft expiry:**
384
385- Add event ID to `expired_events` (prevents re-sync loops)
386- Remove entry completely from `announcement_purgatory`
387
388**Why not just hard-expire at 30 minutes?**
389
390The protocol's 30-minute expiry creates a dilemma for announcements:
391
392- **Option A: Add to `failed_events` at 30 min** → Permanently rejects future state events, losing potential revival when state events arrive late (e.g. from a slow sync)
393- **Option B: Remove entirely at 30 min** → The announcement gets re-fetched on every subsequent sync cycle, wasting bandwidth indefinitely
394
395Soft expiry is the solution: the bare repo is deleted at 30 minutes (respecting the protocol), but the event is retained for 24 hours. During this window, a late-arriving state event can **revive** the announcement—`extend_announcement_expiry()` recreates the bare repo, clears `soft_expired`, and resets the 30-minute timer. After 24 hours with no revival, the event is added to `expired_events` and fully removed.
396
397**Why 24 hours specifically?** This covers the worst-case sync delay. A relay that was offline for up to 24 hours will re-sync state events when it reconnects. The 24-hour window ensures announcements remain revivable throughout that period without permanently occupying disk space.
398
399**Implementation**: [`src/purgatory/mod.rs:SOFT_EXPIRY_EXTENDED`](../../src/purgatory/mod.rs)
368 400
369--- 401---
370 402
@@ -670,6 +702,7 @@ The purgatory sync system is a sophisticated, production-ready implementation th
670✅ **Throttles respectfully** - 5 concurrent + 30/min per domain, round-robin fairness 702✅ **Throttles respectfully** - 5 concurrent + 30/min per domain, round-robin fairness
671✅ **Times strategically** - 3min for user events, 500ms for synced events 703✅ **Times strategically** - 3min for user events, 500ms for synced events
672✅ **Expires responsibly** - 30min auto-cleanup prevents memory leaks 704✅ **Expires responsibly** - 30min auto-cleanup prevents memory leaks
705✅ **Soft-expires announcements** - Bare repo deleted at 30min, event retained 24h for revival
673✅ **Tests thoroughly** - Mock-based architecture enables comprehensive unit tests 706✅ **Tests thoroughly** - Mock-based architecture enables comprehensive unit tests
674 707
675This design ensures ngit-grasp can serve repositories reliably even when git data and Nostr events arrive out-of-order or from different sources, while respecting remote server capacity and providing excellent observability. 708This design ensures ngit-grasp can serve repositories reliably even when git data and Nostr events arrive out-of-order or from different sources, while respecting remote server capacity and providing excellent observability.
diff --git a/docs/explanation/grasp-02-proactive-sync.md b/docs/explanation/grasp-02-proactive-sync.md
index ed8fdbf..6696e27 100644
--- a/docs/explanation/grasp-02-proactive-sync.md
+++ b/docs/explanation/grasp-02-proactive-sync.md
@@ -47,20 +47,37 @@ This state starts afresh when the binary loads.
47### RepoSyncIndex (Source of Truth) 47### RepoSyncIndex (Source of Truth)
48 48
49```rust 49```rust
50/// What we WANT to sync - derived from events received via self-subscription. 50/// What we WANT to sync - derived from events received via self-subscription
51/// Updated immediately when self-subscriber batch fires. 51/// and from purgatory announcements.
52/// Updated immediately when self-subscriber batch fires or purgatory sync timer runs.
52/// Key: repo addressable ref - 30617:pubkey:identifier 53/// Key: repo addressable ref - 30617:pubkey:identifier
53pub type RepoSyncIndex = Arc<RwLock<HashMap<String, RepoSyncNeeds>>>; 54pub type RepoSyncIndex = Arc<RwLock<HashMap<String, RepoSyncNeeds>>>;
54 55
56/// Controls which sync filters are built for a repo
57#[derive(Debug, Clone, Copy, PartialEq, Eq, Default)]
58pub enum SyncLevel {
59 #[default]
60 Full, // Full L2 + L3 sync (promoted repos with git data)
61 StateOnly, // Only state events (kind 30618) — for purgatory announcements
62}
63
55#[derive(Debug, Clone, Default)] 64#[derive(Debug, Clone, Default)]
56pub struct RepoSyncNeeds { 65pub struct RepoSyncNeeds {
57 /// Relay URLs listed in this repo's 30617 announcement 66 /// Relay URLs listed in this repo's 30617 announcement
58 pub relays: HashSet<String>, 67 pub relays: HashSet<String>,
59 /// Root event IDs - 1617/1618/1621 - that reference this repo 68 /// Root event IDs - 1617/1618/1621 - that reference this repo
60 pub root_events: HashSet<EventId>, 69 pub root_events: HashSet<EventId>,
70 /// Controls which filters are built: Full (L2+L3) or StateOnly (kind 30618 only)
71 pub sync_level: SyncLevel,
61} 72}
62``` 73```
63 74
75**Two sources populate `RepoSyncIndex`:**
76
771. **`SelfSubscriber`** — monitors the relay's own event stream for accepted announcements (kinds 30617, 1617, 1618, 1621). Adds entries with `SyncLevel::Full`. When an announcement is promoted from purgatory to the database, the SelfSubscriber sees it and upgrades the entry to `Full`.
78
792. **Purgatory announcement sync timer** (`run_purgatory_announcement_sync`, every 5 seconds) — iterates `purgatory.announcements_for_sync()` and ensures each purgatory announcement has a `SyncLevel::StateOnly` entry in `RepoSyncIndex`. This is the only registration path for purgatory announcements because they are not saved to the database and therefore never seen by the SelfSubscriber.
80
64### RelaySyncIndex (Confirmed State + Connection) 81### RelaySyncIndex (Confirmed State + Connection)
65 82
66```rust 83```rust
@@ -336,7 +353,23 @@ The sync system uses three background tasks that run continuously:
336 353
3371. Queue events to `PendingUpdates` 3541. Queue events to `PendingUpdates`
3382. Timer fires (interval, does not reset on events) 3552. Timer fires (interval, does not reset on events)
3393. Process batch: update RepoSyncIndex → derive targets → send AddFilters to SyncManager 3563. Process batch: update RepoSyncIndex with `SyncLevel::Full` → derive targets → send AddFilters to SyncManager
357
358**Note**: The SelfSubscriber only sees announcements that have been accepted to the database (promoted from purgatory). Purgatory announcements are registered separately by the purgatory sync timer (see below).
359
360### 4. Purgatory Announcement Sync Timer (`run_purgatory_announcement_sync`)
361
362**Purpose**: Register purgatory announcements in `RepoSyncIndex` so state events are synced for them
363
364**Interval**: Every 5 seconds (200ms in test mode)
365
366**Flow**:
367
3681. Iterate `purgatory.announcements_for_sync()`
3692. For each announcement not already in `RepoSyncIndex`: insert with `SyncLevel::StateOnly`
3703. When an announcement is promoted (git data arrives), the SelfSubscriber sees the newly accepted event and upgrades the entry to `SyncLevel::Full`
371
372**Why a separate timer?** Purgatory announcements are never saved to the database, so the SelfSubscriber never sees them. The timer bridges this gap, ensuring state events are synced for repos that may still receive git data.
340 373
341--- 374---
342 375
@@ -602,9 +635,10 @@ flowchart TB
602 635
603- Self-subscriber monitors own relay for 30617, 1617, 1618, 1621 (NOT 1619 or 30618) 636- Self-subscriber monitors own relay for 30617, 1617, 1618, 1621 (NOT 1619 or 30618)
604- Batches events in `PendingUpdates` (5 second window via interval timer) 637- Batches events in `PendingUpdates` (5 second window via interval timer)
605- `process_batch()` updates RepoSyncIndex, then builds AddFilters **directly** (no compute_actions) 638- `process_batch()` updates RepoSyncIndex with `SyncLevel::Full`, then builds AddFilters **directly** (no compute_actions)
606- AddFilters sent via channel to SyncManager, which calls `handle_new_sync_filters()` 639- AddFilters sent via channel to SyncManager, which calls `handle_new_sync_filters()`
607- This path does NOT use compute_actions because it's building fresh filters from the updated index 640- This path does NOT use compute_actions because it's building fresh filters from the updated index
641- Purgatory announcements (not in DB) are registered separately by the purgatory sync timer with `SyncLevel::StateOnly`
608 642
609--- 643---
610 644
@@ -687,16 +721,23 @@ fn compute_actions(
687- **Tags**: lowercase `a`, uppercase `A`, and `q` tags for comprehensive coverage 721- **Tags**: lowercase `a`, uppercase `A`, and `q` tags for comprehensive coverage
688- **Batching**: Per 100 repo refs 722- **Batching**: Per 100 repo refs
689- **Function**: `build_repo_tag_filters(repos, since)` 723- **Function**: `build_repo_tag_filters(repos, since)`
724- **Only for `SyncLevel::Full` repos** — purgatory announcements (`StateOnly`) skip this layer
690 725
691### Layer 3: Events Tagging Our Root Events 726### Layer 3: Events Tagging Our Root Events
692 727
693- **Tags**: lowercase `e`, uppercase `E`, and `q` tags for comprehensive coverage 728- **Tags**: lowercase `e`, uppercase `E`, and `q` tags for comprehensive coverage
694- **Batching**: Per 100 event IDs 729- **Batching**: Per 100 event IDs
695- **Function**: `build_root_event_tag_filters(root_events, since)` 730- **Function**: `build_root_event_tag_filters(root_events, since)`
731- **Only for `SyncLevel::Full` repos** — purgatory announcements (`StateOnly`) skip this layer
732
733### Combined Layer 2+3 (SyncLevel-Aware)
734
735The `build_sync_level_aware_filters()` function combines both layers, partitioning repos by `SyncLevel`:
696 736
697### Combined Layer 2+3 737- **`Full` repos**: state event filters + repo-tag filters + root-event-tag filters
738- **`StateOnly` repos**: state event filters only (kind 30618 with `#d` tags)
698 739
699The `build_layer2_and_layer3_filters()` function combines both layers. Used by: 740Used by:
700 741
701- `recompute_new_sync_filters_for_relay` for new item subscriptions 742- `recompute_new_sync_filters_for_relay` for new item subscriptions
702- `reconstruct_filters` for rebuilding from confirmed state 743- `reconstruct_filters` for rebuilding from confirmed state
@@ -871,9 +912,9 @@ flowchart TB
871 912
872``` 913```
873src/sync/ 914src/sync/
874├── mod.rs # SyncManager, main loop, data structures 915├── mod.rs # SyncManager, main loop, data structures, SyncLevel, run_purgatory_announcement_sync
875├── algorithms.rs # derive_relay_targets(), compute_actions() 916├── algorithms.rs # derive_relay_targets(), compute_actions()
876├── filters.rs # build_announcement_filter(), build_layer2_and_layer3_filters() 917├── filters.rs # build_announcement_filter(), build_sync_level_aware_filters()
877├── health.rs # RelayHealthTracker with exponential backoff 918├── health.rs # RelayHealthTracker with exponential backoff
878├── relay_connection.rs # RelayConnection, RelayEvent handling 919├── relay_connection.rs # RelayConnection, RelayEvent handling
879├── self_subscriber.rs # SelfSubscriber with batching 920├── self_subscriber.rs # SelfSubscriber with batching
diff --git a/docs/explanation/purgatory-design.md b/docs/explanation/purgatory-design.md
index b984745..8e7d75c 100644
--- a/docs/explanation/purgatory-design.md
+++ b/docs/explanation/purgatory-design.md
@@ -8,7 +8,11 @@
8 8
9## Overview 9## Overview
10 10
11Purgatory is an in-memory holding area that solves the **"which arrives first?"** problem in GRASP. Either nostr events or git pushes can arrive in any order: 11Purgatory is an in-memory holding area that solves two related problems in GRASP:
12
13### Problem 1: "Which arrives first?" (State and PR events)
14
15Either nostr events or git pushes can arrive in any order:
12 16
13- **Event first**: Event waits in purgatory until git data arrives 17- **Event first**: Event waits in purgatory until git data arrives
14- **Git first**: Placeholder waits in purgatory until event arrives 18- **Git first**: Placeholder waits in purgatory until event arrives
@@ -19,28 +23,61 @@ When both halves arrive, they are processed together and saved to the database.
19 23
20> Accepted repo state announcements, PRs and PR Updates SHOULD be accepted with message "purgatory: won't be served until git data arrives" and kept in purgatory (not served) until the related git data arrives and otherwise discarded after 30 minutes. 24> Accepted repo state announcements, PRs and PR Updates SHOULD be accepted with message "purgatory: won't be served until git data arrives" and kept in purgatory (not served) until the related git data arrives and otherwise discarded after 30 minutes.
21 25
26### Problem 2: Misleading empty repository announcements
27
28When a repository announcement arrives, we must create the bare git repo immediately so pushes can succeed. But if no git data ever arrives, we would serve an empty repo and its announcement indefinitely—clients see the announcement, try to clone, and get nothing.
29
30**Solution**: New announcements go to **announcement purgatory** instead of being immediately accepted:
31
321. **Announcement arrives** → Create bare repo immediately, add announcement to purgatory
332. **Git data arrives** → Promote announcement from purgatory to active (now served to clients)
343. **No git data before expiry** → Delete bare repo, discard announcement (never served)
35
36This ensures we only serve announcements for repos that actually have content.
37
22--- 38---
23 39
24## Key Design Principles 40## Key Design Principles
25 41
26### 1. In-Memory Only 42### 1. Graceful-Shutdown Persistence
43
44Purgatory state is **saved to disk on graceful shutdown** and **restored on startup**. This preserves in-flight work across planned restarts (deployments, reboots).
45
46On `SIGINT` / Ctrl-C, `main.rs` calls `purgatory.save_to_disk()` before exiting. On startup, if the state file exists, `purgatory.restore_from_disk()` is called before the server begins accepting connections.
47
48**What is persisted:**
49
50| Store | Persisted? | Notes |
51|-------|-----------|-------|
52| `announcement_purgatory` | ✅ Yes | Non-soft-expired entries only (bare repo must exist) |
53| `state_events` | ✅ Yes | All active entries |
54| `pr_events` | ✅ Yes | Both events and placeholders |
55| `expired_events` | ✅ Yes | Prevents re-sync loops after restart |
56| `sync_queue` | ❌ No | Rebuilt automatically after restore |
27 57
28Purgatory data is **not persisted** to disk. On restart, all purgatory entries are lost. This is acceptable because: 58**What is NOT persisted (unclean shutdown):**
59
60On a crash or `SIGKILL`, the state file is not written. In that case:
29 61
30- Events are still on other relays (can be re-submitted) 62- Events are still on other relays (can be re-submitted)
31- Git data can be re-pushed 63- Git data can be re-pushed
32- 30-minute expiry means data is transient anyway 64- 30-minute expiry means data is transient anyway
33 65
34### 2. Separate Storage for State vs PR Events 66**State file location:** `<git_data_path>/purgatory-state.json`
67
68**Downtime accounting:** Expiry deadlines are stored as duration offsets from the save timestamp. On restore, elapsed downtime is subtracted from each deadline. Entries that expired during downtime are immediately swept by the next cleanup tick.
69
70**Soft-expired announcements are excluded:** Their bare repos have already been deleted, so they cannot be meaningfully restored. They will be re-fetched via background sync if needed.
35 71
36State events (kind 30618) and PR events (kind 1617/1618) have fundamentally different matching patterns: 72### 2. Separate Storage for Each Event Type
37 73
38| Event Type | Index | Matching Strategy | 74| Store | Index | Purpose |
39|------------|-------|-------------------| 75|-------|-------|---------|
40| **State Events** | `identifier` (d tag) | Compare refs at push time | 76| `announcement_purgatory` | `(PublicKey, String)` — `(owner, identifier)` | Announcements awaiting git data |
41| **PR Events** | `event_id` (hex string) | Direct match via `refs/nostr/<event-id>` | 77| `state_events` | `identifier` (d tag) | State events awaiting git data |
78| `pr_events` | `event_id` (hex string) | PR events awaiting git data |
42 79
43They use **separate DashMap stores** for efficient concurrent access. 80Announcement purgatory uses `(pubkey, identifier)` because identifier alone is not unique across different owners.
44 81
45### 3. Late Binding for State Events 82### 3. Late Binding for State Events
46 83
@@ -78,7 +115,23 @@ With purgatory checking during authorization:
782. Git push arrives → Checks **database + purgatory** → State found → **AUTHORIZED** ✅ 1152. Git push arrives → Checks **database + purgatory** → State found → **AUTHORIZED** ✅
793. After push succeeds → Save event to database → Remove from purgatory 1163. After push succeeds → Save event to database → Remove from purgatory
80 117
81See [`src/git/authorization.rs:51-162`](../../src/git/authorization.rs) for implementation. 118See [`src/git/authorization.rs`](../../src/git/authorization.rs) for implementation.
119
120### 6. Announcement Purgatory: Bare Repo Created Immediately
121
122**Decision:** Create the bare git repo when announcement enters purgatory.
123
124**Why:** Git pushes may arrive at any time. Without a repo, pushes fail.
125
126**Consequence:** We allocate disk space for repos that may expire unused. Must delete repos on expiry.
127
128### 7. Replacement Announcements Skip Purgatory
129
130**Decision:** Announcements replacing an existing active (database) announcement are accepted immediately.
131
132**Why:** The repository is already proven active with content.
133
134**How:** Check if active announcement exists for `(pubkey, identifier)` before routing to purgatory.
82 135
83--- 136---
84 137
@@ -103,22 +156,54 @@ pub struct RefUpdate {
103} 156}
104``` 157```
105 158
159### Announcement Purgatory Entry
160
161```rust
162pub struct AnnouncementPurgatoryEntry {
163 /// The kind 30617 announcement event
164 pub event: Event,
165
166 /// Repository identifier from 'd' tag
167 pub identifier: String,
168
169 /// Event author pubkey
170 pub owner: PublicKey,
171
172 /// Path to the bare git repo on disk (created immediately on entry)
173 pub repo_path: PathBuf,
174
175 /// Relay URLs from 'relays'/'clone' tags — for sync registration
176 pub relays: HashSet<String>,
177
178 /// When added to purgatory
179 pub created_at: Instant,
180
181 /// Expiry deadline (30 min from creation, may be extended)
182 pub expires_at: Instant,
183
184 /// Whether the bare repo has been deleted (soft expiry phase)
185 pub soft_expired: bool,
186}
187```
188
189**Indexed by `(pubkey, identifier)`** because identifier is not unique across different owners.
190
106### State Purgatory Entry 191### State Purgatory Entry
107 192
108```rust 193```rust
109pub struct StatePurgatoryEntry { 194pub struct StatePurgatoryEntry {
110 /// The nostr state event (kind 30618) awaiting git data 195 /// The nostr state event (kind 30618) awaiting git data
111 pub event: Event, 196 pub event: Event,
112 197
113 /// Repository identifier from 'd' tag 198 /// Repository identifier from 'd' tag
114 pub identifier: String, 199 pub identifier: String,
115 200
116 /// Event author pubkey 201 /// Event author pubkey
117 pub author: PublicKey, 202 pub author: PublicKey,
118 203
119 /// When added to purgatory 204 /// When added to purgatory
120 pub created_at: Instant, 205 pub created_at: Instant,
121 206
122 /// Expiry deadline (30 min from creation, may be extended) 207 /// Expiry deadline (30 min from creation, may be extended)
123 pub expires_at: Instant, 208 pub expires_at: Instant,
124} 209}
@@ -132,14 +217,14 @@ pub struct StatePurgatoryEntry {
132pub struct PrPurgatoryEntry { 217pub struct PrPurgatoryEntry {
133 /// The nostr PR event, if received (None = git data arrived first) 218 /// The nostr PR event, if received (None = git data arrived first)
134 pub event: Option<Event>, 219 pub event: Option<Event>,
135 220
136 /// Expected commit SHA from 'c' tag (if event exists) 221 /// Expected commit SHA from 'c' tag (if event exists)
137 /// or actual commit pushed (if git arrived first) 222 /// or actual commit pushed (if git arrived first)
138 pub commit: String, 223 pub commit: String,
139 224
140 /// When added to purgatory 225 /// When added to purgatory
141 pub created_at: Instant, 226 pub created_at: Instant,
142 227
143 /// Expiry deadline (30 min from creation) 228 /// Expiry deadline (30 min from creation)
144 pub expires_at: Instant, 229 pub expires_at: Instant,
145} 230}
@@ -151,24 +236,180 @@ pub struct PrPurgatoryEntry {
151 236
152```rust 237```rust
153pub struct Purgatory { 238pub struct Purgatory {
239 /// Announcement events indexed by (owner, identifier)
240 announcement_purgatory: DashMap<(PublicKey, String), AnnouncementPurgatoryEntry>,
241
154 /// State events indexed by identifier (d tag) 242 /// State events indexed by identifier (d tag)
155 /// Multiple state events per identifier allowed (different authors) 243 /// Multiple state events per identifier allowed (different authors)
156 state_events: Arc<DashMap<String, Vec<StatePurgatoryEntry>>>, 244 state_events: DashMap<String, Vec<StatePurgatoryEntry>>,
157 245
158 /// PR events indexed by event_id (hex string) 246 /// PR events indexed by event_id (hex string)
159 /// Single entry per event ID 247 /// Single entry per event ID
160 pr_events: Arc<DashMap<String, PrPurgatoryEntry>>, 248 pr_events: DashMap<String, PrPurgatoryEntry>,
161 249
162 /// Sync queue for background git data fetching 250 /// Sync queue for background git data fetching
163 sync_queue: Arc<DashMap<String, SyncQueueEntry>>, 251 sync_queue: DashMap<String, SyncQueueEntry>,
164 252
165 _git_data_path: PathBuf, 253 /// Events that previously expired without git data (prevents re-sync loops)
254 expired_events: DashMap<EventId, Instant>,
255}
256```
257
258### Persistence State (Disk Format)
259
260`Instant` fields cannot be serialized directly. Each entry type has a corresponding `Serializable*` wrapper that stores time fields as `u64` second offsets from a `saved_at: SystemTime` reference point. On restore, elapsed downtime is subtracted to produce the correct remaining TTL.
261
262```rust
263struct PurgatoryState {
264 version: u32, // currently 1
265 saved_at: SystemTime, // reference for offset math
266
267 /// Non-soft-expired announcements indexed by "owner_hex:identifier"
268 announcement_purgatory: HashMap<String, SerializableAnnouncementPurgatoryEntry>,
269
270 /// State events indexed by repository identifier
271 state_events: HashMap<String, Vec<SerializableStatePurgatoryEntry>>,
272
273 /// PR events (and placeholders) indexed by event ID hex
274 pr_events: HashMap<String, SerializablePrPurgatoryEntry>,
275
276 /// Expired event IDs → approximate expiry SystemTime
277 expired_events: HashMap<String, SystemTime>,
166} 278}
167``` 279```
168 280
281The `announcement_purgatory` field uses `#[serde(default)]` so that state files written before announcement persistence was added (version 1 without the field) still deserialize correctly.
282
283---
284
285## Announcement Purgatory Flows
286
287### New Announcement Flow
288
289```
290Announcement arrives
291 |
292 v
293Is there an active announcement for (pubkey, identifier) in DB?
294 |
295 +-- YES --> Accept immediately (replacement, repo already proven)
296 |
297 +-- NO --> Is there a purgatory entry for (pubkey, identifier)?
298 |
299 +-- YES --> Replace purgatory entry, extend expiry 30 min
300 | Return OK to client (but don't serve)
301 |
302 +-- NO --> Create bare repo
303 Add to purgatory
304 Return OK to client (but don't serve)
305```
306
307### Git Data Arrival → Promotion
308
309```
310Git push/fetch completes with data
311 |
312 v
313process_purgatory_announcements() called
314 |
315 v
316Is there a purgatory announcement for (owner, identifier)?
317 |
318 +-- YES --> promote_announcement() removes from purgatory
319 | Save event to database
320 | Notify WebSocket clients
321 | (Sync upgrades to Full automatically via SelfSubscriber)
322 |
323 +-- NO --> Normal processing
324```
325
326### State Event Arrival for Purgatory Announcement
327
328```
329State event arrives
330 |
331 v
332fetch_repository_data_with_purgatory() checks DB + purgatory
333 |
334 +-- Announcement found in purgatory -->
335 | Validate authorization against purgatory announcement
336 | Extend purgatory announcement expiry (reset 30-min timer)
337 | If soft-expired: recreate bare repo, clear soft_expired flag
338 | Route state event to state purgatory
339 |
340 +-- No announcement anywhere --> Reject
341```
342
343### Announcement Expiry (Two-Phase Soft Expiry)
344
345The protocol specifies 30-minute expiry for announcements. We implement a two-phase soft expiry:
346
347**Phase 1 — Initial 30-minute expiry (`soft_expired == false`):**
348- Delete the bare git repo (frees disk space, respects protocol expiry)
349- Set `soft_expired = true`
350- Extend `expires_at` by 24 hours (`SOFT_EXPIRY_EXTENDED`)
351- Continue syncing state events (same as active purgatory)
352
353**Phase 2 — 24-hour soft expiry (`soft_expired == true`):**
354- Add event ID to `expired_events` (prevents re-sync loops)
355- Remove entry completely from `announcement_purgatory`
356
357**Why soft expiry?** Without it, we'd face a dilemma:
358
359- Add expired announcements to `failed_events` → permanently reject future state events, losing potential revival when state events arrive late
360- Re-fetch the announcement event on every sync cycle → wasting bandwidth and creating unnecessary sync traffic
361
362Soft expiry retains the event for 24 hours so that late-arriving state events (e.g. from a slow sync) can revive the announcement without forcing a full re-announcement flow.
363
364**Revival:** If a state event arrives for a soft-expired announcement, `extend_announcement_expiry()` recreates the bare repo, clears `soft_expired`, and resets the 30-minute timer.
365
366### Expiry Extension Triggers
367
368The 30-minute purgatory timer is reset (extended) in three scenarios:
369
370| Trigger | Location | Why |
371|---------|----------|-----|
372| State event arrives | `StatePolicy::process_state_event()` | Repo is actively receiving metadata |
373| Git push authorized against purgatory state | `get_state_authorization_for_specific_owner_repo()` | Repo is actively receiving git data |
374| Replacement announcement arrives | `AnnouncementPolicy::validate()` | Announcement updated |
375
376All three call `purgatory.extend_announcement_expiry(owner, identifier, 1800s)`.
377
378### Purgatory Lifecycle
379
380```
381 ┌─────────────────────────────────────┐
382 │ │
383 v │
384Announcement ──> ACTIVE ──────────────────────────────────┤
385 arrives (bare repo exists) │
386 │ │
387 ├── Git data ──> PROMOTED (exit) │
388 │ │
389 ├── Deletion ──> REMOVED (exit) │
390 │ │
391 v │
392 SOFT_EXPIRED ──────────────────────────────┘
393 (bare repo deleted, ^
394 event retained) │
395 │ │
396 ├── State event arrives (revival)
397
398 └── Extended expiry ──> REMOVED (exit)
399```
400
401| Exit | Trigger | Action |
402|------|---------|--------|
403| **Promotion** | Git data arrives | Move to database, sync upgrades to Full |
404| **Soft expiry** | Initial 30-min timeout | Delete bare repo, retain event, continue sync |
405| **Full expiry** | 24-hour soft expiry | Add to expired_events, remove from purgatory |
406| **Deletion** | Kind 5 event | Delete bare repo, remove from purgatory |
407| **Replacement** | Newer announcement (same pubkey, identifier) | Replace entry, extend expiry |
408| **Service change** | Newer announcement removes our service | Remove from purgatory |
409
169--- 410---
170 411
171## Event Flows 412## State and PR Event Flows
172 413
173### State Event Arrival (Kind 30618) 414### State Event Arrival (Kind 30618)
174 415
@@ -377,11 +618,12 @@ Purgatory includes a background sync system that fetches git data from remote se
377 618
378┌─────────────────────────────────────────────────────┐ 619┌─────────────────────────────────────────────────────┐
379│ process_newly_available_git_data(repo, oids) │ 620│ process_newly_available_git_data(repo, oids) │
380│ 1. Find satisfiable state events in purgatory │ 621│ 1. Find satisfiable announcement in purgatory │
381│ 2. Find satisfiable PR events in purgatory │ 622│ 2. Find satisfiable state events in purgatory │
382│ 3. Save events to database │ 623│ 3. Find satisfiable PR events in purgatory │
383│ 4. Sync git data to other owner repos │ 624│ 4. Save events to database │
384│ 5. Remove from purgatory │ 625│ 5. Sync git data to other owner repos │
626│ 6. Remove from purgatory │
385└─────────────────────────────────────────────────────┘ 627└─────────────────────────────────────────────────────┘
386``` 628```
387 629
@@ -402,8 +644,8 @@ pub struct SyncQueueEntry {
402 644
403**Backoff strategy:** 645**Backoff strategy:**
404- First attempt: 20 seconds 646- First attempt: 20 seconds
405- Second attempt: 2 minutes 647- Second attempt: 40 seconds
406- Subsequent attempts: 2 minutes 648- Subsequent attempts: capped at 2 minutes
407 649
408### Sync Delays 650### Sync Delays
409 651
@@ -428,7 +670,7 @@ pub struct ThrottleManager {
428``` 670```
429 671
430**Rate limiting:** 672**Rate limiting:**
431- Default: 5 requests per domain per 30 seconds 673- Default: 5 concurrent requests per domain, 30 requests per minute
432- Tracks request timestamps in a sliding window 674- Tracks request timestamps in a sliding window
433- Queues identifiers when domain is throttled 675- Queues identifiers when domain is throttled
434- Processes queue when capacity frees up 676- Processes queue when capacity frees up
@@ -439,7 +681,47 @@ See [`src/purgatory/sync/throttle.rs`](../../src/purgatory/sync/throttle.rs) for
439 681
440## Purgatory API 682## Purgatory API
441 683
442### Adding Entries 684### Announcement Purgatory
685
686```rust
687impl Purgatory {
688 /// Add an announcement to purgatory (bare repo already created by caller)
689 pub fn add_announcement(
690 &self,
691 event: Event,
692 identifier: String,
693 owner: PublicKey,
694 repo_path: PathBuf,
695 relays: HashSet<String>,
696 );
697
698 /// Promote announcement: remove from purgatory, return event for DB save
699 pub fn promote_announcement(
700 &self,
701 owner: &PublicKey,
702 identifier: &str,
703 ) -> Option<Event>;
704
705 /// Get announcements by identifier (for authorization checks)
706 pub fn get_announcements_by_identifier(
707 &self,
708 identifier: &str,
709 ) -> Vec<AnnouncementPurgatoryEntry>;
710
711 /// Extend expiry (and revive soft-expired entries, recreating bare repo)
712 pub fn extend_announcement_expiry(
713 &self,
714 owner: &PublicKey,
715 identifier: &str,
716 duration: Duration,
717 );
718
719 /// Get all announcements for sync registration
720 pub fn announcements_for_sync(&self) -> Vec<AnnouncementPurgatoryEntry>;
721}
722```
723
724### State and PR Purgatory
443 725
444```rust 726```rust
445impl Purgatory { 727impl Purgatory {
@@ -453,13 +735,7 @@ impl Purgatory {
453 735
454 /// Add a PR placeholder (git-data-first scenario) 736 /// Add a PR placeholder (git-data-first scenario)
455 pub fn add_pr_placeholder(&self, event_id: String, commit: String); 737 pub fn add_pr_placeholder(&self, event_id: String, commit: String);
456}
457```
458
459### Finding Entries
460 738
461```rust
462impl Purgatory {
463 /// Find state events waiting for an identifier 739 /// Find state events waiting for an identifier
464 pub fn find_state(&self, identifier: &str) -> Vec<StatePurgatoryEntry>; 740 pub fn find_state(&self, identifier: &str) -> Vec<StatePurgatoryEntry>;
465 741
@@ -476,13 +752,7 @@ impl Purgatory {
476 752
477 /// Find a PR placeholder specifically (git-data-first) 753 /// Find a PR placeholder specifically (git-data-first)
478 pub fn find_pr_placeholder(&self, event_id: &str) -> Option<String>; 754 pub fn find_pr_placeholder(&self, event_id: &str) -> Option<String>;
479}
480```
481
482### Removing Entries
483 755
484```rust
485impl Purgatory {
486 /// Remove all state events for an identifier 756 /// Remove all state events for an identifier
487 pub fn remove_state(&self, identifier: &str); 757 pub fn remove_state(&self, identifier: &str);
488 758
@@ -499,36 +769,14 @@ impl Purgatory {
499```rust 769```rust
500impl Purgatory { 770impl Purgatory {
501 /// Remove expired entries (called every 60 seconds) 771 /// Remove expired entries (called every 60 seconds)
502 /// Returns (state_removed, pr_removed) 772 /// Handles two-phase soft expiry for announcements
503 pub fn cleanup(&self) -> (usize, usize); 773 pub fn cleanup(&self);
504 774
505 /// Extend expiry for entries about to be processed 775 /// Extend expiry for state/PR entries about to be processed
506 /// Ensures at least `duration` remaining
507 pub fn extend_expiry(&self, identifier: &str, event_ids: &[EventId], duration: Duration); 776 pub fn extend_expiry(&self, identifier: &str, event_ids: &[EventId], duration: Duration);
508 777
509 /// Get current counts for metrics 778 /// Check if an event previously expired (prevents re-sync loops)
510 pub fn count(&self) -> (usize, usize); 779 pub fn is_expired(&self, event_id: &EventId) -> bool;
511}
512```
513
514### Sync Queue Management
515
516```rust
517impl Purgatory {
518 /// Enqueue identifier for sync with custom delay
519 pub fn enqueue_sync(&self, identifier: &str, delay: Duration);
520
521 /// Enqueue with default delay (3 minutes)
522 pub fn enqueue_sync_default(&self, identifier: &str);
523
524 /// Enqueue with immediate delay (500ms)
525 pub fn enqueue_sync_immediate(&self, identifier: &str);
526
527 /// Check if identifier has pending events
528 pub fn has_pending_events(&self, identifier: &str) -> bool;
529
530 /// Remove identifier from sync queue
531 pub fn remove_from_sync_queue(&self, identifier: &str);
532} 780}
533``` 781```
534 782
@@ -558,12 +806,6 @@ pub fn can_apply_state(
558 event: &Event, 806 event: &Event,
559 repo_path: &Path, 807 repo_path: &Path,
560) -> Result<bool>; 808) -> Result<bool>;
561
562/// Get refs from state that aren't being pushed
563pub fn get_unpushed_refs(
564 state_refs: &[RefPair],
565 pushed_refs: &[RefPair],
566) -> Vec<RefPair>;
567``` 809```
568 810
569See [`src/purgatory/helpers.rs`](../../src/purgatory/helpers.rs) for implementation. 811See [`src/purgatory/helpers.rs`](../../src/purgatory/helpers.rs) for implementation.
@@ -572,123 +814,37 @@ See [`src/purgatory/helpers.rs`](../../src/purgatory/helpers.rs) for implementat
572 814
573## Integration Points 815## Integration Points
574 816
575### 1. Event Policy (Nip34WritePolicy) 817### 1. Announcement Policy (`src/nostr/policy/announcement.rs`)
576 818
577State and PR events are added to purgatory when git data doesn't exist: 819Routes new announcements to purgatory or accepts replacements:
578 820
579```rust 821- If active DB announcement exists for `(pubkey, identifier)` → `Accept` immediately
580// From src/nostr/policy/state.rs 822- If purgatory entry exists → replace it, extend expiry, return `Accept`
581async fn handle_state(&self, event: &Event) -> WritePolicyResult { 823- Otherwise → return `AcceptPurgatory`, caller calls `add_to_purgatory()` which creates bare repo and adds to purgatory
582 let identifier = extract_identifier(event)?;
583
584 // Check if we have matching git data
585 if self.has_matching_git_data(&identifier, event).await? {
586 return WritePolicyResult::Accept;
587 }
588
589 // Add to purgatory
590 self.purgatory.add_state(
591 event.clone(),
592 identifier.clone(),
593 event.pubkey,
594 );
595
596 WritePolicyResult::Reject {
597 status: true, // Client sees OK
598 message: "purgatory: awaiting git data".into()
599 }
600}
601```
602 824
603### 2. Git Push Authorization 825### 2. State Event Policy (`src/nostr/policy/state.rs`)
604 826
605Authorization checks both database and purgatory: 827Checks purgatory announcements for authorization and extends their expiry:
606 828
607```rust 829```rust
608// From src/git/authorization.rs 830// Fetch announcements from both DB and purgatory
609pub async fn authorize_push( 831let repo_data = fetch_repository_data_with_purgatory(db, purgatory, identifier).await?;
610 database: &SharedDatabase, 832
611 identifier: &str, 833// For each authorized owner with a purgatory announcement, extend expiry
612 owner_pubkey: &str, 834purgatory.extend_announcement_expiry(&owner_pk, &identifier, Duration::from_secs(1800));
613 request_body: &Bytes,
614 purgatory: &Arc<Purgatory>, // Critical!
615 repo_path: &std::path::Path,
616) -> anyhow::Result<AuthorizationResult> {
617 // Parse pushed refs
618 let pushed_refs = parse_pushed_refs(request_body);
619
620 // Check database for state events
621 let db_result = get_authorization_from_db(database, identifier).await?;
622
623 if !db_result.authorized {
624 // No state in database - check purgatory
625 let purgatory_result = get_state_authorization_for_specific_owner_repo(
626 database,
627 identifier,
628 owner_pubkey,
629 purgatory,
630 &pushed_refs,
631 repo_path,
632 ).await?;
633
634 return purgatory_result;
635 }
636
637 db_result
638}
639``` 835```
640 836
641### 3. Post-Push Processing 837### 3. Git Push Authorization (`src/git/authorization.rs`)
642 838
643After successful push, events from purgatory are saved to database: 839`fetch_repository_data_with_purgatory()` merges DB announcements with purgatory announcements for authorization. On successful authorization via purgatory state events, also extends announcement expiry.
644 840
645```rust 841### 4. Git Data Processing (`src/git/sync.rs`)
646// From src/git/handlers.rs
647if from_purgatory {
648 if let (Some(db), Some(purg)) = (&database, &purgatory) {
649 // Save state event to database
650 db.save_event(&state.event).await?;
651
652 // Remove from purgatory
653 purg.remove_state_event(identifier, &state.event.id);
654 }
655}
656```
657 842
658### 4. Background Sync Loop 843`process_purgatory_announcements()` is called after any git push or background sync fetch. It promotes announcements from purgatory to the database and notifies WebSocket clients.
659 844
660Started during application initialization: 845### 5. Sync Registration (`src/sync/`)
661 846
662```rust 847A background timer (`run_purgatory_announcement_sync`, every 5 seconds) ensures purgatory announcements are registered in `RepoSyncIndex` with `SyncLevel::StateOnly`. When an announcement is promoted, the `SelfSubscriber` upgrades it to `SyncLevel::Full`.
663// From src/main.rs
664let purgatory = Arc::new(Purgatory::new(git_data_path));
665let ctx = Arc::new(RealSyncContext::new(
666 database.clone(),
667 purgatory.clone(),
668 config.domain.clone(),
669 git_data_path.clone(),
670));
671let throttle_manager = Arc::new(ThrottleManager::new(5, 30));
672throttle_manager.set_context(ctx.clone());
673
674// Start sync loop
675let sync_handle = purgatory.clone().start_sync_loop(ctx, throttle_manager);
676
677// Start cleanup task
678let cleanup_handle = tokio::spawn(async move {
679 let mut interval = tokio::time::interval(Duration::from_secs(60));
680 loop {
681 interval.tick().await;
682 let (state_removed, pr_removed) = purgatory.cleanup();
683 if state_removed + pr_removed > 0 {
684 tracing::debug!(
685 "Purgatory cleanup removed {} state, {} PR entries",
686 state_removed, pr_removed
687 );
688 }
689 }
690});
691```
692 848
693--- 849---
694 850
@@ -697,8 +853,9 @@ let cleanup_handle = tokio::spawn(async move {
697``` 853```
698src/ 854src/
699├── purgatory/ 855├── purgatory/
700│ ├── mod.rs # Main Purgatory struct and API 856│ ├── mod.rs # Main Purgatory struct, API, save_to_disk, restore_from_disk
701│ ├── types.rs # RefPair, StatePurgatoryEntry, PrPurgatoryEntry 857│ ├── types.rs # RefPair, AnnouncementPurgatoryEntry, StatePurgatoryEntry, PrPurgatoryEntry
858│ ├── persistence.rs # instant_to_offset / offset_to_instant time conversion utilities
702│ ├── helpers.rs # Ref extraction and matching functions 859│ ├── helpers.rs # Ref extraction and matching functions
703│ └── sync/ 860│ └── sync/
704│ ├── mod.rs # Sync module exports 861│ ├── mod.rs # Sync module exports
@@ -710,9 +867,10 @@ src/
710├── git/ 867├── git/
711│ ├── authorization.rs # authorize_push with purgatory checking 868│ ├── authorization.rs # authorize_push with purgatory checking
712│ ├── handlers.rs # handle_receive_pack with post-push processing 869│ ├── handlers.rs # handle_receive_pack with post-push processing
713│ └── sync.rs # process_newly_available_git_data 870│ └── sync.rs # process_newly_available_git_data, process_purgatory_announcements
714└── nostr/ 871└── nostr/
715 └── policy/ 872 └── policy/
873 ├── announcement.rs # Route announcements to purgatory
716 ├── state.rs # State event policy with purgatory 874 ├── state.rs # State event policy with purgatory
717 └── pr_event.rs # PR event policy with purgatory 875 └── pr_event.rs # PR event policy with purgatory
718``` 876```
@@ -725,7 +883,8 @@ src/
725 883
726Located in each module: 884Located in each module:
727 885
728- **[`src/purgatory/mod.rs`](../../src/purgatory/mod.rs)** - Core purgatory operations 886- **[`src/purgatory/mod.rs`](../../src/purgatory/mod.rs)** - Core purgatory operations including announcement purgatory; persistence round-trip tests for all entry types (state, PR, announcement, expired events, downtime calculation, soft-expired exclusion, missing-repo skip)
887- **[`src/purgatory/persistence.rs`](../../src/purgatory/persistence.rs)** - `instant_to_offset` / `offset_to_instant` round-trip tests
729- **[`src/purgatory/helpers.rs`](../../src/purgatory/helpers.rs)** - Ref matching logic 888- **[`src/purgatory/helpers.rs`](../../src/purgatory/helpers.rs)** - Ref matching logic
730- **[`src/purgatory/sync/functions.rs`](../../src/purgatory/sync/functions.rs)** - Sync functions with MockSyncContext 889- **[`src/purgatory/sync/functions.rs`](../../src/purgatory/sync/functions.rs)** - Sync functions with MockSyncContext
731- **[`src/purgatory/sync/throttle.rs`](../../src/purgatory/sync/throttle.rs)** - Throttle manager 890- **[`src/purgatory/sync/throttle.rs`](../../src/purgatory/sync/throttle.rs)** - Throttle manager
@@ -734,17 +893,33 @@ Located in each module:
734 893
735Located in [`tests/`](../../tests/): 894Located in [`tests/`](../../tests/):
736 895
896- **Announcement purgatory flow** - Announcement enters purgatory, git data promotes it
897- **Announcement soft expiry** - Bare repo deleted after 30 min, event retained 24h
898- **Announcement revival** - State event revives soft-expired announcement
737- **State event purgatory flow** - Event arrives, git push releases it 899- **State event purgatory flow** - Event arrives, git push releases it
738- **PR event purgatory flow** - Event arrives, git push releases it 900- **PR event purgatory flow** - Event arrives, git push releases it
739- **Git-data-first flow** - Git push creates placeholder, event completes it 901- **Git-data-first flow** - Git push creates placeholder, event completes it
740- **Authorization with purgatory** - Push authorized by purgatory state 902- **Authorization with purgatory** - Push authorized by purgatory state
741- **Background sync** - Sync fetches git data and releases events 903- **Background sync** - Sync fetches git data and releases events
904- **Persistence across restart** - Save/restore cycle preserves all entry types including announcements
742 905
743--- 906---
744 907
745## Key Learnings 908## Key Learnings
746 909
747### 1. Purgatory Authorization is Critical 910### 1. Announcement Purgatory Prevents Misleading Empty Repos
911
912Without announcement purgatory, we'd serve announcements for repos with no content. Clients would see the announcement, try to clone, and get nothing.
913
914**Solution:** Announcements wait in purgatory until git data proves content exists.
915
916### 2. Soft Expiry Avoids Sync Loops
917
918The protocol's 30-minute expiry creates a problem: without soft expiry, we'd either permanently block repositories or constantly re-sync expired announcement events.
919
920**Solution:** Soft expiry retains the event for 24 hours after deleting the bare repo, allowing revival without re-fetching.
921
922### 3. Purgatory Authorization is Critical
748 923
749Without checking purgatory during authorization, we have a deadlock: 924Without checking purgatory during authorization, we have a deadlock:
750- State event goes to purgatory (no git data) 925- State event goes to purgatory (no git data)
@@ -753,7 +928,7 @@ Without checking purgatory during authorization, we have a deadlock:
753 928
754**Solution:** `authorize_push()` checks both database and purgatory. 929**Solution:** `authorize_push()` checks both database and purgatory.
755 930
756### 2. Late Binding for State Events 931### 4. Late Binding for State Events
757 932
758Extracting refs at event arrival time doesn't work when: 933Extracting refs at event arrival time doesn't work when:
759- Multiple state events arrive for same identifier 934- Multiple state events arrive for same identifier
@@ -761,7 +936,7 @@ Extracting refs at event arrival time doesn't work when:
761 936
762**Solution:** Extract and match refs at push time via `find_matching_states()`. 937**Solution:** Extract and match refs at push time via `find_matching_states()`.
763 938
764### 3. Bidirectional Waiting for PR Events 939### 5. Bidirectional Waiting for PR Events
765 940
766PR events can arrive before or after git data: 941PR events can arrive before or after git data:
767- Event first → Wait for git push 942- Event first → Wait for git push
@@ -769,26 +944,21 @@ PR events can arrive before or after git data:
769 944
770**Solution:** `PrPurgatoryEntry.event: Option<Event>` with `None` = placeholder. 945**Solution:** `PrPurgatoryEntry.event: Option<Event>` with `None` = placeholder.
771 946
772### 4. Sync Queue Debouncing 947### 6. Persistence Requires Instant → Duration Conversion
773
774When events arrive in bursts (e.g., negentropy sync), we don't want to spawn a sync task for each event.
775
776**Solution:** `enqueue_sync()` resets `attempt_count` and updates `next_attempt` if already queued.
777 948
778### 5. Domain Throttling with Queues 949`std::time::Instant` is not serializable and is not meaningful across process boundaries. Expiry deadlines must be converted to a portable form.
779 950
780When a domain is throttled, we still want to eventually sync from it. 951**Solution:** Store each deadline as a `u64` second offset from a `saved_at: SystemTime` reference. On restore, subtract elapsed downtime from each offset to compute the new `Instant`. Entries whose deadline already passed during downtime get `expires_at = now` and are swept by the next cleanup tick.
781 952
782**Solution:** `ThrottleManager` maintains per-domain queues and processes them when capacity frees. 953**Soft-expired announcements are excluded from persistence** because their bare repos have been deleted. Restoring them would leave purgatory entries pointing at non-existent repos. They are simply dropped; background sync will re-fetch the announcement event if needed.
783 954
784--- 955---
785 956
786## Related Documentation 957## Related Documentation
787 958
788- [Inline Authorization](inline-authorization.md) - Why purgatory checking during authorization is essential
789- [Architecture Overview](architecture.md) - Full system design 959- [Architecture Overview](architecture.md) - Full system design
790- [Background Sync](../how-to/purgatory-sync.md) - How to configure and monitor sync 960- [GRASP-02 Proactive Sync](grasp-02-proactive-sync.md) - Relay-to-relay event sync with SyncLevel
791- [Test Strategy](../reference/test-strategy.md) - How we test purgatory 961- [GRASP-02 Purgatory Git Data Fetching](grasp-02-proactive-sync-purgatory-git-data.md) - Background git data hunting
792 962
793--- 963---
794 964