upleb.uk

Public git repos — served from a NIP-34 GRASP relay at git.upleb.uk

summaryrefslogtreecommitdiff
path: root/docs/explanation/purgatory-design.md
diff options
context:
space:
mode:
Diffstat (limited to 'docs/explanation/purgatory-design.md')
-rw-r--r--docs/explanation/purgatory-design.md574
1 files changed, 372 insertions, 202 deletions
diff --git a/docs/explanation/purgatory-design.md b/docs/explanation/purgatory-design.md
index b984745..8e7d75c 100644
--- a/docs/explanation/purgatory-design.md
+++ b/docs/explanation/purgatory-design.md
@@ -8,7 +8,11 @@
8 8
9## Overview 9## Overview
10 10
11Purgatory is an in-memory holding area that solves the **"which arrives first?"** problem in GRASP. Either nostr events or git pushes can arrive in any order: 11Purgatory is an in-memory holding area that solves two related problems in GRASP:
12
13### Problem 1: "Which arrives first?" (State and PR events)
14
15Either nostr events or git pushes can arrive in any order:
12 16
13- **Event first**: Event waits in purgatory until git data arrives 17- **Event first**: Event waits in purgatory until git data arrives
14- **Git first**: Placeholder waits in purgatory until event arrives 18- **Git first**: Placeholder waits in purgatory until event arrives
@@ -19,28 +23,61 @@ When both halves arrive, they are processed together and saved to the database.
19 23
20> Accepted repo state announcements, PRs and PR Updates SHOULD be accepted with message "purgatory: won't be served until git data arrives" and kept in purgatory (not served) until the related git data arrives and otherwise discarded after 30 minutes. 24> Accepted repo state announcements, PRs and PR Updates SHOULD be accepted with message "purgatory: won't be served until git data arrives" and kept in purgatory (not served) until the related git data arrives and otherwise discarded after 30 minutes.
21 25
26### Problem 2: Misleading empty repository announcements
27
28When a repository announcement arrives, we must create the bare git repo immediately so pushes can succeed. But if no git data ever arrives, we would serve an empty repo and its announcement indefinitely—clients see the announcement, try to clone, and get nothing.
29
30**Solution**: New announcements go to **announcement purgatory** instead of being immediately accepted:
31
321. **Announcement arrives** → Create bare repo immediately, add announcement to purgatory
332. **Git data arrives** → Promote announcement from purgatory to active (now served to clients)
343. **No git data before expiry** → Delete bare repo, discard announcement (never served)
35
36This ensures we only serve announcements for repos that actually have content.
37
22--- 38---
23 39
24## Key Design Principles 40## Key Design Principles
25 41
26### 1. In-Memory Only 42### 1. Graceful-Shutdown Persistence
43
44Purgatory state is **saved to disk on graceful shutdown** and **restored on startup**. This preserves in-flight work across planned restarts (deployments, reboots).
45
46On `SIGINT` / Ctrl-C, `main.rs` calls `purgatory.save_to_disk()` before exiting. On startup, if the state file exists, `purgatory.restore_from_disk()` is called before the server begins accepting connections.
47
48**What is persisted:**
49
50| Store | Persisted? | Notes |
51|-------|-----------|-------|
52| `announcement_purgatory` | ✅ Yes | Non-soft-expired entries only (bare repo must exist) |
53| `state_events` | ✅ Yes | All active entries |
54| `pr_events` | ✅ Yes | Both events and placeholders |
55| `expired_events` | ✅ Yes | Prevents re-sync loops after restart |
56| `sync_queue` | ❌ No | Rebuilt automatically after restore |
27 57
28Purgatory data is **not persisted** to disk. On restart, all purgatory entries are lost. This is acceptable because: 58**What is NOT persisted (unclean shutdown):**
59
60On a crash or `SIGKILL`, the state file is not written. In that case:
29 61
30- Events are still on other relays (can be re-submitted) 62- Events are still on other relays (can be re-submitted)
31- Git data can be re-pushed 63- Git data can be re-pushed
32- 30-minute expiry means data is transient anyway 64- 30-minute expiry means data is transient anyway
33 65
34### 2. Separate Storage for State vs PR Events 66**State file location:** `<git_data_path>/purgatory-state.json`
67
68**Downtime accounting:** Expiry deadlines are stored as duration offsets from the save timestamp. On restore, elapsed downtime is subtracted from each deadline. Entries that expired during downtime are immediately swept by the next cleanup tick.
69
70**Soft-expired announcements are excluded:** Their bare repos have already been deleted, so they cannot be meaningfully restored. They will be re-fetched via background sync if needed.
35 71
36State events (kind 30618) and PR events (kind 1617/1618) have fundamentally different matching patterns: 72### 2. Separate Storage for Each Event Type
37 73
38| Event Type | Index | Matching Strategy | 74| Store | Index | Purpose |
39|------------|-------|-------------------| 75|-------|-------|---------|
40| **State Events** | `identifier` (d tag) | Compare refs at push time | 76| `announcement_purgatory` | `(PublicKey, String)` — `(owner, identifier)` | Announcements awaiting git data |
41| **PR Events** | `event_id` (hex string) | Direct match via `refs/nostr/<event-id>` | 77| `state_events` | `identifier` (d tag) | State events awaiting git data |
78| `pr_events` | `event_id` (hex string) | PR events awaiting git data |
42 79
43They use **separate DashMap stores** for efficient concurrent access. 80Announcement purgatory uses `(pubkey, identifier)` because identifier alone is not unique across different owners.
44 81
45### 3. Late Binding for State Events 82### 3. Late Binding for State Events
46 83
@@ -78,7 +115,23 @@ With purgatory checking during authorization:
782. Git push arrives → Checks **database + purgatory** → State found → **AUTHORIZED** ✅ 1152. Git push arrives → Checks **database + purgatory** → State found → **AUTHORIZED** ✅
793. After push succeeds → Save event to database → Remove from purgatory 1163. After push succeeds → Save event to database → Remove from purgatory
80 117
81See [`src/git/authorization.rs:51-162`](../../src/git/authorization.rs) for implementation. 118See [`src/git/authorization.rs`](../../src/git/authorization.rs) for implementation.
119
120### 6. Announcement Purgatory: Bare Repo Created Immediately
121
122**Decision:** Create the bare git repo when announcement enters purgatory.
123
124**Why:** Git pushes may arrive at any time. Without a repo, pushes fail.
125
126**Consequence:** We allocate disk space for repos that may expire unused. Must delete repos on expiry.
127
128### 7. Replacement Announcements Skip Purgatory
129
130**Decision:** Announcements replacing an existing active (database) announcement are accepted immediately.
131
132**Why:** The repository is already proven active with content.
133
134**How:** Check if active announcement exists for `(pubkey, identifier)` before routing to purgatory.
82 135
83--- 136---
84 137
@@ -103,22 +156,54 @@ pub struct RefUpdate {
103} 156}
104``` 157```
105 158
159### Announcement Purgatory Entry
160
161```rust
162pub struct AnnouncementPurgatoryEntry {
163 /// The kind 30617 announcement event
164 pub event: Event,
165
166 /// Repository identifier from 'd' tag
167 pub identifier: String,
168
169 /// Event author pubkey
170 pub owner: PublicKey,
171
172 /// Path to the bare git repo on disk (created immediately on entry)
173 pub repo_path: PathBuf,
174
175 /// Relay URLs from 'relays'/'clone' tags — for sync registration
176 pub relays: HashSet<String>,
177
178 /// When added to purgatory
179 pub created_at: Instant,
180
181 /// Expiry deadline (30 min from creation, may be extended)
182 pub expires_at: Instant,
183
184 /// Whether the bare repo has been deleted (soft expiry phase)
185 pub soft_expired: bool,
186}
187```
188
189**Indexed by `(pubkey, identifier)`** because identifier is not unique across different owners.
190
106### State Purgatory Entry 191### State Purgatory Entry
107 192
108```rust 193```rust
109pub struct StatePurgatoryEntry { 194pub struct StatePurgatoryEntry {
110 /// The nostr state event (kind 30618) awaiting git data 195 /// The nostr state event (kind 30618) awaiting git data
111 pub event: Event, 196 pub event: Event,
112 197
113 /// Repository identifier from 'd' tag 198 /// Repository identifier from 'd' tag
114 pub identifier: String, 199 pub identifier: String,
115 200
116 /// Event author pubkey 201 /// Event author pubkey
117 pub author: PublicKey, 202 pub author: PublicKey,
118 203
119 /// When added to purgatory 204 /// When added to purgatory
120 pub created_at: Instant, 205 pub created_at: Instant,
121 206
122 /// Expiry deadline (30 min from creation, may be extended) 207 /// Expiry deadline (30 min from creation, may be extended)
123 pub expires_at: Instant, 208 pub expires_at: Instant,
124} 209}
@@ -132,14 +217,14 @@ pub struct StatePurgatoryEntry {
132pub struct PrPurgatoryEntry { 217pub struct PrPurgatoryEntry {
133 /// The nostr PR event, if received (None = git data arrived first) 218 /// The nostr PR event, if received (None = git data arrived first)
134 pub event: Option<Event>, 219 pub event: Option<Event>,
135 220
136 /// Expected commit SHA from 'c' tag (if event exists) 221 /// Expected commit SHA from 'c' tag (if event exists)
137 /// or actual commit pushed (if git arrived first) 222 /// or actual commit pushed (if git arrived first)
138 pub commit: String, 223 pub commit: String,
139 224
140 /// When added to purgatory 225 /// When added to purgatory
141 pub created_at: Instant, 226 pub created_at: Instant,
142 227
143 /// Expiry deadline (30 min from creation) 228 /// Expiry deadline (30 min from creation)
144 pub expires_at: Instant, 229 pub expires_at: Instant,
145} 230}
@@ -151,24 +236,180 @@ pub struct PrPurgatoryEntry {
151 236
152```rust 237```rust
153pub struct Purgatory { 238pub struct Purgatory {
239 /// Announcement events indexed by (owner, identifier)
240 announcement_purgatory: DashMap<(PublicKey, String), AnnouncementPurgatoryEntry>,
241
154 /// State events indexed by identifier (d tag) 242 /// State events indexed by identifier (d tag)
155 /// Multiple state events per identifier allowed (different authors) 243 /// Multiple state events per identifier allowed (different authors)
156 state_events: Arc<DashMap<String, Vec<StatePurgatoryEntry>>>, 244 state_events: DashMap<String, Vec<StatePurgatoryEntry>>,
157 245
158 /// PR events indexed by event_id (hex string) 246 /// PR events indexed by event_id (hex string)
159 /// Single entry per event ID 247 /// Single entry per event ID
160 pr_events: Arc<DashMap<String, PrPurgatoryEntry>>, 248 pr_events: DashMap<String, PrPurgatoryEntry>,
161 249
162 /// Sync queue for background git data fetching 250 /// Sync queue for background git data fetching
163 sync_queue: Arc<DashMap<String, SyncQueueEntry>>, 251 sync_queue: DashMap<String, SyncQueueEntry>,
164 252
165 _git_data_path: PathBuf, 253 /// Events that previously expired without git data (prevents re-sync loops)
254 expired_events: DashMap<EventId, Instant>,
255}
256```
257
258### Persistence State (Disk Format)
259
260`Instant` fields cannot be serialized directly. Each entry type has a corresponding `Serializable*` wrapper that stores time fields as `u64` second offsets from a `saved_at: SystemTime` reference point. On restore, elapsed downtime is subtracted to produce the correct remaining TTL.
261
262```rust
263struct PurgatoryState {
264 version: u32, // currently 1
265 saved_at: SystemTime, // reference for offset math
266
267 /// Non-soft-expired announcements indexed by "owner_hex:identifier"
268 announcement_purgatory: HashMap<String, SerializableAnnouncementPurgatoryEntry>,
269
270 /// State events indexed by repository identifier
271 state_events: HashMap<String, Vec<SerializableStatePurgatoryEntry>>,
272
273 /// PR events (and placeholders) indexed by event ID hex
274 pr_events: HashMap<String, SerializablePrPurgatoryEntry>,
275
276 /// Expired event IDs → approximate expiry SystemTime
277 expired_events: HashMap<String, SystemTime>,
166} 278}
167``` 279```
168 280
281The `announcement_purgatory` field uses `#[serde(default)]` so that state files written before announcement persistence was added (version 1 without the field) still deserialize correctly.
282
283---
284
285## Announcement Purgatory Flows
286
287### New Announcement Flow
288
289```
290Announcement arrives
291 |
292 v
293Is there an active announcement for (pubkey, identifier) in DB?
294 |
295 +-- YES --> Accept immediately (replacement, repo already proven)
296 |
297 +-- NO --> Is there a purgatory entry for (pubkey, identifier)?
298 |
299 +-- YES --> Replace purgatory entry, extend expiry 30 min
300 | Return OK to client (but don't serve)
301 |
302 +-- NO --> Create bare repo
303 Add to purgatory
304 Return OK to client (but don't serve)
305```
306
307### Git Data Arrival → Promotion
308
309```
310Git push/fetch completes with data
311 |
312 v
313process_purgatory_announcements() called
314 |
315 v
316Is there a purgatory announcement for (owner, identifier)?
317 |
318 +-- YES --> promote_announcement() removes from purgatory
319 | Save event to database
320 | Notify WebSocket clients
321 | (Sync upgrades to Full automatically via SelfSubscriber)
322 |
323 +-- NO --> Normal processing
324```
325
326### State Event Arrival for Purgatory Announcement
327
328```
329State event arrives
330 |
331 v
332fetch_repository_data_with_purgatory() checks DB + purgatory
333 |
334 +-- Announcement found in purgatory -->
335 | Validate authorization against purgatory announcement
336 | Extend purgatory announcement expiry (reset 30-min timer)
337 | If soft-expired: recreate bare repo, clear soft_expired flag
338 | Route state event to state purgatory
339 |
340 +-- No announcement anywhere --> Reject
341```
342
343### Announcement Expiry (Two-Phase Soft Expiry)
344
345The protocol specifies 30-minute expiry for announcements. We implement a two-phase soft expiry:
346
347**Phase 1 — Initial 30-minute expiry (`soft_expired == false`):**
348- Delete the bare git repo (frees disk space, respects protocol expiry)
349- Set `soft_expired = true`
350- Extend `expires_at` by 24 hours (`SOFT_EXPIRY_EXTENDED`)
351- Continue syncing state events (same as active purgatory)
352
353**Phase 2 — 24-hour soft expiry (`soft_expired == true`):**
354- Add event ID to `expired_events` (prevents re-sync loops)
355- Remove entry completely from `announcement_purgatory`
356
357**Why soft expiry?** Without it, we'd face a dilemma:
358
359- Add expired announcements to `failed_events` → permanently reject future state events, losing potential revival when state events arrive late
360- Re-fetch the announcement event on every sync cycle → wasting bandwidth and creating unnecessary sync traffic
361
362Soft expiry retains the event for 24 hours so that late-arriving state events (e.g. from a slow sync) can revive the announcement without forcing a full re-announcement flow.
363
364**Revival:** If a state event arrives for a soft-expired announcement, `extend_announcement_expiry()` recreates the bare repo, clears `soft_expired`, and resets the 30-minute timer.
365
366### Expiry Extension Triggers
367
368The 30-minute purgatory timer is reset (extended) in three scenarios:
369
370| Trigger | Location | Why |
371|---------|----------|-----|
372| State event arrives | `StatePolicy::process_state_event()` | Repo is actively receiving metadata |
373| Git push authorized against purgatory state | `get_state_authorization_for_specific_owner_repo()` | Repo is actively receiving git data |
374| Replacement announcement arrives | `AnnouncementPolicy::validate()` | Announcement updated |
375
376All three call `purgatory.extend_announcement_expiry(owner, identifier, 1800s)`.
377
378### Purgatory Lifecycle
379
380```
381 ┌─────────────────────────────────────┐
382 │ │
383 v │
384Announcement ──> ACTIVE ──────────────────────────────────┤
385 arrives (bare repo exists) │
386 │ │
387 ├── Git data ──> PROMOTED (exit) │
388 │ │
389 ├── Deletion ──> REMOVED (exit) │
390 │ │
391 v │
392 SOFT_EXPIRED ──────────────────────────────┘
393 (bare repo deleted, ^
394 event retained) │
395 │ │
396 ├── State event arrives (revival)
397
398 └── Extended expiry ──> REMOVED (exit)
399```
400
401| Exit | Trigger | Action |
402|------|---------|--------|
403| **Promotion** | Git data arrives | Move to database, sync upgrades to Full |
404| **Soft expiry** | Initial 30-min timeout | Delete bare repo, retain event, continue sync |
405| **Full expiry** | 24-hour soft expiry | Add to expired_events, remove from purgatory |
406| **Deletion** | Kind 5 event | Delete bare repo, remove from purgatory |
407| **Replacement** | Newer announcement (same pubkey, identifier) | Replace entry, extend expiry |
408| **Service change** | Newer announcement removes our service | Remove from purgatory |
409
169--- 410---
170 411
171## Event Flows 412## State and PR Event Flows
172 413
173### State Event Arrival (Kind 30618) 414### State Event Arrival (Kind 30618)
174 415
@@ -377,11 +618,12 @@ Purgatory includes a background sync system that fetches git data from remote se
377 618
378┌─────────────────────────────────────────────────────┐ 619┌─────────────────────────────────────────────────────┐
379│ process_newly_available_git_data(repo, oids) │ 620│ process_newly_available_git_data(repo, oids) │
380│ 1. Find satisfiable state events in purgatory │ 621│ 1. Find satisfiable announcement in purgatory │
381│ 2. Find satisfiable PR events in purgatory │ 622│ 2. Find satisfiable state events in purgatory │
382│ 3. Save events to database │ 623│ 3. Find satisfiable PR events in purgatory │
383│ 4. Sync git data to other owner repos │ 624│ 4. Save events to database │
384│ 5. Remove from purgatory │ 625│ 5. Sync git data to other owner repos │
626│ 6. Remove from purgatory │
385└─────────────────────────────────────────────────────┘ 627└─────────────────────────────────────────────────────┘
386``` 628```
387 629
@@ -402,8 +644,8 @@ pub struct SyncQueueEntry {
402 644
403**Backoff strategy:** 645**Backoff strategy:**
404- First attempt: 20 seconds 646- First attempt: 20 seconds
405- Second attempt: 2 minutes 647- Second attempt: 40 seconds
406- Subsequent attempts: 2 minutes 648- Subsequent attempts: capped at 2 minutes
407 649
408### Sync Delays 650### Sync Delays
409 651
@@ -428,7 +670,7 @@ pub struct ThrottleManager {
428``` 670```
429 671
430**Rate limiting:** 672**Rate limiting:**
431- Default: 5 requests per domain per 30 seconds 673- Default: 5 concurrent requests per domain, 30 requests per minute
432- Tracks request timestamps in a sliding window 674- Tracks request timestamps in a sliding window
433- Queues identifiers when domain is throttled 675- Queues identifiers when domain is throttled
434- Processes queue when capacity frees up 676- Processes queue when capacity frees up
@@ -439,7 +681,47 @@ See [`src/purgatory/sync/throttle.rs`](../../src/purgatory/sync/throttle.rs) for
439 681
440## Purgatory API 682## Purgatory API
441 683
442### Adding Entries 684### Announcement Purgatory
685
686```rust
687impl Purgatory {
688 /// Add an announcement to purgatory (bare repo already created by caller)
689 pub fn add_announcement(
690 &self,
691 event: Event,
692 identifier: String,
693 owner: PublicKey,
694 repo_path: PathBuf,
695 relays: HashSet<String>,
696 );
697
698 /// Promote announcement: remove from purgatory, return event for DB save
699 pub fn promote_announcement(
700 &self,
701 owner: &PublicKey,
702 identifier: &str,
703 ) -> Option<Event>;
704
705 /// Get announcements by identifier (for authorization checks)
706 pub fn get_announcements_by_identifier(
707 &self,
708 identifier: &str,
709 ) -> Vec<AnnouncementPurgatoryEntry>;
710
711 /// Extend expiry (and revive soft-expired entries, recreating bare repo)
712 pub fn extend_announcement_expiry(
713 &self,
714 owner: &PublicKey,
715 identifier: &str,
716 duration: Duration,
717 );
718
719 /// Get all announcements for sync registration
720 pub fn announcements_for_sync(&self) -> Vec<AnnouncementPurgatoryEntry>;
721}
722```
723
724### State and PR Purgatory
443 725
444```rust 726```rust
445impl Purgatory { 727impl Purgatory {
@@ -453,13 +735,7 @@ impl Purgatory {
453 735
454 /// Add a PR placeholder (git-data-first scenario) 736 /// Add a PR placeholder (git-data-first scenario)
455 pub fn add_pr_placeholder(&self, event_id: String, commit: String); 737 pub fn add_pr_placeholder(&self, event_id: String, commit: String);
456}
457```
458
459### Finding Entries
460 738
461```rust
462impl Purgatory {
463 /// Find state events waiting for an identifier 739 /// Find state events waiting for an identifier
464 pub fn find_state(&self, identifier: &str) -> Vec<StatePurgatoryEntry>; 740 pub fn find_state(&self, identifier: &str) -> Vec<StatePurgatoryEntry>;
465 741
@@ -476,13 +752,7 @@ impl Purgatory {
476 752
477 /// Find a PR placeholder specifically (git-data-first) 753 /// Find a PR placeholder specifically (git-data-first)
478 pub fn find_pr_placeholder(&self, event_id: &str) -> Option<String>; 754 pub fn find_pr_placeholder(&self, event_id: &str) -> Option<String>;
479}
480```
481
482### Removing Entries
483 755
484```rust
485impl Purgatory {
486 /// Remove all state events for an identifier 756 /// Remove all state events for an identifier
487 pub fn remove_state(&self, identifier: &str); 757 pub fn remove_state(&self, identifier: &str);
488 758
@@ -499,36 +769,14 @@ impl Purgatory {
499```rust 769```rust
500impl Purgatory { 770impl Purgatory {
501 /// Remove expired entries (called every 60 seconds) 771 /// Remove expired entries (called every 60 seconds)
502 /// Returns (state_removed, pr_removed) 772 /// Handles two-phase soft expiry for announcements
503 pub fn cleanup(&self) -> (usize, usize); 773 pub fn cleanup(&self);
504 774
505 /// Extend expiry for entries about to be processed 775 /// Extend expiry for state/PR entries about to be processed
506 /// Ensures at least `duration` remaining
507 pub fn extend_expiry(&self, identifier: &str, event_ids: &[EventId], duration: Duration); 776 pub fn extend_expiry(&self, identifier: &str, event_ids: &[EventId], duration: Duration);
508 777
509 /// Get current counts for metrics 778 /// Check if an event previously expired (prevents re-sync loops)
510 pub fn count(&self) -> (usize, usize); 779 pub fn is_expired(&self, event_id: &EventId) -> bool;
511}
512```
513
514### Sync Queue Management
515
516```rust
517impl Purgatory {
518 /// Enqueue identifier for sync with custom delay
519 pub fn enqueue_sync(&self, identifier: &str, delay: Duration);
520
521 /// Enqueue with default delay (3 minutes)
522 pub fn enqueue_sync_default(&self, identifier: &str);
523
524 /// Enqueue with immediate delay (500ms)
525 pub fn enqueue_sync_immediate(&self, identifier: &str);
526
527 /// Check if identifier has pending events
528 pub fn has_pending_events(&self, identifier: &str) -> bool;
529
530 /// Remove identifier from sync queue
531 pub fn remove_from_sync_queue(&self, identifier: &str);
532} 780}
533``` 781```
534 782
@@ -558,12 +806,6 @@ pub fn can_apply_state(
558 event: &Event, 806 event: &Event,
559 repo_path: &Path, 807 repo_path: &Path,
560) -> Result<bool>; 808) -> Result<bool>;
561
562/// Get refs from state that aren't being pushed
563pub fn get_unpushed_refs(
564 state_refs: &[RefPair],
565 pushed_refs: &[RefPair],
566) -> Vec<RefPair>;
567``` 809```
568 810
569See [`src/purgatory/helpers.rs`](../../src/purgatory/helpers.rs) for implementation. 811See [`src/purgatory/helpers.rs`](../../src/purgatory/helpers.rs) for implementation.
@@ -572,123 +814,37 @@ See [`src/purgatory/helpers.rs`](../../src/purgatory/helpers.rs) for implementat
572 814
573## Integration Points 815## Integration Points
574 816
575### 1. Event Policy (Nip34WritePolicy) 817### 1. Announcement Policy (`src/nostr/policy/announcement.rs`)
576 818
577State and PR events are added to purgatory when git data doesn't exist: 819Routes new announcements to purgatory or accepts replacements:
578 820
579```rust 821- If active DB announcement exists for `(pubkey, identifier)` → `Accept` immediately
580// From src/nostr/policy/state.rs 822- If purgatory entry exists → replace it, extend expiry, return `Accept`
581async fn handle_state(&self, event: &Event) -> WritePolicyResult { 823- Otherwise → return `AcceptPurgatory`, caller calls `add_to_purgatory()` which creates bare repo and adds to purgatory
582 let identifier = extract_identifier(event)?;
583
584 // Check if we have matching git data
585 if self.has_matching_git_data(&identifier, event).await? {
586 return WritePolicyResult::Accept;
587 }
588
589 // Add to purgatory
590 self.purgatory.add_state(
591 event.clone(),
592 identifier.clone(),
593 event.pubkey,
594 );
595
596 WritePolicyResult::Reject {
597 status: true, // Client sees OK
598 message: "purgatory: awaiting git data".into()
599 }
600}
601```
602 824
603### 2. Git Push Authorization 825### 2. State Event Policy (`src/nostr/policy/state.rs`)
604 826
605Authorization checks both database and purgatory: 827Checks purgatory announcements for authorization and extends their expiry:
606 828
607```rust 829```rust
608// From src/git/authorization.rs 830// Fetch announcements from both DB and purgatory
609pub async fn authorize_push( 831let repo_data = fetch_repository_data_with_purgatory(db, purgatory, identifier).await?;
610 database: &SharedDatabase, 832
611 identifier: &str, 833// For each authorized owner with a purgatory announcement, extend expiry
612 owner_pubkey: &str, 834purgatory.extend_announcement_expiry(&owner_pk, &identifier, Duration::from_secs(1800));
613 request_body: &Bytes,
614 purgatory: &Arc<Purgatory>, // Critical!
615 repo_path: &std::path::Path,
616) -> anyhow::Result<AuthorizationResult> {
617 // Parse pushed refs
618 let pushed_refs = parse_pushed_refs(request_body);
619
620 // Check database for state events
621 let db_result = get_authorization_from_db(database, identifier).await?;
622
623 if !db_result.authorized {
624 // No state in database - check purgatory
625 let purgatory_result = get_state_authorization_for_specific_owner_repo(
626 database,
627 identifier,
628 owner_pubkey,
629 purgatory,
630 &pushed_refs,
631 repo_path,
632 ).await?;
633
634 return purgatory_result;
635 }
636
637 db_result
638}
639``` 835```
640 836
641### 3. Post-Push Processing 837### 3. Git Push Authorization (`src/git/authorization.rs`)
642 838
643After successful push, events from purgatory are saved to database: 839`fetch_repository_data_with_purgatory()` merges DB announcements with purgatory announcements for authorization. On successful authorization via purgatory state events, also extends announcement expiry.
644 840
645```rust 841### 4. Git Data Processing (`src/git/sync.rs`)
646// From src/git/handlers.rs
647if from_purgatory {
648 if let (Some(db), Some(purg)) = (&database, &purgatory) {
649 // Save state event to database
650 db.save_event(&state.event).await?;
651
652 // Remove from purgatory
653 purg.remove_state_event(identifier, &state.event.id);
654 }
655}
656```
657 842
658### 4. Background Sync Loop 843`process_purgatory_announcements()` is called after any git push or background sync fetch. It promotes announcements from purgatory to the database and notifies WebSocket clients.
659 844
660Started during application initialization: 845### 5. Sync Registration (`src/sync/`)
661 846
662```rust 847A background timer (`run_purgatory_announcement_sync`, every 5 seconds) ensures purgatory announcements are registered in `RepoSyncIndex` with `SyncLevel::StateOnly`. When an announcement is promoted, the `SelfSubscriber` upgrades it to `SyncLevel::Full`.
663// From src/main.rs
664let purgatory = Arc::new(Purgatory::new(git_data_path));
665let ctx = Arc::new(RealSyncContext::new(
666 database.clone(),
667 purgatory.clone(),
668 config.domain.clone(),
669 git_data_path.clone(),
670));
671let throttle_manager = Arc::new(ThrottleManager::new(5, 30));
672throttle_manager.set_context(ctx.clone());
673
674// Start sync loop
675let sync_handle = purgatory.clone().start_sync_loop(ctx, throttle_manager);
676
677// Start cleanup task
678let cleanup_handle = tokio::spawn(async move {
679 let mut interval = tokio::time::interval(Duration::from_secs(60));
680 loop {
681 interval.tick().await;
682 let (state_removed, pr_removed) = purgatory.cleanup();
683 if state_removed + pr_removed > 0 {
684 tracing::debug!(
685 "Purgatory cleanup removed {} state, {} PR entries",
686 state_removed, pr_removed
687 );
688 }
689 }
690});
691```
692 848
693--- 849---
694 850
@@ -697,8 +853,9 @@ let cleanup_handle = tokio::spawn(async move {
697``` 853```
698src/ 854src/
699├── purgatory/ 855├── purgatory/
700│ ├── mod.rs # Main Purgatory struct and API 856│ ├── mod.rs # Main Purgatory struct, API, save_to_disk, restore_from_disk
701│ ├── types.rs # RefPair, StatePurgatoryEntry, PrPurgatoryEntry 857│ ├── types.rs # RefPair, AnnouncementPurgatoryEntry, StatePurgatoryEntry, PrPurgatoryEntry
858│ ├── persistence.rs # instant_to_offset / offset_to_instant time conversion utilities
702│ ├── helpers.rs # Ref extraction and matching functions 859│ ├── helpers.rs # Ref extraction and matching functions
703│ └── sync/ 860│ └── sync/
704│ ├── mod.rs # Sync module exports 861│ ├── mod.rs # Sync module exports
@@ -710,9 +867,10 @@ src/
710├── git/ 867├── git/
711│ ├── authorization.rs # authorize_push with purgatory checking 868│ ├── authorization.rs # authorize_push with purgatory checking
712│ ├── handlers.rs # handle_receive_pack with post-push processing 869│ ├── handlers.rs # handle_receive_pack with post-push processing
713│ └── sync.rs # process_newly_available_git_data 870│ └── sync.rs # process_newly_available_git_data, process_purgatory_announcements
714└── nostr/ 871└── nostr/
715 └── policy/ 872 └── policy/
873 ├── announcement.rs # Route announcements to purgatory
716 ├── state.rs # State event policy with purgatory 874 ├── state.rs # State event policy with purgatory
717 └── pr_event.rs # PR event policy with purgatory 875 └── pr_event.rs # PR event policy with purgatory
718``` 876```
@@ -725,7 +883,8 @@ src/
725 883
726Located in each module: 884Located in each module:
727 885
728- **[`src/purgatory/mod.rs`](../../src/purgatory/mod.rs)** - Core purgatory operations 886- **[`src/purgatory/mod.rs`](../../src/purgatory/mod.rs)** - Core purgatory operations including announcement purgatory; persistence round-trip tests for all entry types (state, PR, announcement, expired events, downtime calculation, soft-expired exclusion, missing-repo skip)
887- **[`src/purgatory/persistence.rs`](../../src/purgatory/persistence.rs)** - `instant_to_offset` / `offset_to_instant` round-trip tests
729- **[`src/purgatory/helpers.rs`](../../src/purgatory/helpers.rs)** - Ref matching logic 888- **[`src/purgatory/helpers.rs`](../../src/purgatory/helpers.rs)** - Ref matching logic
730- **[`src/purgatory/sync/functions.rs`](../../src/purgatory/sync/functions.rs)** - Sync functions with MockSyncContext 889- **[`src/purgatory/sync/functions.rs`](../../src/purgatory/sync/functions.rs)** - Sync functions with MockSyncContext
731- **[`src/purgatory/sync/throttle.rs`](../../src/purgatory/sync/throttle.rs)** - Throttle manager 890- **[`src/purgatory/sync/throttle.rs`](../../src/purgatory/sync/throttle.rs)** - Throttle manager
@@ -734,17 +893,33 @@ Located in each module:
734 893
735Located in [`tests/`](../../tests/): 894Located in [`tests/`](../../tests/):
736 895
896- **Announcement purgatory flow** - Announcement enters purgatory, git data promotes it
897- **Announcement soft expiry** - Bare repo deleted after 30 min, event retained 24h
898- **Announcement revival** - State event revives soft-expired announcement
737- **State event purgatory flow** - Event arrives, git push releases it 899- **State event purgatory flow** - Event arrives, git push releases it
738- **PR event purgatory flow** - Event arrives, git push releases it 900- **PR event purgatory flow** - Event arrives, git push releases it
739- **Git-data-first flow** - Git push creates placeholder, event completes it 901- **Git-data-first flow** - Git push creates placeholder, event completes it
740- **Authorization with purgatory** - Push authorized by purgatory state 902- **Authorization with purgatory** - Push authorized by purgatory state
741- **Background sync** - Sync fetches git data and releases events 903- **Background sync** - Sync fetches git data and releases events
904- **Persistence across restart** - Save/restore cycle preserves all entry types including announcements
742 905
743--- 906---
744 907
745## Key Learnings 908## Key Learnings
746 909
747### 1. Purgatory Authorization is Critical 910### 1. Announcement Purgatory Prevents Misleading Empty Repos
911
912Without announcement purgatory, we'd serve announcements for repos with no content. Clients would see the announcement, try to clone, and get nothing.
913
914**Solution:** Announcements wait in purgatory until git data proves content exists.
915
916### 2. Soft Expiry Avoids Sync Loops
917
918The protocol's 30-minute expiry creates a problem: without soft expiry, we'd either permanently block repositories or constantly re-sync expired announcement events.
919
920**Solution:** Soft expiry retains the event for 24 hours after deleting the bare repo, allowing revival without re-fetching.
921
922### 3. Purgatory Authorization is Critical
748 923
749Without checking purgatory during authorization, we have a deadlock: 924Without checking purgatory during authorization, we have a deadlock:
750- State event goes to purgatory (no git data) 925- State event goes to purgatory (no git data)
@@ -753,7 +928,7 @@ Without checking purgatory during authorization, we have a deadlock:
753 928
754**Solution:** `authorize_push()` checks both database and purgatory. 929**Solution:** `authorize_push()` checks both database and purgatory.
755 930
756### 2. Late Binding for State Events 931### 4. Late Binding for State Events
757 932
758Extracting refs at event arrival time doesn't work when: 933Extracting refs at event arrival time doesn't work when:
759- Multiple state events arrive for same identifier 934- Multiple state events arrive for same identifier
@@ -761,7 +936,7 @@ Extracting refs at event arrival time doesn't work when:
761 936
762**Solution:** Extract and match refs at push time via `find_matching_states()`. 937**Solution:** Extract and match refs at push time via `find_matching_states()`.
763 938
764### 3. Bidirectional Waiting for PR Events 939### 5. Bidirectional Waiting for PR Events
765 940
766PR events can arrive before or after git data: 941PR events can arrive before or after git data:
767- Event first → Wait for git push 942- Event first → Wait for git push
@@ -769,26 +944,21 @@ PR events can arrive before or after git data:
769 944
770**Solution:** `PrPurgatoryEntry.event: Option<Event>` with `None` = placeholder. 945**Solution:** `PrPurgatoryEntry.event: Option<Event>` with `None` = placeholder.
771 946
772### 4. Sync Queue Debouncing 947### 6. Persistence Requires Instant → Duration Conversion
773
774When events arrive in bursts (e.g., negentropy sync), we don't want to spawn a sync task for each event.
775
776**Solution:** `enqueue_sync()` resets `attempt_count` and updates `next_attempt` if already queued.
777 948
778### 5. Domain Throttling with Queues 949`std::time::Instant` is not serializable and is not meaningful across process boundaries. Expiry deadlines must be converted to a portable form.
779 950
780When a domain is throttled, we still want to eventually sync from it. 951**Solution:** Store each deadline as a `u64` second offset from a `saved_at: SystemTime` reference. On restore, subtract elapsed downtime from each offset to compute the new `Instant`. Entries whose deadline already passed during downtime get `expires_at = now` and are swept by the next cleanup tick.
781 952
782**Solution:** `ThrottleManager` maintains per-domain queues and processes them when capacity frees. 953**Soft-expired announcements are excluded from persistence** because their bare repos have been deleted. Restoring them would leave purgatory entries pointing at non-existent repos. They are simply dropped; background sync will re-fetch the announcement event if needed.
783 954
784--- 955---
785 956
786## Related Documentation 957## Related Documentation
787 958
788- [Inline Authorization](inline-authorization.md) - Why purgatory checking during authorization is essential
789- [Architecture Overview](architecture.md) - Full system design 959- [Architecture Overview](architecture.md) - Full system design
790- [Background Sync](../how-to/purgatory-sync.md) - How to configure and monitor sync 960- [GRASP-02 Proactive Sync](grasp-02-proactive-sync.md) - Relay-to-relay event sync with SyncLevel
791- [Test Strategy](../reference/test-strategy.md) - How we test purgatory 961- [GRASP-02 Purgatory Git Data Fetching](grasp-02-proactive-sync-purgatory-git-data.md) - Background git data hunting
792 962
793--- 963---
794 964