upleb.uk

Public git repos — served from a NIP-34 GRASP relay at git.upleb.uk

summaryrefslogtreecommitdiff
path: root/docs/explanation/purgatory-design.md
diff options
context:
space:
mode:
Diffstat (limited to 'docs/explanation/purgatory-design.md')
-rw-r--r--docs/explanation/purgatory-design.md520
1 files changed, 316 insertions, 204 deletions
diff --git a/docs/explanation/purgatory-design.md b/docs/explanation/purgatory-design.md
index b984745..bd792d4 100644
--- a/docs/explanation/purgatory-design.md
+++ b/docs/explanation/purgatory-design.md
@@ -8,7 +8,11 @@
8 8
9## Overview 9## Overview
10 10
11Purgatory is an in-memory holding area that solves the **"which arrives first?"** problem in GRASP. Either nostr events or git pushes can arrive in any order: 11Purgatory is an in-memory holding area that solves two related problems in GRASP:
12
13### Problem 1: "Which arrives first?" (State and PR events)
14
15Either nostr events or git pushes can arrive in any order:
12 16
13- **Event first**: Event waits in purgatory until git data arrives 17- **Event first**: Event waits in purgatory until git data arrives
14- **Git first**: Placeholder waits in purgatory until event arrives 18- **Git first**: Placeholder waits in purgatory until event arrives
@@ -19,6 +23,18 @@ When both halves arrive, they are processed together and saved to the database.
19 23
20> Accepted repo state announcements, PRs and PR Updates SHOULD be accepted with message "purgatory: won't be served until git data arrives" and kept in purgatory (not served) until the related git data arrives and otherwise discarded after 30 minutes. 24> Accepted repo state announcements, PRs and PR Updates SHOULD be accepted with message "purgatory: won't be served until git data arrives" and kept in purgatory (not served) until the related git data arrives and otherwise discarded after 30 minutes.
21 25
26### Problem 2: Misleading empty repository announcements
27
28When a repository announcement arrives, we must create the bare git repo immediately so pushes can succeed. But if no git data ever arrives, we would serve an empty repo and its announcement indefinitely—clients see the announcement, try to clone, and get nothing.
29
30**Solution**: New announcements go to **announcement purgatory** instead of being immediately accepted:
31
321. **Announcement arrives** → Create bare repo immediately, add announcement to purgatory
332. **Git data arrives** → Promote announcement from purgatory to active (now served to clients)
343. **No git data before expiry** → Delete bare repo, discard announcement (never served)
35
36This ensures we only serve announcements for repos that actually have content.
37
22--- 38---
23 39
24## Key Design Principles 40## Key Design Principles
@@ -31,16 +47,15 @@ Purgatory data is **not persisted** to disk. On restart, all purgatory entries a
31- Git data can be re-pushed 47- Git data can be re-pushed
32- 30-minute expiry means data is transient anyway 48- 30-minute expiry means data is transient anyway
33 49
34### 2. Separate Storage for State vs PR Events 50### 2. Separate Storage for Each Event Type
35
36State events (kind 30618) and PR events (kind 1617/1618) have fundamentally different matching patterns:
37 51
38| Event Type | Index | Matching Strategy | 52| Store | Index | Purpose |
39|------------|-------|-------------------| 53|-------|-------|---------|
40| **State Events** | `identifier` (d tag) | Compare refs at push time | 54| `announcement_purgatory` | `(PublicKey, String)` — `(owner, identifier)` | Announcements awaiting git data |
41| **PR Events** | `event_id` (hex string) | Direct match via `refs/nostr/<event-id>` | 55| `state_events` | `identifier` (d tag) | State events awaiting git data |
56| `pr_events` | `event_id` (hex string) | PR events awaiting git data |
42 57
43They use **separate DashMap stores** for efficient concurrent access. 58Announcement purgatory uses `(pubkey, identifier)` because identifier alone is not unique across different owners.
44 59
45### 3. Late Binding for State Events 60### 3. Late Binding for State Events
46 61
@@ -78,7 +93,23 @@ With purgatory checking during authorization:
782. Git push arrives → Checks **database + purgatory** → State found → **AUTHORIZED** ✅ 932. Git push arrives → Checks **database + purgatory** → State found → **AUTHORIZED** ✅
793. After push succeeds → Save event to database → Remove from purgatory 943. After push succeeds → Save event to database → Remove from purgatory
80 95
81See [`src/git/authorization.rs:51-162`](../../src/git/authorization.rs) for implementation. 96See [`src/git/authorization.rs`](../../src/git/authorization.rs) for implementation.
97
98### 6. Announcement Purgatory: Bare Repo Created Immediately
99
100**Decision:** Create the bare git repo when announcement enters purgatory.
101
102**Why:** Git pushes may arrive at any time. Without a repo, pushes fail.
103
104**Consequence:** We allocate disk space for repos that may expire unused. Must delete repos on expiry.
105
106### 7. Replacement Announcements Skip Purgatory
107
108**Decision:** Announcements replacing an existing active (database) announcement are accepted immediately.
109
110**Why:** The repository is already proven active with content.
111
112**How:** Check if active announcement exists for `(pubkey, identifier)` before routing to purgatory.
82 113
83--- 114---
84 115
@@ -103,22 +134,54 @@ pub struct RefUpdate {
103} 134}
104``` 135```
105 136
137### Announcement Purgatory Entry
138
139```rust
140pub struct AnnouncementPurgatoryEntry {
141 /// The kind 30617 announcement event
142 pub event: Event,
143
144 /// Repository identifier from 'd' tag
145 pub identifier: String,
146
147 /// Event author pubkey
148 pub owner: PublicKey,
149
150 /// Path to the bare git repo on disk (created immediately on entry)
151 pub repo_path: PathBuf,
152
153 /// Relay URLs from 'relays'/'clone' tags — for sync registration
154 pub relays: HashSet<String>,
155
156 /// When added to purgatory
157 pub created_at: Instant,
158
159 /// Expiry deadline (30 min from creation, may be extended)
160 pub expires_at: Instant,
161
162 /// Whether the bare repo has been deleted (soft expiry phase)
163 pub soft_expired: bool,
164}
165```
166
167**Indexed by `(pubkey, identifier)`** because identifier is not unique across different owners.
168
106### State Purgatory Entry 169### State Purgatory Entry
107 170
108```rust 171```rust
109pub struct StatePurgatoryEntry { 172pub struct StatePurgatoryEntry {
110 /// The nostr state event (kind 30618) awaiting git data 173 /// The nostr state event (kind 30618) awaiting git data
111 pub event: Event, 174 pub event: Event,
112 175
113 /// Repository identifier from 'd' tag 176 /// Repository identifier from 'd' tag
114 pub identifier: String, 177 pub identifier: String,
115 178
116 /// Event author pubkey 179 /// Event author pubkey
117 pub author: PublicKey, 180 pub author: PublicKey,
118 181
119 /// When added to purgatory 182 /// When added to purgatory
120 pub created_at: Instant, 183 pub created_at: Instant,
121 184
122 /// Expiry deadline (30 min from creation, may be extended) 185 /// Expiry deadline (30 min from creation, may be extended)
123 pub expires_at: Instant, 186 pub expires_at: Instant,
124} 187}
@@ -132,14 +195,14 @@ pub struct StatePurgatoryEntry {
132pub struct PrPurgatoryEntry { 195pub struct PrPurgatoryEntry {
133 /// The nostr PR event, if received (None = git data arrived first) 196 /// The nostr PR event, if received (None = git data arrived first)
134 pub event: Option<Event>, 197 pub event: Option<Event>,
135 198
136 /// Expected commit SHA from 'c' tag (if event exists) 199 /// Expected commit SHA from 'c' tag (if event exists)
137 /// or actual commit pushed (if git arrived first) 200 /// or actual commit pushed (if git arrived first)
138 pub commit: String, 201 pub commit: String,
139 202
140 /// When added to purgatory 203 /// When added to purgatory
141 pub created_at: Instant, 204 pub created_at: Instant,
142 205
143 /// Expiry deadline (30 min from creation) 206 /// Expiry deadline (30 min from creation)
144 pub expires_at: Instant, 207 pub expires_at: Instant,
145} 208}
@@ -151,24 +214,155 @@ pub struct PrPurgatoryEntry {
151 214
152```rust 215```rust
153pub struct Purgatory { 216pub struct Purgatory {
217 /// Announcement events indexed by (owner, identifier)
218 announcement_purgatory: DashMap<(PublicKey, String), AnnouncementPurgatoryEntry>,
219
154 /// State events indexed by identifier (d tag) 220 /// State events indexed by identifier (d tag)
155 /// Multiple state events per identifier allowed (different authors) 221 /// Multiple state events per identifier allowed (different authors)
156 state_events: Arc<DashMap<String, Vec<StatePurgatoryEntry>>>, 222 state_events: DashMap<String, Vec<StatePurgatoryEntry>>,
157 223
158 /// PR events indexed by event_id (hex string) 224 /// PR events indexed by event_id (hex string)
159 /// Single entry per event ID 225 /// Single entry per event ID
160 pr_events: Arc<DashMap<String, PrPurgatoryEntry>>, 226 pr_events: DashMap<String, PrPurgatoryEntry>,
161 227
162 /// Sync queue for background git data fetching 228 /// Sync queue for background git data fetching
163 sync_queue: Arc<DashMap<String, SyncQueueEntry>>, 229 sync_queue: DashMap<String, SyncQueueEntry>,
164 230
165 _git_data_path: PathBuf, 231 /// Events that previously expired without git data (prevents re-sync loops)
232 expired_events: DashMap<EventId, Instant>,
166} 233}
167``` 234```
168 235
169--- 236---
170 237
171## Event Flows 238## Announcement Purgatory Flows
239
240### New Announcement Flow
241
242```
243Announcement arrives
244 |
245 v
246Is there an active announcement for (pubkey, identifier) in DB?
247 |
248 +-- YES --> Accept immediately (replacement, repo already proven)
249 |
250 +-- NO --> Is there a purgatory entry for (pubkey, identifier)?
251 |
252 +-- YES --> Replace purgatory entry, extend expiry 30 min
253 | Return OK to client (but don't serve)
254 |
255 +-- NO --> Create bare repo
256 Add to purgatory
257 Return OK to client (but don't serve)
258```
259
260### Git Data Arrival → Promotion
261
262```
263Git push/fetch completes with data
264 |
265 v
266process_purgatory_announcements() called
267 |
268 v
269Is there a purgatory announcement for (owner, identifier)?
270 |
271 +-- YES --> promote_announcement() removes from purgatory
272 | Save event to database
273 | Notify WebSocket clients
274 | (Sync upgrades to Full automatically via SelfSubscriber)
275 |
276 +-- NO --> Normal processing
277```
278
279### State Event Arrival for Purgatory Announcement
280
281```
282State event arrives
283 |
284 v
285fetch_repository_data_with_purgatory() checks DB + purgatory
286 |
287 +-- Announcement found in purgatory -->
288 | Validate authorization against purgatory announcement
289 | Extend purgatory announcement expiry (reset 30-min timer)
290 | If soft-expired: recreate bare repo, clear soft_expired flag
291 | Route state event to state purgatory
292 |
293 +-- No announcement anywhere --> Reject
294```
295
296### Announcement Expiry (Two-Phase Soft Expiry)
297
298The protocol specifies 30-minute expiry for announcements. We implement a two-phase soft expiry:
299
300**Phase 1 — Initial 30-minute expiry (`soft_expired == false`):**
301- Delete the bare git repo (frees disk space, respects protocol expiry)
302- Set `soft_expired = true`
303- Extend `expires_at` by 24 hours (`SOFT_EXPIRY_EXTENDED`)
304- Continue syncing state events (same as active purgatory)
305
306**Phase 2 — 24-hour soft expiry (`soft_expired == true`):**
307- Add event ID to `expired_events` (prevents re-sync loops)
308- Remove entry completely from `announcement_purgatory`
309
310**Why soft expiry?** Without it, we'd face a dilemma:
311
312- Add expired announcements to `failed_events` → permanently reject future state events, losing potential revival when state events arrive late
313- Re-fetch the announcement event on every sync cycle → wasting bandwidth and creating unnecessary sync traffic
314
315Soft expiry retains the event for 24 hours so that late-arriving state events (e.g. from a slow sync) can revive the announcement without forcing a full re-announcement flow.
316
317**Revival:** If a state event arrives for a soft-expired announcement, `extend_announcement_expiry()` recreates the bare repo, clears `soft_expired`, and resets the 30-minute timer.
318
319### Expiry Extension Triggers
320
321The 30-minute purgatory timer is reset (extended) in three scenarios:
322
323| Trigger | Location | Why |
324|---------|----------|-----|
325| State event arrives | `StatePolicy::process_state_event()` | Repo is actively receiving metadata |
326| Git push authorized against purgatory state | `get_state_authorization_for_specific_owner_repo()` | Repo is actively receiving git data |
327| Replacement announcement arrives | `AnnouncementPolicy::validate()` | Announcement updated |
328
329All three call `purgatory.extend_announcement_expiry(owner, identifier, 1800s)`.
330
331### Purgatory Lifecycle
332
333```
334 ┌─────────────────────────────────────┐
335 │ │
336 v │
337Announcement ──> ACTIVE ──────────────────────────────────┤
338 arrives (bare repo exists) │
339 │ │
340 ├── Git data ──> PROMOTED (exit) │
341 │ │
342 ├── Deletion ──> REMOVED (exit) │
343 │ │
344 v │
345 SOFT_EXPIRED ──────────────────────────────┘
346 (bare repo deleted, ^
347 event retained) │
348 │ │
349 ├── State event arrives (revival)
350
351 └── Extended expiry ──> REMOVED (exit)
352```
353
354| Exit | Trigger | Action |
355|------|---------|--------|
356| **Promotion** | Git data arrives | Move to database, sync upgrades to Full |
357| **Soft expiry** | Initial 30-min timeout | Delete bare repo, retain event, continue sync |
358| **Full expiry** | 24-hour soft expiry | Add to expired_events, remove from purgatory |
359| **Deletion** | Kind 5 event | Delete bare repo, remove from purgatory |
360| **Replacement** | Newer announcement (same pubkey, identifier) | Replace entry, extend expiry |
361| **Service change** | Newer announcement removes our service | Remove from purgatory |
362
363---
364
365## State and PR Event Flows
172 366
173### State Event Arrival (Kind 30618) 367### State Event Arrival (Kind 30618)
174 368
@@ -377,11 +571,12 @@ Purgatory includes a background sync system that fetches git data from remote se
377 571
378┌─────────────────────────────────────────────────────┐ 572┌─────────────────────────────────────────────────────┐
379│ process_newly_available_git_data(repo, oids) │ 573│ process_newly_available_git_data(repo, oids) │
380│ 1. Find satisfiable state events in purgatory │ 574│ 1. Find satisfiable announcement in purgatory │
381│ 2. Find satisfiable PR events in purgatory │ 575│ 2. Find satisfiable state events in purgatory │
382│ 3. Save events to database │ 576│ 3. Find satisfiable PR events in purgatory │
383│ 4. Sync git data to other owner repos │ 577│ 4. Save events to database │
384│ 5. Remove from purgatory │ 578│ 5. Sync git data to other owner repos │
579│ 6. Remove from purgatory │
385└─────────────────────────────────────────────────────┘ 580└─────────────────────────────────────────────────────┘
386``` 581```
387 582
@@ -402,8 +597,8 @@ pub struct SyncQueueEntry {
402 597
403**Backoff strategy:** 598**Backoff strategy:**
404- First attempt: 20 seconds 599- First attempt: 20 seconds
405- Second attempt: 2 minutes 600- Second attempt: 40 seconds
406- Subsequent attempts: 2 minutes 601- Subsequent attempts: capped at 2 minutes
407 602
408### Sync Delays 603### Sync Delays
409 604
@@ -428,7 +623,7 @@ pub struct ThrottleManager {
428``` 623```
429 624
430**Rate limiting:** 625**Rate limiting:**
431- Default: 5 requests per domain per 30 seconds 626- Default: 5 concurrent requests per domain, 30 requests per minute
432- Tracks request timestamps in a sliding window 627- Tracks request timestamps in a sliding window
433- Queues identifiers when domain is throttled 628- Queues identifiers when domain is throttled
434- Processes queue when capacity frees up 629- Processes queue when capacity frees up
@@ -439,7 +634,47 @@ See [`src/purgatory/sync/throttle.rs`](../../src/purgatory/sync/throttle.rs) for
439 634
440## Purgatory API 635## Purgatory API
441 636
442### Adding Entries 637### Announcement Purgatory
638
639```rust
640impl Purgatory {
641 /// Add an announcement to purgatory (bare repo already created by caller)
642 pub fn add_announcement(
643 &self,
644 event: Event,
645 identifier: String,
646 owner: PublicKey,
647 repo_path: PathBuf,
648 relays: HashSet<String>,
649 );
650
651 /// Promote announcement: remove from purgatory, return event for DB save
652 pub fn promote_announcement(
653 &self,
654 owner: &PublicKey,
655 identifier: &str,
656 ) -> Option<Event>;
657
658 /// Get announcements by identifier (for authorization checks)
659 pub fn get_announcements_by_identifier(
660 &self,
661 identifier: &str,
662 ) -> Vec<AnnouncementPurgatoryEntry>;
663
664 /// Extend expiry (and revive soft-expired entries, recreating bare repo)
665 pub fn extend_announcement_expiry(
666 &self,
667 owner: &PublicKey,
668 identifier: &str,
669 duration: Duration,
670 );
671
672 /// Get all announcements for sync registration
673 pub fn announcements_for_sync(&self) -> Vec<AnnouncementPurgatoryEntry>;
674}
675```
676
677### State and PR Purgatory
443 678
444```rust 679```rust
445impl Purgatory { 680impl Purgatory {
@@ -453,13 +688,7 @@ impl Purgatory {
453 688
454 /// Add a PR placeholder (git-data-first scenario) 689 /// Add a PR placeholder (git-data-first scenario)
455 pub fn add_pr_placeholder(&self, event_id: String, commit: String); 690 pub fn add_pr_placeholder(&self, event_id: String, commit: String);
456}
457```
458 691
459### Finding Entries
460
461```rust
462impl Purgatory {
463 /// Find state events waiting for an identifier 692 /// Find state events waiting for an identifier
464 pub fn find_state(&self, identifier: &str) -> Vec<StatePurgatoryEntry>; 693 pub fn find_state(&self, identifier: &str) -> Vec<StatePurgatoryEntry>;
465 694
@@ -476,13 +705,7 @@ impl Purgatory {
476 705
477 /// Find a PR placeholder specifically (git-data-first) 706 /// Find a PR placeholder specifically (git-data-first)
478 pub fn find_pr_placeholder(&self, event_id: &str) -> Option<String>; 707 pub fn find_pr_placeholder(&self, event_id: &str) -> Option<String>;
479}
480```
481 708
482### Removing Entries
483
484```rust
485impl Purgatory {
486 /// Remove all state events for an identifier 709 /// Remove all state events for an identifier
487 pub fn remove_state(&self, identifier: &str); 710 pub fn remove_state(&self, identifier: &str);
488 711
@@ -499,36 +722,14 @@ impl Purgatory {
499```rust 722```rust
500impl Purgatory { 723impl Purgatory {
501 /// Remove expired entries (called every 60 seconds) 724 /// Remove expired entries (called every 60 seconds)
502 /// Returns (state_removed, pr_removed) 725 /// Handles two-phase soft expiry for announcements
503 pub fn cleanup(&self) -> (usize, usize); 726 pub fn cleanup(&self);
504 727
505 /// Extend expiry for entries about to be processed 728 /// Extend expiry for state/PR entries about to be processed
506 /// Ensures at least `duration` remaining
507 pub fn extend_expiry(&self, identifier: &str, event_ids: &[EventId], duration: Duration); 729 pub fn extend_expiry(&self, identifier: &str, event_ids: &[EventId], duration: Duration);
508 730
509 /// Get current counts for metrics 731 /// Check if an event previously expired (prevents re-sync loops)
510 pub fn count(&self) -> (usize, usize); 732 pub fn is_expired(&self, event_id: &EventId) -> bool;
511}
512```
513
514### Sync Queue Management
515
516```rust
517impl Purgatory {
518 /// Enqueue identifier for sync with custom delay
519 pub fn enqueue_sync(&self, identifier: &str, delay: Duration);
520
521 /// Enqueue with default delay (3 minutes)
522 pub fn enqueue_sync_default(&self, identifier: &str);
523
524 /// Enqueue with immediate delay (500ms)
525 pub fn enqueue_sync_immediate(&self, identifier: &str);
526
527 /// Check if identifier has pending events
528 pub fn has_pending_events(&self, identifier: &str) -> bool;
529
530 /// Remove identifier from sync queue
531 pub fn remove_from_sync_queue(&self, identifier: &str);
532} 733}
533``` 734```
534 735
@@ -558,12 +759,6 @@ pub fn can_apply_state(
558 event: &Event, 759 event: &Event,
559 repo_path: &Path, 760 repo_path: &Path,
560) -> Result<bool>; 761) -> Result<bool>;
561
562/// Get refs from state that aren't being pushed
563pub fn get_unpushed_refs(
564 state_refs: &[RefPair],
565 pushed_refs: &[RefPair],
566) -> Vec<RefPair>;
567``` 762```
568 763
569See [`src/purgatory/helpers.rs`](../../src/purgatory/helpers.rs) for implementation. 764See [`src/purgatory/helpers.rs`](../../src/purgatory/helpers.rs) for implementation.
@@ -572,123 +767,37 @@ See [`src/purgatory/helpers.rs`](../../src/purgatory/helpers.rs) for implementat
572 767
573## Integration Points 768## Integration Points
574 769
575### 1. Event Policy (Nip34WritePolicy) 770### 1. Announcement Policy (`src/nostr/policy/announcement.rs`)
576 771
577State and PR events are added to purgatory when git data doesn't exist: 772Routes new announcements to purgatory or accepts replacements:
578 773
579```rust 774- If active DB announcement exists for `(pubkey, identifier)` → `Accept` immediately
580// From src/nostr/policy/state.rs 775- If purgatory entry exists → replace it, extend expiry, return `Accept`
581async fn handle_state(&self, event: &Event) -> WritePolicyResult { 776- Otherwise → return `AcceptPurgatory`, caller calls `add_to_purgatory()` which creates bare repo and adds to purgatory
582 let identifier = extract_identifier(event)?;
583
584 // Check if we have matching git data
585 if self.has_matching_git_data(&identifier, event).await? {
586 return WritePolicyResult::Accept;
587 }
588
589 // Add to purgatory
590 self.purgatory.add_state(
591 event.clone(),
592 identifier.clone(),
593 event.pubkey,
594 );
595
596 WritePolicyResult::Reject {
597 status: true, // Client sees OK
598 message: "purgatory: awaiting git data".into()
599 }
600}
601```
602 777
603### 2. Git Push Authorization 778### 2. State Event Policy (`src/nostr/policy/state.rs`)
604 779
605Authorization checks both database and purgatory: 780Checks purgatory announcements for authorization and extends their expiry:
606 781
607```rust 782```rust
608// From src/git/authorization.rs 783// Fetch announcements from both DB and purgatory
609pub async fn authorize_push( 784let repo_data = fetch_repository_data_with_purgatory(db, purgatory, identifier).await?;
610 database: &SharedDatabase, 785
611 identifier: &str, 786// For each authorized owner with a purgatory announcement, extend expiry
612 owner_pubkey: &str, 787purgatory.extend_announcement_expiry(&owner_pk, &identifier, Duration::from_secs(1800));
613 request_body: &Bytes,
614 purgatory: &Arc<Purgatory>, // Critical!
615 repo_path: &std::path::Path,
616) -> anyhow::Result<AuthorizationResult> {
617 // Parse pushed refs
618 let pushed_refs = parse_pushed_refs(request_body);
619
620 // Check database for state events
621 let db_result = get_authorization_from_db(database, identifier).await?;
622
623 if !db_result.authorized {
624 // No state in database - check purgatory
625 let purgatory_result = get_state_authorization_for_specific_owner_repo(
626 database,
627 identifier,
628 owner_pubkey,
629 purgatory,
630 &pushed_refs,
631 repo_path,
632 ).await?;
633
634 return purgatory_result;
635 }
636
637 db_result
638}
639``` 788```
640 789
641### 3. Post-Push Processing 790### 3. Git Push Authorization (`src/git/authorization.rs`)
642 791
643After successful push, events from purgatory are saved to database: 792`fetch_repository_data_with_purgatory()` merges DB announcements with purgatory announcements for authorization. On successful authorization via purgatory state events, also extends announcement expiry.
644 793
645```rust 794### 4. Git Data Processing (`src/git/sync.rs`)
646// From src/git/handlers.rs
647if from_purgatory {
648 if let (Some(db), Some(purg)) = (&database, &purgatory) {
649 // Save state event to database
650 db.save_event(&state.event).await?;
651
652 // Remove from purgatory
653 purg.remove_state_event(identifier, &state.event.id);
654 }
655}
656```
657 795
658### 4. Background Sync Loop 796`process_purgatory_announcements()` is called after any git push or background sync fetch. It promotes announcements from purgatory to the database and notifies WebSocket clients.
659 797
660Started during application initialization: 798### 5. Sync Registration (`src/sync/`)
661 799
662```rust 800A background timer (`run_purgatory_announcement_sync`, every 5 seconds) ensures purgatory announcements are registered in `RepoSyncIndex` with `SyncLevel::StateOnly`. When an announcement is promoted, the `SelfSubscriber` upgrades it to `SyncLevel::Full`.
663// From src/main.rs
664let purgatory = Arc::new(Purgatory::new(git_data_path));
665let ctx = Arc::new(RealSyncContext::new(
666 database.clone(),
667 purgatory.clone(),
668 config.domain.clone(),
669 git_data_path.clone(),
670));
671let throttle_manager = Arc::new(ThrottleManager::new(5, 30));
672throttle_manager.set_context(ctx.clone());
673
674// Start sync loop
675let sync_handle = purgatory.clone().start_sync_loop(ctx, throttle_manager);
676
677// Start cleanup task
678let cleanup_handle = tokio::spawn(async move {
679 let mut interval = tokio::time::interval(Duration::from_secs(60));
680 loop {
681 interval.tick().await;
682 let (state_removed, pr_removed) = purgatory.cleanup();
683 if state_removed + pr_removed > 0 {
684 tracing::debug!(
685 "Purgatory cleanup removed {} state, {} PR entries",
686 state_removed, pr_removed
687 );
688 }
689 }
690});
691```
692 801
693--- 802---
694 803
@@ -698,7 +807,7 @@ let cleanup_handle = tokio::spawn(async move {
698src/ 807src/
699├── purgatory/ 808├── purgatory/
700│ ├── mod.rs # Main Purgatory struct and API 809│ ├── mod.rs # Main Purgatory struct and API
701│ ├── types.rs # RefPair, StatePurgatoryEntry, PrPurgatoryEntry 810│ ├── types.rs # RefPair, AnnouncementPurgatoryEntry, StatePurgatoryEntry, PrPurgatoryEntry
702│ ├── helpers.rs # Ref extraction and matching functions 811│ ├── helpers.rs # Ref extraction and matching functions
703│ └── sync/ 812│ └── sync/
704│ ├── mod.rs # Sync module exports 813│ ├── mod.rs # Sync module exports
@@ -710,9 +819,10 @@ src/
710├── git/ 819├── git/
711│ ├── authorization.rs # authorize_push with purgatory checking 820│ ├── authorization.rs # authorize_push with purgatory checking
712│ ├── handlers.rs # handle_receive_pack with post-push processing 821│ ├── handlers.rs # handle_receive_pack with post-push processing
713│ └── sync.rs # process_newly_available_git_data 822│ └── sync.rs # process_newly_available_git_data, process_purgatory_announcements
714└── nostr/ 823└── nostr/
715 └── policy/ 824 └── policy/
825 ├── announcement.rs # Route announcements to purgatory
716 ├── state.rs # State event policy with purgatory 826 ├── state.rs # State event policy with purgatory
717 └── pr_event.rs # PR event policy with purgatory 827 └── pr_event.rs # PR event policy with purgatory
718``` 828```
@@ -725,7 +835,7 @@ src/
725 835
726Located in each module: 836Located in each module:
727 837
728- **[`src/purgatory/mod.rs`](../../src/purgatory/mod.rs)** - Core purgatory operations 838- **[`src/purgatory/mod.rs`](../../src/purgatory/mod.rs)** - Core purgatory operations including announcement purgatory
729- **[`src/purgatory/helpers.rs`](../../src/purgatory/helpers.rs)** - Ref matching logic 839- **[`src/purgatory/helpers.rs`](../../src/purgatory/helpers.rs)** - Ref matching logic
730- **[`src/purgatory/sync/functions.rs`](../../src/purgatory/sync/functions.rs)** - Sync functions with MockSyncContext 840- **[`src/purgatory/sync/functions.rs`](../../src/purgatory/sync/functions.rs)** - Sync functions with MockSyncContext
731- **[`src/purgatory/sync/throttle.rs`](../../src/purgatory/sync/throttle.rs)** - Throttle manager 841- **[`src/purgatory/sync/throttle.rs`](../../src/purgatory/sync/throttle.rs)** - Throttle manager
@@ -734,6 +844,9 @@ Located in each module:
734 844
735Located in [`tests/`](../../tests/): 845Located in [`tests/`](../../tests/):
736 846
847- **Announcement purgatory flow** - Announcement enters purgatory, git data promotes it
848- **Announcement soft expiry** - Bare repo deleted after 30 min, event retained 24h
849- **Announcement revival** - State event revives soft-expired announcement
737- **State event purgatory flow** - Event arrives, git push releases it 850- **State event purgatory flow** - Event arrives, git push releases it
738- **PR event purgatory flow** - Event arrives, git push releases it 851- **PR event purgatory flow** - Event arrives, git push releases it
739- **Git-data-first flow** - Git push creates placeholder, event completes it 852- **Git-data-first flow** - Git push creates placeholder, event completes it
@@ -744,7 +857,19 @@ Located in [`tests/`](../../tests/):
744 857
745## Key Learnings 858## Key Learnings
746 859
747### 1. Purgatory Authorization is Critical 860### 1. Announcement Purgatory Prevents Misleading Empty Repos
861
862Without announcement purgatory, we'd serve announcements for repos with no content. Clients would see the announcement, try to clone, and get nothing.
863
864**Solution:** Announcements wait in purgatory until git data proves content exists.
865
866### 2. Soft Expiry Avoids Sync Loops
867
868The protocol's 30-minute expiry creates a problem: without soft expiry, we'd either permanently block repositories or constantly re-sync expired announcement events.
869
870**Solution:** Soft expiry retains the event for 24 hours after deleting the bare repo, allowing revival without re-fetching.
871
872### 3. Purgatory Authorization is Critical
748 873
749Without checking purgatory during authorization, we have a deadlock: 874Without checking purgatory during authorization, we have a deadlock:
750- State event goes to purgatory (no git data) 875- State event goes to purgatory (no git data)
@@ -753,7 +878,7 @@ Without checking purgatory during authorization, we have a deadlock:
753 878
754**Solution:** `authorize_push()` checks both database and purgatory. 879**Solution:** `authorize_push()` checks both database and purgatory.
755 880
756### 2. Late Binding for State Events 881### 4. Late Binding for State Events
757 882
758Extracting refs at event arrival time doesn't work when: 883Extracting refs at event arrival time doesn't work when:
759- Multiple state events arrive for same identifier 884- Multiple state events arrive for same identifier
@@ -761,7 +886,7 @@ Extracting refs at event arrival time doesn't work when:
761 886
762**Solution:** Extract and match refs at push time via `find_matching_states()`. 887**Solution:** Extract and match refs at push time via `find_matching_states()`.
763 888
764### 3. Bidirectional Waiting for PR Events 889### 5. Bidirectional Waiting for PR Events
765 890
766PR events can arrive before or after git data: 891PR events can arrive before or after git data:
767- Event first → Wait for git push 892- Event first → Wait for git push
@@ -769,26 +894,13 @@ PR events can arrive before or after git data:
769 894
770**Solution:** `PrPurgatoryEntry.event: Option<Event>` with `None` = placeholder. 895**Solution:** `PrPurgatoryEntry.event: Option<Event>` with `None` = placeholder.
771 896
772### 4. Sync Queue Debouncing
773
774When events arrive in bursts (e.g., negentropy sync), we don't want to spawn a sync task for each event.
775
776**Solution:** `enqueue_sync()` resets `attempt_count` and updates `next_attempt` if already queued.
777
778### 5. Domain Throttling with Queues
779
780When a domain is throttled, we still want to eventually sync from it.
781
782**Solution:** `ThrottleManager` maintains per-domain queues and processes them when capacity frees.
783
784--- 897---
785 898
786## Related Documentation 899## Related Documentation
787 900
788- [Inline Authorization](inline-authorization.md) - Why purgatory checking during authorization is essential
789- [Architecture Overview](architecture.md) - Full system design 901- [Architecture Overview](architecture.md) - Full system design
790- [Background Sync](../how-to/purgatory-sync.md) - How to configure and monitor sync 902- [GRASP-02 Proactive Sync](grasp-02-proactive-sync.md) - Relay-to-relay event sync with SyncLevel
791- [Test Strategy](../reference/test-strategy.md) - How we test purgatory 903- [GRASP-02 Purgatory Git Data Fetching](grasp-02-proactive-sync-purgatory-git-data.md) - Background git data hunting
792 904
793--- 905---
794 906