upleb.uk

Public git repos — served from a NIP-34 GRASP relay at git.upleb.uk

summaryrefslogtreecommitdiff
path: root/docs/explanation/grasp-02-proactive-sync-purgatory-git-data.md
diff options
context:
space:
mode:
authorDanConwayDev <DanConwayDev@protonmail.com>2026-02-23 15:41:32 +0000
committerDanConwayDev <DanConwayDev@protonmail.com>2026-02-23 15:41:32 +0000
commitc54ce061d6d278cce8362d5af085808ca60c239b (patch)
treeec967d6195d9f7ec4f061449596611afe3a0950f /docs/explanation/grasp-02-proactive-sync-purgatory-git-data.md
parente0ad39a489b3398f8208713bf728db0cb11475b0 (diff)
parent113928aa84894ea8f65c247d9987527e792b32a9 (diff)
feat: announcement purgatory
Extends purgatory to hold repository announcements until git data arrives, preventing empty repositories from being served to clients. When an announcement is received, a bare repo is created immediately and the announcement is held in purgatory. It is only promoted and served once a git push confirms real content exists. If no push arrives before expiry, the bare repo is deleted and the announcement is silently discarded. Key behaviours: - Soft expiry: announcements are hidden from clients but kept alive while git pushes are in progress, reviving on successful push - Expiry is extended when a matching state event or git push is observed - NIP-09 deletion events remove announcements from purgatory - Purgatory state (announcements, state events, PR events, expired set) is persisted to disk on graceful shutdown and restored on startup, with elapsed downtime subtracted from expiry deadlines - Purgatory announcements drive StateOnly sync in the sync system so state events are fetched from listed relays before promotion - SyncLevel added to RepoSyncIndex to distinguish purgatory repos (StateOnly) from promoted repos (Full L2+L3 sync)
Diffstat (limited to 'docs/explanation/grasp-02-proactive-sync-purgatory-git-data.md')
-rw-r--r--docs/explanation/grasp-02-proactive-sync-purgatory-git-data.md67
1 files changed, 50 insertions, 17 deletions
diff --git a/docs/explanation/grasp-02-proactive-sync-purgatory-git-data.md b/docs/explanation/grasp-02-proactive-sync-purgatory-git-data.md
index 31c3e46..8fb5798 100644
--- a/docs/explanation/grasp-02-proactive-sync-purgatory-git-data.md
+++ b/docs/explanation/grasp-02-proactive-sync-purgatory-git-data.md
@@ -12,7 +12,13 @@
12 12
13## Overview 13## Overview
14 14
15When Nostr events arrive before their git data, they enter **purgatory** waiting to be served. But they don't wait passively—ngit-grasp **actively hunts** for the missing git data across all git servers assoicated with the repo until it finds what it needs. 15When Nostr events arrive before their git data, they enter **purgatory** waiting to be served. But they don't wait passively—ngit-grasp **actively hunts** for the missing git data across all git servers associated with the repo until it finds what it needs.
16
17This applies to three types of purgatory entries:
18
19- **Announcement purgatory** — kind 30617 announcements waiting for a git push to prove the repo has content
20- **State event purgatory** — kind 30618 state events waiting for their referenced git objects
21- **PR event purgatory** — kind 1617/1618 PR events waiting for their referenced commits
16 22
17### How It Works 23### How It Works
18 24
@@ -42,6 +48,7 @@ We respect remote server capacity with:
42✅ **Respectful throttling** - 5 concurrent + 30/min per domain, plays nice with other implementations 48✅ **Respectful throttling** - 5 concurrent + 30/min per domain, plays nice with other implementations
43✅ **Smart timing** - 3min delay for user pushes, 500ms for synced events 49✅ **Smart timing** - 3min delay for user pushes, 500ms for synced events
44✅ **30min expiry** - Auto-cleanup of events when data never arrives 50✅ **30min expiry** - Auto-cleanup of events when data never arrives
51✅ **Soft expiry for announcements** - Bare repo deleted at 30min, event retained 24h to allow revival
45✅ **Fully testable** - Mock-based architecture for reliable unit tests 52✅ **Fully testable** - Mock-based architecture for reliable unit tests
46 53
47--- 54---
@@ -73,6 +80,16 @@ Timeline D: Data never arrives
73 t=60s: Retry → all servers checked, no data 80 t=60s: Retry → all servers checked, no data
74 ... 81 ...
75 t=1800s: 30 minutes expired → event discarded, purgatory cleaned up 🗑️ 82 t=1800s: 30 minutes expired → event discarded, purgatory cleaned up 🗑️
83
84Timeline E: Announcement purgatory (no git data within 30 min)
85 t=0s: Announcement received → bare repo created, enters announcement purgatory
86 t=0.5s: Start hunting git servers for any content
87 ...
88 t=1800s: 30 minutes expired → bare repo deleted, event retained (soft_expired=true)
89 t=3600s: State event arrives (slow sync) → bare repo recreated, expiry reset ✅
90 t=5400s: Git push arrives → announcement promoted to DB, served to clients ✅
91 OR
92 t=86400s: 24 hours elapsed, no revival → event added to expired_events, removed 🗑️
76``` 93```
77 94
78**Without proactive sync**: Events in Timeline C would wait indefinitely (or until manual git push). 95**Without proactive sync**: Events in Timeline C would wait indefinitely (or until manual git push).
@@ -330,11 +347,11 @@ Both methods check `has_capacity()` and trigger `try_process_next()` if true.
330 347
331--- 348---
332 349
333## 30-Minute Purgatory Expiry 350## Purgatory Expiry
334 351
335Purgatory entries **automatically expire** after 30 minutes to prevent unbounded memory growth. 352### State and PR Events: 30-Minute Hard Expiry
336 353
337### Why 30 Minutes? 354State and PR purgatory entries **automatically expire** after 30 minutes.
338 355
339From the [GRASP-01 spec](https://github.com/DanConwayDev/grasp/blob/main/01.md#purgatory): 356From the [GRASP-01 spec](https://github.com/DanConwayDev/grasp/blob/main/01.md#purgatory):
340 357
@@ -346,25 +363,40 @@ This balances:
346- 🧹 **Short enough** to prevent memory leaks from abandoned events 363- 🧹 **Short enough** to prevent memory leaks from abandoned events
347- 🔄 **Recoverable** events are still on other relays and can be re-submitted 364- 🔄 **Recoverable** events are still on other relays and can be re-submitted
348 365
349### Implementation 366Each entry tracks `expires_at: Instant` (30 min from creation). The sync loop checks expiry before processing via `has_pending_events()`. If all events for an identifier have expired, the identifier is removed from the sync queue.
350 367
351Each purgatory entry tracks: 368To prevent infinite re-sync loops, expired event IDs are added to an `expired_events` set. If a sync delivers an event that previously expired, it is rejected with `"previously expired from purgatory without git data"`.
352 369
353- `created_at: Instant` - When added to purgatory 370**Implementation**: [`src/purgatory/mod.rs:DEFAULT_EXPIRY`](../../src/purgatory/mod.rs)
354- `expires_at: Instant` - When to discard (created_at + 30min)
355 371
356The main sync loop checks expiry before processing: 372### Announcement Purgatory: Two-Phase Soft Expiry
357 373
358```rust 374Announcements use a different expiry strategy because they have an additional concern: the bare git repo created on arrival must be cleaned up, but we also need to avoid re-syncing the announcement event on every sync cycle.
359if !self.has_pending_events(&identifier) {
360 // No events remain (expired or released) → remove from sync queue
361 self.sync_queue.remove(&identifier);
362}
363```
364 375
365**Note**: Expiry is checked implicitly via `has_pending_events()`. If all events for an identifier have expired, the identifier is removed from the sync queue. 376**Phase 1 — Initial 30-minute expiry:**
366 377
367**Implementation**: [`src/purgatory/mod.rs:DEFAULT_EXPIRY`](../../src/purgatory/mod.rs) 378- Delete the bare git repo (frees disk space, respects the protocol's 30-minute expiry)
379- Set `soft_expired = true` on the entry
380- Extend `expires_at` by **24 hours** (`SOFT_EXPIRY_EXTENDED`)
381- Continue syncing state events for this repo (same as active purgatory)
382
383**Phase 2 — 24-hour soft expiry:**
384
385- Add event ID to `expired_events` (prevents re-sync loops)
386- Remove entry completely from `announcement_purgatory`
387
388**Why not just hard-expire at 30 minutes?**
389
390The protocol's 30-minute expiry creates a dilemma for announcements:
391
392- **Option A: Add to `failed_events` at 30 min** → Permanently rejects future state events, losing potential revival when state events arrive late (e.g. from a slow sync)
393- **Option B: Remove entirely at 30 min** → The announcement gets re-fetched on every subsequent sync cycle, wasting bandwidth indefinitely
394
395Soft expiry is the solution: the bare repo is deleted at 30 minutes (respecting the protocol), but the event is retained for 24 hours. During this window, a late-arriving state event can **revive** the announcement—`extend_announcement_expiry()` recreates the bare repo, clears `soft_expired`, and resets the 30-minute timer. After 24 hours with no revival, the event is added to `expired_events` and fully removed.
396
397**Why 24 hours specifically?** This covers the worst-case sync delay. A relay that was offline for up to 24 hours will re-sync state events when it reconnects. The 24-hour window ensures announcements remain revivable throughout that period without permanently occupying disk space.
398
399**Implementation**: [`src/purgatory/mod.rs:SOFT_EXPIRY_EXTENDED`](../../src/purgatory/mod.rs)
368 400
369--- 401---
370 402
@@ -670,6 +702,7 @@ The purgatory sync system is a sophisticated, production-ready implementation th
670✅ **Throttles respectfully** - 5 concurrent + 30/min per domain, round-robin fairness 702✅ **Throttles respectfully** - 5 concurrent + 30/min per domain, round-robin fairness
671✅ **Times strategically** - 3min for user events, 500ms for synced events 703✅ **Times strategically** - 3min for user events, 500ms for synced events
672✅ **Expires responsibly** - 30min auto-cleanup prevents memory leaks 704✅ **Expires responsibly** - 30min auto-cleanup prevents memory leaks
705✅ **Soft-expires announcements** - Bare repo deleted at 30min, event retained 24h for revival
673✅ **Tests thoroughly** - Mock-based architecture enables comprehensive unit tests 706✅ **Tests thoroughly** - Mock-based architecture enables comprehensive unit tests
674 707
675This design ensures ngit-grasp can serve repositories reliably even when git data and Nostr events arrive out-of-order or from different sources, while respecting remote server capacity and providing excellent observability. 708This design ensures ngit-grasp can serve repositories reliably even when git data and Nostr events arrive out-of-order or from different sources, while respecting remote server capacity and providing excellent observability.