diff options
| author | DanConwayDev <DanConwayDev@protonmail.com> | 2026-02-23 15:41:32 +0000 |
|---|---|---|
| committer | DanConwayDev <DanConwayDev@protonmail.com> | 2026-02-23 15:41:32 +0000 |
| commit | c54ce061d6d278cce8362d5af085808ca60c239b (patch) | |
| tree | ec967d6195d9f7ec4f061449596611afe3a0950f /docs/explanation/grasp-02-proactive-sync-purgatory-git-data.md | |
| parent | e0ad39a489b3398f8208713bf728db0cb11475b0 (diff) | |
| parent | 113928aa84894ea8f65c247d9987527e792b32a9 (diff) | |
feat: announcement purgatory
Extends purgatory to hold repository announcements until git data arrives,
preventing empty repositories from being served to clients.
When an announcement is received, a bare repo is created immediately and the
announcement is held in purgatory. It is only promoted and served once a git
push confirms real content exists. If no push arrives before expiry, the bare
repo is deleted and the announcement is silently discarded.
Key behaviours:
- Soft expiry: announcements are hidden from clients but kept alive while git
pushes are in progress, reviving on successful push
- Expiry is extended when a matching state event or git push is observed
- NIP-09 deletion events remove announcements from purgatory
- Purgatory state (announcements, state events, PR events, expired set) is
persisted to disk on graceful shutdown and restored on startup, with elapsed
downtime subtracted from expiry deadlines
- Purgatory announcements drive StateOnly sync in the sync system so state
events are fetched from listed relays before promotion
- SyncLevel added to RepoSyncIndex to distinguish purgatory repos (StateOnly)
from promoted repos (Full L2+L3 sync)
Diffstat (limited to 'docs/explanation/grasp-02-proactive-sync-purgatory-git-data.md')
| -rw-r--r-- | docs/explanation/grasp-02-proactive-sync-purgatory-git-data.md | 67 |
1 files changed, 50 insertions, 17 deletions
diff --git a/docs/explanation/grasp-02-proactive-sync-purgatory-git-data.md b/docs/explanation/grasp-02-proactive-sync-purgatory-git-data.md index 31c3e46..8fb5798 100644 --- a/docs/explanation/grasp-02-proactive-sync-purgatory-git-data.md +++ b/docs/explanation/grasp-02-proactive-sync-purgatory-git-data.md | |||
| @@ -12,7 +12,13 @@ | |||
| 12 | 12 | ||
| 13 | ## Overview | 13 | ## Overview |
| 14 | 14 | ||
| 15 | When Nostr events arrive before their git data, they enter **purgatory** waiting to be served. But they don't wait passively—ngit-grasp **actively hunts** for the missing git data across all git servers assoicated with the repo until it finds what it needs. | 15 | When Nostr events arrive before their git data, they enter **purgatory** waiting to be served. But they don't wait passively—ngit-grasp **actively hunts** for the missing git data across all git servers associated with the repo until it finds what it needs. |
| 16 | |||
| 17 | This applies to three types of purgatory entries: | ||
| 18 | |||
| 19 | - **Announcement purgatory** — kind 30617 announcements waiting for a git push to prove the repo has content | ||
| 20 | - **State event purgatory** — kind 30618 state events waiting for their referenced git objects | ||
| 21 | - **PR event purgatory** — kind 1617/1618 PR events waiting for their referenced commits | ||
| 16 | 22 | ||
| 17 | ### How It Works | 23 | ### How It Works |
| 18 | 24 | ||
| @@ -42,6 +48,7 @@ We respect remote server capacity with: | |||
| 42 | ✅ **Respectful throttling** - 5 concurrent + 30/min per domain, plays nice with other implementations | 48 | ✅ **Respectful throttling** - 5 concurrent + 30/min per domain, plays nice with other implementations |
| 43 | ✅ **Smart timing** - 3min delay for user pushes, 500ms for synced events | 49 | ✅ **Smart timing** - 3min delay for user pushes, 500ms for synced events |
| 44 | ✅ **30min expiry** - Auto-cleanup of events when data never arrives | 50 | ✅ **30min expiry** - Auto-cleanup of events when data never arrives |
| 51 | ✅ **Soft expiry for announcements** - Bare repo deleted at 30min, event retained 24h to allow revival | ||
| 45 | ✅ **Fully testable** - Mock-based architecture for reliable unit tests | 52 | ✅ **Fully testable** - Mock-based architecture for reliable unit tests |
| 46 | 53 | ||
| 47 | --- | 54 | --- |
| @@ -73,6 +80,16 @@ Timeline D: Data never arrives | |||
| 73 | t=60s: Retry → all servers checked, no data | 80 | t=60s: Retry → all servers checked, no data |
| 74 | ... | 81 | ... |
| 75 | t=1800s: 30 minutes expired → event discarded, purgatory cleaned up 🗑️ | 82 | t=1800s: 30 minutes expired → event discarded, purgatory cleaned up 🗑️ |
| 83 | |||
| 84 | Timeline E: Announcement purgatory (no git data within 30 min) | ||
| 85 | t=0s: Announcement received → bare repo created, enters announcement purgatory | ||
| 86 | t=0.5s: Start hunting git servers for any content | ||
| 87 | ... | ||
| 88 | t=1800s: 30 minutes expired → bare repo deleted, event retained (soft_expired=true) | ||
| 89 | t=3600s: State event arrives (slow sync) → bare repo recreated, expiry reset ✅ | ||
| 90 | t=5400s: Git push arrives → announcement promoted to DB, served to clients ✅ | ||
| 91 | OR | ||
| 92 | t=86400s: 24 hours elapsed, no revival → event added to expired_events, removed 🗑️ | ||
| 76 | ``` | 93 | ``` |
| 77 | 94 | ||
| 78 | **Without proactive sync**: Events in Timeline C would wait indefinitely (or until manual git push). | 95 | **Without proactive sync**: Events in Timeline C would wait indefinitely (or until manual git push). |
| @@ -330,11 +347,11 @@ Both methods check `has_capacity()` and trigger `try_process_next()` if true. | |||
| 330 | 347 | ||
| 331 | --- | 348 | --- |
| 332 | 349 | ||
| 333 | ## 30-Minute Purgatory Expiry | 350 | ## Purgatory Expiry |
| 334 | 351 | ||
| 335 | Purgatory entries **automatically expire** after 30 minutes to prevent unbounded memory growth. | 352 | ### State and PR Events: 30-Minute Hard Expiry |
| 336 | 353 | ||
| 337 | ### Why 30 Minutes? | 354 | State and PR purgatory entries **automatically expire** after 30 minutes. |
| 338 | 355 | ||
| 339 | From the [GRASP-01 spec](https://github.com/DanConwayDev/grasp/blob/main/01.md#purgatory): | 356 | From the [GRASP-01 spec](https://github.com/DanConwayDev/grasp/blob/main/01.md#purgatory): |
| 340 | 357 | ||
| @@ -346,25 +363,40 @@ This balances: | |||
| 346 | - 🧹 **Short enough** to prevent memory leaks from abandoned events | 363 | - 🧹 **Short enough** to prevent memory leaks from abandoned events |
| 347 | - 🔄 **Recoverable** events are still on other relays and can be re-submitted | 364 | - 🔄 **Recoverable** events are still on other relays and can be re-submitted |
| 348 | 365 | ||
| 349 | ### Implementation | 366 | Each entry tracks `expires_at: Instant` (30 min from creation). The sync loop checks expiry before processing via `has_pending_events()`. If all events for an identifier have expired, the identifier is removed from the sync queue. |
| 350 | 367 | ||
| 351 | Each purgatory entry tracks: | 368 | To prevent infinite re-sync loops, expired event IDs are added to an `expired_events` set. If a sync delivers an event that previously expired, it is rejected with `"previously expired from purgatory without git data"`. |
| 352 | 369 | ||
| 353 | - `created_at: Instant` - When added to purgatory | 370 | **Implementation**: [`src/purgatory/mod.rs:DEFAULT_EXPIRY`](../../src/purgatory/mod.rs) |
| 354 | - `expires_at: Instant` - When to discard (created_at + 30min) | ||
| 355 | 371 | ||
| 356 | The main sync loop checks expiry before processing: | 372 | ### Announcement Purgatory: Two-Phase Soft Expiry |
| 357 | 373 | ||
| 358 | ```rust | 374 | Announcements use a different expiry strategy because they have an additional concern: the bare git repo created on arrival must be cleaned up, but we also need to avoid re-syncing the announcement event on every sync cycle. |
| 359 | if !self.has_pending_events(&identifier) { | ||
| 360 | // No events remain (expired or released) → remove from sync queue | ||
| 361 | self.sync_queue.remove(&identifier); | ||
| 362 | } | ||
| 363 | ``` | ||
| 364 | 375 | ||
| 365 | **Note**: Expiry is checked implicitly via `has_pending_events()`. If all events for an identifier have expired, the identifier is removed from the sync queue. | 376 | **Phase 1 — Initial 30-minute expiry:** |
| 366 | 377 | ||
| 367 | **Implementation**: [`src/purgatory/mod.rs:DEFAULT_EXPIRY`](../../src/purgatory/mod.rs) | 378 | - Delete the bare git repo (frees disk space, respects the protocol's 30-minute expiry) |
| 379 | - Set `soft_expired = true` on the entry | ||
| 380 | - Extend `expires_at` by **24 hours** (`SOFT_EXPIRY_EXTENDED`) | ||
| 381 | - Continue syncing state events for this repo (same as active purgatory) | ||
| 382 | |||
| 383 | **Phase 2 — 24-hour soft expiry:** | ||
| 384 | |||
| 385 | - Add event ID to `expired_events` (prevents re-sync loops) | ||
| 386 | - Remove entry completely from `announcement_purgatory` | ||
| 387 | |||
| 388 | **Why not just hard-expire at 30 minutes?** | ||
| 389 | |||
| 390 | The protocol's 30-minute expiry creates a dilemma for announcements: | ||
| 391 | |||
| 392 | - **Option A: Add to `failed_events` at 30 min** → Permanently rejects future state events, losing potential revival when state events arrive late (e.g. from a slow sync) | ||
| 393 | - **Option B: Remove entirely at 30 min** → The announcement gets re-fetched on every subsequent sync cycle, wasting bandwidth indefinitely | ||
| 394 | |||
| 395 | Soft expiry is the solution: the bare repo is deleted at 30 minutes (respecting the protocol), but the event is retained for 24 hours. During this window, a late-arriving state event can **revive** the announcement—`extend_announcement_expiry()` recreates the bare repo, clears `soft_expired`, and resets the 30-minute timer. After 24 hours with no revival, the event is added to `expired_events` and fully removed. | ||
| 396 | |||
| 397 | **Why 24 hours specifically?** This covers the worst-case sync delay. A relay that was offline for up to 24 hours will re-sync state events when it reconnects. The 24-hour window ensures announcements remain revivable throughout that period without permanently occupying disk space. | ||
| 398 | |||
| 399 | **Implementation**: [`src/purgatory/mod.rs:SOFT_EXPIRY_EXTENDED`](../../src/purgatory/mod.rs) | ||
| 368 | 400 | ||
| 369 | --- | 401 | --- |
| 370 | 402 | ||
| @@ -670,6 +702,7 @@ The purgatory sync system is a sophisticated, production-ready implementation th | |||
| 670 | ✅ **Throttles respectfully** - 5 concurrent + 30/min per domain, round-robin fairness | 702 | ✅ **Throttles respectfully** - 5 concurrent + 30/min per domain, round-robin fairness |
| 671 | ✅ **Times strategically** - 3min for user events, 500ms for synced events | 703 | ✅ **Times strategically** - 3min for user events, 500ms for synced events |
| 672 | ✅ **Expires responsibly** - 30min auto-cleanup prevents memory leaks | 704 | ✅ **Expires responsibly** - 30min auto-cleanup prevents memory leaks |
| 705 | ✅ **Soft-expires announcements** - Bare repo deleted at 30min, event retained 24h for revival | ||
| 673 | ✅ **Tests thoroughly** - Mock-based architecture enables comprehensive unit tests | 706 | ✅ **Tests thoroughly** - Mock-based architecture enables comprehensive unit tests |
| 674 | 707 | ||
| 675 | This design ensures ngit-grasp can serve repositories reliably even when git data and Nostr events arrive out-of-order or from different sources, while respecting remote server capacity and providing excellent observability. | 708 | This design ensures ngit-grasp can serve repositories reliably even when git data and Nostr events arrive out-of-order or from different sources, while respecting remote server capacity and providing excellent observability. |