diff options
| author | DanConwayDev <DanConwayDev@protonmail.com> | 2026-02-23 14:49:30 +0000 |
|---|---|---|
| committer | DanConwayDev <DanConwayDev@protonmail.com> | 2026-02-23 14:49:30 +0000 |
| commit | 4848c4029fc58f6f310a2babeae1ee82a7e41656 (patch) | |
| tree | ccdfdaae41dd2907794a47bbeff562824dd3915b /docs/explanation/grasp-02-proactive-sync-purgatory-git-data.md | |
| parent | f19b424e01fc5a682778c5e2bb194d242efd6987 (diff) | |
docs: update purgatory docs to reflect announcements purgatory implementation
Remove the pre-implementation planning docs (announcements-purgatory-design.md
and announcements-purgatory-implementation.md) now that the feature is built.
Update the three living docs to reflect what was actually implemented:
- purgatory-design.md: expanded to cover all three purgatory stores
(announcement, state, PR), including AnnouncementPurgatoryEntry structure,
two-phase soft expiry lifecycle, expiry extension triggers, promotion flow,
and updated integration points and file structure
- grasp-02-proactive-sync.md: added SyncLevel enum (Full/StateOnly) to
RepoSyncNeeds, documented the purgatory announcement sync timer as the
registration path for purgatory announcements, updated filter building
to describe build_sync_level_aware_filters() and StateOnly behaviour
- grasp-02-proactive-sync-purgatory-git-data.md: expanded to cover
announcement purgatory as a third entry type, added Timeline E showing
soft-expiry and revival, replaced the single expiry section with separate
hard-expiry (state/PR) and two-phase soft-expiry (announcements) sections
with full justification for the 24-hour extended retention window
Diffstat (limited to 'docs/explanation/grasp-02-proactive-sync-purgatory-git-data.md')
| -rw-r--r-- | docs/explanation/grasp-02-proactive-sync-purgatory-git-data.md | 67 |
1 files changed, 50 insertions, 17 deletions
diff --git a/docs/explanation/grasp-02-proactive-sync-purgatory-git-data.md b/docs/explanation/grasp-02-proactive-sync-purgatory-git-data.md index 31c3e46..8fb5798 100644 --- a/docs/explanation/grasp-02-proactive-sync-purgatory-git-data.md +++ b/docs/explanation/grasp-02-proactive-sync-purgatory-git-data.md | |||
| @@ -12,7 +12,13 @@ | |||
| 12 | 12 | ||
| 13 | ## Overview | 13 | ## Overview |
| 14 | 14 | ||
| 15 | When Nostr events arrive before their git data, they enter **purgatory** waiting to be served. But they don't wait passively—ngit-grasp **actively hunts** for the missing git data across all git servers assoicated with the repo until it finds what it needs. | 15 | When Nostr events arrive before their git data, they enter **purgatory** waiting to be served. But they don't wait passively—ngit-grasp **actively hunts** for the missing git data across all git servers associated with the repo until it finds what it needs. |
| 16 | |||
| 17 | This applies to three types of purgatory entries: | ||
| 18 | |||
| 19 | - **Announcement purgatory** — kind 30617 announcements waiting for a git push to prove the repo has content | ||
| 20 | - **State event purgatory** — kind 30618 state events waiting for their referenced git objects | ||
| 21 | - **PR event purgatory** — kind 1617/1618 PR events waiting for their referenced commits | ||
| 16 | 22 | ||
| 17 | ### How It Works | 23 | ### How It Works |
| 18 | 24 | ||
| @@ -42,6 +48,7 @@ We respect remote server capacity with: | |||
| 42 | ✅ **Respectful throttling** - 5 concurrent + 30/min per domain, plays nice with other implementations | 48 | ✅ **Respectful throttling** - 5 concurrent + 30/min per domain, plays nice with other implementations |
| 43 | ✅ **Smart timing** - 3min delay for user pushes, 500ms for synced events | 49 | ✅ **Smart timing** - 3min delay for user pushes, 500ms for synced events |
| 44 | ✅ **30min expiry** - Auto-cleanup of events when data never arrives | 50 | ✅ **30min expiry** - Auto-cleanup of events when data never arrives |
| 51 | ✅ **Soft expiry for announcements** - Bare repo deleted at 30min, event retained 24h to allow revival | ||
| 45 | ✅ **Fully testable** - Mock-based architecture for reliable unit tests | 52 | ✅ **Fully testable** - Mock-based architecture for reliable unit tests |
| 46 | 53 | ||
| 47 | --- | 54 | --- |
| @@ -73,6 +80,16 @@ Timeline D: Data never arrives | |||
| 73 | t=60s: Retry → all servers checked, no data | 80 | t=60s: Retry → all servers checked, no data |
| 74 | ... | 81 | ... |
| 75 | t=1800s: 30 minutes expired → event discarded, purgatory cleaned up 🗑️ | 82 | t=1800s: 30 minutes expired → event discarded, purgatory cleaned up 🗑️ |
| 83 | |||
| 84 | Timeline E: Announcement purgatory (no git data within 30 min) | ||
| 85 | t=0s: Announcement received → bare repo created, enters announcement purgatory | ||
| 86 | t=0.5s: Start hunting git servers for any content | ||
| 87 | ... | ||
| 88 | t=1800s: 30 minutes expired → bare repo deleted, event retained (soft_expired=true) | ||
| 89 | t=3600s: State event arrives (slow sync) → bare repo recreated, expiry reset ✅ | ||
| 90 | t=5400s: Git push arrives → announcement promoted to DB, served to clients ✅ | ||
| 91 | OR | ||
| 92 | t=86400s: 24 hours elapsed, no revival → event added to expired_events, removed 🗑️ | ||
| 76 | ``` | 93 | ``` |
| 77 | 94 | ||
| 78 | **Without proactive sync**: Events in Timeline C would wait indefinitely (or until manual git push). | 95 | **Without proactive sync**: Events in Timeline C would wait indefinitely (or until manual git push). |
| @@ -330,11 +347,11 @@ Both methods check `has_capacity()` and trigger `try_process_next()` if true. | |||
| 330 | 347 | ||
| 331 | --- | 348 | --- |
| 332 | 349 | ||
| 333 | ## 30-Minute Purgatory Expiry | 350 | ## Purgatory Expiry |
| 334 | 351 | ||
| 335 | Purgatory entries **automatically expire** after 30 minutes to prevent unbounded memory growth. | 352 | ### State and PR Events: 30-Minute Hard Expiry |
| 336 | 353 | ||
| 337 | ### Why 30 Minutes? | 354 | State and PR purgatory entries **automatically expire** after 30 minutes. |
| 338 | 355 | ||
| 339 | From the [GRASP-01 spec](https://github.com/DanConwayDev/grasp/blob/main/01.md#purgatory): | 356 | From the [GRASP-01 spec](https://github.com/DanConwayDev/grasp/blob/main/01.md#purgatory): |
| 340 | 357 | ||
| @@ -346,25 +363,40 @@ This balances: | |||
| 346 | - 🧹 **Short enough** to prevent memory leaks from abandoned events | 363 | - 🧹 **Short enough** to prevent memory leaks from abandoned events |
| 347 | - 🔄 **Recoverable** events are still on other relays and can be re-submitted | 364 | - 🔄 **Recoverable** events are still on other relays and can be re-submitted |
| 348 | 365 | ||
| 349 | ### Implementation | 366 | Each entry tracks `expires_at: Instant` (30 min from creation). The sync loop checks expiry before processing via `has_pending_events()`. If all events for an identifier have expired, the identifier is removed from the sync queue. |
| 350 | 367 | ||
| 351 | Each purgatory entry tracks: | 368 | To prevent infinite re-sync loops, expired event IDs are added to an `expired_events` set. If a sync delivers an event that previously expired, it is rejected with `"previously expired from purgatory without git data"`. |
| 352 | 369 | ||
| 353 | - `created_at: Instant` - When added to purgatory | 370 | **Implementation**: [`src/purgatory/mod.rs:DEFAULT_EXPIRY`](../../src/purgatory/mod.rs) |
| 354 | - `expires_at: Instant` - When to discard (created_at + 30min) | ||
| 355 | 371 | ||
| 356 | The main sync loop checks expiry before processing: | 372 | ### Announcement Purgatory: Two-Phase Soft Expiry |
| 357 | 373 | ||
| 358 | ```rust | 374 | Announcements use a different expiry strategy because they have an additional concern: the bare git repo created on arrival must be cleaned up, but we also need to avoid re-syncing the announcement event on every sync cycle. |
| 359 | if !self.has_pending_events(&identifier) { | ||
| 360 | // No events remain (expired or released) → remove from sync queue | ||
| 361 | self.sync_queue.remove(&identifier); | ||
| 362 | } | ||
| 363 | ``` | ||
| 364 | 375 | ||
| 365 | **Note**: Expiry is checked implicitly via `has_pending_events()`. If all events for an identifier have expired, the identifier is removed from the sync queue. | 376 | **Phase 1 — Initial 30-minute expiry:** |
| 366 | 377 | ||
| 367 | **Implementation**: [`src/purgatory/mod.rs:DEFAULT_EXPIRY`](../../src/purgatory/mod.rs) | 378 | - Delete the bare git repo (frees disk space, respects the protocol's 30-minute expiry) |
| 379 | - Set `soft_expired = true` on the entry | ||
| 380 | - Extend `expires_at` by **24 hours** (`SOFT_EXPIRY_EXTENDED`) | ||
| 381 | - Continue syncing state events for this repo (same as active purgatory) | ||
| 382 | |||
| 383 | **Phase 2 — 24-hour soft expiry:** | ||
| 384 | |||
| 385 | - Add event ID to `expired_events` (prevents re-sync loops) | ||
| 386 | - Remove entry completely from `announcement_purgatory` | ||
| 387 | |||
| 388 | **Why not just hard-expire at 30 minutes?** | ||
| 389 | |||
| 390 | The protocol's 30-minute expiry creates a dilemma for announcements: | ||
| 391 | |||
| 392 | - **Option A: Add to `failed_events` at 30 min** → Permanently rejects future state events, losing potential revival when state events arrive late (e.g. from a slow sync) | ||
| 393 | - **Option B: Remove entirely at 30 min** → The announcement gets re-fetched on every subsequent sync cycle, wasting bandwidth indefinitely | ||
| 394 | |||
| 395 | Soft expiry is the solution: the bare repo is deleted at 30 minutes (respecting the protocol), but the event is retained for 24 hours. During this window, a late-arriving state event can **revive** the announcement—`extend_announcement_expiry()` recreates the bare repo, clears `soft_expired`, and resets the 30-minute timer. After 24 hours with no revival, the event is added to `expired_events` and fully removed. | ||
| 396 | |||
| 397 | **Why 24 hours specifically?** This covers the worst-case sync delay. A relay that was offline for up to 24 hours will re-sync state events when it reconnects. The 24-hour window ensures announcements remain revivable throughout that period without permanently occupying disk space. | ||
| 398 | |||
| 399 | **Implementation**: [`src/purgatory/mod.rs:SOFT_EXPIRY_EXTENDED`](../../src/purgatory/mod.rs) | ||
| 368 | 400 | ||
| 369 | --- | 401 | --- |
| 370 | 402 | ||
| @@ -670,6 +702,7 @@ The purgatory sync system is a sophisticated, production-ready implementation th | |||
| 670 | ✅ **Throttles respectfully** - 5 concurrent + 30/min per domain, round-robin fairness | 702 | ✅ **Throttles respectfully** - 5 concurrent + 30/min per domain, round-robin fairness |
| 671 | ✅ **Times strategically** - 3min for user events, 500ms for synced events | 703 | ✅ **Times strategically** - 3min for user events, 500ms for synced events |
| 672 | ✅ **Expires responsibly** - 30min auto-cleanup prevents memory leaks | 704 | ✅ **Expires responsibly** - 30min auto-cleanup prevents memory leaks |
| 705 | ✅ **Soft-expires announcements** - Bare repo deleted at 30min, event retained 24h for revival | ||
| 673 | ✅ **Tests thoroughly** - Mock-based architecture enables comprehensive unit tests | 706 | ✅ **Tests thoroughly** - Mock-based architecture enables comprehensive unit tests |
| 674 | 707 | ||
| 675 | This design ensures ngit-grasp can serve repositories reliably even when git data and Nostr events arrive out-of-order or from different sources, while respecting remote server capacity and providing excellent observability. | 708 | This design ensures ngit-grasp can serve repositories reliably even when git data and Nostr events arrive out-of-order or from different sources, while respecting remote server capacity and providing excellent observability. |