From 4848c4029fc58f6f310a2babeae1ee82a7e41656 Mon Sep 17 00:00:00 2001 From: DanConwayDev Date: Mon, 23 Feb 2026 14:49:30 +0000 Subject: docs: update purgatory docs to reflect announcements purgatory implementation Remove the pre-implementation planning docs (announcements-purgatory-design.md and announcements-purgatory-implementation.md) now that the feature is built. Update the three living docs to reflect what was actually implemented: - purgatory-design.md: expanded to cover all three purgatory stores (announcement, state, PR), including AnnouncementPurgatoryEntry structure, two-phase soft expiry lifecycle, expiry extension triggers, promotion flow, and updated integration points and file structure - grasp-02-proactive-sync.md: added SyncLevel enum (Full/StateOnly) to RepoSyncNeeds, documented the purgatory announcement sync timer as the registration path for purgatory announcements, updated filter building to describe build_sync_level_aware_filters() and StateOnly behaviour - grasp-02-proactive-sync-purgatory-git-data.md: expanded to cover announcement purgatory as a third entry type, added Timeline E showing soft-expiry and revival, replaced the single expiry section with separate hard-expiry (state/PR) and two-phase soft-expiry (announcements) sections with full justification for the 24-hour extended retention window --- .../grasp-02-proactive-sync-purgatory-git-data.md | 67 ++++++++++++++++------ 1 file changed, 50 insertions(+), 17 deletions(-) (limited to 'docs/explanation/grasp-02-proactive-sync-purgatory-git-data.md') diff --git a/docs/explanation/grasp-02-proactive-sync-purgatory-git-data.md b/docs/explanation/grasp-02-proactive-sync-purgatory-git-data.md index 31c3e46..8fb5798 100644 --- a/docs/explanation/grasp-02-proactive-sync-purgatory-git-data.md +++ b/docs/explanation/grasp-02-proactive-sync-purgatory-git-data.md @@ -12,7 +12,13 @@ ## Overview -When Nostr events arrive before their git data, they enter **purgatory** waiting to be served. But they don't wait passively—ngit-grasp **actively hunts** for the missing git data across all git servers assoicated with the repo until it finds what it needs. +When Nostr events arrive before their git data, they enter **purgatory** waiting to be served. But they don't wait passively—ngit-grasp **actively hunts** for the missing git data across all git servers associated with the repo until it finds what it needs. + +This applies to three types of purgatory entries: + +- **Announcement purgatory** — kind 30617 announcements waiting for a git push to prove the repo has content +- **State event purgatory** — kind 30618 state events waiting for their referenced git objects +- **PR event purgatory** — kind 1617/1618 PR events waiting for their referenced commits ### How It Works @@ -42,6 +48,7 @@ We respect remote server capacity with: ✅ **Respectful throttling** - 5 concurrent + 30/min per domain, plays nice with other implementations ✅ **Smart timing** - 3min delay for user pushes, 500ms for synced events ✅ **30min expiry** - Auto-cleanup of events when data never arrives +✅ **Soft expiry for announcements** - Bare repo deleted at 30min, event retained 24h to allow revival ✅ **Fully testable** - Mock-based architecture for reliable unit tests --- @@ -73,6 +80,16 @@ Timeline D: Data never arrives t=60s: Retry → all servers checked, no data ... t=1800s: 30 minutes expired → event discarded, purgatory cleaned up 🗑️ + +Timeline E: Announcement purgatory (no git data within 30 min) + t=0s: Announcement received → bare repo created, enters announcement purgatory + t=0.5s: Start hunting git servers for any content + ... + t=1800s: 30 minutes expired → bare repo deleted, event retained (soft_expired=true) + t=3600s: State event arrives (slow sync) → bare repo recreated, expiry reset ✅ + t=5400s: Git push arrives → announcement promoted to DB, served to clients ✅ + OR + t=86400s: 24 hours elapsed, no revival → event added to expired_events, removed 🗑️ ``` **Without proactive sync**: Events in Timeline C would wait indefinitely (or until manual git push). @@ -330,11 +347,11 @@ Both methods check `has_capacity()` and trigger `try_process_next()` if true. --- -## 30-Minute Purgatory Expiry +## Purgatory Expiry -Purgatory entries **automatically expire** after 30 minutes to prevent unbounded memory growth. +### State and PR Events: 30-Minute Hard Expiry -### Why 30 Minutes? +State and PR purgatory entries **automatically expire** after 30 minutes. From the [GRASP-01 spec](https://github.com/DanConwayDev/grasp/blob/main/01.md#purgatory): @@ -346,25 +363,40 @@ This balances: - 🧹 **Short enough** to prevent memory leaks from abandoned events - 🔄 **Recoverable** events are still on other relays and can be re-submitted -### Implementation +Each entry tracks `expires_at: Instant` (30 min from creation). The sync loop checks expiry before processing via `has_pending_events()`. If all events for an identifier have expired, the identifier is removed from the sync queue. -Each purgatory entry tracks: +To prevent infinite re-sync loops, expired event IDs are added to an `expired_events` set. If a sync delivers an event that previously expired, it is rejected with `"previously expired from purgatory without git data"`. -- `created_at: Instant` - When added to purgatory -- `expires_at: Instant` - When to discard (created_at + 30min) +**Implementation**: [`src/purgatory/mod.rs:DEFAULT_EXPIRY`](../../src/purgatory/mod.rs) -The main sync loop checks expiry before processing: +### Announcement Purgatory: Two-Phase Soft Expiry -```rust -if !self.has_pending_events(&identifier) { - // No events remain (expired or released) → remove from sync queue - self.sync_queue.remove(&identifier); -} -``` +Announcements use a different expiry strategy because they have an additional concern: the bare git repo created on arrival must be cleaned up, but we also need to avoid re-syncing the announcement event on every sync cycle. -**Note**: Expiry is checked implicitly via `has_pending_events()`. If all events for an identifier have expired, the identifier is removed from the sync queue. +**Phase 1 — Initial 30-minute expiry:** -**Implementation**: [`src/purgatory/mod.rs:DEFAULT_EXPIRY`](../../src/purgatory/mod.rs) +- Delete the bare git repo (frees disk space, respects the protocol's 30-minute expiry) +- Set `soft_expired = true` on the entry +- Extend `expires_at` by **24 hours** (`SOFT_EXPIRY_EXTENDED`) +- Continue syncing state events for this repo (same as active purgatory) + +**Phase 2 — 24-hour soft expiry:** + +- Add event ID to `expired_events` (prevents re-sync loops) +- Remove entry completely from `announcement_purgatory` + +**Why not just hard-expire at 30 minutes?** + +The protocol's 30-minute expiry creates a dilemma for announcements: + +- **Option A: Add to `failed_events` at 30 min** → Permanently rejects future state events, losing potential revival when state events arrive late (e.g. from a slow sync) +- **Option B: Remove entirely at 30 min** → The announcement gets re-fetched on every subsequent sync cycle, wasting bandwidth indefinitely + +Soft expiry is the solution: the bare repo is deleted at 30 minutes (respecting the protocol), but the event is retained for 24 hours. During this window, a late-arriving state event can **revive** the announcement—`extend_announcement_expiry()` recreates the bare repo, clears `soft_expired`, and resets the 30-minute timer. After 24 hours with no revival, the event is added to `expired_events` and fully removed. + +**Why 24 hours specifically?** This covers the worst-case sync delay. A relay that was offline for up to 24 hours will re-sync state events when it reconnects. The 24-hour window ensures announcements remain revivable throughout that period without permanently occupying disk space. + +**Implementation**: [`src/purgatory/mod.rs:SOFT_EXPIRY_EXTENDED`](../../src/purgatory/mod.rs) --- @@ -670,6 +702,7 @@ The purgatory sync system is a sophisticated, production-ready implementation th ✅ **Throttles respectfully** - 5 concurrent + 30/min per domain, round-robin fairness ✅ **Times strategically** - 3min for user events, 500ms for synced events ✅ **Expires responsibly** - 30min auto-cleanup prevents memory leaks +✅ **Soft-expires announcements** - Bare repo deleted at 30min, event retained 24h for revival ✅ **Tests thoroughly** - Mock-based architecture enables comprehensive unit tests This design ensures ngit-grasp can serve repositories reliably even when git data and Nostr events arrive out-of-order or from different sources, while respecting remote server capacity and providing excellent observability. -- cgit v1.2.3