From 70d0197e85ae4ef85202781f6d2dc9e76bd508b3 Mon Sep 17 00:00:00 2001 From: DanConwayDev Date: Wed, 24 Dec 2025 08:02:12 +0000 Subject: feat(purgatory): add broken purgatory implementation --- docs/explanation/architecture.md | 71 +++++++++++++++++++++- docs/explanation/decisions.md | 29 +++++++++ docs/explanation/purgatory-design.md | 112 +++++++++++++++++++++++++++++++++++ 3 files changed, 211 insertions(+), 1 deletion(-) (limited to 'docs/explanation') diff --git a/docs/explanation/architecture.md b/docs/explanation/architecture.md index efa3423..6da2295 100644 --- a/docs/explanation/architecture.md +++ b/docs/explanation/architecture.md @@ -239,7 +239,76 @@ pub struct RepositoryAnnouncement { ... } pub struct RepositoryState { ... } ``` -### 5. Configuration ([`src/config.rs`](src/config.rs)) +### 5. Purgatory System ([`src/purgatory/`](../../src/purgatory/)) + +The purgatory system solves the "which arrives first?" problem where either nostr events or git pushes can arrive in any order. It provides an in-memory holding area for events and git data awaiting their counterparts. + +**Design Document**: See [`purgatory-design.md`](purgatory-design.md) for complete design specifications. + +#### Architecture + +```rust +/// Main purgatory structure with two separate stores +pub struct Purgatory { + /// State events (kind 30618) indexed by repository identifier + state_events: Arc>>, + + /// PR events (kind 1617/1618) or placeholders indexed by event ID + pr_events: Arc>, +} +``` + +**Key Design Principles:** + +1. **Separate Storage**: State events and PR events use different indexing strategies + - State events: Indexed by `identifier` (multiple events can wait for same repo) + - PR events: Indexed by `event_id` (one-to-one mapping) + +2. **Late Binding**: State event refs are extracted at git push time, not event arrival + - Enables flexible matching when pushes arrive out-of-order + - Helper functions in [`helpers.rs`](../../src/purgatory/helpers.rs) handle ref extraction + +3. **Bidirectional Waiting**: Either side can arrive first + - **Event-first**: Event waits for git push + - **Git-first**: Placeholder created, waits for event + +4. **Automatic Expiry**: 30-minute default expiry, extensible during processing + - Background cleanup task runs every 60 seconds + - Removes expired entries from both stores + +#### Data Types + +See [`types.rs`](../../src/purgatory/types.rs) for complete definitions: + +- **[`RefPair`](../../src/purgatory/types.rs:16)**: Ref name + object SHA pair +- **[`StatePurgatoryEntry`](../../src/purgatory/types.rs:29)**: State event with metadata +- **[`PrPurgatoryEntry`](../../src/purgatory/types.rs:52)**: PR event or placeholder with metadata + +#### Integration Points + +**Write Policy** ([`src/nostr/policy/`](../../src/nostr/policy/)): +- State policy checks git data existence before adding to purgatory +- PR policy checks for placeholders before adding to purgatory +- Events return "purgatory: will not be served until git data arrives" message + +**Git Handlers** ([`src/git/handlers.rs`](../../src/git/handlers.rs)): +- On git push: Check purgatory for matching state events +- On refs/nostr/* push: Check purgatory for PR events or create placeholders +- Release events from purgatory when git data arrives +- Save released events to database + +**Main.rs** ([`src/main.rs`](../../src/main.rs)): +- Creates `Arc` at startup +- Passes to both write policy and git handlers +- Spawns background cleanup task (60-second interval) + +#### Thread Safety + +- Uses `Arc` for lock-free concurrent access +- Safe to share between HTTP handlers, WebSocket handlers, and background tasks +- No blocking locks in hot paths + +### 6. Configuration ([`src/config.rs`](src/config.rs)) ```rust pub struct Config { diff --git a/docs/explanation/decisions.md b/docs/explanation/decisions.md index e9b7422..c72112d 100644 --- a/docs/explanation/decisions.md +++ b/docs/explanation/decisions.md @@ -172,3 +172,32 @@ The additional complexity of parsing the Git protocol is minimal compared to the 6. ⏭️ Implement push validation logic 7. ⏭️ Integration tests 8. ⏭️ GRASP-01 compliance testing + +## Purgatory Implementation (2025-12-23) + +Implemented according to design specification in [`purgatory-design.md`](purgatory-design.md). No significant deviations from original design. + +**Implementation approach:** +- Phases 1-7 completed sequentially as planned +- All data structures match design specifications +- Integration points implemented as designed + +**Key technical choices:** + +1. **Optional Purgatory in Git Handlers**: Used `Option>` in git handler signatures for backward compatibility. This allows git handlers to function even when no purgatory is provided (e.g., in minimal test setups). + +2. **Cleanup Interval**: Background cleanup task runs every 60 seconds as designed, removing expired entries from both state and PR stores. + +3. **Thread-Safe Storage**: Used `Arc` for lock-free concurrent access, enabling safe sharing between HTTP handlers, WebSocket handlers, and background tasks. + +4. **Late Binding Implementation**: Ref extraction logic in [`helpers.rs`](../../src/purgatory/helpers.rs) extracts refs at git push time, not event arrival time, as specified in the design. + +**Test integration:** +- Existing test code in `grasp-audit` was uncommented (Phases 7) +- No new integration tests added (as instructed) +- Test verification enabled in [`grasp-audit/src/client.rs`](../../grasp-audit/src/client.rs) and [`grasp-audit/src/specs/grasp01/push_authorization.rs`](../../grasp-audit/src/specs/grasp01/push_authorization.rs) + +**Related Documentation:** +- Design: [`purgatory-design.md`](purgatory-design.md) +- Architecture: [`architecture.md`](architecture.md#5-purgatory-system-srcpurgatory) +- Implementation Plan: [`../purgatory-implementation-plan.md`](../purgatory-implementation-plan.md) diff --git a/docs/explanation/purgatory-design.md b/docs/explanation/purgatory-design.md index bf867ad..5ee4e06 100644 --- a/docs/explanation/purgatory-design.md +++ b/docs/explanation/purgatory-design.md @@ -1,5 +1,10 @@ # Purgatory Implementation Design +**Status**: ✅ Implemented (2025-12-23) +**Implementation**: Phases 1-7 complete +**Source Code**: [`src/purgatory/`](../../src/purgatory/) +**Related**: [`docs/explanation/architecture.md`](architecture.md) - System architecture overview + ## Overview Purgatory is an in-memory holding area for nostr events that depend on git data that hasn't arrived yet, **and** for git data that arrived before its corresponding nostr event. Events/placeholders are held until the other half arrives, at which point they are processed and saved to the database. @@ -899,3 +904,110 @@ WritePolicyResult::Accept 2. **Same event submitted twice** - Deduplicated by event ID 3. **Push timeout during processing** - Entry expiry extended to 15 min minimum 4. **Race between event and git push** - Whichever completes the pair triggers release + + +## Purgatory Authorization Fix (2025-12-24) + +**Critical Implementation Note**: The original purgatory design placed purgatory checking AFTER git push execution. This created a deadlock where pushes were rejected because the authorizing state event was in purgatory, not the database. + +### The Deadlock Problem + +**Original broken flow:** +1. State event arrives → No git data exists → Event stored in PURGATORY (not database) +2. Git push arrives → Authorization checks DATABASE only → No state found → **PUSH REJECTED** ❌ +3. Purgatory check runs → But push already failed, so this never helps + +### The Fix: Authorization-Time Purgatory Check + +**Correct flow (implemented):** +1. State event arrives → No git data exists → Event stored in purgatory +2. Git push arrives → Authorization checks **DATABASE + PURGATORY** → State found in purgatory → **PUSH AUTHORIZED** ✅ +3. After successful push → Save purgatory event to database → Remove from purgatory + +### Implementation Details + +#### 1. Modified [`AuthorizationResult`](../../src/git/authorization.rs:397) + +Added `from_purgatory: bool` field to track whether the authorizing state came from purgatory: + +```rust +pub struct AuthorizationResult { + pub authorized: bool, + pub reason: String, + pub state: Option, + pub maintainers: Vec, + pub from_purgatory: bool, // NEW: Track event source +} +``` + +#### 2. Enhanced [`get_authorization_for_owner()`](../../src/git/authorization.rs:342) + +Added purgatory checking when no state found in database: + +```rust +pub async fn get_authorization_for_owner( + database: &SharedDatabase, + identifier: &str, + owner_pubkey: &str, + purgatory: Option<&Arc>, + pushed_refs: &[(String, String, String)], + repo_path: &Path, +) -> Result +``` + +**Logic**: +1. Check database for state events (existing behavior) +2. If no state in database AND purgatory available: + - Parse pushed refs to RefPairs + - Get local refs from repository + - Call [`find_matching_states()`](../../src/purgatory/mod.rs:203) + - Filter to latest event from authorized authors + - Return authorization with `from_purgatory: true` + +#### 3. Post-Push Purgatory Event Save + +In [`handle_receive_pack()`](../../src/git/handlers.rs:187), after successful push: + +```rust +if from_purgatory { + if let (Some(db), Some(purg)) = (&database, &purgatory) { + // Save state event to database + db.save_event(&state.event).await?; + + // Remove from purgatory + purg.remove_state_event(identifier, &state.event.id); + } +} +``` + +### Files Modified + +1. **[`src/git/authorization.rs`](../../src/git/authorization.rs)** + - Added `from_purgatory` field to `AuthorizationResult` + - Modified `get_authorization_for_owner()` signature and logic + - Added purgatory checking when database has no state + +2. **[`src/git/handlers.rs`](../../src/git/handlers.rs)** + - Modified `authorize_push()` to accept purgatory and repo_path parameters + - Added tracking of `from_purgatory` flag + - Added post-push database save for purgatory events + +### Why This Order Matters + +Checking purgatory DURING authorization (before push execution) is critical: + +- **Prevents deadlock**: Push is authorized by purgatory state before execution +- **Maintains atomicity**: Only saves to database after successful push +- **Race condition safe**: First successful push claims the purgatory event + +The alternative (checking purgatory after push) creates an insurmountable deadlock where valid pushes are rejected because their authorizing state is in purgatory instead of the database. + +### Testing + +The fix enables the `test_push_authorized_by_owner_state` integration test scenario where: +1. State event is sent to relay (goes to purgatory - no git data yet) +2. Git push is sent (uses purgatory state for authorization) +3. State event is released from purgatory to database + +--- + -- cgit v1.2.3