upleb.uk

Public git repos — served from a NIP-34 GRASP relay at git.upleb.uk

summaryrefslogtreecommitdiff
path: root/src/purgatory
AgeCommit message (Collapse)Author
2026-02-26chore: apply cargo fmt and fix clippy warningsDanConwayDev
Fix pre-existing clippy lints: - &PathBuf -> &Path in audit_cleanup.rs - too_many_arguments on process_newly_available_git_data, process_purgatory_announcements, and HttpService::new - clone_on_copy for PublicKey (Copy type) in purgatory cleanup loop
2026-02-26fix: ignore peeled tag entries (^{}) in state event ref parsingDanConwayDev
State events (kind 30618) can include refs/tags/<name>^{} entries which are git's notation for the dereferenced commit behind an annotated tag. These are not real git refs and are never sent as part of a push. extract_refs_from_state and RepositoryState::from_event were treating them as real refs, causing can_satisfy_state to reject valid annotated tag pushes: the would-be state after the push lacked the spurious ^{} entry, so the exact-equality check always failed.
2026-02-24rename: fetch_repository_data -> ↵DanConwayDev
fetch_repository_data_{excluding,with}_purgatory The old name was ambiguous - it wasn't clear whether purgatory was included or not. The two variants are now explicitly named: - fetch_repository_data_excluding_purgatory: DB only - fetch_repository_data_with_purgatory: DB + purgatory overlay SyncContext trait method also renamed to fetch_repository_data_with_purgatory to match the free function it delegates to.
2026-02-23Merge master into 3ca0-announcements-purgatoryDanConwayDev
2026-02-23persist and restore announcement events across graceful restartsDanConwayDev
Extends purgatory persistence to include announcement purgatory entries. On graceful shutdown, non-soft-expired announcements are serialised to purgatory-state.json alongside state/PR/expired events; on startup they are restored, skipping any entry whose bare repo path no longer exists. Updates purgatory-design.md to reflect that purgatory persists through graceful shutdown and documents the new PurgatoryState disk format. Adds create_announcement_event helper to purgatory_helpers and three new integration tests in purgatory_persistence covering the full save/restore cycle, missing-repo skip, and the combined roundtrip with all entry types.
2026-02-23fix: only soft-expire announcement when bare repo deletion succeedsDanConwayDev
If remove_dir_all fails, leave the entry untouched so the next cleanup cycle retries the deletion automatically. Previously a failed deletion would still set soft_expired=true and extend the expiry, meaning the bare repo would never be retried.
2026-02-23feat: implement soft expiry and revival for purgatory announcementsDanConwayDev
Two-phase expiry for announcement purgatory entries: - Phase 1 (initial 30min timeout): delete bare repo, set soft_expired=true, extend expiry by 24h so the event is retained for potential revival - Phase 2 (24h extended timeout): fully remove from purgatory Revival: extend_announcement_expiry() now recreates the bare git repo when called on a soft-expired entry (triggered by state event or git auth), clearing soft_expired and resetting the expiry window.
2026-02-23fix: re-process hot-cache maintainer announcements after git push promotionDanConwayDev
When an owner announcement is promoted from purgatory via a git push, any maintainer announcements sitting in the rejected_events_index hot cache were never re-processed. The invalidate_and_get call only existed in SyncManager::process_event_static (the nostr sync path); the git push promotion path (http -> handlers -> git::sync) had no access to the rejected_events_index at all. Thread rejected_events_index and write_policy through the git push path: - process_purgatory_announcements: after saving the promoted announcement, parse its maintainers tag and call invalidate_and_get() for each, then re-process any returned hot-cache events via admit_event + save - process_newly_available_git_data: accept optional write_policy and rejected_events_index, pass them through to process_purgatory_announcements - handle_receive_pack: accept Arc<Nip34WritePolicy> and Arc<RejectedEventsIndex>, pass them to process_newly_available_git_data - HttpService / run_server: carry the two new fields, clone into each handle_receive_pack call - main.rs: obtain rejected_events_index from sync_manager before moving it into its task; wrap write_policy in Arc for the HTTP server - RealSyncContext::process_newly_available_git_data: pass None for both new params (purgatory sync path already handles this via SyncManager::process_event_static) Also rewrite the maintainer_reprocessing integration tests to correctly exercise the hot-cache path now that announcements require git data before being released from purgatory: - Start relay_b with relay_a as bootstrap so its SyncManager syncs maintainer announcements via negentropy before the owner git push - Use push_unique_git_data_to_relay (new helper) to give each maintainer a distinct commit hash, preventing git from skipping pack transfer - Make wait_for_event_on_relay poll in a retry loop so transient timing gaps between DB write and query do not cause false negatives
2026-02-18fix: replace repo_sync_index wiring with purgatory announcement sync timerDanConwayDev
Instead of threading repo_sync_index through PolicyContext/builder.rs/main.rs to handle user-submitted purgatory announcements, add a simple background timer (run_purgatory_announcement_sync, every 5s) that scans the purgatory for announcement entries and registers them in repo_sync_index as StateOnly. This is simpler and covers both flows: - Sync-path announcements: inline registration still happens during event processing (sync/mod.rs:1839+), timer provides a safety net - User-submitted announcements: SelfSubscriber never sees them (rejected from DB), timer is the primary registration path The timer calls sync_purgatory_announcements_to_index() which: 1. Snapshots purgatory via new announcements_for_sync() public method 2. Or_inserts StateOnly entries (never downgrades Full entries) 3. Detects newly added relay URLs and calls handle_new_sync_filters to connect and subscribe - fixing the failing test that expected relay discovery from a user-submitted purgatory announcement Removes: repo_sync_index field from PolicyContext, set/get_repo_sync_index methods, set_repo_sync_index on Nip34WritePolicy, wiring in main.rs, and the inline AcceptPurgatory registration block in builder.rs.
2026-02-18Revert "feat: upgrade repo to Full sync and trigger PR event subscription ↵DanConwayDev
after announcement promotion" This reverts commit d76003b629a4a03dba23a8a1c41da6e4ac4c30cf.
2026-02-18feat: upgrade repo to Full sync and trigger PR event subscription after ↵DanConwayDev
announcement promotion When git data arrives for a purgatory announcement and promotes it to the database, the relay now: 1. Upgrades the announcement's sync level in RepoSyncIndex from StateOnly to Full (git/sync.rs: process_purgatory_announcements) 2. Sends AddFilters actions to SyncManager for all connected relays, using Full sync filters (Layer 2 #a/#A/#q) to subscribe to PR events (purgatory/sync/context.rs: RealSyncContext.process_newly_available_git_data) 3. For user-submitted purgatory announcements, registers the repo in RepoSyncIndex with StateOnly level and sends AddFilters to SyncManager so it discovers and connects to relays listed in the announcement tags (nostr/builder.rs: handle_announcement AcceptPurgatory path) The RealSyncContext now accepts optional repo_sync_index and sync_action_tx parameters. main.rs wires these up from SyncManager. PolicyContext gains repo_sync_index and sync_action_tx fields for the write policy path.
2026-02-18fix: break circular deadlock in sync loop by including purgatory in URL lookupDanConwayDev
The sync loop calls fetch_repository_data() to get clone URLs so it knows where to fetch git data from. Previously this only queried the database, which means an announcement still in purgatory (no git data yet) would return no clone URLs, so the sync loop could never fetch the git data needed to promote the announcement - a circular deadlock. Fix by switching to fetch_repository_data_with_purgatory() which combines database announcements with purgatory announcements. Update the trait method's doc comment to document this behaviour. The mock implementation in tests is unaffected since it returns pre-configured data rather than delegating to either function.
2026-02-13fix: revert wrong sync approach for purgatory announcementsDanConwayDev
The partial fix treating ProcessResult::Purgatory as confirmed in pending_sync_index would trigger full L2/L3 sync for purgatory announcements. Per design (decision #6), purgatory announcements should only sync state events via SyncLevel::StateOnly (not yet implemented). Ignore test_archive_read_only_creates_bare_repo until SyncLevel is implemented in Phase 3.
2026-02-13feat: implement announcement purgatory core (breaks archive sync test)DanConwayDev
Route new announcements to purgatory instead of accepting immediately. Announcements are promoted to the database when git data arrives, ensuring we only serve announcements for repos with actual content. Implemented: - AnnouncementPurgatoryEntry type and DashMap store - Route new announcements to purgatory (replacement announcements skip) - Promote announcements on git data arrival (process_purgatory_announcements) - Authorization checks purgatory announcements (fetch_repository_data_with_purgatory) - State policy uses purgatory announcements for maintainer validation - Cleanup task handles announcement expiry - Updated count()/cleanup() to 3-tuples Known broken: - test_archive_read_only_creates_bare_repo fails: sync module does not treat purgatory announcements as confirmed repos, so per-repo sync (state events, PRs) is never triggered for purgatory announcements - Announcement persistence (save/restore) not implemented - SyncLevel (StateOnly vs Full) not implemented - Soft expiry two-phase not implemented - Expiry extension on state event / git auth not wired up
2026-02-03feat: add diagnostic logging for partial state event matchesDanConwayDev
Improves observability when pushes are rejected due to state events that only partially match the pushed refs. Previously, logs only showed 'No state event found' even when state events existed but didn't match. Changes: - Add diagnose_state_mismatch() to explain why state events don't match - Log specific reasons: missing refs, wrong SHAs, or extra refs - Update rejection message to 'No matching state event found' (more accurate) - Add 4 unit tests for diagnostic function Example diagnostic output: WARN State event abc123 from authorized author doesn't match push: refs/heads/main missing (state declares 9cc3d93b) This addresses the issue where a push with only refs/heads/test was rejected because the state event also declared refs/heads/main, but logs didn't explain why the match failed.
2026-02-03Merge relay.ngit.dev migration: bug fixes and migration toolingDanConwayDev
This merge includes critical bug fixes and comprehensive migration tooling developed during the relay.ngit.dev migration effort. Bug Fixes: - Fix git protocol error handling to return HTTP 200 with ERR pkt-line - Fix naughty list false positives and DNS failure identification - Fix database query filters in load_existing_events (remove .since()) - Fix OID fetch tracking to distinguish 0 OIDs from successful fetches - Fix purgatory event source tracking for filtered expiry logging - Implement OID retry logic for 'not our ref' errors Migration Tools & Documentation: - Complete 5-phase migration analysis pipeline with orchestration script - Phase 1: Event fetching from source relay - Phase 2: Git sync verification - Phase 3: Categorization and relay comparison - Phase 4: Log extraction (parse failures, purgatory expiry) - Phase 5: Action classification for migration decisions - Comprehensive migration guide with lessons learned - Troubleshooting guide for permission and corruption issues Configuration: - Add NGIT_LOG_LEVEL configuration option - Update git throttle limits to 60/minute - Improve logging throughout for better observability
2026-01-28feat(purgatory): track event source for filtered expiry loggingDanConwayDev
Add EventSource enum (Direct/Sync) to purgatory entries to distinguish between user-submitted events and sync-fetched events. This enables: - WARN-level logging for direct submissions that expire (user should know) - DEBUG-level logging for sync-fetched expirations (expected behavior) - Source upgrade from Sync→Direct if user submits after sync - Expiry timer reset on source upgrade (fresh 30-min window for user) The source is included in [PURGATORY_EXPIRED] logs as source=direct or source=sync for easy filtering.
2026-01-27fix: pass actually fetched OIDs to process_newly_available_git_dataDanConwayDev
Previously, sync_identifier_from_url passed all needed OIDs to process_newly_available_git_data, not just the OIDs that were successfully fetched. This caused incorrect logging (new_oids_count would show all needed OIDs, not just fetched ones). While this didn't break functionality (the actual processing uses can_apply_state which checks the repository on disk), it made debugging confusing. Changes: - Rename oids_fetched to fetched_oids and change type from usize to Vec<String> - Return Vec<String> from match arms instead of counts - Pass fetched_oids (not needed_oids) to process_newly_available_git_data - Return fetched_oids.len() at the end This ensures logging accurately reflects which OIDs were actually fetched from the remote.
2026-01-27improve loggingDanConwayDev
2026-01-27fix: distinguish 0 OIDs fetched from successful fetch in loggingDanConwayDev
When fetch_oids returns Ok(vec![]) (all requested OIDs missing from remote), the log message now says 'Fetch returned no OIDs (not available on remote)' instead of the misleading 'Fetch succeeded' with oids_fetched=0.
2026-01-27feat: implement OID retry logic for 'not our ref' errorsDanConwayDev
Add retry loop in fetch_oids that handles git's behavior of stopping at the first missing OID. When a 'not our ref' error occurs: - Parse the missing OID from stderr - Remove it from the fetch list and track it as missing - Retry with remaining OIDs until success or all OIDs exhausted This ensures we fetch all available OIDs even when some are missing from the remote, rather than failing the entire batch. Also improves error reporting: - Include URL in all error messages for easier debugging - Log stderr even when domain is already on naughty list
2026-01-27Add structured logging for migration analysisDanConwayDev
- Add [PARSE_FAIL] logging when event parsing fails - Add [PURGATORY_EXPIRED] logging when repos expire from purgatory - Logs include: kind, event_id, repo, npub, reason - Supports Phase 4 migration scripts (30-extract-*.sh) - All 382 tests pass
2026-01-23fix: improve 'not our ref' error messages and warn about multi-OID fetch bugDanConwayDev
When git fetch fails with 'upload-pack: not our ref', git stops at the first missing OID and doesn't attempt to fetch remaining OIDs. This means if we request 5 OIDs and the first is missing, we never try the other 4 (which may exist on the remote). Changes: - Parse missing OID from stderr for clearer error messages - Single OID case: 'remote missing only oid requested: <oid>' - Multi OID case: Log WARNING and indicate other OIDs weren't attempted - Identifies the bug that needs retry logic to fetch OIDs individually
2026-01-14feat(purgatory): add persistence to survive relay restartsDanConwayDev
Implement save/restore functionality for purgatory state to prevent event loss during relay restarts. Events in purgatory (state events, PR events, and expired events) are now saved to disk on graceful shutdown and restored on startup. Key features: - Serialize purgatory state to JSON (purgatory-state.json) - Time conversion helpers for Instant <-> Duration serialization - Restore with downtime adjustment (preserves remaining TTL) - Graceful degradation (missing/corrupted files don't crash) - File cleanup after successful restore - get_all_identifiers() for re-queueing after restore Files: - src/purgatory/persistence.rs: Time conversion helpers - src/purgatory/types.rs: Serialization derives - src/purgatory/mod.rs: save_to_disk/restore_from_disk methods Tests: 15 unit tests covering serialization, downtime, edge cases
2026-01-12fix: fetch full git history instead of shallow clonesDanConwayDev
Previously, purgatory sync was using '--depth=1' when fetching OIDs from remote servers. This created shallow clones with only 1-2 commits instead of the complete git history. The fix removes the '--depth=1' flag, allowing git to fetch the complete commit history chain when fetching specific commit OIDs. This is the correct behavior for GRASP - users cloning from our relay should get the full repository history. Changes: - Remove '--depth=1' from git fetch command in RealSyncContext::fetch_oids - Update comment to clarify that full history is fetched Impact: - Production repositories will now contain full git history - Users cloning from the relay will get complete commit chains - No more 'shallow' files in git repositories - May be slightly slower due to fetching more data, but correctness is prioritized Testing: - All 564 tests pass (276 unit + 288 integration) - No regressions in existing functionality Fixes issue documented in work/active-issues/shallow-git-fetch.md
2026-01-10Add naughty list for git remotes with persistent SSL/DNS errorsDanConwayDev
Implement domain-level naughty list tracking for git remotes, reusing the existing NaughtyListTracker from relay sync. This prevents repeated attempts to fetch from git domains with persistent infrastructure issues (SSL/TLS certificate errors, DNS failures). Changes: - Updated NaughtyListTracker to track both relay URLs and git domains - Added git_naughty_list field to RealSyncContext for error classification - Modified fetch_oids() to classify git fetch errors and record naughty domains - Updated sync_identifier_next_url() to filter out naughty domains during URL selection - Added git_naughty_list parameter to ThrottleManager for domain queue processing - Threaded naughty list through start_sync_loop and all sync functions - Updated all tests to pass naughty list parameter The naughty list uses 12-hour expiration (configurable) to allow domains to recover from infrastructure issues. First occurrence logs WARN, repeats log DEBUG.
2026-01-10fix: propagate git fetch errors instead of logging misleading successDanConwayDev
2026-01-09Fix sync tests after Syncing status introductionDanConwayDev
- Fix relay_connected() helper to check v >= 2 (Syncing/Connected states) - Fix unit test to use status value 3 (Connected) instead of 1 (Connecting) - Fix clippy warning: use .to_vec() instead of .iter().cloned().collect() All 61 sync integration tests now passing. All 238 unit tests passing. Clippy clean.
2026-01-09fix: MockSyncContext creates single clone tag with multiple valuesDanConwayDev
The mock was creating multiple clone tags (one per URL), which violated NIP-34 format and triggered validation errors added in commit 92bfbd3. NIP-34 specifies: single clone tag with multiple values ["clone", "https://url1.com", "https://url2.com", ...] NOT multiple clone tags: ["clone", "https://url1.com"] ["clone", "https://url2.com"] This regression caused 7 purgatory::sync::functions tests to fail because RepositoryAnnouncement::from_event() now correctly rejects announcements with multiple clone tags. Fixes: - next_url_skips_throttled_domains - next_url_skips_tried_urls - next_url_filters_our_domain - next_url_with_specific_domain - get_throttled_domains_returns_only_throttled_with_untried - sync_identifier_enqueues_throttled_domains_when_incomplete - sync_identifier_tries_multiple_urls_until_complete All 232 unit tests now pass.
2026-01-08chore: upgrade nostr-* packages to rev 4767ad13DanConwayDev
- Update nostr-relay-builder, nostr-sdk, nostr-lmdb to latest revision - Update grasp-audit nostr-sdk dependency - Fix clippy warnings: - Replace .clone() with std::slice::from_ref() in src/git/sync.rs - Change &PathBuf to &Path in tests/common/git_server.rs - Replace vec![] with array literal in src/purgatory/sync/functions.rs - Update PR_TEST_COMMIT_HASH in grasp-audit due to event generation changes All 249 tests passing, no breaking changes required.
2026-01-08chore: cargo fmtDanConwayDev
2026-01-08test: disable GPG signing in all test helpersDanConwayDev
Prevent GPG signing prompts (including Yubikey activation) during test runs by explicitly disabling commit.gpgsign and tag.gpgsign in all test repository creation helpers. Modified: - tests/common/purgatory_helpers.rs: create_test_repo_with_commit() - src/git/mod.rs: create_test_repo_with_commit() - src/purgatory/helpers.rs: create_test_repo_with_commit() All test repositories now have GPG signing disabled regardless of global git configuration.
2026-01-08feat(purgatory): track expired events to prevent infinite re-sync loopsDanConwayDev
Adds expired event tracking to prevent proactive sync from repeatedly fetching and re-adding events that expired from purgatory without finding git data. Key features: - Track expired events for 7 days to prevent re-sync loops - Distinguish synced vs user-submitted events (via socket address) - Allow users to retry expired events (git data might now be available) - Reject synced expired events (prevents infinite loop) - Daily cleanup of expired event records older than 7 days Implementation: - Added expired_events: DashMap<EventId, Instant> to Purgatory - Updated event_ids() to include both purgatory + expired events - Added is_expired(), mark_expired(), cleanup_expired_events() - Updated cleanup() to mark expired events automatically - Added is_synced detection in WritePolicy (localhost:0 = synced) - Policy layer checks is_synced && is_expired() before rejecting Behavior: - Negentropy: Filters expired events before fetching (optimal) - REQ+EOSE: Rejects synced expired events at policy layer - User submissions: Always allowed to retry (skip expired check) Testing: - Added 5 new tests for expired event tracking - All 222 tests passing Fixes the infinite re-sync loop where events without git data would expire, get synced again, expire again, repeat forever.
2026-01-07fix: resolve clippy warningsDanConwayDev
- Prefix unused variable auth_result with underscore - Prefix unused field git_data_path with underscore in Purgatory struct - Add #[allow(clippy::too_many_arguments)] to handle_receive_pack - Replace len() >= 1 with !is_empty() - Replace .last() with .next_back() on DoubleEndedIterator - Fix doc list item overindentation - Replace map_or(true, ...) with is_none_or(...) - Replace map_or(false, ...) with is_some_and(...)
2026-01-07feat(sync): extract clone URLs from PR events in purgatoryDanConwayDev
Add support for extracting clone URLs from PR/PR-Update events (kind 1618/1619) during purgatory sync, per NIP-34 specification. This enables fetching PR commits from URLs specified in the PR event itself, not just from repository announcement clone URLs. Changes: - Add collect_pr_clone_urls() to SyncContext trait - Implement in RealSyncContext: extract clone tags from PR events in purgatory - Implement in MockSyncContext: configurable PR clone URLs for testing - Update sync_identifier_next_url to merge PR clone URLs with announcement URLs - Update get_throttled_domains_with_untried_urls with same merge logic - Add unit tests for PR clone URL extraction and filtering
2026-01-07test: add test_state_event_syncs_from_remote integration testDanConwayDev
Implements Phase 3 of the purgatory sync integration test plan. Key changes: - Add immediate sync triggering for sync-received events that go to purgatory (instead of default 3-minute delay for user-submitted events) - TestRelay now respects RUST_LOG environment variable for debugging - New test verifies end-to-end flow: state event syncs from source relay, enters purgatory, git data is fetched from source's clone URL, and event is released and served
2026-01-07Wire up new purgatory sync loop, remove legacy sync_state_git_dataDanConwayDev
Phase 13 of purgatory-sync-redesign: - Add sync loop startup in main.rs (RealSyncContext + ThrottleManager + start_sync_loop) - Update add_state() and add_pr() to automatically enqueue for background sync - Remove start_state_sync() call from state.rs (now handled by sync loop) - Remove orphaned legacy functions: sync_state_git_data, fetch_missing_oids_from_server, get_most_complete_local_repo, identify_missing_oids, get_date_of_most_recent_commit_on_default_branch - Clean up unused imports in purgatory/mod.rs
2026-01-07Add RealSyncContext implementation for production purgatory syncDanConwayDev
Implement the production SyncContext that connects to real systems: - RealSyncContext struct holding purgatory, database, git_data_path, our_domain, and local_relay references - fetch_repository_data: delegates to git::authorization module - collect_needed_oids: collects commit hashes from state events (branches/tags) and PR events (c-tag) in purgatory - oid_exists: delegates to git::oid_exists function - fetch_oids: uses git fetch --depth=1 to retrieve specific OIDs from remote servers, running in spawn_blocking for async safety - process_newly_available_git_data: delegates to the unified function in git::sync module for consistent post-git-data processing - has_pending_events: delegates to purgatory method - find_target_repo: finds first existing owner repository on disk - our_domain: returns configured domain for clone URL filtering This enables the purgatory sync loop to use real database queries, git operations, and event processing instead of mocks.
2026-01-07purgatory: improve process_newly_available_git_data state event syncDanConwayDev
2026-01-07Refactor handle_receive_pack to use unified process_newly_available_git_dataDanConwayDev
Replace ~100 lines of duplicated post-push processing in handle_receive_pack with a single call to the unified process_newly_available_git_data function. The unified function handles all post-git-data-available processing: - Discovering satisfiable events from purgatory (state and PR events) - Syncing OIDs to authorized owner repos - Aligning refs (+ setting HEAD) in all owner repos - Saving events to database - Notifying WebSocket subscribers - Removing from purgatory This ensures consistent behavior regardless of how git data arrives (git push vs purgatory sync fetching from remote servers). Also mark test-only internal methods with #[cfg(test)] to silence dead code warnings.
2026-01-07Add unified process_newly_available_git_data functionDanConwayDev
Implement the unified function that handles all post-git-data-available processing, regardless of how data arrived (git push or purgatory sync). This function: - Discovers satisfiable events from purgatory (state and PR events) - Syncs OIDs to authorized owner repos - Aligns refs and sets HEAD - Saves events to database - Notifies WebSocket subscribers - Removes from purgatory New additions: - ProcessResult struct for tracking processing outcomes - process_newly_available_git_data async function in src/git/sync.rs - Helper functions: extract_identifier_from_repo_path, extract_identifier_from_pr_event - Purgatory::find_prs_for_identifier method for PR event discovery - Unit tests for all helper functions Also fixes: - Simplified extract_domain to avoid url crate dependency - Removed unused imports in sync/loop.rs
2026-01-07Add background sync loop for purgatory identifier processingDanConwayDev
Implement the main sync loop that runs in the background and processes identifiers that are ready for git data synchronization: - Runs every 1 second (hardcoded interval, not configurable) - Finds all ready identifiers where !in_progress && next_attempt <= now - Spawns parallel tasks for each ready identifier - Each task calls sync_identifier to try fetching git data from remotes - Applies backoff when sync completes but events remain in purgatory - Removes identifiers from queue when sync completes or no events remain The loop integrates with the existing sync infrastructure: - Uses SyncContext trait for testability - Uses ThrottleManager for domain-based rate limiting - Uses sync_identifier for the actual fetch orchestration This enables automatic background fetching of git data for events in purgatory, complementing the existing push-triggered sync path.
2026-01-07Add sync queue to Purgatory with enqueue_sync and has_pending_eventsDanConwayDev
- Add sync_queue field to Purgatory struct for tracking identifiers that need background git data fetching - Implement enqueue_sync() with debouncing - resets attempt_count and updates next_attempt when new events arrive for an identifier already in queue - Add enqueue_sync_default() for user-submitted events (3 minute delay to wait for git push) - Add enqueue_sync_immediate() for sync-triggered events (500ms delay for batching burst arrivals) - Implement has_pending_events() to check if an identifier has state events or PR events in purgatory - Add helper methods: sync_queue(), remove_from_sync_queue(), sync_queue_size() - Add unit tests for debouncing behavior and pending event detection
2026-01-07Add sync_identifier orchestration and ThrottleManager queue processingDanConwayDev
Implement the main sync orchestration function and trigger-based queue processing for throttled domains: sync_identifier function: - Orchestrates syncing git data for a single identifier - Tries all non-throttled URLs in sequence - Checks completion after each fetch (no pending events or all OIDs fetched) - Enqueues with throttled domains when non-throttled URLs are exhausted - Returns true if complete, false if events remain (for backoff) ThrottleManager enhancements: - Add set_context() to provide SyncContext for queue processing - Add try_process_next() to spawn tasks when capacity frees - Add process_queued_identifier() to handle queued work - Update complete_request() to trigger processing on completion - Update enqueue_identifier() to trigger processing when capacity available - Add internal methods for non-Arc testing compatibility Generic function updates: - Add ?Sized bound to sync_identifier_next_url, sync_identifier_from_url, sync_identifier, and get_throttled_domains_with_untried_urls for dynamic dispatch support (Arc<dyn SyncContext>) Tests: - sync_identifier_tries_multiple_urls_until_complete: verifies sequential URL fetching until all OIDs are available - sync_identifier_enqueues_throttled_domains_when_incomplete: verifies throttled domains get the identifier enqueued for later processing - has_queued_work_reflects_queue_state: verifies queue state tracking
2026-01-07Add core sync functions for identifier-based purgatory synchronizationDanConwayDev
Implement sync_identifier_next_url and sync_identifier_from_url functions that provide the core URL selection and fetch logic for purgatory sync. sync_identifier_next_url: - Pure URL selection logic with no side effects - Filters out our own domain and already-tried URLs - Respects domain throttling when domain parameter is None - Can target a specific domain when domain parameter is Some sync_identifier_from_url: - Fetches OIDs from a specific URL via the SyncContext - Tracks request start/completion with ThrottleManager for rate limiting - Calls process_newly_available_git_data on successful fetch Also adds get_throttled_domains_with_untried_urls helper for the main sync loop to know which DomainThrottle queues to enqueue identifiers to. These functions are designed to be called by both: - Main sync loop (tries non-throttled URLs immediately) - DomainThrottle queue processing (when capacity frees up) Includes 10 unit tests covering: - Throttled domain skipping - Tried URL skipping - Our domain filtering - Specific domain targeting - Fetch success/failure handling - Throttle request tracking
2026-01-07Add SyncContext trait and MockSyncContext for purgatory syncDanConwayDev
Implement the abstraction layer for purgatory sync operations: - SyncContext trait: defines interface for repository data fetching, OID existence checks, git fetch operations, and event processing - ProcessResult: captures outcomes when releasing events from purgatory - MockSyncContext: test mock with builder pattern for configuring: - Clone URLs and which OIDs each URL provides - Needed OIDs (simulates purgatory state) - URL failure simulation - Fetch logging for assertions The trait uses async_trait for async method support and requires Send + Sync for use in concurrent sync operations. This abstraction enables unit testing of sync logic without I/O, while the real implementation (to be added later) will connect to actual database, git, and relay systems.
2026-01-07Add ThrottleManager for cross-domain rate limitingDanConwayDev
Implements ThrottleManager which manages all per-domain DomainThrottle instances and provides: - Throttle status checking via is_throttled() for sync URL selection - Request tracking via start_request()/complete_request() - Identifier queue management via enqueue_identifier() - Automatic domain throttle creation on first access - Thread-safe access via DashMap with Mutex-wrapped throttles The manager uses the configured max_concurrent and max_per_minute limits for all domains. Trigger-based queue processing (set_context, process_queued_identifier) will be added after SyncContext is available. Tests verify: - is_throttled reflects domain capacity correctly - enqueue_identifier creates domain throttle if needed - start_request creates domain throttle if needed
2026-01-07Add DomainThrottle for per-domain rate limitingDanConwayDev
Implement per-domain throttling for purgatory sync operations: - Concurrent request limit (max in-flight requests per domain) - Rate limit (max requests per minute via sliding window) - Fair round-robin queue processing across identifiers - In-progress tracking to prevent duplicate fetches - Tried URL tracking per identifier Add indexmap dependency for ordered iteration in round-robin queue. Includes 6 unit tests covering: - Concurrent limit enforcement - Rate limit enforcement (sliding window) - Round-robin fair processing - In-progress identifier skipping - Round-robin index adjustment on removal - Tried URL merging on re-enqueue
2026-01-07Add SyncQueueEntry with exponential backoff for purgatory syncDanConwayDev
Implement the sync queue entry struct that tracks sync state per identifier: - next_attempt: when the next sync should be attempted - attempt_count: for backoff calculation (resets on new events) - in_progress: prevents concurrent syncs for same identifier Backoff schedule: 20s → 40s → 80s → 120s (capped at 2 minutes) This is the foundation for the identifier-based purgatory sync system that will replace the current per-event syncing approach.
2026-01-05sync all repos when authorised state data push receivedDanConwayDev
2026-01-05purgatory: git data sync applies state and saves eventDanConwayDev
2026-01-05purgatory: state git data sync use single command to fetch oidsDanConwayDev
2026-01-05purgatory: add state git data syncDanConwayDev
2026-01-02sync: use purgatoryDanConwayDev
don't save new events destined for purgatory events directly to db or serve on websockets don't download events already in purgatory via negentropy sync
2025-12-24feat(purgatory): add broken purgatory implementationDanConwayDev