| Age | Commit message (Collapse) | Author |
|
Implement save/restore functionality for purgatory state to prevent
event loss during relay restarts. Events in purgatory (state events,
PR events, and expired events) are now saved to disk on graceful
shutdown and restored on startup.
Key features:
- Serialize purgatory state to JSON (purgatory-state.json)
- Time conversion helpers for Instant <-> Duration serialization
- Restore with downtime adjustment (preserves remaining TTL)
- Graceful degradation (missing/corrupted files don't crash)
- File cleanup after successful restore
- get_all_identifiers() for re-queueing after restore
Files:
- src/purgatory/persistence.rs: Time conversion helpers
- src/purgatory/types.rs: Serialization derives
- src/purgatory/mod.rs: save_to_disk/restore_from_disk methods
Tests: 15 unit tests covering serialization, downtime, edge cases
|
|
Previously, purgatory sync was using '--depth=1' when fetching OIDs from
remote servers. This created shallow clones with only 1-2 commits instead
of the complete git history.
The fix removes the '--depth=1' flag, allowing git to fetch the complete
commit history chain when fetching specific commit OIDs. This is the
correct behavior for GRASP - users cloning from our relay should get the
full repository history.
Changes:
- Remove '--depth=1' from git fetch command in RealSyncContext::fetch_oids
- Update comment to clarify that full history is fetched
Impact:
- Production repositories will now contain full git history
- Users cloning from the relay will get complete commit chains
- No more 'shallow' files in git repositories
- May be slightly slower due to fetching more data, but correctness is prioritized
Testing:
- All 564 tests pass (276 unit + 288 integration)
- No regressions in existing functionality
Fixes issue documented in work/active-issues/shallow-git-fetch.md
|
|
Implement domain-level naughty list tracking for git remotes, reusing the
existing NaughtyListTracker from relay sync. This prevents repeated attempts
to fetch from git domains with persistent infrastructure issues (SSL/TLS
certificate errors, DNS failures).
Changes:
- Updated NaughtyListTracker to track both relay URLs and git domains
- Added git_naughty_list field to RealSyncContext for error classification
- Modified fetch_oids() to classify git fetch errors and record naughty domains
- Updated sync_identifier_next_url() to filter out naughty domains during URL selection
- Added git_naughty_list parameter to ThrottleManager for domain queue processing
- Threaded naughty list through start_sync_loop and all sync functions
- Updated all tests to pass naughty list parameter
The naughty list uses 12-hour expiration (configurable) to allow domains to
recover from infrastructure issues. First occurrence logs WARN, repeats log DEBUG.
|
|
|
|
- Fix relay_connected() helper to check v >= 2 (Syncing/Connected states)
- Fix unit test to use status value 3 (Connected) instead of 1 (Connecting)
- Fix clippy warning: use .to_vec() instead of .iter().cloned().collect()
All 61 sync integration tests now passing.
All 238 unit tests passing.
Clippy clean.
|
|
The mock was creating multiple clone tags (one per URL), which violated
NIP-34 format and triggered validation errors added in commit 92bfbd3.
NIP-34 specifies: single clone tag with multiple values
["clone", "https://url1.com", "https://url2.com", ...]
NOT multiple clone tags:
["clone", "https://url1.com"]
["clone", "https://url2.com"]
This regression caused 7 purgatory::sync::functions tests to fail because
RepositoryAnnouncement::from_event() now correctly rejects announcements
with multiple clone tags.
Fixes:
- next_url_skips_throttled_domains
- next_url_skips_tried_urls
- next_url_filters_our_domain
- next_url_with_specific_domain
- get_throttled_domains_returns_only_throttled_with_untried
- sync_identifier_enqueues_throttled_domains_when_incomplete
- sync_identifier_tries_multiple_urls_until_complete
All 232 unit tests now pass.
|
|
- Update nostr-relay-builder, nostr-sdk, nostr-lmdb to latest revision
- Update grasp-audit nostr-sdk dependency
- Fix clippy warnings:
- Replace .clone() with std::slice::from_ref() in src/git/sync.rs
- Change &PathBuf to &Path in tests/common/git_server.rs
- Replace vec![] with array literal in src/purgatory/sync/functions.rs
- Update PR_TEST_COMMIT_HASH in grasp-audit due to event generation changes
All 249 tests passing, no breaking changes required.
|
|
|
|
Prevent GPG signing prompts (including Yubikey activation) during test runs
by explicitly disabling commit.gpgsign and tag.gpgsign in all test repository
creation helpers.
Modified:
- tests/common/purgatory_helpers.rs: create_test_repo_with_commit()
- src/git/mod.rs: create_test_repo_with_commit()
- src/purgatory/helpers.rs: create_test_repo_with_commit()
All test repositories now have GPG signing disabled regardless of global
git configuration.
|
|
Adds expired event tracking to prevent proactive sync from repeatedly
fetching and re-adding events that expired from purgatory without
finding git data.
Key features:
- Track expired events for 7 days to prevent re-sync loops
- Distinguish synced vs user-submitted events (via socket address)
- Allow users to retry expired events (git data might now be available)
- Reject synced expired events (prevents infinite loop)
- Daily cleanup of expired event records older than 7 days
Implementation:
- Added expired_events: DashMap<EventId, Instant> to Purgatory
- Updated event_ids() to include both purgatory + expired events
- Added is_expired(), mark_expired(), cleanup_expired_events()
- Updated cleanup() to mark expired events automatically
- Added is_synced detection in WritePolicy (localhost:0 = synced)
- Policy layer checks is_synced && is_expired() before rejecting
Behavior:
- Negentropy: Filters expired events before fetching (optimal)
- REQ+EOSE: Rejects synced expired events at policy layer
- User submissions: Always allowed to retry (skip expired check)
Testing:
- Added 5 new tests for expired event tracking
- All 222 tests passing
Fixes the infinite re-sync loop where events without git data would
expire, get synced again, expire again, repeat forever.
|
|
- Prefix unused variable auth_result with underscore
- Prefix unused field git_data_path with underscore in Purgatory struct
- Add #[allow(clippy::too_many_arguments)] to handle_receive_pack
- Replace len() >= 1 with !is_empty()
- Replace .last() with .next_back() on DoubleEndedIterator
- Fix doc list item overindentation
- Replace map_or(true, ...) with is_none_or(...)
- Replace map_or(false, ...) with is_some_and(...)
|
|
Add support for extracting clone URLs from PR/PR-Update events (kind 1618/1619)
during purgatory sync, per NIP-34 specification. This enables fetching PR commits
from URLs specified in the PR event itself, not just from repository announcement
clone URLs.
Changes:
- Add collect_pr_clone_urls() to SyncContext trait
- Implement in RealSyncContext: extract clone tags from PR events in purgatory
- Implement in MockSyncContext: configurable PR clone URLs for testing
- Update sync_identifier_next_url to merge PR clone URLs with announcement URLs
- Update get_throttled_domains_with_untried_urls with same merge logic
- Add unit tests for PR clone URL extraction and filtering
|
|
Implements Phase 3 of the purgatory sync integration test plan.
Key changes:
- Add immediate sync triggering for sync-received events that go to
purgatory (instead of default 3-minute delay for user-submitted events)
- TestRelay now respects RUST_LOG environment variable for debugging
- New test verifies end-to-end flow: state event syncs from source relay,
enters purgatory, git data is fetched from source's clone URL, and
event is released and served
|
|
Phase 13 of purgatory-sync-redesign:
- Add sync loop startup in main.rs (RealSyncContext + ThrottleManager + start_sync_loop)
- Update add_state() and add_pr() to automatically enqueue for background sync
- Remove start_state_sync() call from state.rs (now handled by sync loop)
- Remove orphaned legacy functions: sync_state_git_data, fetch_missing_oids_from_server,
get_most_complete_local_repo, identify_missing_oids, get_date_of_most_recent_commit_on_default_branch
- Clean up unused imports in purgatory/mod.rs
|
|
Implement the production SyncContext that connects to real systems:
- RealSyncContext struct holding purgatory, database, git_data_path,
our_domain, and local_relay references
- fetch_repository_data: delegates to git::authorization module
- collect_needed_oids: collects commit hashes from state events
(branches/tags) and PR events (c-tag) in purgatory
- oid_exists: delegates to git::oid_exists function
- fetch_oids: uses git fetch --depth=1 to retrieve specific OIDs
from remote servers, running in spawn_blocking for async safety
- process_newly_available_git_data: delegates to the unified function
in git::sync module for consistent post-git-data processing
- has_pending_events: delegates to purgatory method
- find_target_repo: finds first existing owner repository on disk
- our_domain: returns configured domain for clone URL filtering
This enables the purgatory sync loop to use real database queries,
git operations, and event processing instead of mocks.
|
|
|
|
Replace ~100 lines of duplicated post-push processing in handle_receive_pack
with a single call to the unified process_newly_available_git_data function.
The unified function handles all post-git-data-available processing:
- Discovering satisfiable events from purgatory (state and PR events)
- Syncing OIDs to authorized owner repos
- Aligning refs (+ setting HEAD) in all owner repos
- Saving events to database
- Notifying WebSocket subscribers
- Removing from purgatory
This ensures consistent behavior regardless of how git data arrives
(git push vs purgatory sync fetching from remote servers).
Also mark test-only internal methods with #[cfg(test)] to silence
dead code warnings.
|
|
Implement the unified function that handles all post-git-data-available
processing, regardless of how data arrived (git push or purgatory sync).
This function:
- Discovers satisfiable events from purgatory (state and PR events)
- Syncs OIDs to authorized owner repos
- Aligns refs and sets HEAD
- Saves events to database
- Notifies WebSocket subscribers
- Removes from purgatory
New additions:
- ProcessResult struct for tracking processing outcomes
- process_newly_available_git_data async function in src/git/sync.rs
- Helper functions: extract_identifier_from_repo_path, extract_identifier_from_pr_event
- Purgatory::find_prs_for_identifier method for PR event discovery
- Unit tests for all helper functions
Also fixes:
- Simplified extract_domain to avoid url crate dependency
- Removed unused imports in sync/loop.rs
|
|
Implement the main sync loop that runs in the background and processes
identifiers that are ready for git data synchronization:
- Runs every 1 second (hardcoded interval, not configurable)
- Finds all ready identifiers where !in_progress && next_attempt <= now
- Spawns parallel tasks for each ready identifier
- Each task calls sync_identifier to try fetching git data from remotes
- Applies backoff when sync completes but events remain in purgatory
- Removes identifiers from queue when sync completes or no events remain
The loop integrates with the existing sync infrastructure:
- Uses SyncContext trait for testability
- Uses ThrottleManager for domain-based rate limiting
- Uses sync_identifier for the actual fetch orchestration
This enables automatic background fetching of git data for events in
purgatory, complementing the existing push-triggered sync path.
|
|
- Add sync_queue field to Purgatory struct for tracking identifiers that need
background git data fetching
- Implement enqueue_sync() with debouncing - resets attempt_count and updates
next_attempt when new events arrive for an identifier already in queue
- Add enqueue_sync_default() for user-submitted events (3 minute delay to wait
for git push)
- Add enqueue_sync_immediate() for sync-triggered events (500ms delay for
batching burst arrivals)
- Implement has_pending_events() to check if an identifier has state events
or PR events in purgatory
- Add helper methods: sync_queue(), remove_from_sync_queue(), sync_queue_size()
- Add unit tests for debouncing behavior and pending event detection
|
|
Implement the main sync orchestration function and trigger-based queue
processing for throttled domains:
sync_identifier function:
- Orchestrates syncing git data for a single identifier
- Tries all non-throttled URLs in sequence
- Checks completion after each fetch (no pending events or all OIDs fetched)
- Enqueues with throttled domains when non-throttled URLs are exhausted
- Returns true if complete, false if events remain (for backoff)
ThrottleManager enhancements:
- Add set_context() to provide SyncContext for queue processing
- Add try_process_next() to spawn tasks when capacity frees
- Add process_queued_identifier() to handle queued work
- Update complete_request() to trigger processing on completion
- Update enqueue_identifier() to trigger processing when capacity available
- Add internal methods for non-Arc testing compatibility
Generic function updates:
- Add ?Sized bound to sync_identifier_next_url, sync_identifier_from_url,
sync_identifier, and get_throttled_domains_with_untried_urls for
dynamic dispatch support (Arc<dyn SyncContext>)
Tests:
- sync_identifier_tries_multiple_urls_until_complete: verifies sequential
URL fetching until all OIDs are available
- sync_identifier_enqueues_throttled_domains_when_incomplete: verifies
throttled domains get the identifier enqueued for later processing
- has_queued_work_reflects_queue_state: verifies queue state tracking
|
|
Implement sync_identifier_next_url and sync_identifier_from_url functions
that provide the core URL selection and fetch logic for purgatory sync.
sync_identifier_next_url:
- Pure URL selection logic with no side effects
- Filters out our own domain and already-tried URLs
- Respects domain throttling when domain parameter is None
- Can target a specific domain when domain parameter is Some
sync_identifier_from_url:
- Fetches OIDs from a specific URL via the SyncContext
- Tracks request start/completion with ThrottleManager for rate limiting
- Calls process_newly_available_git_data on successful fetch
Also adds get_throttled_domains_with_untried_urls helper for the main
sync loop to know which DomainThrottle queues to enqueue identifiers to.
These functions are designed to be called by both:
- Main sync loop (tries non-throttled URLs immediately)
- DomainThrottle queue processing (when capacity frees up)
Includes 10 unit tests covering:
- Throttled domain skipping
- Tried URL skipping
- Our domain filtering
- Specific domain targeting
- Fetch success/failure handling
- Throttle request tracking
|
|
Implement the abstraction layer for purgatory sync operations:
- SyncContext trait: defines interface for repository data fetching,
OID existence checks, git fetch operations, and event processing
- ProcessResult: captures outcomes when releasing events from purgatory
- MockSyncContext: test mock with builder pattern for configuring:
- Clone URLs and which OIDs each URL provides
- Needed OIDs (simulates purgatory state)
- URL failure simulation
- Fetch logging for assertions
The trait uses async_trait for async method support and requires
Send + Sync for use in concurrent sync operations.
This abstraction enables unit testing of sync logic without I/O,
while the real implementation (to be added later) will connect
to actual database, git, and relay systems.
|
|
Implements ThrottleManager which manages all per-domain DomainThrottle
instances and provides:
- Throttle status checking via is_throttled() for sync URL selection
- Request tracking via start_request()/complete_request()
- Identifier queue management via enqueue_identifier()
- Automatic domain throttle creation on first access
- Thread-safe access via DashMap with Mutex-wrapped throttles
The manager uses the configured max_concurrent and max_per_minute limits
for all domains. Trigger-based queue processing (set_context,
process_queued_identifier) will be added after SyncContext is available.
Tests verify:
- is_throttled reflects domain capacity correctly
- enqueue_identifier creates domain throttle if needed
- start_request creates domain throttle if needed
|
|
Implement per-domain throttling for purgatory sync operations:
- Concurrent request limit (max in-flight requests per domain)
- Rate limit (max requests per minute via sliding window)
- Fair round-robin queue processing across identifiers
- In-progress tracking to prevent duplicate fetches
- Tried URL tracking per identifier
Add indexmap dependency for ordered iteration in round-robin queue.
Includes 6 unit tests covering:
- Concurrent limit enforcement
- Rate limit enforcement (sliding window)
- Round-robin fair processing
- In-progress identifier skipping
- Round-robin index adjustment on removal
- Tried URL merging on re-enqueue
|
|
Implement the sync queue entry struct that tracks sync state per identifier:
- next_attempt: when the next sync should be attempted
- attempt_count: for backoff calculation (resets on new events)
- in_progress: prevents concurrent syncs for same identifier
Backoff schedule: 20s → 40s → 80s → 120s (capped at 2 minutes)
This is the foundation for the identifier-based purgatory sync system
that will replace the current per-event syncing approach.
|
|
|
|
|
|
|
|
|
|
don't save new events destined for purgatory events directly to db
or serve on websockets
don't download events already in purgatory via negentropy sync
|
|
|