| Age | Commit message (Collapse) | Author |
|
shutdown/startup
Implement save/restore functionality for rejected events cache and
integrate persistence with relay shutdown/startup lifecycle. Both
purgatory and rejected cache now survive relay restarts.
Key features:
- Serialize rejected events cache to JSON (rejected-events-cache.json)
- Save both hot cache (2min, full events) and cold index (7day, metadata)
- Restore with downtime adjustment (preserves remaining TTL)
- Graceful degradation (missing/corrupted files don't crash)
- File cleanup after successful restore
- Automatic restoration in SyncManager::new()
Integration:
- Shutdown hook saves both purgatory and rejected cache
- Startup hook restores both and re-queues repositories
- Non-fatal errors (logs warnings, continues on failure)
Files:
- src/sync/rejected_index.rs: save_to_disk/restore_from_disk methods
- src/sync/mod.rs: SyncManager integration and auto-restore
- src/main.rs: Shutdown/startup hooks for both caches
- tests/purgatory_persistence.rs: 17 integration tests
Tests: 13 unit tests + 17 integration tests covering full lifecycle
|
|
config methods
Refactors configuration validation to fail fast on fatal errors at startup
while gracefully handling recoverable issues (e.g., malformed whitelist entries).
Changes:
- Add Config::validate() for eager validation called immediately after load
- Remove Result<> from archive_config() and repository_config() methods
- WhitelistEntry::parse_whitelist() skips invalid entries with warnings
- Validate relay_owner_nsec format in Config::validate()
- Update all call sites to remove Result handling from config getters
Benefits:
- Fatal config errors (incompatible settings) fail at startup, not runtime
- Recoverable errors (bad whitelist entries) logged as warnings and skipped
- No Result handling scattered throughout runtime code after validation
- Config methods safe to call without error handling after validate()
Testing:
- Add 7 new tests for validation edge cases and error handling
- Total config tests: 40 (up from 33)
- All 320 library tests passing
Breaking change: Config users must call config.validate() after Config::load()
to ensure configuration is valid. This is enforced in main.rs.
|
|
Implements ngit_repositories_total metric by counting *.git directories
on disk every time /metrics is requested (~15s interval by Prometheus).
This approach is simpler than increment-on-create because:
- No need to pass metrics through the relay builder chain
- Always accurate and self-correcting
- Negligible performance impact (~100-200 dir entries)
Changes:
- Add count_repositories_on_disk() static method to Metrics
- Update Metrics::render() to count repos before encoding metrics
- Pass git_data_path to Metrics::new() in main.rs
- Consolidate metrics tests to avoid global Prometheus registry conflicts
Fixes repository count metric issue from Phase 8 deployment plan.
|
|
Implement domain-level naughty list tracking for git remotes, reusing the
existing NaughtyListTracker from relay sync. This prevents repeated attempts
to fetch from git domains with persistent infrastructure issues (SSL/TLS
certificate errors, DNS failures).
Changes:
- Updated NaughtyListTracker to track both relay URLs and git domains
- Added git_naughty_list field to RealSyncContext for error classification
- Modified fetch_oids() to classify git fetch errors and record naughty domains
- Updated sync_identifier_next_url() to filter out naughty domains during URL selection
- Added git_naughty_list parameter to ThrottleManager for domain queue processing
- Threaded naughty list through start_sync_loop and all sync functions
- Updated all tests to pass naughty list parameter
The naughty list uses 12-hour expiration (configurable) to allow domains to
recover from infrastructure issues. First occurrence logs WARN, repeats log DEBUG.
|
|
|
|
Adds expired event tracking to prevent proactive sync from repeatedly
fetching and re-adding events that expired from purgatory without
finding git data.
Key features:
- Track expired events for 7 days to prevent re-sync loops
- Distinguish synced vs user-submitted events (via socket address)
- Allow users to retry expired events (git data might now be available)
- Reject synced expired events (prevents infinite loop)
- Daily cleanup of expired event records older than 7 days
Implementation:
- Added expired_events: DashMap<EventId, Instant> to Purgatory
- Updated event_ids() to include both purgatory + expired events
- Added is_expired(), mark_expired(), cleanup_expired_events()
- Updated cleanup() to mark expired events automatically
- Added is_synced detection in WritePolicy (localhost:0 = synced)
- Policy layer checks is_synced && is_expired() before rejecting
Behavior:
- Negentropy: Filters expired events before fetching (optimal)
- REQ+EOSE: Rejects synced expired events at policy layer
- User submissions: Always allowed to retry (skip expired check)
Testing:
- Added 5 new tests for expired event tracking
- All 222 tests passing
Fixes the infinite re-sync loop where events without git data would
expire, get synced again, expire again, repeat forever.
|
|
Phase 13 of purgatory-sync-redesign:
- Add sync loop startup in main.rs (RealSyncContext + ThrottleManager + start_sync_loop)
- Update add_state() and add_pr() to automatically enqueue for background sync
- Remove start_state_sync() call from state.rs (now handled by sync loop)
- Remove orphaned legacy functions: sync_state_git_data, fetch_missing_oids_from_server,
get_most_complete_local_repo, identify_missing_oids, get_date_of_most_recent_commit_on_default_branch
- Clean up unused imports in purgatory/mod.rs
|
|
|
|
|
|
|
|
|
|
so we can more easily support grasp purgatory feature
|
|
Main lib (src/):
- Add #[allow(dead_code)] for build_info field (stored to prevent Prometheus unregistration)
- Add #[allow(dead_code)] for first_seen field (reserved for future rate limiting)
- Replace .or_insert_with(RelaySyncNeeds::default) with .or_default()
- Replace manual div_ceil implementations with .div_ceil(100)
Test code (tests/):
- Replace .expect(&format!(...)) with .unwrap_or_else(|_| panic!(...))
- Remove needless borrows in fetch_metrics() calls
- Add #[allow(dead_code)] and #[allow(unused_imports)] to test helpers module
grasp-audit:
- Apply cargo fmt to fix formatting
|
|
Root cause: Both Metrics::new() and SyncManager::new() were trying to register
SyncMetrics with the same Prometheus registry. The second registration failed
silently, leaving SyncManager.metrics = None, so record_connection_attempt()
calls were no-ops.
Changes:
- SyncManager::new() now accepts Option<SyncMetrics> instead of Option<&Registry>
- main.rs passes already-registered sync metrics from Metrics to SyncManager
- Simplified test_connection_failure_increments_counter assertion
- Marked 3 tests as #[ignore] pending relay tracking metrics wiring
Tests fixed:
- test_connection_failure_increments_counter (now counts failures)
- test_health_state_degrades_on_failure (now tracks health state)
- test_live_sync_layer3_events (already working, confirmed)
Tests ignored (future work):
- test_live_sync_event_count
- test_multi_source_aggregate_counts
- test_relay_connected_status
|
|
|
|
|
|
Enable recursive relay discovery by broadcasting synced events to
WebSocket subscribers via LocalRelay.notify_event(). This allows the
SelfSubscriber to receive 30617 announcements synced from external
relays and discover additional relay URLs to connect to.
Changes:
- Pass LocalRelay to SyncManager::new() from main.rs
- Add local_relay field to SyncManager struct
- Call notify_event() after saving synced events to database
- Enable test_recursive_relay_discovery_syncs_announcement test
The test verifies that when relay_a syncs announcement_x from bootstrap
relay_b (which lists relay_c), relay_a discovers and connects to
relay_c to sync announcement_y.
Fixes recursive relay discovery from bootstrap sync.
|
|
|
|
- Add RelayHealthTracker with DashMap
- Implement exponential backoff (5s -> 1h max)
- Handle dead relays (24h failures -> daily retry)
- Add startup jitter to prevent thundering herd
- Add NGIT_SYNC_MAX_BACKOFF_SECS config
|
|
- Add relay discovery from stored announcements
- Implement FilterService with three-layer strategy
- Support multiple simultaneous relay connections
- Filter batching for large tag sets
|
|
- Add src/sync/ module with SyncManager
- Add NGIT_SYNC_RELAY_URL config option
- Subscribe to kind 30617 on configured relay
- Validate synced events through Nip34WritePolicy
- Integration test with two TestRelay instances
|
|
|
|
|
|
|
|
|
|
|
|
|
|
- WebSocket-based relay using tokio-tungstenite
- Full NIP-01 protocol support (EVENT, REQ, CLOSE)
- Event validation (signature and ID)
- In-memory event storage
- Filter support (IDs, authors, kinds, since/until)
- Configuration via environment variables
- Nix flake for reproducible builds
- Test automation script
All 6 NIP-01 smoke tests passing (100%)
|