| Age | Commit message (Collapse) | Author |
|
- Add periodic health check in RelayConnection::run_event_loop that polls
nostr-sdk's relay.is_connected() every second to detect dead connections
- When event channel closes without explicit Closed/Shutdown, send
DisconnectNotification to SyncManager (fixes case where TCP drops silently)
- Enable test_relay_connected_status test which validates the
ngit_sync_relay_connected metric correctly reflects connection state
The issue was that when a remote relay stops abruptly, nostr-sdk's
notification receiver blocks indefinitely waiting for data. TCP disconnect
detection without keepalive can take minutes. The health check polls
nostr-sdk's internal relay status which detects disconnection promptly.
|
|
Root cause: Both Metrics::new() and SyncManager::new() were trying to register
SyncMetrics with the same Prometheus registry. The second registration failed
silently, leaving SyncManager.metrics = None, so record_connection_attempt()
calls were no-ops.
Changes:
- SyncManager::new() now accepts Option<SyncMetrics> instead of Option<&Registry>
- main.rs passes already-registered sync metrics from Metrics to SyncManager
- Simplified test_connection_failure_increments_counter assertion
- Marked 3 tests as #[ignore] pending relay tracking metrics wiring
Tests fixed:
- test_connection_failure_increments_counter (now counts failures)
- test_health_state_degrades_on_failure (now tracks health state)
- test_live_sync_layer3_events (already working, confirmed)
Tests ignored (future work):
- test_live_sync_event_count
- test_multi_source_aggregate_counts
- test_relay_connected_status
|
|
|
|
Changes:
- Fix connection attempt metrics: record success/failure based on actual
connection result instead of pre-emptively recording failure
- Add health tracker integration on connection failure: call
record_failure() and record_health_state() in error path
- Add connection verification in relay_connection.rs: wait 500ms after
connect() then verify is_connected() to detect silent failures
- Add configurable disconnect check interval via
NGIT_SYNC_DISCONNECT_CHECK_INTERVAL_SECS env var
- Update TestRelay with fast test settings: startup_delay=0, jitter=0,
disconnect_check_interval=1s
- Add debug output to metrics tests for investigation
Note: Tests may still fail due to 5-second base backoff in health tracker.
A follow-up task will add NGIT_SYNC_BASE_BACKOFF_SECS config parameter
to allow faster test cycles.
Related: metrics-wiring-plan.md Tasks 1 & 2
|
|
|
|
|
|
|
|
Enable recursive relay discovery by broadcasting synced events to
WebSocket subscribers via LocalRelay.notify_event(). This allows the
SelfSubscriber to receive 30617 announcements synced from external
relays and discover additional relay URLs to connect to.
Changes:
- Pass LocalRelay to SyncManager::new() from main.rs
- Add local_relay field to SyncManager struct
- Call notify_event() after saving synced events to database
- Enable test_recursive_relay_discovery_syncs_announcement test
The test verifies that when relay_a syncs announcement_x from bootstrap
relay_b (which lists relay_c), relay_a discovers and connects to
relay_c to sync announcement_y.
Fixes recursive relay discovery from bootstrap sync.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
- Add SyncMetrics with full Prometheus integration
- Track sync gaps via catchup events
- Update Grafana dashboard with sync panels
- Document all sync configuration options
- Update design doc with implementation notes
|
|
- Add NegentropyService for set reconciliation
- Implement startup catchup with warm-up delay
- Implement reconnect catchup (last 3 days)
- Add daily catchup schedule with stagger
|
|
- Add SubscriptionManager for per-connection tracking
- Trigger subscription updates on new repo/PR events
- Implement consolidation when filter count > 150
|
|
- Add RelayHealthTracker with DashMap
- Implement exponential backoff (5s -> 1h max)
- Handle dead relays (24h failures -> daily retry)
- Add startup jitter to prevent thundering herd
- Add NGIT_SYNC_MAX_BACKOFF_SECS config
|
|
- Add relay discovery from stored announcements
- Implement FilterService with three-layer strategy
- Support multiple simultaneous relay connections
- Filter batching for large tag sets
|
|
- Add src/sync/ module with SyncManager
- Add NGIT_SYNC_RELAY_URL config option
- Subscribe to kind 30617 on configured relay
- Validate synced events through Nip34WritePolicy
- Integration test with two TestRelay instances
|