upleb.uk

Public git repos — served from a NIP-34 GRASP relay at git.upleb.uk

summaryrefslogtreecommitdiff
path: root/src/sync
AgeCommit message (Collapse)Author
2025-12-11refactor: use Relay::notifications() for event-driven disconnect detectionDanConwayDev
Replace the 1-second polling loop with nostr-sdk's relay-level notification system that provides immediate disconnect detection via RelayNotification::RelayStatus. Key changes: - Use relay.notifications() instead of client.notifications() - Handle RelayNotification::RelayStatus { Disconnected | Terminated } to detect connection loss immediately without polling - Remove tokio::select! with interval timer - now uses simple match loop - Handle additional notification types (Authenticated, AuthenticationFailed) Why this is better: - Event-driven vs polling: no wasted CPU cycles checking every second - Immediate detection: disconnect triggers notification instantly - Uses nostr-sdk's built-in mechanism that was previously inaccessible at pool level (RelayStatus notifications are filtered out in RelayPoolNotification) Technical note: RelayNotification::RelayStatus is only available via Relay::notifications(), not Client::notifications(), because the pool-level broadcast filters out status change events. Future refactoring opportunity: Consider restructuring RelayConnection to hold a Relay directly instead of wrapping a Client, since we only manage one relay per connection anyway.
2025-12-11fix: wire up relay disconnection detection for metricsDanConwayDev
- Add periodic health check in RelayConnection::run_event_loop that polls nostr-sdk's relay.is_connected() every second to detect dead connections - When event channel closes without explicit Closed/Shutdown, send DisconnectNotification to SyncManager (fixes case where TCP drops silently) - Enable test_relay_connected_status test which validates the ngit_sync_relay_connected metric correctly reflects connection state The issue was that when a remote relay stops abruptly, nostr-sdk's notification receiver blocks indefinitely waiting for data. TCP disconnect detection without keepalive can take minutes. The health check polls nostr-sdk's internal relay status which detects disconnection promptly.
2025-12-11fix: resolve duplicate SyncMetrics registration preventing metrics recordingDanConwayDev
Root cause: Both Metrics::new() and SyncManager::new() were trying to register SyncMetrics with the same Prometheus registry. The second registration failed silently, leaving SyncManager.metrics = None, so record_connection_attempt() calls were no-ops. Changes: - SyncManager::new() now accepts Option<SyncMetrics> instead of Option<&Registry> - main.rs passes already-registered sync metrics from Metrics to SyncManager - Simplified test_connection_failure_increments_counter assertion - Marked 3 tests as #[ignore] pending relay tracking metrics wiring Tests fixed: - test_connection_failure_increments_counter (now counts failures) - test_health_state_degrades_on_failure (now tracks health state) - test_live_sync_layer3_events (already working, confirmed) Tests ignored (future work): - test_live_sync_event_count - test_multi_source_aggregate_counts - test_relay_connected_status
2025-12-11sync: add sync_base_backoff_secs config for better testingDanConwayDev
2025-12-11sync: improve connection timeout handlingDanConwayDev
2025-12-11fix(sync): improve metrics recording and connection failure detectionDanConwayDev
Changes: - Fix connection attempt metrics: record success/failure based on actual connection result instead of pre-emptively recording failure - Add health tracker integration on connection failure: call record_failure() and record_health_state() in error path - Add connection verification in relay_connection.rs: wait 500ms after connect() then verify is_connected() to detect silent failures - Add configurable disconnect check interval via NGIT_SYNC_DISCONNECT_CHECK_INTERVAL_SECS env var - Update TestRelay with fast test settings: startup_delay=0, jitter=0, disconnect_check_interval=1s - Add debug output to metrics tests for investigation Note: Tests may still fail due to 5-second base backoff in health tracker. A follow-up task will add NGIT_SYNC_BASE_BACKOFF_SECS config parameter to allow faster test cycles. Related: metrics-wiring-plan.md Tasks 1 & 2
2025-12-11feat: add event metrics tracking throughout sync (Phase 5)DanConwayDev
2025-12-10feat: add metrics field to SyncManager (Phase 2)DanConwayDev
2025-12-10feat: create sync metrics module (Phase 1)DanConwayDev
2025-12-10fix: enable Layer 3 sync by adding root events to pending queueDanConwayDev
When root events (issues/patches) are received via self-subscription, handle_root_event() was only updating the repo_sync_index directly. This caused process_batch() to early-return when pending.is_empty(), so Layer 3 filters for comments/replies were never created. The fix adds root events to both: 1. repo_sync_index (for immediate availability) 2. pending queue (to trigger Layer 3 filter creation in next batch) Critical: The pending entry must include relays from repo_sync_index so derive_relay_targets() knows where to send Layer 3 subscriptions. The Layer 3 test now verifies that events sent BEFORE the subscription is established are still synced - proving subscriptions without 'since' correctly fetch historical events. Enabled 4 previously ignored Layer 3 tests: - test_live_sync_layer3_events - test_layer3_sync_with_lowercase_e_tag - test_layer3_sync_with_uppercase_e_tag - test_layer3_sync_with_q_tag
2025-12-10feat(sync): broadcast synced events to WebSocket subscribersDanConwayDev
Enable recursive relay discovery by broadcasting synced events to WebSocket subscribers via LocalRelay.notify_event(). This allows the SelfSubscriber to receive 30617 announcements synced from external relays and discover additional relay URLs to connect to. Changes: - Pass LocalRelay to SyncManager::new() from main.rs - Add local_relay field to SyncManager struct - Call notify_event() after saving synced events to database - Enable test_recursive_relay_discovery_syncs_announcement test The test verifies that when relay_a syncs announcement_x from bootstrap relay_b (which lists relay_c), relay_a discovers and connects to relay_c to sync announcement_y. Fixes recursive relay discovery from bootstrap sync.
2025-12-10sync: fix connection registration issueDanConwayDev
2025-12-10improve: count all active subscriptions in get_filter_count (IMPROVE-1)DanConwayDev
2025-12-10refactor: remove insert-remove pattern in spawn_relay_connection (SIMPLIFY-3)DanConwayDev
2025-12-10refactor: deduplicate SelfSubscriber select branches (SIMPLIFY-2)DanConwayDev
2025-12-10refactor: remove redundant RelayAction enum (SIMPLIFY-1)DanConwayDev
2025-12-10feat: add automatic reconnection with exponential backoff (IMPROVE-2)DanConwayDev
2025-12-10fix: don't add 30617 announcement IDs to root_events (BUG-2)DanConwayDev
2025-12-10fix: add Layer 1 re-subscription on quick reconnect (BUG-1)DanConwayDev
2025-12-10sync: implement graceful shutdown for all tasks and connectionsDanConwayDev
2025-12-10sync: enhance SelfSubscriber with reconnect and root event trackingDanConwayDev
2025-12-10sync: implement relay removal for empty non-bootstrap relaysDanConwayDev
2025-12-10sync: implement daily timer for periodic fresh syncDanConwayDev
2025-12-10sync: implement filter consolidation systemDanConwayDev
2025-12-10sync: complete AddFilters handler with auto-spawningDanConwayDev
2025-12-10sync: implement unified connect/reconnect with since filtersDanConwayDev
2025-12-10sync: implement PendingBatch EOSE confirmation flowDanConwayDev
2025-12-10sync: implement disconnect handler with state cleanupDanConwayDev
2025-12-10sync: integrate health tracking and connection storageDanConwayDev
2025-12-10sync v4 mvpDanConwayDev
2025-12-10stub of sync v4DanConwayDev
2025-12-10improve sync designDanConwayDev
2025-12-09sync initalize from dbDanConwayDev
2025-12-09basic sync stubDanConwayDev
2025-12-08proposed sync change to use self subscribe to trigger everythingDanConwayDev
2025-12-05remove stupid tests and methodsDanConwayDev
2025-12-05rename sunc_bootstrap_relay_urlDanConwayDev
2025-12-05fix basic sync testsDanConwayDev
2025-12-05sync fixesDanConwayDev
2025-12-04feat(sync): Phase 6 - observability and production readinessDanConwayDev
- Add SyncMetrics with full Prometheus integration - Track sync gaps via catchup events - Update Grafana dashboard with sync panels - Document all sync configuration options - Update design doc with implementation notes
2025-12-04feat(sync): Phase 5 - negentropy catchup (NIP-77)DanConwayDev
- Add NegentropyService for set reconciliation - Implement startup catchup with warm-up delay - Implement reconnect catchup (last 3 days) - Add daily catchup schedule with stagger
2025-12-04feat(sync): Phase 4 - dynamic subscriptionsDanConwayDev
- Add SubscriptionManager for per-connection tracking - Trigger subscription updates on new repo/PR events - Implement consolidation when filter count > 150
2025-12-04feat(sync): Phase 3 - resilience and health trackingDanConwayDev
- Add RelayHealthTracker with DashMap - Implement exponential backoff (5s -> 1h max) - Handle dead relays (24h failures -> daily retry) - Add startup jitter to prevent thundering herd - Add NGIT_SYNC_MAX_BACKOFF_SECS config
2025-12-04feat(sync): Phase 2 - multi-relay and complete filtersDanConwayDev
- Add relay discovery from stored announcements - Implement FilterService with three-layer strategy - Support multiple simultaneous relay connections - Filter batching for large tag sets
2025-12-04feat(sync): Phase 1 MVP - single relay proactive syncDanConwayDev
- Add src/sync/ module with SyncManager - Add NGIT_SYNC_RELAY_URL config option - Subscribe to kind 30617 on configured relay - Validate synced events through Nip34WritePolicy - Integration test with two TestRelay instances