upleb.uk

Public git repos — served from a NIP-34 GRASP relay at git.upleb.uk

summaryrefslogtreecommitdiff
path: root/src/sync
AgeCommit message (Collapse)Author
2026-01-05purgatory: add state git data syncDanConwayDev
2026-01-02sync: use purgatoryDanConwayDev
don't save new events destined for purgatory events directly to db or serve on websockets don't download events already in purgatory via negentropy sync
2025-12-22chore: cargo fmt and clippyDanConwayDev
2025-12-22chore: bump rust-nostr to latest masterDanConwayDev
so we can more easily support grasp purgatory feature
2025-12-22accept all UserGraspList for better discoveryDanConwayDev
2025-12-22docs: proactive sync hand written overview rewrite and AI update of restDanConwayDev
2025-12-22fix: sync consoldate subscription countDanConwayDev
2025-12-22sync: add req rate-limit detection and cooldownDanConwayDev
2025-12-19feat(sync): implement pagination for historic_sync REQ+EOSE flowDanConwayDev
Add automatic pagination support for non-Negentropy historic sync to handle large result sets efficiently. When a subscription receives >= 75 events, the system automatically fetches the next page using the 'until' parameter. Changes: - Add PaginationState struct to track event counts and min timestamps - Add pagination_state HashMap to PendingBatch for per-subscription tracking - Add PAGINATION_THRESHOLD constant (75 events) - Pass pending_sync_index to event processor for state updates - Track events and timestamps as they arrive - Check threshold on EOSE and launch follow-up subscriptions - Initialize pagination state when creating historic sync subscriptions - Update test fixtures in algorithms.rs The pagination continues recursively until a page returns fewer than 75 events, ensuring complete historic data retrieval without overwhelming relay limits.
2025-12-19Simplify sync metrics to track only newly saved eventsDanConwayDev
Replace broken event counting that occurred before duplicate/policy checks with accurate tracking of events that are new, accepted, and saved. Changes: - Added ProcessResult enum to track event processing outcomes - Modified process_event_static() to return ProcessResult - Replaced events_total (with source labels) with events_synced_total - Removed gap_events_total and event_source module - Removed eose_received flag (EOSE is per-subscription, not suitable) - Updated all tests to use new simplified API The new ngit_sync_events_synced_total metric only counts events that: 1. Are new (not duplicates) 2. Pass write policy validation 3. Are successfully saved to database All 165 tests pass (124 lib + 41 integration)
2025-12-19sync: fix autoclose on EOSE for historic filtersDanConwayDev
2025-12-19refactor: rename connect_and_subscribe to connectDanConwayDev
Separated connection from subscription logic. The RelayConnection.connect() method now only handles WebSocket connection establishment. Subscriptions are managed separately via handle_connect_or_reconnect. Changes: - Renamed RelayConnection::connect_and_subscribe() to connect() - Removed subscription logic from connect method - Updated call site in try_connect_relay() - Removed unused build_announcement_filter import
2025-12-19Fix: Capture old_last_connected before updating stateDanConwayDev
Bug: handle_connect_or_reconnect() was incorrectly calling quick_reconnect() on first connections instead of fresh_start(). Root cause: The code updated last_connected = Some(now) at line 808, then immediately read it back at line 932 to make the reconnection decision. This meant first connections saw elapsed = now - now = 0 seconds, which triggered quick_reconnect() instead of fresh_start(). Fix: Capture old_last_connected BEFORE updating the state, then use that value for the reconnection decision. Now first connections correctly see None and call fresh_start(). Impact: - First connections now properly use fresh_start() with full historic sync - Short disconnections (< 15 min) use quick_reconnect() with since filter - Long disconnections (> 15 min) use fresh_start() with full resync All 41 sync tests passing.
2025-12-19fix: prevent CLOSED messages from terminating relay connectionsDanConwayDev
The system was incorrectly treating subscription-specific CLOSED messages as connection-wide disconnects, causing live subscriptions to be terminated immediately after historic_sync completed. Two bugs fixed: 1. relay_connection.rs: Removed break on RelayMessage::Closed - it's subscription-specific, not connection-wide 2. mod.rs: Removed disconnect handling for RelayEvent::Closed - only log at DEBUG level and continue All 41 sync tests now pass including previously failing live sync tests.
2025-12-19sync: negentropy fixesDanConwayDev
2025-12-18sync: turn off negentropy and fix some testsDanConwayDev
2025-12-18sync: fix sync connectionDanConwayDev
2025-12-18sync: new connection logicDanConwayDev
2025-12-18sync removing dead codeDanConwayDev
2025-12-16proactive sync prep - some helper functions written but not enabledDanConwayDev
2025-12-12fix: remove misleading fallback claim from negentropy sync error logDanConwayDev
The log message claimed 'will fall back to REQ+EOSE' but no such fallback was implemented - the function simply returns 0 and exits.
2025-12-12fix: unify sync state tracking for negentropy and REQ+EOSE pathsDanConwayDev
When negentropy (NIP-77) sync was enabled, the RelaySyncIndex was never updated to reflect historical sync completion. This caused the three-way diff algorithm in compute_actions() to malfunction, leading to: - Repeated sync attempts for the same items - Incorrect filter counting for consolidation - Potential premature relay disconnection This fix unifies both sync paths (REQ+EOSE and Negentropy) through a consistent PendingBatch flow: 1. Added SyncMethod enum to distinguish between sync types 2. Updated PendingBatch struct to include sync_method field 3. Extracted confirm_batch() method for unified batch confirmation 4. Modified negentropy_sync_and_process() to: - Create a PendingBatch before sync - Add batch to pending_sync_index - On success: Remove batch and call confirm_batch() - On failure: Remove batch without confirming The confirm_batch() method moves repos and root_events from the batch to the RelayState.repos and RelayState.root_events, ensuring the three-way diff works correctly regardless of sync method. Closes: negentropy-sync-state-tracking.md
2025-12-11sync: remove reply kind from sync filters for root eventsDanConwayDev
they are legacy and not root events
2025-12-11fix: resolve all fmt and clippy warningsDanConwayDev
Main lib (src/): - Add #[allow(dead_code)] for build_info field (stored to prevent Prometheus unregistration) - Add #[allow(dead_code)] for first_seen field (reserved for future rate limiting) - Replace .or_insert_with(RelaySyncNeeds::default) with .or_default() - Replace manual div_ceil implementations with .div_ceil(100) Test code (tests/): - Replace .expect(&format!(...)) with .unwrap_or_else(|_| panic!(...)) - Remove needless borrows in fetch_metrics() calls - Add #[allow(dead_code)] and #[allow(unused_imports)] to test helpers module grasp-audit: - Apply cargo fmt to fix formatting
2025-12-11sync: test sync works without negentropy and add disable option in syncDanConwayDev
2025-12-11feat: implement NIP-77 negentropy sync for historical dataDanConwayDev
Replace EOSE-based sync completion with negentropy reconciliation for: - Initial connect (fresh sync) - Daily sync (Layer 1 announcements) - Stale reconnect (>15 min) Key changes: - Add NegentropySyncResult struct with remote_only, local_only, received fields - Add supports_negentropy() using try-and-fallback approach - Add negentropy_sync_filter() using nostr-sdk client.sync() API - Modify handle_connect_or_reconnect() to use negentropy for fresh/stale sync - Modify daily_sync() to use negentropy for Layer 1 - Single-warning logging per relay when negentropy fails Quick reconnects (<15 min) unchanged - still use REQ with since filter. If negentropy unsupported, gracefully falls back to REQ+EOSE flow.
2025-12-11docs: simplify grasp-02 docDanConwayDev
2025-12-11fix docsDanConwayDev
2025-12-11fix(sync): add Layer 1 re-subscription to daily_sync()DanConwayDev
- Add Layer 1 (announcements) re-subscription in daily_sync() after unsubscribe_all() to ensure kinds 30617+30618 are re-established - Clarify comments in handle_connect_or_reconnect() explaining that Layer 1 subscription is established during connect_and_subscribe() Addresses implementation gaps from design vs implementation report: - Gap 1: Comments clarified (Layer 1 handled by connect_and_subscribe) - Gap 2: daily_sync() now re-subscribes to Layer 1 without since filter - Gap 3: consolidate() already had Layer 1 re-subscription (no change) All 125 unit tests and integration tests pass.
2025-12-11fix: sync metrics aggregate relay countsDanConwayDev
2025-12-11fix: classify sync events as startup/live based on EOSE, not relay typeDanConwayDev
Previously, events were classified as 'startup' or 'live' based on whether they came from a bootstrap relay (is_bootstrap flag). This meant ALL events from bootstrap relays were counted as 'startup', even events received after the initial sync completed. Now events are classified based on whether EOSE (End Of Stored Events) has been received for that connection: - Events BEFORE EOSE → 'startup' (historical events during initial sync) - Events AFTER EOSE → 'live' (new events via real-time subscription) This enables the test_live_sync_event_count test which validates that events received after sync connection is established are counted as live events. Also removed the #[ignore] attribute from test_live_sync_event_count since the metrics are now properly wired up.
2025-12-11docs(sync): document why RelayConnection uses Client instead of Relay directlyDanConwayDev
nostr-sdk 0.44's Relay::new() is pub(crate), making it impossible to construct a Relay directly from outside the crate. Relays can only be created through Client::add_relay() or RelayPool::add_relay(). This commit: - Adds 'Why Client instead of Relay directly?' section to struct docs - Updates run_event_loop() docs to explain the API constraint - Removes outdated 'Future Refactoring' suggestion (not feasible)
2025-12-11refactor: use Relay::notifications() for event-driven disconnect detectionDanConwayDev
Replace the 1-second polling loop with nostr-sdk's relay-level notification system that provides immediate disconnect detection via RelayNotification::RelayStatus. Key changes: - Use relay.notifications() instead of client.notifications() - Handle RelayNotification::RelayStatus { Disconnected | Terminated } to detect connection loss immediately without polling - Remove tokio::select! with interval timer - now uses simple match loop - Handle additional notification types (Authenticated, AuthenticationFailed) Why this is better: - Event-driven vs polling: no wasted CPU cycles checking every second - Immediate detection: disconnect triggers notification instantly - Uses nostr-sdk's built-in mechanism that was previously inaccessible at pool level (RelayStatus notifications are filtered out in RelayPoolNotification) Technical note: RelayNotification::RelayStatus is only available via Relay::notifications(), not Client::notifications(), because the pool-level broadcast filters out status change events. Future refactoring opportunity: Consider restructuring RelayConnection to hold a Relay directly instead of wrapping a Client, since we only manage one relay per connection anyway.
2025-12-11fix: wire up relay disconnection detection for metricsDanConwayDev
- Add periodic health check in RelayConnection::run_event_loop that polls nostr-sdk's relay.is_connected() every second to detect dead connections - When event channel closes without explicit Closed/Shutdown, send DisconnectNotification to SyncManager (fixes case where TCP drops silently) - Enable test_relay_connected_status test which validates the ngit_sync_relay_connected metric correctly reflects connection state The issue was that when a remote relay stops abruptly, nostr-sdk's notification receiver blocks indefinitely waiting for data. TCP disconnect detection without keepalive can take minutes. The health check polls nostr-sdk's internal relay status which detects disconnection promptly.
2025-12-11fix: resolve duplicate SyncMetrics registration preventing metrics recordingDanConwayDev
Root cause: Both Metrics::new() and SyncManager::new() were trying to register SyncMetrics with the same Prometheus registry. The second registration failed silently, leaving SyncManager.metrics = None, so record_connection_attempt() calls were no-ops. Changes: - SyncManager::new() now accepts Option<SyncMetrics> instead of Option<&Registry> - main.rs passes already-registered sync metrics from Metrics to SyncManager - Simplified test_connection_failure_increments_counter assertion - Marked 3 tests as #[ignore] pending relay tracking metrics wiring Tests fixed: - test_connection_failure_increments_counter (now counts failures) - test_health_state_degrades_on_failure (now tracks health state) - test_live_sync_layer3_events (already working, confirmed) Tests ignored (future work): - test_live_sync_event_count - test_multi_source_aggregate_counts - test_relay_connected_status
2025-12-11sync: add sync_base_backoff_secs config for better testingDanConwayDev
2025-12-11sync: improve connection timeout handlingDanConwayDev
2025-12-11fix(sync): improve metrics recording and connection failure detectionDanConwayDev
Changes: - Fix connection attempt metrics: record success/failure based on actual connection result instead of pre-emptively recording failure - Add health tracker integration on connection failure: call record_failure() and record_health_state() in error path - Add connection verification in relay_connection.rs: wait 500ms after connect() then verify is_connected() to detect silent failures - Add configurable disconnect check interval via NGIT_SYNC_DISCONNECT_CHECK_INTERVAL_SECS env var - Update TestRelay with fast test settings: startup_delay=0, jitter=0, disconnect_check_interval=1s - Add debug output to metrics tests for investigation Note: Tests may still fail due to 5-second base backoff in health tracker. A follow-up task will add NGIT_SYNC_BASE_BACKOFF_SECS config parameter to allow faster test cycles. Related: metrics-wiring-plan.md Tasks 1 & 2
2025-12-11feat: add event metrics tracking throughout sync (Phase 5)DanConwayDev
2025-12-10feat: add metrics field to SyncManager (Phase 2)DanConwayDev
2025-12-10feat: create sync metrics module (Phase 1)DanConwayDev
2025-12-10fix: enable Layer 3 sync by adding root events to pending queueDanConwayDev
When root events (issues/patches) are received via self-subscription, handle_root_event() was only updating the repo_sync_index directly. This caused process_batch() to early-return when pending.is_empty(), so Layer 3 filters for comments/replies were never created. The fix adds root events to both: 1. repo_sync_index (for immediate availability) 2. pending queue (to trigger Layer 3 filter creation in next batch) Critical: The pending entry must include relays from repo_sync_index so derive_relay_targets() knows where to send Layer 3 subscriptions. The Layer 3 test now verifies that events sent BEFORE the subscription is established are still synced - proving subscriptions without 'since' correctly fetch historical events. Enabled 4 previously ignored Layer 3 tests: - test_live_sync_layer3_events - test_layer3_sync_with_lowercase_e_tag - test_layer3_sync_with_uppercase_e_tag - test_layer3_sync_with_q_tag
2025-12-10feat(sync): broadcast synced events to WebSocket subscribersDanConwayDev
Enable recursive relay discovery by broadcasting synced events to WebSocket subscribers via LocalRelay.notify_event(). This allows the SelfSubscriber to receive 30617 announcements synced from external relays and discover additional relay URLs to connect to. Changes: - Pass LocalRelay to SyncManager::new() from main.rs - Add local_relay field to SyncManager struct - Call notify_event() after saving synced events to database - Enable test_recursive_relay_discovery_syncs_announcement test The test verifies that when relay_a syncs announcement_x from bootstrap relay_b (which lists relay_c), relay_a discovers and connects to relay_c to sync announcement_y. Fixes recursive relay discovery from bootstrap sync.
2025-12-10sync: fix connection registration issueDanConwayDev
2025-12-10improve: count all active subscriptions in get_filter_count (IMPROVE-1)DanConwayDev
2025-12-10refactor: remove insert-remove pattern in spawn_relay_connection (SIMPLIFY-3)DanConwayDev
2025-12-10refactor: deduplicate SelfSubscriber select branches (SIMPLIFY-2)DanConwayDev
2025-12-10refactor: remove redundant RelayAction enum (SIMPLIFY-1)DanConwayDev
2025-12-10feat: add automatic reconnection with exponential backoff (IMPROVE-2)DanConwayDev
2025-12-10fix: don't add 30617 announcement IDs to root_events (BUG-2)DanConwayDev
2025-12-10fix: add Layer 1 re-subscription on quick reconnect (BUG-1)DanConwayDev
2025-12-10sync: implement graceful shutdown for all tasks and connectionsDanConwayDev
2025-12-10sync: enhance SelfSubscriber with reconnect and root event trackingDanConwayDev
2025-12-10sync: implement relay removal for empty non-bootstrap relaysDanConwayDev
2025-12-10sync: implement daily timer for periodic fresh syncDanConwayDev
2025-12-10sync: implement filter consolidation systemDanConwayDev
2025-12-10sync: complete AddFilters handler with auto-spawningDanConwayDev
2025-12-10sync: implement unified connect/reconnect with since filtersDanConwayDev
2025-12-10sync: implement PendingBatch EOSE confirmation flowDanConwayDev
2025-12-10sync: implement disconnect handler with state cleanupDanConwayDev
2025-12-10sync: integrate health tracking and connection storageDanConwayDev
2025-12-10sync v4 mvpDanConwayDev
2025-12-10stub of sync v4DanConwayDev
2025-12-10improve sync designDanConwayDev
2025-12-09sync initalize from dbDanConwayDev
2025-12-09basic sync stubDanConwayDev
2025-12-08proposed sync change to use self subscribe to trigger everythingDanConwayDev
2025-12-05remove stupid tests and methodsDanConwayDev
2025-12-05rename sunc_bootstrap_relay_urlDanConwayDev
2025-12-05fix basic sync testsDanConwayDev
2025-12-05sync fixesDanConwayDev
2025-12-04feat(sync): Phase 6 - observability and production readinessDanConwayDev
- Add SyncMetrics with full Prometheus integration - Track sync gaps via catchup events - Update Grafana dashboard with sync panels - Document all sync configuration options - Update design doc with implementation notes
2025-12-04feat(sync): Phase 5 - negentropy catchup (NIP-77)DanConwayDev
- Add NegentropyService for set reconciliation - Implement startup catchup with warm-up delay - Implement reconnect catchup (last 3 days) - Add daily catchup schedule with stagger
2025-12-04feat(sync): Phase 4 - dynamic subscriptionsDanConwayDev
- Add SubscriptionManager for per-connection tracking - Trigger subscription updates on new repo/PR events - Implement consolidation when filter count > 150
2025-12-04feat(sync): Phase 3 - resilience and health trackingDanConwayDev
- Add RelayHealthTracker with DashMap - Implement exponential backoff (5s -> 1h max) - Handle dead relays (24h failures -> daily retry) - Add startup jitter to prevent thundering herd - Add NGIT_SYNC_MAX_BACKOFF_SECS config
2025-12-04feat(sync): Phase 2 - multi-relay and complete filtersDanConwayDev
- Add relay discovery from stored announcements - Implement FilterService with three-layer strategy - Support multiple simultaneous relay connections - Filter batching for large tag sets
2025-12-04feat(sync): Phase 1 MVP - single relay proactive syncDanConwayDev
- Add src/sync/ module with SyncManager - Add NGIT_SYNC_RELAY_URL config option - Subscribe to kind 30617 on configured relay - Validate synced events through Nip34WritePolicy - Integration test with two TestRelay instances