| Age | Commit message (Collapse) | Author |
|
|
|
|
|
|
|
Add automatic pagination support for non-Negentropy historic sync to handle
large result sets efficiently. When a subscription receives >= 75 events,
the system automatically fetches the next page using the 'until' parameter.
Changes:
- Add PaginationState struct to track event counts and min timestamps
- Add pagination_state HashMap to PendingBatch for per-subscription tracking
- Add PAGINATION_THRESHOLD constant (75 events)
- Pass pending_sync_index to event processor for state updates
- Track events and timestamps as they arrive
- Check threshold on EOSE and launch follow-up subscriptions
- Initialize pagination state when creating historic sync subscriptions
- Update test fixtures in algorithms.rs
The pagination continues recursively until a page returns fewer than 75 events,
ensuring complete historic data retrieval without overwhelming relay limits.
|
|
|
|
Separated connection from subscription logic. The RelayConnection.connect()
method now only handles WebSocket connection establishment. Subscriptions
are managed separately via handle_connect_or_reconnect.
Changes:
- Renamed RelayConnection::connect_and_subscribe() to connect()
- Removed subscription logic from connect method
- Updated call site in try_connect_relay()
- Removed unused build_announcement_filter import
|
|
The system was incorrectly treating subscription-specific CLOSED messages
as connection-wide disconnects, causing live subscriptions to be terminated
immediately after historic_sync completed.
Two bugs fixed:
1. relay_connection.rs: Removed break on RelayMessage::Closed - it's
subscription-specific, not connection-wide
2. mod.rs: Removed disconnect handling for RelayEvent::Closed - only log
at DEBUG level and continue
All 41 sync tests now pass including previously failing live sync tests.
|
|
|
|
Main lib (src/):
- Add #[allow(dead_code)] for build_info field (stored to prevent Prometheus unregistration)
- Add #[allow(dead_code)] for first_seen field (reserved for future rate limiting)
- Replace .or_insert_with(RelaySyncNeeds::default) with .or_default()
- Replace manual div_ceil implementations with .div_ceil(100)
Test code (tests/):
- Replace .expect(&format!(...)) with .unwrap_or_else(|_| panic!(...))
- Remove needless borrows in fetch_metrics() calls
- Add #[allow(dead_code)] and #[allow(unused_imports)] to test helpers module
grasp-audit:
- Apply cargo fmt to fix formatting
|
|
Replace EOSE-based sync completion with negentropy reconciliation for:
- Initial connect (fresh sync)
- Daily sync (Layer 1 announcements)
- Stale reconnect (>15 min)
Key changes:
- Add NegentropySyncResult struct with remote_only, local_only, received fields
- Add supports_negentropy() using try-and-fallback approach
- Add negentropy_sync_filter() using nostr-sdk client.sync() API
- Modify handle_connect_or_reconnect() to use negentropy for fresh/stale sync
- Modify daily_sync() to use negentropy for Layer 1
- Single-warning logging per relay when negentropy fails
Quick reconnects (<15 min) unchanged - still use REQ with since filter.
If negentropy unsupported, gracefully falls back to REQ+EOSE flow.
|
|
nostr-sdk 0.44's Relay::new() is pub(crate), making it impossible to
construct a Relay directly from outside the crate. Relays can only be
created through Client::add_relay() or RelayPool::add_relay().
This commit:
- Adds 'Why Client instead of Relay directly?' section to struct docs
- Updates run_event_loop() docs to explain the API constraint
- Removes outdated 'Future Refactoring' suggestion (not feasible)
|
|
Replace the 1-second polling loop with nostr-sdk's relay-level notification
system that provides immediate disconnect detection via RelayNotification::RelayStatus.
Key changes:
- Use relay.notifications() instead of client.notifications()
- Handle RelayNotification::RelayStatus { Disconnected | Terminated } to detect
connection loss immediately without polling
- Remove tokio::select! with interval timer - now uses simple match loop
- Handle additional notification types (Authenticated, AuthenticationFailed)
Why this is better:
- Event-driven vs polling: no wasted CPU cycles checking every second
- Immediate detection: disconnect triggers notification instantly
- Uses nostr-sdk's built-in mechanism that was previously inaccessible at pool level
(RelayStatus notifications are filtered out in RelayPoolNotification)
Technical note: RelayNotification::RelayStatus is only available via
Relay::notifications(), not Client::notifications(), because the pool-level
broadcast filters out status change events.
Future refactoring opportunity: Consider restructuring RelayConnection to hold
a Relay directly instead of wrapping a Client, since we only manage one relay
per connection anyway.
|
|
- Add periodic health check in RelayConnection::run_event_loop that polls
nostr-sdk's relay.is_connected() every second to detect dead connections
- When event channel closes without explicit Closed/Shutdown, send
DisconnectNotification to SyncManager (fixes case where TCP drops silently)
- Enable test_relay_connected_status test which validates the
ngit_sync_relay_connected metric correctly reflects connection state
The issue was that when a remote relay stops abruptly, nostr-sdk's
notification receiver blocks indefinitely waiting for data. TCP disconnect
detection without keepalive can take minutes. The health check polls
nostr-sdk's internal relay status which detects disconnection promptly.
|
|
|
|
|
|
Changes:
- Fix connection attempt metrics: record success/failure based on actual
connection result instead of pre-emptively recording failure
- Add health tracker integration on connection failure: call
record_failure() and record_health_state() in error path
- Add connection verification in relay_connection.rs: wait 500ms after
connect() then verify is_connected() to detect silent failures
- Add configurable disconnect check interval via
NGIT_SYNC_DISCONNECT_CHECK_INTERVAL_SECS env var
- Update TestRelay with fast test settings: startup_delay=0, jitter=0,
disconnect_check_interval=1s
- Add debug output to metrics tests for investigation
Note: Tests may still fail due to 5-second base backoff in health tracker.
A follow-up task will add NGIT_SYNC_BASE_BACKOFF_SECS config parameter
to allow faster test cycles.
Related: metrics-wiring-plan.md Tasks 1 & 2
|
|
|
|
|
|
|
|
|
|
|