| Age | Commit message (Collapse) | Author |
|
|
|
setting lower thresholds
|
|
Replace broken event counting that occurred before duplicate/policy checks
with accurate tracking of events that are new, accepted, and saved.
Changes:
- Added ProcessResult enum to track event processing outcomes
- Modified process_event_static() to return ProcessResult
- Replaced events_total (with source labels) with events_synced_total
- Removed gap_events_total and event_source module
- Removed eose_received flag (EOSE is per-subscription, not suitable)
- Updated all tests to use new simplified API
The new ngit_sync_events_synced_total metric only counts events that:
1. Are new (not duplicates)
2. Pass write policy validation
3. Are successfully saved to database
All 165 tests pass (124 lib + 41 integration)
|
|
|
|
|
|
|
|
- Add comprehensive test pattern guidance to tests/sync/mod.rs
- Explain when to use run_sync_test() vs manual setup
- Document helper scope and architectural limitations
Key findings:
- historic_sync.rs: 4 tests refactored, 143 lines removed (50% reduction)
- live_sync/discovery/tag_variations: Manual setup required due to
architectural incompatibilities (timing, multi-relay, assertions)
- Helper works for batch historic verification, not real-time scenarios
Detailed summary available in work/sync-test-refactor-summary.md
|
|
- Refactored all 4 tests in historic_sync.rs to use run_sync_test()
- Tests maintain same logic and assertions, only setup simplified
- Moved run_sync_test() and SyncTestResult outside #[cfg(test)] module
- Updated validation to allow empty event slices (for announcement-only tests)
- All 4 historic_sync tests passing (test_bootstrap_syncs_existing_layer2_events, test_relay_replays_events_after_restart, test_announcement_not_listing_relay_is_not_synced, test_history_sync_without_negentropy)
- Result: 39/40 tests passing (1 more than Phase 1 baseline of 38/40)
|
|
|
|
|
|
|
|
|
|
Main lib (src/):
- Add #[allow(dead_code)] for build_info field (stored to prevent Prometheus unregistration)
- Add #[allow(dead_code)] for first_seen field (reserved for future rate limiting)
- Replace .or_insert_with(RelaySyncNeeds::default) with .or_default()
- Replace manual div_ceil implementations with .div_ceil(100)
Test code (tests/):
- Replace .expect(&format!(...)) with .unwrap_or_else(|_| panic!(...))
- Remove needless borrows in fetch_metrics() calls
- Add #[allow(dead_code)] and #[allow(unused_imports)] to test helpers module
grasp-audit:
- Apply cargo fmt to fix formatting
|
|
|
|
The catchup sync mechanism (reconnection with since filter) is implemented
in src/sync/mod.rs handle_connect_or_reconnect(), but cannot be reliably
integration tested with current infrastructure:
- TestRelay uses in-memory database (events lost on stop)
- No way to force WebSocket disconnection without stopping relay
- Stopping syncing relay creates new instance (fresh sync, not catchup)
Convert the skeleton test file to comprehensive documentation explaining:
- How catchup sync works (since filter on reconnect)
- The 15-minute quick reconnect window logic
- Why integration testing is not feasible
- Alternative approaches that could enable testing
- Related tests that cover adjacent functionality
|
|
|
|
Previously, events were classified as 'startup' or 'live' based on whether
they came from a bootstrap relay (is_bootstrap flag). This meant ALL events
from bootstrap relays were counted as 'startup', even events received after
the initial sync completed.
Now events are classified based on whether EOSE (End Of Stored Events) has
been received for that connection:
- Events BEFORE EOSE → 'startup' (historical events during initial sync)
- Events AFTER EOSE → 'live' (new events via real-time subscription)
This enables the test_live_sync_event_count test which validates that events
received after sync connection is established are counted as live events.
Also removed the #[ignore] attribute from test_live_sync_event_count since
the metrics are now properly wired up.
|
|
- Add periodic health check in RelayConnection::run_event_loop that polls
nostr-sdk's relay.is_connected() every second to detect dead connections
- When event channel closes without explicit Closed/Shutdown, send
DisconnectNotification to SyncManager (fixes case where TCP drops silently)
- Enable test_relay_connected_status test which validates the
ngit_sync_relay_connected metric correctly reflects connection state
The issue was that when a remote relay stops abruptly, nostr-sdk's
notification receiver blocks indefinitely waiting for data. TCP disconnect
detection without keepalive can take minutes. The health check polls
nostr-sdk's internal relay status which detects disconnection promptly.
|
|
Root cause: Both Metrics::new() and SyncManager::new() were trying to register
SyncMetrics with the same Prometheus registry. The second registration failed
silently, leaving SyncManager.metrics = None, so record_connection_attempt()
calls were no-ops.
Changes:
- SyncManager::new() now accepts Option<SyncMetrics> instead of Option<&Registry>
- main.rs passes already-registered sync metrics from Metrics to SyncManager
- Simplified test_connection_failure_increments_counter assertion
- Marked 3 tests as #[ignore] pending relay tracking metrics wiring
Tests fixed:
- test_connection_failure_increments_counter (now counts failures)
- test_health_state_degrades_on_failure (now tracks health state)
- test_live_sync_layer3_events (already working, confirmed)
Tests ignored (future work):
- test_live_sync_event_count
- test_multi_source_aggregate_counts
- test_relay_connected_status
|
|
Changes:
- Fix connection attempt metrics: record success/failure based on actual
connection result instead of pre-emptively recording failure
- Add health tracker integration on connection failure: call
record_failure() and record_health_state() in error path
- Add connection verification in relay_connection.rs: wait 500ms after
connect() then verify is_connected() to detect silent failures
- Add configurable disconnect check interval via
NGIT_SYNC_DISCONNECT_CHECK_INTERVAL_SECS env var
- Update TestRelay with fast test settings: startup_delay=0, jitter=0,
disconnect_check_interval=1s
- Add debug output to metrics tests for investigation
Note: Tests may still fail due to 5-second base backoff in health tracker.
A follow-up task will add NGIT_SYNC_BASE_BACKOFF_SECS config parameter
to allow faster test cycles.
Related: metrics-wiring-plan.md Tasks 1 & 2
|
|
|
|
Deleted 12 existence-only tests that provided zero confidence:
- test_sync_metrics_exposed
- test_sync_metric_names_present
- test_connection_metrics_on_success
- test_event_sync_metrics
- test_health_state_metrics
- test_gap_event_tracking
- test_connection_failure_metrics
- test_failure_counter_increments
- test_relay_count_metrics
- test_event_source_labels_in_metrics
- test_multi_relay_load
- test_gap_events_tracked_separately
Kept 5 valuable tests:
- test_prometheus_format_valid
- test_concurrent_metrics_requests
- test_metric_values_are_numeric
- test_startup_sync_event_count
- test_metrics_availability_during_sync
Added 3 real value-checking tests (currently ignored):
- test_connection_failure_increments_counter
- test_live_sync_event_count
- test_relay_connected_status
Test results: 6 passed, 0 failed, 3 ignored
|
|
|
|
|
|
|
|
When root events (issues/patches) are received via self-subscription,
handle_root_event() was only updating the repo_sync_index directly.
This caused process_batch() to early-return when pending.is_empty(),
so Layer 3 filters for comments/replies were never created.
The fix adds root events to both:
1. repo_sync_index (for immediate availability)
2. pending queue (to trigger Layer 3 filter creation in next batch)
Critical: The pending entry must include relays from repo_sync_index
so derive_relay_targets() knows where to send Layer 3 subscriptions.
The Layer 3 test now verifies that events sent BEFORE the subscription
is established are still synced - proving subscriptions without 'since'
correctly fetch historical events.
Enabled 4 previously ignored Layer 3 tests:
- test_live_sync_layer3_events
- test_layer3_sync_with_lowercase_e_tag
- test_layer3_sync_with_uppercase_e_tag
- test_layer3_sync_with_q_tag
|
|
Enable recursive relay discovery by broadcasting synced events to
WebSocket subscribers via LocalRelay.notify_event(). This allows the
SelfSubscriber to receive 30617 announcements synced from external
relays and discover additional relay URLs to connect to.
Changes:
- Pass LocalRelay to SyncManager::new() from main.rs
- Add local_relay field to SyncManager struct
- Call notify_event() after saving synced events to database
- Enable test_recursive_relay_discovery_syncs_announcement test
The test verifies that when relay_a syncs announcement_x from bootstrap
relay_b (which lists relay_c), relay_a discovers and connects to
relay_c to sync announcement_y.
Fixes recursive relay discovery from bootstrap sync.
|
|
|
|
|
|
|
|
|
|
|
|
Add comprehensive tests for different Layer 2 and Layer 3 tag variations:
Layer 2 tests (Tests 8a-c) - all pass:
- test_layer2_sync_with_lowercase_a_tag (standard NIP-01)
- test_layer2_sync_with_uppercase_a_tag (NIP-33)
- test_layer2_sync_with_q_tag (NIP-18 quotes)
Layer 3 tests (Tests 9a-c) - marked #[ignore]:
- test_layer3_sync_with_lowercase_e_tag (NIP-01)
- test_layer3_sync_with_uppercase_e_tag (NIP-22)
- test_layer3_sync_with_q_tag (NIP-18)
Layer 3 tests have full implementation but are ignored until
Layer 3 sync is enabled in the relay.
|
|
|
|
Create organized test structure for proactive sync:
tests/common/sync_helpers.rs (from Phase 4):
- TestClient with retry logic for connect/send
- Event builders: build_layer2_issue_event, build_layer3_comment_event
- Tag variants (a/A/q for Layer 2, e/E/q for Layer 3)
- wait_for_event_on_relay() assertion helper
- repo_coord() utility function
- Unit tests for all builders
tests/sync/mod.rs:
- Module organization for sync tests
- Documentation of test categories
tests/sync.rs:
- Main test harness including common and sync modules
tests/sync/bootstrap.rs:
- test_bootstrap_syncs_existing_layer2_events (Test 1)
- test_relay_replays_events_after_restart (Test 4)
tests/sync/discovery.rs:
- test_discovers_layer3_via_layer2 (Test 2)
- test_layer2_discovery_with_chain (Test 3 - simplified)
All 14 tests pass: cargo test --test sync
|