| Age | Commit message (Collapse) | Author |
|
Remove the pre-implementation planning docs (announcements-purgatory-design.md
and announcements-purgatory-implementation.md) now that the feature is built.
Update the three living docs to reflect what was actually implemented:
- purgatory-design.md: expanded to cover all three purgatory stores
(announcement, state, PR), including AnnouncementPurgatoryEntry structure,
two-phase soft expiry lifecycle, expiry extension triggers, promotion flow,
and updated integration points and file structure
- grasp-02-proactive-sync.md: added SyncLevel enum (Full/StateOnly) to
RepoSyncNeeds, documented the purgatory announcement sync timer as the
registration path for purgatory announcements, updated filter building
to describe build_sync_level_aware_filters() and StateOnly behaviour
- grasp-02-proactive-sync-purgatory-git-data.md: expanded to cover
announcement purgatory as a third entry type, added Timeline E showing
soft-expiry and revival, replaced the single expiry section with separate
hard-expiry (state/PR) and two-phase soft-expiry (announcements) sections
with full justification for the 24-hour extended retention window
|
|
- Add rejected events index to architecture.md with two-tier system explanation
- Document NGIT_REJECTED_HOT_CACHE_DURATION_SECS and NGIT_REJECTED_COLD_INDEX_EXPIRY_SECS in configuration.md
- Add comprehensive rejected events metrics section to monitoring.md with Grafana queries and alerts
- Explain negentropy integration with rejected index in grasp-02-proactive-sync.md
- Document state event authorization defense-in-depth and rejection tracking in inline-authorization.md
This integrates information from work/rejected-events-index-summary.md into the main documentation,
ensuring architecture docs accurately reflect the implemented rejected events index system.
|
|
Resolves naming conflict with RelayHealthState::Degraded by using a more
explicit name that clearly indicates the connection status relates to
historic sync failures, not connection health degradation.
Changes:
- ConnectionStatus::ConnectedDegraded → ConnectedHistoricSyncFailures
- Updated all documentation and comments
- Updated Prometheus metric descriptions
- Metric value remains 4 for backward compatibility
This makes it clear that:
- ConnectedHistoricSyncFailures = connection lifecycle (missing historic data)
- RelayHealthState::Degraded = connection health (reliability issues)
These are orthogonal concerns - a relay can be ConnectedHistoricSyncFailures
but Healthy, or Connected but Degraded.
|
|
- Add ConnectionStatus::ConnectedDegraded (status=4 in metrics)
- Track batch failures via PendingBatch.failed field
- Track relay-level failures via RelayState.historic_sync_had_failures
- Transition to ConnectedDegraded when any batch fails during historic sync
- Add is_live_sync_active() helper for cleaner match patterns
- Update state machine diagram with ConnectedDegraded transitions
- Update metrics docs with status=4 and example queries
Fixes issue where relays with failed negentropy retries would
incorrectly transition to Connected status despite missing data.
Now operators can distinguish 'fully synced' vs 'degraded (partial data)'.
|
|
- Add ConnectionStatus::Syncing state between Connecting and Connected
- Track historic_sync_completed and historic_sync_completed_at in RelayState
- Auto-detect sync completion via check_and_complete_historic_sync()
- Update metrics: ngit_sync_relay_connected now shows 0-3 (disconnected/connecting/syncing/connected)
- Update Prometheus metric documentation with new status values
- Add state machine diagram showing Syncing transition
- Operators can now distinguish 'connected but catching up' vs 'fully synced'
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Replace EOSE-based sync completion with negentropy reconciliation for:
- Initial connect (fresh sync)
- Daily sync (Layer 1 announcements)
- Stale reconnect (>15 min)
Key changes:
- Add NegentropySyncResult struct with remote_only, local_only, received fields
- Add supports_negentropy() using try-and-fallback approach
- Add negentropy_sync_filter() using nostr-sdk client.sync() API
- Modify handle_connect_or_reconnect() to use negentropy for fresh/stale sync
- Modify daily_sync() to use negentropy for Layer 1
- Single-warning logging per relay when negentropy fails
Quick reconnects (<15 min) unchanged - still use REQ with since filter.
If negentropy unsupported, gracefully falls back to REQ+EOSE flow.
|
|
|
|
|
|
|
|
|
|
- Add SyncMetrics with full Prometheus integration
- Track sync gaps via catchup events
- Update Grafana dashboard with sync panels
- Document all sync configuration options
- Update design doc with implementation notes
|
|
|
|
|