| Age | Commit message (Collapse) | Author |
|
git repos
Adds a maintenance subcommand that scans the LMDB database for kind 30617
(repository announcement) events whose bare git repo on disk is empty or
missing, then removes both the 30617 and any matching 30618 (state) events.
A relay should not serve announcement or state events for a repository with
no git data. This was needed to clean up repos leaked by the bug fixed in
2161e3c, and is useful as an ongoing maintenance tool.
Usage (dry-run by default, stop relay before --execute):
ngit-grasp cleanup-empty-repos [--relay-data-path <path>] [--git-data-path <path>] [--execute]
The relay itself is now invoked as an implicit 'serve' subcommand, preserving
full backward compatibility with existing deployments and env-var configuration.
|
|
Spawns a tokio task that runs every 30 minutes and removes all events
tagged 'grasp-audit-test-event' older than 2 hours from the LMDB
database, along with their associated bare git repositories on disk.
|
|
|
|
When an owner announcement is promoted from purgatory via a git push,
any maintainer announcements sitting in the rejected_events_index hot
cache were never re-processed. The invalidate_and_get call only existed
in SyncManager::process_event_static (the nostr sync path); the git push
promotion path (http -> handlers -> git::sync) had no access to the
rejected_events_index at all.
Thread rejected_events_index and write_policy through the git push path:
- process_purgatory_announcements: after saving the promoted announcement,
parse its maintainers tag and call invalidate_and_get() for each, then
re-process any returned hot-cache events via admit_event + save
- process_newly_available_git_data: accept optional write_policy and
rejected_events_index, pass them through to process_purgatory_announcements
- handle_receive_pack: accept Arc<Nip34WritePolicy> and
Arc<RejectedEventsIndex>, pass them to process_newly_available_git_data
- HttpService / run_server: carry the two new fields, clone into each
handle_receive_pack call
- main.rs: obtain rejected_events_index from sync_manager before moving
it into its task; wrap write_policy in Arc for the HTTP server
- RealSyncContext::process_newly_available_git_data: pass None for both
new params (purgatory sync path already handles this via
SyncManager::process_event_static)
Also rewrite the maintainer_reprocessing integration tests to correctly
exercise the hot-cache path now that announcements require git data
before being released from purgatory:
- Start relay_b with relay_a as bootstrap so its SyncManager syncs
maintainer announcements via negentropy before the owner git push
- Use push_unique_git_data_to_relay (new helper) to give each maintainer
a distinct commit hash, preventing git from skipping pack transfer
- Make wait_for_event_on_relay poll in a retry loop so transient timing
gaps between DB write and query do not cause false negatives
|
|
Instead of threading repo_sync_index through PolicyContext/builder.rs/main.rs
to handle user-submitted purgatory announcements, add a simple background
timer (run_purgatory_announcement_sync, every 5s) that scans the purgatory
for announcement entries and registers them in repo_sync_index as StateOnly.
This is simpler and covers both flows:
- Sync-path announcements: inline registration still happens during event
processing (sync/mod.rs:1839+), timer provides a safety net
- User-submitted announcements: SelfSubscriber never sees them (rejected
from DB), timer is the primary registration path
The timer calls sync_purgatory_announcements_to_index() which:
1. Snapshots purgatory via new announcements_for_sync() public method
2. Or_inserts StateOnly entries (never downgrades Full entries)
3. Detects newly added relay URLs and calls handle_new_sync_filters to
connect and subscribe - fixing the failing test that expected relay
discovery from a user-submitted purgatory announcement
Removes: repo_sync_index field from PolicyContext, set/get_repo_sync_index
methods, set_repo_sync_index on Nip34WritePolicy, wiring in main.rs, and
the inline AcceptPurgatory registration block in builder.rs.
|
|
negentropy fallback
Three targeted fixes for purgatory announcement sync:
1. SelfSubscriber sync_level upgrade: After or_insert_with in process_batch,
always set entry.sync_level = SyncLevel::Full so that when a promoted
announcement is broadcast via notify_event and SelfSubscriber receives it,
an existing StateOnly entry gets upgraded to Full and PR event subscriptions
are triggered immediately (not delayed up to 24h).
2. Negentropy fallback filter split: In handle_eose, when falling back from
negentropy to REQ+EOSE, split batch_repos by SyncLevel and call
build_sync_level_aware_filters instead of build_layer2_and_layer3_filters.
Prevents StateOnly (purgatory) repos from getting Layer 2 #a/#A/#q filters
prematurely, which caused nostr-sdk client deduplication to permanently
drop PR events after orphan rejection.
3. Recompute sync filters after announcement batch EOSE: Add
recompute_new_sync_filters_for_relay calls at all three batch-completion
paths in handle_eose for generic filter (announcement) batches. This
triggers state-only subscriptions for any purgatory repos registered during
that batch, fixing the 24h delay before state event sync starts.
4. User-submitted purgatory announcements: Add repo_sync_index field to
PolicyContext with setter/getter, wire in main.rs after SyncManager
creation, and register in AcceptPurgatory handler so user-submitted
announcements get StateOnly sync started immediately.
5. Update archive tests: test_archive_without_state_events_does_not_sync_git
updated to reflect that StateOnly subscription now proactively fetches
state events from source relays. test_archive_read_only_creates_bare_repo
un-ignored as it now works end-to-end.
|
|
after announcement promotion"
This reverts commit d76003b629a4a03dba23a8a1c41da6e4ac4c30cf.
|
|
announcement promotion
When git data arrives for a purgatory announcement and promotes it to the
database, the relay now:
1. Upgrades the announcement's sync level in RepoSyncIndex from StateOnly
to Full (git/sync.rs: process_purgatory_announcements)
2. Sends AddFilters actions to SyncManager for all connected relays, using
Full sync filters (Layer 2 #a/#A/#q) to subscribe to PR events
(purgatory/sync/context.rs: RealSyncContext.process_newly_available_git_data)
3. For user-submitted purgatory announcements, registers the repo in
RepoSyncIndex with StateOnly level and sends AddFilters to SyncManager
so it discovers and connects to relays listed in the announcement tags
(nostr/builder.rs: handle_announcement AcceptPurgatory path)
The RealSyncContext now accepts optional repo_sync_index and sync_action_tx
parameters. main.rs wires these up from SyncManager. PolicyContext gains
repo_sync_index and sync_action_tx fields for the write policy path.
|
|
Route new announcements to purgatory instead of accepting immediately.
Announcements are promoted to the database when git data arrives,
ensuring we only serve announcements for repos with actual content.
Implemented:
- AnnouncementPurgatoryEntry type and DashMap store
- Route new announcements to purgatory (replacement announcements skip)
- Promote announcements on git data arrival (process_purgatory_announcements)
- Authorization checks purgatory announcements (fetch_repository_data_with_purgatory)
- State policy uses purgatory announcements for maintainer validation
- Cleanup task handles announcement expiry
- Updated count()/cleanup() to 3-tuples
Known broken:
- test_archive_read_only_creates_bare_repo fails: sync module does not
treat purgatory announcements as confirmed repos, so per-repo sync
(state events, PRs) is never triggered for purgatory announcements
- Announcement persistence (save/restore) not implemented
- SyncLevel (StateOnly vs Full) not implemented
- Soft expiry two-phase not implemented
- Expiry extension on state event / git auth not wired up
|
|
Listen for both SIGINT (Ctrl+C) and SIGTERM (systemd) signals to ensure
graceful shutdown cleanup runs when stopping the service via systemd.
Previously, only SIGINT was handled, causing purgatory state and rejected
events cache to be lost on every systemd restart. Now both signals trigger
the cleanup code that saves state files and removes placeholder refs.
Fixes issue 0f73
|
|
|
|
Add proper log level configuration following standard approach:
- CLI flag: --log-level <level>
- Environment variable: NGIT_LOG_LEVEL
- Default: info
- Supports simple levels (error, warn, info, debug, trace)
- Supports filter expressions (e.g., ngit_grasp=debug,actix_web=info)
Configuration is now consistent across all four sources:
1. src/config.rs - Config struct with log_level field
2. docs/reference/configuration.md - Full documentation
3. nix/module.nix - NixOS module with logLevel option
4. .env.example - Example configuration file
This replaces the previous RUST_LOG approach with proper integration
into the ngit-grasp configuration system, enabling trace logging from
CLI, environment variables, or NixOS configuration.
|
|
shutdown/startup
Implement save/restore functionality for rejected events cache and
integrate persistence with relay shutdown/startup lifecycle. Both
purgatory and rejected cache now survive relay restarts.
Key features:
- Serialize rejected events cache to JSON (rejected-events-cache.json)
- Save both hot cache (2min, full events) and cold index (7day, metadata)
- Restore with downtime adjustment (preserves remaining TTL)
- Graceful degradation (missing/corrupted files don't crash)
- File cleanup after successful restore
- Automatic restoration in SyncManager::new()
Integration:
- Shutdown hook saves both purgatory and rejected cache
- Startup hook restores both and re-queues repositories
- Non-fatal errors (logs warnings, continues on failure)
Files:
- src/sync/rejected_index.rs: save_to_disk/restore_from_disk methods
- src/sync/mod.rs: SyncManager integration and auto-restore
- src/main.rs: Shutdown/startup hooks for both caches
- tests/purgatory_persistence.rs: 17 integration tests
Tests: 13 unit tests + 17 integration tests covering full lifecycle
|
|
config methods
Refactors configuration validation to fail fast on fatal errors at startup
while gracefully handling recoverable issues (e.g., malformed whitelist entries).
Changes:
- Add Config::validate() for eager validation called immediately after load
- Remove Result<> from archive_config() and repository_config() methods
- WhitelistEntry::parse_whitelist() skips invalid entries with warnings
- Validate relay_owner_nsec format in Config::validate()
- Update all call sites to remove Result handling from config getters
Benefits:
- Fatal config errors (incompatible settings) fail at startup, not runtime
- Recoverable errors (bad whitelist entries) logged as warnings and skipped
- No Result handling scattered throughout runtime code after validation
- Config methods safe to call without error handling after validate()
Testing:
- Add 7 new tests for validation edge cases and error handling
- Total config tests: 40 (up from 33)
- All 320 library tests passing
Breaking change: Config users must call config.validate() after Config::load()
to ensure configuration is valid. This is enforced in main.rs.
|
|
Implements ngit_repositories_total metric by counting *.git directories
on disk every time /metrics is requested (~15s interval by Prometheus).
This approach is simpler than increment-on-create because:
- No need to pass metrics through the relay builder chain
- Always accurate and self-correcting
- Negligible performance impact (~100-200 dir entries)
Changes:
- Add count_repositories_on_disk() static method to Metrics
- Update Metrics::render() to count repos before encoding metrics
- Pass git_data_path to Metrics::new() in main.rs
- Consolidate metrics tests to avoid global Prometheus registry conflicts
Fixes repository count metric issue from Phase 8 deployment plan.
|
|
Implement domain-level naughty list tracking for git remotes, reusing the
existing NaughtyListTracker from relay sync. This prevents repeated attempts
to fetch from git domains with persistent infrastructure issues (SSL/TLS
certificate errors, DNS failures).
Changes:
- Updated NaughtyListTracker to track both relay URLs and git domains
- Added git_naughty_list field to RealSyncContext for error classification
- Modified fetch_oids() to classify git fetch errors and record naughty domains
- Updated sync_identifier_next_url() to filter out naughty domains during URL selection
- Added git_naughty_list parameter to ThrottleManager for domain queue processing
- Threaded naughty list through start_sync_loop and all sync functions
- Updated all tests to pass naughty list parameter
The naughty list uses 12-hour expiration (configurable) to allow domains to
recover from infrastructure issues. First occurrence logs WARN, repeats log DEBUG.
|
|
|
|
Adds expired event tracking to prevent proactive sync from repeatedly
fetching and re-adding events that expired from purgatory without
finding git data.
Key features:
- Track expired events for 7 days to prevent re-sync loops
- Distinguish synced vs user-submitted events (via socket address)
- Allow users to retry expired events (git data might now be available)
- Reject synced expired events (prevents infinite loop)
- Daily cleanup of expired event records older than 7 days
Implementation:
- Added expired_events: DashMap<EventId, Instant> to Purgatory
- Updated event_ids() to include both purgatory + expired events
- Added is_expired(), mark_expired(), cleanup_expired_events()
- Updated cleanup() to mark expired events automatically
- Added is_synced detection in WritePolicy (localhost:0 = synced)
- Policy layer checks is_synced && is_expired() before rejecting
Behavior:
- Negentropy: Filters expired events before fetching (optimal)
- REQ+EOSE: Rejects synced expired events at policy layer
- User submissions: Always allowed to retry (skip expired check)
Testing:
- Added 5 new tests for expired event tracking
- All 222 tests passing
Fixes the infinite re-sync loop where events without git data would
expire, get synced again, expire again, repeat forever.
|
|
Phase 13 of purgatory-sync-redesign:
- Add sync loop startup in main.rs (RealSyncContext + ThrottleManager + start_sync_loop)
- Update add_state() and add_pr() to automatically enqueue for background sync
- Remove start_state_sync() call from state.rs (now handled by sync loop)
- Remove orphaned legacy functions: sync_state_git_data, fetch_missing_oids_from_server,
get_most_complete_local_repo, identify_missing_oids, get_date_of_most_recent_commit_on_default_branch
- Clean up unused imports in purgatory/mod.rs
|
|
|
|
|
|
|
|
|
|
so we can more easily support grasp purgatory feature
|
|
Main lib (src/):
- Add #[allow(dead_code)] for build_info field (stored to prevent Prometheus unregistration)
- Add #[allow(dead_code)] for first_seen field (reserved for future rate limiting)
- Replace .or_insert_with(RelaySyncNeeds::default) with .or_default()
- Replace manual div_ceil implementations with .div_ceil(100)
Test code (tests/):
- Replace .expect(&format!(...)) with .unwrap_or_else(|_| panic!(...))
- Remove needless borrows in fetch_metrics() calls
- Add #[allow(dead_code)] and #[allow(unused_imports)] to test helpers module
grasp-audit:
- Apply cargo fmt to fix formatting
|
|
Root cause: Both Metrics::new() and SyncManager::new() were trying to register
SyncMetrics with the same Prometheus registry. The second registration failed
silently, leaving SyncManager.metrics = None, so record_connection_attempt()
calls were no-ops.
Changes:
- SyncManager::new() now accepts Option<SyncMetrics> instead of Option<&Registry>
- main.rs passes already-registered sync metrics from Metrics to SyncManager
- Simplified test_connection_failure_increments_counter assertion
- Marked 3 tests as #[ignore] pending relay tracking metrics wiring
Tests fixed:
- test_connection_failure_increments_counter (now counts failures)
- test_health_state_degrades_on_failure (now tracks health state)
- test_live_sync_layer3_events (already working, confirmed)
Tests ignored (future work):
- test_live_sync_event_count
- test_multi_source_aggregate_counts
- test_relay_connected_status
|
|
|
|
|
|
Enable recursive relay discovery by broadcasting synced events to
WebSocket subscribers via LocalRelay.notify_event(). This allows the
SelfSubscriber to receive 30617 announcements synced from external
relays and discover additional relay URLs to connect to.
Changes:
- Pass LocalRelay to SyncManager::new() from main.rs
- Add local_relay field to SyncManager struct
- Call notify_event() after saving synced events to database
- Enable test_recursive_relay_discovery_syncs_announcement test
The test verifies that when relay_a syncs announcement_x from bootstrap
relay_b (which lists relay_c), relay_a discovers and connects to
relay_c to sync announcement_y.
Fixes recursive relay discovery from bootstrap sync.
|
|
|
|
- Add RelayHealthTracker with DashMap
- Implement exponential backoff (5s -> 1h max)
- Handle dead relays (24h failures -> daily retry)
- Add startup jitter to prevent thundering herd
- Add NGIT_SYNC_MAX_BACKOFF_SECS config
|
|
- Add relay discovery from stored announcements
- Implement FilterService with three-layer strategy
- Support multiple simultaneous relay connections
- Filter batching for large tag sets
|
|
- Add src/sync/ module with SyncManager
- Add NGIT_SYNC_RELAY_URL config option
- Subscribe to kind 30617 on configured relay
- Validate synced events through Nip34WritePolicy
- Integration test with two TestRelay instances
|
|
|
|
|
|
|
|
|
|
|
|
|
|
- WebSocket-based relay using tokio-tungstenite
- Full NIP-01 protocol support (EVENT, REQ, CLOSE)
- Event validation (signature and ID)
- In-memory event storage
- Filter support (IDs, authors, kinds, since/until)
- Configuration via environment variables
- Nix flake for reproducible builds
- Test automation script
All 6 NIP-01 smoke tests passing (100%)
|