| Age | Commit message (Collapse) | Author |
|
git repos
Adds a maintenance subcommand that scans the LMDB database for kind 30617
(repository announcement) events whose bare git repo on disk is empty or
missing, then removes both the 30617 and any matching 30618 (state) events.
A relay should not serve announcement or state events for a repository with
no git data. This was needed to clean up repos leaked by the bug fixed in
2161e3c, and is useful as an ongoing maintenance tool.
Usage (dry-run by default, stop relay before --execute):
ngit-grasp cleanup-empty-repos [--relay-data-path <path>] [--git-data-path <path>] [--execute]
The relay itself is now invoked as an implicit 'serve' subcommand, preserving
full backward compatibility with existing deployments and env-var configuration.
|
|
When NGIT_MAX_CONNECTIONS is unset the relay imposes no connection cap,
deferring to OS fd limits and infrastructure controls. The option remains
available for operators who want an explicit ceiling.
|
|
|
|
This merge includes critical bug fixes and comprehensive migration tooling
developed during the relay.ngit.dev migration effort.
Bug Fixes:
- Fix git protocol error handling to return HTTP 200 with ERR pkt-line
- Fix naughty list false positives and DNS failure identification
- Fix database query filters in load_existing_events (remove .since())
- Fix OID fetch tracking to distinguish 0 OIDs from successful fetches
- Fix purgatory event source tracking for filtered expiry logging
- Implement OID retry logic for 'not our ref' errors
Migration Tools & Documentation:
- Complete 5-phase migration analysis pipeline with orchestration script
- Phase 1: Event fetching from source relay
- Phase 2: Git sync verification
- Phase 3: Categorization and relay comparison
- Phase 4: Log extraction (parse failures, purgatory expiry)
- Phase 5: Action classification for migration decisions
- Comprehensive migration guide with lessons learned
- Troubleshooting guide for permission and corruption issues
Configuration:
- Add NGIT_LOG_LEVEL configuration option
- Update git throttle limits to 60/minute
- Improve logging throughout for better observability
|
|
Add proper log level configuration following standard approach:
- CLI flag: --log-level <level>
- Environment variable: NGIT_LOG_LEVEL
- Default: info
- Supports simple levels (error, warn, info, debug, trace)
- Supports filter expressions (e.g., ngit_grasp=debug,actix_web=info)
Configuration is now consistent across all four sources:
1. src/config.rs - Config struct with log_level field
2. docs/reference/configuration.md - Full documentation
3. nix/module.nix - NixOS module with logLevel option
4. .env.example - Example configuration file
This replaces the previous RUST_LOG approach with proper integration
into the ngit-grasp configuration system, enabling trace logging from
CLI, environment variables, or NixOS configuration.
|
|
The NIP-11 specification requires the pubkey field to be a 64-character
hex string, but we were incorrectly using npub (bech32) format.
Changes:
- Add Config::relay_owner_pubkey_hex() method to get hex format
- Update NIP-11 document to use hex format instead of npub
- Update test to verify 64-char hex string instead of npub format
Fixes nak relay command error:
'must be a hex string of 64 characters'
|
|
Enables relay operators to backup/archive specific GRASP servers by domain.
Includes configuration, validation, documentation, and integration tests.
|
|
Combined Accept and AcceptArchive match arms in builder.rs to ensure
bare repositories are created for both cases. Previously AcceptArchive
had duplicate code that didn't call ensure_bare_repository().
Also includes:
- Config fix: effective_git_data_path() respects explicit paths with memory backend
- TestRelay: Added git_data_path() and archive config support for testing
- Integration tests for archive_read_only behavior
|
|
Increases connection limit across all configuration sources:
- src/config.rs: default_value_t = 4096
- docs/reference/configuration.md: updated default and examples
- nix/module.nix: maxConnections default = 4096
- .env.example: updated default and comment
This allows the relay to handle more concurrent connections and reduces
the likelihood of connection exhaustion under normal load. The previous
limit of 2000 was too conservative for production deployments.
|
|
- Make RateLimit explicit in relay builder (500 subs, 60 events/min)
- Add NGIT_MAX_CONNECTIONS config option (default: 500)
- Update all 4 config locations (src, nix, docs, .env.example)
- Fix documentation error: filter limit 5000→500
- Document Phase 2 deferral decision (per-IP enforcement)
Addresses primary DoS vector (connection exhaustion) with minimal code.
Per-IP rate limiting deferred until abuse detected in production.
Related: issue ff38 (git endpoint throttling - separate concern)
|
|
- Update default bind address in src/config.rs to 127.0.0.1:7334
- Update all four critical config sources per AGENTS.md:
- src/config.rs (code default and tests)
- .env.example (development template)
- docs/reference/configuration.md (user documentation)
- nix/module.nix (NixOS deployment)
- Update all documentation examples and references:
- README.md (with note about phone keypad mnemonic)
- docs/how-to/*.md (deploy, prometheus-setup, test-compliance)
- docs/explanation/*.md (architecture, comparison)
- docs/learnings/grasp-audit.md
Port 7334 spells NGIT on a phone keypad, making it memorable and
project-specific.
All tests pass (336 lib tests + 51 integration tests).
|
|
Adds NGIT_EVENT_BLACKLIST option for blocking all events from specific npubs,
taking precedence over all other validation to enable comprehensive moderation
without affecting curation policy.
Key features:
- Simple npub-only format: <npub>,<npub>,...
- Checked FIRST before any other validation (including repository blacklist)
- Blocks ALL event types (announcements, state events, PRs, comments, etc.)
- Events never reach relay storage or purgatory
- Specific rejection reason for operator debugging
Implementation:
- Add EventBlacklistConfig struct with check() method
- Add NGIT_EVENT_BLACKLIST config option and event_blacklist_config() method
- Add config field to PolicyContext for policy access
- Add check_event_blacklist() to Nip34WritePolicy
- Check event blacklist first in admit_event() method (before any other validation)
- 4 new unit tests covering all blacklist behavior
Configuration synced across all four sources:
- src/config.rs: Core implementation with EventBlacklistConfig
- .env.example: Comprehensive documentation with examples
- docs/reference/configuration.md: Complete reference documentation
- nix/module.nix: NixOS module option with environment mapping
README updates:
- Add comprehensive "Curation & Moderation" section
- Document repository whitelists (GRASP-01 and GRASP-05 modes)
- Document repository and event blacklists with precedence order
- Add configuration table for all curation/moderation settings
- Provide real-world examples for different relay configurations
Testing:
- 4 new tests for event blacklist functionality
- All 336 library tests passing
- All 64 integration tests passing
- All 38 filter support tests passing
Verification:
- Repository blacklist confirmed to apply to sync (uses same admit_event flow)
- Sync events validated through process_event_static -> write_policy.admit_event
Use cases:
- Block spam/abusive users completely
- Prevent malicious actors from submitting any events
- Temporary blocks for investigation
- Moderation without affecting whitelist curation policy
|
|
Adds NGIT_REPOSITORY_BLACKLIST option for blocking repositories, taking precedence
over all whitelists (archive and repository) to enable moderation without affecting
curation policy.
Key features:
- Three blacklist formats: <npub>, <npub>/<identifier>, <identifier>
- Blacklist checked first before any other validation
- Overrides archive whitelist and repository whitelist
- Specific rejection reasons based on match type (npub/identifier/both)
- Not flagged in NIP-11 curation (operational, not policy)
Implementation:
- Add BlacklistConfig struct with check() method returning detailed reasons
- Add NGIT_REPOSITORY_BLACKLIST config option and blacklist_config() method
- Update validate_announcement() to check blacklist first with specific reasons
- 12 new unit tests covering all blacklist behavior and precedence
Configuration synced across all four sources:
- src/config.rs: Core implementation with BlacklistConfig
- .env.example: Comprehensive documentation with examples
- docs/reference/configuration.md: Complete reference documentation
- nix/module.nix: NixOS module option with environment mapping
Testing:
- 12 new tests for blacklist functionality (config + validation)
- All 332 library tests passing
- All 38 integration tests passing
Use cases:
- Block spam/malware repos by identifier
- Block abusive users by npub
- Block specific problematic repos by npub/identifier
- Temporary blocks for investigation
|
|
config methods
Refactors configuration validation to fail fast on fatal errors at startup
while gracefully handling recoverable issues (e.g., malformed whitelist entries).
Changes:
- Add Config::validate() for eager validation called immediately after load
- Remove Result<> from archive_config() and repository_config() methods
- WhitelistEntry::parse_whitelist() skips invalid entries with warnings
- Validate relay_owner_nsec format in Config::validate()
- Update all call sites to remove Result handling from config getters
Benefits:
- Fatal config errors (incompatible settings) fail at startup, not runtime
- Recoverable errors (bad whitelist entries) logged as warnings and skipped
- No Result handling scattered throughout runtime code after validation
- Config methods safe to call without error handling after validate()
Testing:
- Add 7 new tests for validation edge cases and error handling
- Total config tests: 40 (up from 33)
- All 320 library tests passing
Breaking change: Config users must call config.validate() after Config::load()
to ensure configuration is valid. This is enforced in main.rs.
|
|
Adds NGIT_REPOSITORY_WHITELIST option for curated relay operation that
accepts only whitelisted repositories while maintaining GRASP-01 compliance
(announcements must list the service). This differs from archive whitelist
which enables GRASP-05 mode and doesn't require service listing.
Key features:
- Supports three whitelist formats: npub, npub/identifier, identifier
- Enforces mutual exclusivity with archive read-only mode
- Updates NIP-11 curation field when whitelist is enabled
- Maintains GRASP-01 compliance (doesn't add GRASP-05 support)
Configuration synced across all four sources: src/config.rs, docs/reference/configuration.md,
nix/module.nix, and .env.example as required by AGENTS.md.
|
|
Implements NGIT_ARCHIVE_READ_ONLY configuration option that defaults to true
when archive mode is enabled, allowing relays to operate as read-only syncs
of archived repositories.
Key changes:
- Add NGIT_ARCHIVE_READ_ONLY config option (defaults to true if archive enabled)
- NIP-11 advertises GRASP-05 support and includes curation field when read-only
- Validation logic rejects non-whitelisted repos in read-only mode
- Comprehensive tests for read-only behavior and defaults
- Full documentation in config reference, .env.example, and NixOS module
Read-only mode enables passive mirroring without being listed in announcements,
useful for backup/archive operations while preventing accidental write acceptance.
|
|
Implements GRASP-05 specification for accepting repository announcements
that don't list this relay, enabling archive, mirror, and backup use cases.
Core Features:
- Three whitelist formats: <npub>, <npub>/<identifier>, <identifier>
- Archive-all mode for complete ecosystem mirrors
- Fail-fast npub validation at startup
- Read-only enforcement (archived repos reject pushes)
- Full GRASP-02 sync (git data + Nostr events)
- Dynamic archive status (no flags/metadata)
Implementation:
- Add ArchiveWhitelistEntry enum with Pubkey/Repository/Identifier variants
- Add ArchiveConfig with validation and matching logic
- Update AnnouncementResult to include AcceptArchive variant
- Refactor validate_announcement() to return AnnouncementResult with archive check
- Update AnnouncementPolicy with catch-all pattern for cleaner code
- Wire archive config through builder and policy layers
Configuration:
- NGIT_ARCHIVE_ALL: Accept all announcements (⚠️ storage risk)
- NGIT_ARCHIVE_WHITELIST: Comma-separated whitelist entries
- Updated docs, .env.example, and nix/module.nix
Testing:
- 28 unit tests for config parsing and whitelist matching
- 7 integration tests for archive mode validation
- All 296 tests passing
Validation Priority:
1. Lists our service → Accept (GRASP-01, read/write)
2. Is maintainer → AcceptMaintainer (multi-maintainer, read/write)
3. Matches archive config → AcceptArchive (GRASP-05, read-only)
4. None of above → Reject
Security Considerations:
- Archive-all mode has storage/bandwidth DoS risk
- Identifier-only format matches any pubkey (use npub/identifier for high-value)
- Invalid npubs cause startup failure (fail-fast)
Documentation:
- Concise explanation focused on rationale
- Reference docs updated with all config options
- README updated to reflect completed feature
- Removed from roadmap, added to compliance section
See docs/explanation/grasp-05-archive.md for details.
|
|
When relay_owner_nsec is provided via CLI argument or environment
variable (e.g., read from a file by the NixOS module), trim any
leading/trailing whitespace including newlines. This matches the
behavior when reading from the .relay-owner.nsec file directly.
Fixes issue where NixOS module reads nsec file with 'cat', which
includes the trailing newline, making the nsec invalid when passed
as a CLI argument.
Also reverted the tr workaround in nix/module.nix since ngit-grasp
now handles this correctly.
|
|
Add naughty list tracking for relays with persistent infrastructure issues
(DNS failures, TLS certificate errors, protocol violations) to reduce log
noise and provide better visibility via metrics.
Key features:
- Classify errors into naughty (persistent) vs transient (temporary)
- Track naughty relays with category, reason, and occurrence count
- Log WARN on first naughty occurrence, DEBUG on repeats
- Automatic expiration after 12 hours (configurable)
- Prometheus metrics for monitoring naughty relays by category
- Periodic cleanup task integrated with health checker
Components added:
- src/sync/naughty_list.rs: Core naughty list tracker with error classification
- NaughtyListTracker integration in RelayHealthTracker
- Connection error handling updates in sync manager
- Naughty list metrics (total by category, detailed info per relay)
- Config option for naughty_list_expiration_hours (default: 12)
Closes DNS lookup failures and TLS certificate errors tracking issues.
|
|
|
|
Replaces the simple HashSet<EventId> with the sophisticated two-tier
RejectedEventsIndex from PR1, enabling future immediate re-processing
when maintainer dependencies resolve.
## Changes
### Config (src/config.rs)
- Add `rejected_hot_cache_duration_secs` (default: 120 = 2 minutes)
- Add `rejected_cold_index_expiry_secs` (default: 604800 = 7 days)
- Both configurable via CLI flags or environment variables
### SyncManager (src/sync/mod.rs)
**Type Change:**
- Before: `Arc<RwLock<HashSet<EventId>>>` (simple event ID set)
- After: `Arc<RejectedEventsIndex>` (two-tier storage)
**Initialization:**
- Pass config durations to RejectedEventsIndex::new()
- Creates hot cache (2 min) + cold index (7 days)
**Event Processing (process_event_static):**
- Extract identifier from 'd' tag
- Determine rejection reason from error message
- Call `add_announcement()` with full event + metadata
- Stores in both hot cache and cold index
**Negentropy Sync (derive_relay_targets):**
- Call `get_all_event_ids()` to get rejected IDs
- Returns union of hot cache + cold index event IDs
- Excludes from negentropy reconciliation
**Event Loop (relay_connection):**
- Use `contains()` method instead of direct HashSet access
- Simpler API, same skip-rejected behavior
### RejectedEventsIndex (src/sync/rejected_index.rs)
**New Method:**
- `get_all_event_ids()`: Returns HashSet<EventId> from both tiers
- Used for negentropy exclusion (replaces direct HashSet access)
### Tests Updated
**test_rejected_events_index_tracks_announcements:**
- Create RejectedEventsIndex with config durations
- Add 'd' tag to test announcement
- Use `add_announcement()` with full event
- Verify both hot cache and cold index populated
- Check lengths with `hot_cache_len()` and `cold_index_len()`
**test_rejected_events_excluded_from_negentropy:**
- Create RejectedEventsIndex instead of HashSet
- Build full event with 'd' tag
- Add to index with `add_announcement()`
- Get IDs with `get_all_event_ids()`
- Verify excluded from reconciliation
## Architecture
```
┌─────────────────────────────────────────────────────────────┐
│ SyncManager │
│ │
│ rejected_events_index: Arc<RejectedEventsIndex> │
│ ├─ Hot Cache (2 min): Full events for re-processing │
│ └─ Cold Index (7 days): Metadata for dedup │
└─────────────────────────────────────────────────────────────┘
│
│ On rejection
▼
┌─────────────────────────────────────────────────────────────┐
│ add_announcement(event, pubkey, identifier, reason) │
│ ├─ Store full event in hot cache │
│ └─ Store metadata in cold index │
└─────────────────────────────────────────────────────────────┘
│
│ On negentropy sync
▼
┌─────────────────────────────────────────────────────────────┐
│ get_all_event_ids() → HashSet<EventId> │
│ ├─ Union of hot cache IDs │
│ └─ Union of cold index IDs │
└─────────────────────────────────────────────────────────────┘
```
## Benefits
### Immediate
- **Better tracking**: Store rejection reason + metadata
- **Configurable**: Tune cache/index durations per deployment
- **Observable**: Separate hot/cold metrics (future PR4)
### Future (PR3)
- **Immediate re-processing**: Get events from hot cache when valid
- **No 24h delay**: Maintainer announcements accepted in <1 second
- **Automatic recovery**: Hot cache for immediate, cold index for later
## Backward Compatibility
**No breaking changes:**
- Same rejection behavior (skip events in index)
- Same negentropy exclusion (union with purgatory IDs)
- Default config values match previous implicit behavior
**Migration:**
- Existing deployments continue working with defaults
- Optional: Tune durations via new config flags
## Testing
All tests passing:
- ✅ 9 rejected_index tests (hot cache, cold index, two-tier)
- ✅ 139 sync module tests (including updated integration tests)
- ✅ 247 total library tests
## Next Steps
**PR3: Add invalidation + immediate re-processing**
- Invalidate cold index when owner announcement accepted
- Get events from hot cache for re-processing
- Recursive call to process_event_static
- Integration tests for <1s maintainer acceptance
**PR4: Add cleanup + metrics**
- Hot cache cleanup task (every 60s)
- Cold index cleanup task (daily)
- Prometheus metrics for both tiers
- Monitor hot cache hits vs misses
## Configuration Examples
```bash
# Default (2 min hot cache, 7 day cold index)
ngit-grasp
# Longer hot cache for slow relays
ngit-grasp --rejected-hot-cache-duration-secs 300
# Shorter cold index for memory-constrained systems
ngit-grasp --rejected-cold-index-expiry-secs 86400
# Environment variables
export NGIT_REJECTED_HOT_CACHE_DURATION_SECS=180
export NGIT_REJECTED_COLD_INDEX_EXPIRY_SECS=259200
ngit-grasp
```
Part of: Maintainer chain discovery fix
See: work/SOLUTION-SUMMARY-V2.md for full design
Previous: PR1 (rejected_index.rs implementation)
Next: PR3 (invalidation + re-processing)
|
|
Replace the owner-npub configuration option with relay-owner-nsec to provide
a persistent cryptographic identity for the relay operator. This addresses
NIP-42 authentication requirements discovered during sync debugging.
Motivation:
- Some relays (e.g., relay.damus.io) require NIP-42 authentication for
advanced features like NIP-77 negentropy sync
- Previously used random ephemeral keys per connection, providing no
persistent identity
- Other relays can now recognize us by pubkey for reputation-based rate
limiting
- Ensures consistency between NIP-11 pubkey and authentication key
Changes:
- Config: relay_owner_nsec with auto-load/generate from .relay-owner.nsec
- NIP-11: Pubkey derived from nsec instead of separate npub field
- Sync: RelayConnection now uses operator keys for NIP-42 auth
- Docs: Updated README, .env.example, and added .relay-owner.nsec to gitignore
Key Features:
- Auto-generates key on first run and saves to .relay-owner.nsec
- Loads existing key from file on subsequent runs
- Can override via CLI flag or environment variable
- Enables reputation building across relay network
- Future-ready for event signing and WoT calculations
Testing:
- 225/232 tests passing (7 pre-existing purgatory failures unrelated)
- Verified key generation, loading, and NIP-11 derivation
- Release build successful
Related: work/sync-debug-analysis.md, work/relay-owner-nsec-implementation.md
|
|
|
|
Main lib (src/):
- Add #[allow(dead_code)] for build_info field (stored to prevent Prometheus unregistration)
- Add #[allow(dead_code)] for first_seen field (reserved for future rate limiting)
- Replace .or_insert_with(RelaySyncNeeds::default) with .or_default()
- Replace manual div_ceil implementations with .div_ceil(100)
Test code (tests/):
- Replace .expect(&format!(...)) with .unwrap_or_else(|_| panic!(...))
- Remove needless borrows in fetch_metrics() calls
- Add #[allow(dead_code)] and #[allow(unused_imports)] to test helpers module
grasp-audit:
- Apply cargo fmt to fix formatting
|
|
|
|
Remove 4 config fields that were defined but never used:
- sync_startup_delay_secs
- sync_reconnect_delay_secs
- sync_reconnect_lookback_days
- sync_startup_jitter_ms
These fields were added during GRASP-02 planning but the implementation
took a different approach (using hardcoded constants for quick reconnect
windows and batch window via env var).
|
|
|
|
Changes:
- Fix connection attempt metrics: record success/failure based on actual
connection result instead of pre-emptively recording failure
- Add health tracker integration on connection failure: call
record_failure() and record_health_state() in error path
- Add connection verification in relay_connection.rs: wait 500ms after
connect() then verify is_connected() to detect silent failures
- Add configurable disconnect check interval via
NGIT_SYNC_DISCONNECT_CHECK_INTERVAL_SECS env var
- Update TestRelay with fast test settings: startup_delay=0, jitter=0,
disconnect_check_interval=1s
- Add debug output to metrics tests for investigation
Note: Tests may still fail due to 5-second base backoff in health tracker.
A follow-up task will add NGIT_SYNC_BASE_BACKOFF_SECS config parameter
to allow faster test cycles.
Related: metrics-wiring-plan.md Tasks 1 & 2
|
|
|
|
|
|
- Add NegentropyService for set reconciliation
- Implement startup catchup with warm-up delay
- Implement reconnect catchup (last 3 days)
- Add daily catchup schedule with stagger
|
|
- Add RelayHealthTracker with DashMap
- Implement exponential backoff (5s -> 1h max)
- Handle dead relays (24h failures -> daily retry)
- Add startup jitter to prevent thundering herd
- Add NGIT_SYNC_MAX_BACKOFF_SECS config
|
|
- Add src/sync/ module with SyncManager
- Add NGIT_SYNC_RELAY_URL config option
- Subscribe to kind 30617 on configured relay
- Validate synced events through Nip34WritePolicy
- Integration test with two TestRelay instances
|
|
|
|
|
|
|
|
Add environment variable configuration for database backend selection:
- Added DatabaseBackend enum (memory, nostrdb, lmdb) in src/config.rs
- Updated relay builder to use configured backend in src/nostr/builder.rs
- Added NGIT_DATABASE_BACKEND to .env.example with documentation
- Updated docs/reference/configuration.md with backend comparison table
NostrDB and LMDB backends prepared for future implementation when
nostr-relay-builder adds support. Currently defaults to in-memory
database with warning logs when persistent backends are selected.
|
|
|
|
- WebSocket-based relay using tokio-tungstenite
- Full NIP-01 protocol support (EVENT, REQ, CLOSE)
- Event validation (signature and ID)
- In-memory event storage
- Filter support (IDs, authors, kinds, since/until)
- Configuration via environment variables
- Nix flake for reproducible builds
- Test automation script
All 6 NIP-01 smoke tests passing (100%)
|