| Age | Commit message (Collapse) | Author |
|
|
|
Add git ancestry comparison (22-compare-git-data.sh) to determine
commit relationships between prod and archive repos. Repos where
archive is ahead are now correctly classified as ready-for-migration
since ngit-grasp only accepts git data authorized by state events.
Previously, repos with different git data were flagged as needs-resync
even when archive had newer/better data than prod.
|
|
Previously, sync_identifier_from_url passed all needed OIDs to
process_newly_available_git_data, not just the OIDs that were
successfully fetched. This caused incorrect logging (new_oids_count
would show all needed OIDs, not just fetched ones).
While this didn't break functionality (the actual processing uses
can_apply_state which checks the repository on disk), it made
debugging confusing.
Changes:
- Rename oids_fetched to fetched_oids and change type from usize to Vec<String>
- Return Vec<String> from match arms instead of counts
- Pass fetched_oids (not needed_oids) to process_newly_available_git_data
- Return fetched_oids.len() at the end
This ensures logging accurately reflects which OIDs were actually
fetched from the remote.
|
|
|
|
When fetch_oids returns Ok(vec![]) (all requested OIDs missing from
remote), the log message now says 'Fetch returned no OIDs (not available
on remote)' instead of the misleading 'Fetch succeeded' with oids_fetched=0.
|
|
Add retry loop in fetch_oids that handles git's behavior of stopping
at the first missing OID. When a 'not our ref' error occurs:
- Parse the missing OID from stderr
- Remove it from the fetch list and track it as missing
- Retry with remaining OIDs until success or all OIDs exhausted
This ensures we fetch all available OIDs even when some are missing
from the remote, rather than failing the entire batch.
Also improves error reporting:
- Include URL in all error messages for easier debugging
- Log stderr even when domain is already on naughty list
|
|
Previously, all git upload-pack/receive-pack failures returned HTTP 500,
but the git smart HTTP protocol requires protocol-level errors (like
"not our ref") to be returned as HTTP 200 OK with an ERR pkt-line in
the response body.
Changes:
- Add build_git_protocol_error_response() to create HTTP 200 responses
with properly formatted ERR pkt-line ("ERR <message>\n")
- Add is_git_protocol_error() to detect protocol errors (exit code 128
with stderr content) vs transport errors
- Update handle_upload_pack() and handle_receive_pack() to return
protocol errors as HTTP 200 with ERR pkt-line
- Keep HTTP 500 for actual transport errors (spawn failures, I/O errors,
signals)
This allows git clients to properly parse and display protocol error
messages instead of seeing generic HTTP 500 errors.
|
|
|
|
Change protocol error detection to only match WebSocket-specific errors
(websocket, invalid frame) instead of generic 'protocol' keyword which
was incorrectly catching transient git protocol errors.
Git protocol errors like 'fatal: protocol error: bad line length' are
transient network issues that should use backoff/retry, not permanent
naughty list blocking. Only WebSocket/Nostr protocol violations indicate
persistent infrastructure problems.
Fixes production false positive:
- relay.ngit.dev: git protocol error + remote warning misclassified
Add production test cases for git protocol errors and warning combinations.
|
|
|
|
|
|
Remove --analysis-root flag and external data file dependencies. The script
now extracts repo/npub information directly from 'Added rejected announcement'
log entries (which include pubkey and identifier fields) and uses
`nak encode npub <hex-pubkey>` to convert hex pubkeys to npub format.
This simplification was enabled by the recent logging improvement that added
pubkey to the 'Added rejected announcement' log entries.
|
|
Strip URLs (http://, https://, git://, ws://, wss://) from error messages
before classification to prevent false positives from repository names,
paths, or identifiers containing keywords like 'ssl', 'certificate', etc.
- Add strip_urls() function to remove URLs before pattern matching
- Add WebSocket protocol support (ws://, wss://) for relay errors
- Filter remote warnings that don't indicate infrastructure problems
- Use more specific SSL/TLS patterns to avoid npub substring matches
- Reduce test suite from 40 to 13 tests, keeping only edge cases
Fixes false positives seen in production:
- git.shakespeare.diy: 'repository not found' with npub containing 'ssl'
- relay.ngit.dev: HTTP 500 error with npub containing 'ssl'
- gitnostr.com: remote permission warning misclassified as protocol error
|
|
failures
|
|
load_existing_events()
Root cause: `last_connected` was set to Timestamp::now() BEFORE
load_existing_events() was called (line 425), causing the database
query to filter out all existing events with .since(current_time).
The query became: SELECT * FROM events WHERE created_at >= <now>
Result: 0 events returned (nothing has created_at in the future)
Solution: Remove .since() filter from database queries entirely.
The `last_connected` field is now only used for WebSocket subscription
filters to avoid re-fetching events from remote relays on reconnect.
Rationale for this approach over reordering operations:
- Database queries are fast (indexed by kind and created_at)
- Loading all events on startup ensures consistency
- Eliminates subtle ordering dependency that could break in refactoring
- Cleaner mental model: database = full load, WebSocket = incremental
This fixes the issue where ~190 state events weren't being fetched
after deploying the database query fix (commit 4162c90).
Evidence: Production logs showed "Loaded announcements from database
count=0" when there should have been hundreds of announcements.
|
|
|
|
Previously, SelfSubscriber only saw events returned by the WebSocket
subscription to the local relay, which has limits on the number of
events returned. This caused repos with announcements in the database
to never get Layer 2/3 filters created, resulting in missing state events.
Now, on startup, we query the database directly with two separate queries:
1. Query announcements (30617) to populate repo_sync_index
2. Query root events (1617/1618/1621) to create Layer 3 filters
Both queries use .since(last_connected) if available for incremental
loading on reconnect.
Filters are created inline and made mutable to support the .since()
clause, rather than using a shared create_event_filter() method.
Fixes the issue where state events were missing for repos like cashbird
and creative-space that had announcements in the database but weren't
returned by the WebSocket subscription.
|
|
Add proper log level configuration following standard approach:
- CLI flag: --log-level <level>
- Environment variable: NGIT_LOG_LEVEL
- Default: info
- Supports simple levels (error, warn, info, debug, trace)
- Supports filter expressions (e.g., ngit_grasp=debug,actix_web=info)
Configuration is now consistent across all four sources:
1. src/config.rs - Config struct with log_level field
2. docs/reference/configuration.md - Full documentation
3. nix/module.nix - NixOS module with logLevel option
4. .env.example - Example configuration file
This replaces the previous RUST_LOG approach with proper integration
into the ngit-grasp configuration system, enabling trace logging from
CLI, environment variables, or NixOS configuration.
|
|
|
|
|
|
The new script implements the redesigned classification system with:
- Tier 1: No Action Required (complete in both, deleted, empty, archive-only)
- Tier 2: Action Required (complete in prod but missing/incomplete in archive)
- Tier 3: Manual Investigation (partial/no-match in prod, archive-only anomalies)
Produces cleaner output format with actionable categories and reasons.
|
|
caught a production bug where npub in url string contained "dns"
triggering false positive
|
|
Implements the redesigned migration analysis classification system:
Tier 1 - Ready for Migration (no action required):
- Complete in both prod and archive
- Deleted by user (kind 5 event)
- Empty in prod (cat2) - always no action, regardless of archive
- Archive-only (not in prod)
- Purgatory-only (not in prod)
Tier 2 - Needs Re-sync (action required):
- Complete in prod, missing/incomplete in archive
- Includes purgatory context (expired vs never-tried)
Tier 3 - Manual Review (investigation needed):
- Partial in prod (cat3)
- No-match in prod (cat4)
- Parse failures with complete prod
Key fixes:
- Use safe arithmetic ($((x + 1))) instead of ((x++)) with set -e
- Batch nak hex-to-npub conversions for deletion processing
- Handle NDJSON format for deletion files
Output: 352 ready, 295 resync, 46 review (693 total)
|
|
Phase 4 (30-extract-parse-failures.sh) now enriches parse failures with
repo name and npub by looking up event_id in announcements.json. This is
critical because 'Invalid announcement' rejections only log event_id and
kind, not the repo name or npub.
Phase 5 (40-classify-actions.sh) was also fixed to extract columns 4 and 5
(repo|npub) instead of columns 1 and 2 (event_id|kind) from parse-failures.txt.
Without this fix, action-required.txt showed unusable output like:
000014b2... | 30617 | parse failure logged | fix event format...
Now it correctly shows:
scripts | npub1hs5244... | parse failure logged | fix event format...
The enrichment uses jq to build a lookup table from announcements.json and
optionally uses 'nak' to convert hex pubkeys to npub format.
|
|
Filter parse failures to only those for announcements that are in
production but missing from the archive. This eliminates noise from
rejections of events from other relays that don't affect migration.
Before: 223 parse failures (all rejections from all relays)
After: 18 parse failures (only for missing announcements)
The filter works by:
1. Reading missing announcements from comparison data
2. Extracting event IDs from production announcements JSON
3. Filtering parse failures to only matching event IDs
|
|
The script was counting the same invalid announcement twice because:
- Write policy logs use hex event IDs
- Builder logs use note1 (bech32) event IDs
- Deduplication only worked within each format
Fix: Only extract from write policy logs (hex IDs) to avoid the
format mismatch. Builder logs contain the same events, so we don't
lose any data.
Result: 446 entries → 223 unique invalid announcements (correct count)
|
|
Update parse failures script to also extract 'Invalid announcement'
rejections from logs. These are announcement events that failed
validation (e.g., multiple clone tags instead of single tag with
multiple values).
Changes:
- Search for 'Event rejected by write policy' pattern with 'Invalid announcement'
- Search for 'Rejected repository announcement' pattern from builder
- Extract event_id, kind, and reason from rejection logs
- Combine with [PARSE_FAIL] entries in output
- Deduplicate entries by event_id
- Update header to clarify both patterns are captured
- Update migration guide to document this
- Fix SIGPIPE handling in purgatory script (minor)
This captures the ~446 unique announcements rejected for NIP-34 format
violations (multiple clone tags), which were previously unexplained
in the migration analysis.
|
|
Make scripts fully automatic with no manual intervention needed.
Changes:
- Add --no-pager to journalctl commands in validate-service.sh
- Add service existence validation with helpful error messages
- Capture and report journalctl stderr for better error visibility
- Improve error handling without failing on empty logs
The main issue was missing --no-pager in validate-service.sh which
could cause scripts to hang when run non-interactively (e.g., via SSH).
Tested locally - scripts run without hanging and produce correct output.
|
|
Add validation to ensure Phase 4 scripts use ngit-grasp service
(with structured logging) instead of ngit-relay service.
Changes:
- Add validate-service.sh helper for reusable service validation
- Add validation to run-migration-analysis.sh before Phase 4
- Add validation to 30-extract-parse-failures.sh
- Add validation to 31-extract-purgatory-expiry.sh
- Update migration guide with clear warnings about service selection
- Expand troubleshooting for 'Phase 4 finds no logs' issue
- Emphasize lesson learned in relay.ngit.dev notes
This prevents the issue where Phase 4 was run against ngit-relay.service
and found no parse failures because structured logging only exists in
ngit-grasp services.
|
|
- Add Gotchas section with common issues: git installation, localhost-only
archive relays, non-standard git paths, service name variations, and
permission requirements
- Add relay.ngit.dev-specific migration notes with actual paths, service
names, and analysis results (315 repos need re-sync, 382 purgatory expired)
- Enhance Running the Analysis section with path discovery guidance
- Expand Troubleshooting section with solutions for git not found, archive
connection failures, and wrong git paths
- Add git --version to prerequisite checks
- Update examples to use realistic localhost archive URLs
|
|
- 10-check-git-sync.sh: Check for git before running
- run-migration-analysis.sh: Include git in prerequisite checks
- Fixes script failures when git is not installed
|
|
- Rename guide: migrate-ngit-relay-to-ngit-grasp.md → migrate-to-ngit-grasp.md
- Remove ngit-relay and relay.ngit.dev specific references
- Use generic terminology: source/target relay, current implementation
- Add Compatibility section explaining requirements
- Update examples to be implementation-agnostic
- Update script comments to reference GRASP relay (not ngit-relay)
- Update README.md to link to the new guide
Scripts already work with any GRASP implementation via parameters.
|
|
Transforms the guide from a technical reference into a practical
step-by-step guide with:
- Quick Start section at the top with copy-paste commands
- Prerequisites section with verification steps
- Migration Overview explaining the 3-stage process
- Running the Analysis section with all options documented
- Understanding Results section explaining output files
- Troubleshooting section for common issues
- Architecture section (moved from top) for those wanting details
- Next Steps section for post-analysis workflow
The guide now follows a practical flow: get started fast, understand
results, then dive into architecture details if needed.
|
|
Adds run-migration-analysis.sh that orchestrates all 5 phases of the
migration analysis with:
- Parameterized inputs for relay URLs, git paths, and service name
- Phase control (skip, only, from-phase options)
- Dry-run mode to preview execution
- Progress indicators and timing information
- Error handling with continue-on-error option
- Auto-detection of available features (git paths, journalctl)
- Summary display with results overview
|
|
- Combines all data sources from Phases 1-4
- Produces three actionable outputs: no-action, action-required, manual-investigation
- Generates comprehensive summary with recommendations
- Handles missing Phase 4 logs gracefully
- Classification logic for migration decision-making
|
|
- Add [PARSE_FAIL] logging when event parsing fails
- Add [PURGATORY_EXPIRED] logging when repos expire from purgatory
- Logs include: kind, event_id, repo, npub, reason
- Supports Phase 4 migration scripts (30-extract-*.sh)
- All 382 tests pass
|
|
- 30-extract-parse-failures.sh: Extracts parse failure events from logs
- 31-extract-purgatory-expiry.sh: Extracts purgatory expiry events from logs
- Both support time range filtering (--since, --until)
- Includes dry-run mode for testing
- Gracefully handles missing logs with dependency notes
- TSV output format for Phase 5 consumption
- Ready for when structured logging is implemented in ngit-grasp
|
|
- Compares state event refs to actual git data on disk
- Uses git show-ref to handle both loose and packed refs
- Outputs TSV format compatible with Phase 3 categorization
- Optional --categorize flag for inline categorization
- Includes progress indicators and ETA (~20 min runtime on VPS)
- Improved error handling and validation over original script
|
|
- 20-categorize.sh: Categorizes git sync status into 4 categories
- 21-compare-relays.sh: Compares prod vs archive to find gaps
- Updated how-to doc with detailed Phase 3 outputs and directory structure
- Tested with Jan 22 data: 231 complete in both, 276 complete in prod but missing from archive
|
|
- Fetches kind 30618 (state), 30617 (announcement), 5 (deletion) events
- Uses nak req --paginate for complete event retrieval
- Outputs JSONL format for downstream processing
- Includes error handling and timing information
|
|
|
|
|
|
Addresses the problem of empty bare repos misleading clients and sync
downloading refs to deleted repos. Key design points:
- Bare repo created immediately so git pushes can succeed
- Git data arrival triggers promotion to active status
- Expiry extended in two places: state event arrival and git auth
- Indexed by (pubkey, identifier) for correct uniqueness
- Handles replacement announcements and service changes
|
|
When git fetch fails with 'upload-pack: not our ref', git stops at the first
missing OID and doesn't attempt to fetch remaining OIDs. This means if we
request 5 OIDs and the first is missing, we never try the other 4 (which may
exist on the remote).
Changes:
- Parse missing OID from stderr for clearer error messages
- Single OID case: 'remote missing only oid requested: <oid>'
- Multi OID case: Log WARNING and indicate other OIDs weren't attempted
- Identifies the bug that needs retry logic to fetch OIDs individually
|
|
The NIP-11 specification requires the pubkey field to be a 64-character
hex string, but we were incorrectly using npub (bech32) format.
Changes:
- Add Config::relay_owner_pubkey_hex() method to get hex format
- Update NIP-11 document to use hex format instead of npub
- Update test to verify 64-char hex string instead of npub format
Fixes nak relay command error:
'must be a hex string of 64 characters'
|
|
Modern git clients send Content-Encoding: gzip on POST requests to
/git-upload-pack for efficiency. Without decompression, the compressed
binary data was passed directly to git upload-pack, which expected
pkt-line format, causing:
fatal: protocol error: bad line length character: ??
error: RPC failed; HTTP 500
This was discovered in production when git clone requests consistently
failed with HTTP 500 errors. The fix extracts the Content-Encoding
header and uses flate2::GzDecoder to decompress gzip bodies before
passing them to the git subprocess.
|
|
The main service uses ReadWritePaths for security hardening, but systemd
requires these paths to exist BEFORE setting up the mount namespace.
ExecStartPre runs AFTER namespace setup, so it cannot create the directories.
This fix adds a separate oneshot setup service (ngit-grasp-{name}-setup)
that:
- Runs before the main service without namespace restrictions
- Creates dataDir and subdirectories (git/, relay/) with mkdir -p
- Sets proper ownership (user:group) and permissions (750)
- Uses RemainAfterExit so it only runs once per boot
The main service now depends on the setup service via requires/after.
Fixes: 'Failed to set up mount namespacing: /path: No such file or directory'
|
|
The tmpfiles.rules now explicitly creates the parent directory of dataDir
with root:root ownership and 0755 permissions before creating the
service-owned directories. This ensures the directory hierarchy exists
even if parent directories are missing.
While systemd-tmpfiles should create parent directories automatically,
this makes the behavior explicit and ensures proper permissions on the
immediate parent directory.
|
|
|
|
Refactor internal code to use the mark_negentropy_unsupported() method
instead of direct field access for improved readability.
|
|
When negentropy retry makes no progress (relay returns zero events),
this indicates the relay's negentropy implementation is broken. Instead
of marking the batch as failed, we now:
1. Mark the relay as not supporting NIP-77 so future batches skip
negentropy and use REQ+EOSE directly
2. Fall back to REQ+EOSE using semantic filters (kind/author/tags)
for the current batch, which may succeed where ID-based queries fail
This addresses the issue where some relays (e.g., azzamo.net, snort.social)
return event IDs during negentropy diff but fail to serve those events
when requested by ID.
|
|
|
|
Enables relay operators to backup/archive specific GRASP servers by domain.
Includes configuration, validation, documentation, and integration tests.
|
|
NIP-34 specifies single clone/relays tags with multiple values, not multiple
tags with single values. Update test helper to match spec.
|
|
This reverts commit 70673cf84aad8dfc3413181ffc4ce28809f6f6eb.
|
|
When users import the ngit-grasp flake into their NixOS configurations,
crane emits warnings about not being able to find the package name in
the workspace Cargo.toml. This adds the recommended workspace.metadata
section to explicitly specify the crane package name for the workspace.
Fixes the warning:
evaluation warning: crane will use a placeholder value since name
cannot be found in /nix/store/.../Cargo.toml
This is a cosmetic fix that doesn't affect functionality but improves
the user experience when importing the flake.
|
|
Add ExecStartPre directives to ensure data directories exist before
service starts. This fixes service failures when using custom dataDir
paths that don't exist yet.
The tmpfiles.rules weren't automatically executed during nixos-rebuild
switch, causing 'status=226/NAMESPACE' errors. ExecStartPre runs as
root (+ prefix) to create directories with proper ownership/permissions.
|
|
Changes RED from standard red (\x1b[31m) to bold bright red (\x1b[1;91m)
and GREEN from standard green (\x1b[32m) to bold bright green (\x1b[1;92m).
This follows ANSI/ISO standards (ECMA-48) and matches industry best
practices used by Rust/Cargo and other modern CLI tools. Bold bright
colors provide significantly better readability on dark terminal
backgrounds while maintaining maximum compatibility with all terminals.
Addresses user feedback that red color was too hard to read.
|
|
Combined Accept and AcceptArchive match arms in builder.rs to ensure
bare repositories are created for both cases. Previously AcceptArchive
had duplicate code that didn't call ensure_bare_repository().
Also includes:
- Config fix: effective_git_data_path() respects explicit paths with memory backend
- TestRelay: Added git_data_path() and archive config support for testing
- Integration tests for archive_read_only behavior
|
|
Increases connection limit across all configuration sources:
- src/config.rs: default_value_t = 4096
- docs/reference/configuration.md: updated default and examples
- nix/module.nix: maxConnections default = 4096
- .env.example: updated default and comment
This allows the relay to handle more concurrent connections and reduces
the likelihood of connection exhaustion under normal load. The previous
limit of 2000 was too conservative for production deployments.
|
|
Implement defensive measures to protect against DoS attacks:
- Add explicit rate limits (500 subscriptions, 60 events/min per connection)
- Add total connection limit (default: 500, configurable via NGIT_MAX_CONNECTIONS)
- Update configuration across all 4 locations (src, nix, docs, .env.example)
Per-IP rate limiting deferred until abuse is detected in production or
implemented in rust-nostr relay-builder to benefit the entire Nostr ecosystem.
Documentation added explaining the defensive features and rationale.
Detailed analysis of other relay implementations preserved in commit history.
|
|
Add comprehensive documentation explaining the defensive features
implemented in ngit-grasp. The detailed analysis of other relay
implementations is now preserved in commit history (e3792b9).
|
|
- Make RateLimit explicit in relay builder (500 subs, 60 events/min)
- Add NGIT_MAX_CONNECTIONS config option (default: 500)
- Update all 4 config locations (src, nix, docs, .env.example)
- Fix documentation error: filter limit 5000→500
- Document Phase 2 deferral decision (per-IP enforcement)
Addresses primary DoS vector (connection exhaustion) with minimal code.
Per-IP rate limiting deferred until abuse detected in production.
Related: issue ff38 (git endpoint throttling - separate concern)
|
|
Comprehensive research on rate limiting and defensive features across major Nostr relay implementations. Documents:
- Current state of ngit-grasp defensive features
- Detailed analysis of strfry, nostr-rs-relay, and khatru
- Concrete defaults and configuration options from each
- Rust rate limiting ecosystem (governor crate)
- Recommendations for ngit-grasp implementation
- Proposed default values and implementation phases
|
|
Implement save/restore functionality for both purgatory state and
rejected events cache. Events are now saved to disk on graceful
shutdown and restored on startup, preventing data loss during
relay restarts.
Key features:
- Purgatory state persisted to JSON (state events, PR events, expired events)
- Rejected events cache persisted (hot cache + cold index)
- Downtime adjustment preserves remaining TTL
- Graceful degradation on missing/corrupted files
- Automatic re-queueing of restored repositories
- Comprehensive test coverage (45 tests)
|
|
shutdown/startup
Implement save/restore functionality for rejected events cache and
integrate persistence with relay shutdown/startup lifecycle. Both
purgatory and rejected cache now survive relay restarts.
Key features:
- Serialize rejected events cache to JSON (rejected-events-cache.json)
- Save both hot cache (2min, full events) and cold index (7day, metadata)
- Restore with downtime adjustment (preserves remaining TTL)
- Graceful degradation (missing/corrupted files don't crash)
- File cleanup after successful restore
- Automatic restoration in SyncManager::new()
Integration:
- Shutdown hook saves both purgatory and rejected cache
- Startup hook restores both and re-queues repositories
- Non-fatal errors (logs warnings, continues on failure)
Files:
- src/sync/rejected_index.rs: save_to_disk/restore_from_disk methods
- src/sync/mod.rs: SyncManager integration and auto-restore
- src/main.rs: Shutdown/startup hooks for both caches
- tests/purgatory_persistence.rs: 17 integration tests
Tests: 13 unit tests + 17 integration tests covering full lifecycle
|
|
Implement save/restore functionality for purgatory state to prevent
event loss during relay restarts. Events in purgatory (state events,
PR events, and expired events) are now saved to disk on graceful
shutdown and restored on startup.
Key features:
- Serialize purgatory state to JSON (purgatory-state.json)
- Time conversion helpers for Instant <-> Duration serialization
- Restore with downtime adjustment (preserves remaining TTL)
- Graceful degradation (missing/corrupted files don't crash)
- File cleanup after successful restore
- get_all_identifiers() for re-queueing after restore
Files:
- src/purgatory/persistence.rs: Time conversion helpers
- src/purgatory/types.rs: Serialization derives
- src/purgatory/mod.rs: save_to_disk/restore_from_disk methods
Tests: 15 unit tests covering serialization, downtime, edge cases
|
|
The bug: SelfSubscriber filtered announcements with lists_our_relay() check,
preventing archive_all mode from discovering relays in announcements that
don't list our relay domain.
The insight: SelfSubscriber only receives events that ALREADY passed
write policy validation (archive_all, archive_whitelist, blacklist, etc.)
via admit_event() before being saved to the database. The event flow:
External relay → process_event_static() → write_policy.admit_event()
→ (validation happens here) → save to DB → notify_event()
→ SelfSubscriber receives via WebSocket
So the lists_our_relay() check was redundant double-validation that
broke archive_all mode by filtering events that had already been
accepted by the write policy.
The fix: Simply remove the lists_our_relay() filtering. Events reaching
SelfSubscriber are pre-validated and should all be processed for relay
discovery according to the configured archive policy.
Changes:
- Removed lists_our_relay() check from process_notification() (4 lines)
- Removed unused lists_our_relay() helper function (9 lines)
- Added comment explaining events are pre-validated (3 lines)
- Total: 13 lines removed, 3 lines added
Fixes #194d
|
|
The archiveAll and archiveReadOnly options were using toString which converts
booleans to "1"/"0", but the CLI expects "true"/"false" strings.
This caused startup errors like:
error: invalid value '1' for '--archive-all'
[possible values: true, false]
Changed both to use explicit if/then/else conversion to match CLI expectations.
|
|
- Update default bind address in src/config.rs to 127.0.0.1:7334
- Update all four critical config sources per AGENTS.md:
- src/config.rs (code default and tests)
- .env.example (development template)
- docs/reference/configuration.md (user documentation)
- nix/module.nix (NixOS deployment)
- Update all documentation examples and references:
- README.md (with note about phone keypad mnemonic)
- docs/how-to/*.md (deploy, prometheus-setup, test-compliance)
- docs/explanation/*.md (architecture, comparison)
- docs/learnings/grasp-audit.md
Port 7334 spells NGIT on a phone keypad, making it memorable and
project-specific.
All tests pass (336 lib tests + 51 integration tests).
|
|
|
|
Adds NGIT_EVENT_BLACKLIST option for blocking all events from specific npubs,
taking precedence over all other validation to enable comprehensive moderation
without affecting curation policy.
Key features:
- Simple npub-only format: <npub>,<npub>,...
- Checked FIRST before any other validation (including repository blacklist)
- Blocks ALL event types (announcements, state events, PRs, comments, etc.)
- Events never reach relay storage or purgatory
- Specific rejection reason for operator debugging
Implementation:
- Add EventBlacklistConfig struct with check() method
- Add NGIT_EVENT_BLACKLIST config option and event_blacklist_config() method
- Add config field to PolicyContext for policy access
- Add check_event_blacklist() to Nip34WritePolicy
- Check event blacklist first in admit_event() method (before any other validation)
- 4 new unit tests covering all blacklist behavior
Configuration synced across all four sources:
- src/config.rs: Core implementation with EventBlacklistConfig
- .env.example: Comprehensive documentation with examples
- docs/reference/configuration.md: Complete reference documentation
- nix/module.nix: NixOS module option with environment mapping
README updates:
- Add comprehensive "Curation & Moderation" section
- Document repository whitelists (GRASP-01 and GRASP-05 modes)
- Document repository and event blacklists with precedence order
- Add configuration table for all curation/moderation settings
- Provide real-world examples for different relay configurations
Testing:
- 4 new tests for event blacklist functionality
- All 336 library tests passing
- All 64 integration tests passing
- All 38 filter support tests passing
Verification:
- Repository blacklist confirmed to apply to sync (uses same admit_event flow)
- Sync events validated through process_event_static -> write_policy.admit_event
Use cases:
- Block spam/abusive users completely
- Prevent malicious actors from submitting any events
- Temporary blocks for investigation
- Moderation without affecting whitelist curation policy
|
|
Adds NGIT_REPOSITORY_BLACKLIST option for blocking repositories, taking precedence
over all whitelists (archive and repository) to enable moderation without affecting
curation policy.
Key features:
- Three blacklist formats: <npub>, <npub>/<identifier>, <identifier>
- Blacklist checked first before any other validation
- Overrides archive whitelist and repository whitelist
- Specific rejection reasons based on match type (npub/identifier/both)
- Not flagged in NIP-11 curation (operational, not policy)
Implementation:
- Add BlacklistConfig struct with check() method returning detailed reasons
- Add NGIT_REPOSITORY_BLACKLIST config option and blacklist_config() method
- Update validate_announcement() to check blacklist first with specific reasons
- 12 new unit tests covering all blacklist behavior and precedence
Configuration synced across all four sources:
- src/config.rs: Core implementation with BlacklistConfig
- .env.example: Comprehensive documentation with examples
- docs/reference/configuration.md: Complete reference documentation
- nix/module.nix: NixOS module option with environment mapping
Testing:
- 12 new tests for blacklist functionality (config + validation)
- All 332 library tests passing
- All 38 integration tests passing
Use cases:
- Block spam/malware repos by identifier
- Block abusive users by npub
- Block specific problematic repos by npub/identifier
- Temporary blocks for investigation
|
|
config methods
Refactors configuration validation to fail fast on fatal errors at startup
while gracefully handling recoverable issues (e.g., malformed whitelist entries).
Changes:
- Add Config::validate() for eager validation called immediately after load
- Remove Result<> from archive_config() and repository_config() methods
- WhitelistEntry::parse_whitelist() skips invalid entries with warnings
- Validate relay_owner_nsec format in Config::validate()
- Update all call sites to remove Result handling from config getters
Benefits:
- Fatal config errors (incompatible settings) fail at startup, not runtime
- Recoverable errors (bad whitelist entries) logged as warnings and skipped
- No Result handling scattered throughout runtime code after validation
- Config methods safe to call without error handling after validate()
Testing:
- Add 7 new tests for validation edge cases and error handling
- Total config tests: 40 (up from 33)
- All 320 library tests passing
Breaking change: Config users must call config.validate() after Config::load()
to ensure configuration is valid. This is enforced in main.rs.
|
|
Adds NGIT_REPOSITORY_WHITELIST option for curated relay operation that
accepts only whitelisted repositories while maintaining GRASP-01 compliance
(announcements must list the service). This differs from archive whitelist
which enables GRASP-05 mode and doesn't require service listing.
Key features:
- Supports three whitelist formats: npub, npub/identifier, identifier
- Enforces mutual exclusivity with archive read-only mode
- Updates NIP-11 curation field when whitelist is enabled
- Maintains GRASP-01 compliance (doesn't add GRASP-05 support)
Configuration synced across all four sources: src/config.rs, docs/reference/configuration.md,
nix/module.nix, and .env.example as required by AGENTS.md.
|
|
Implements NGIT_ARCHIVE_READ_ONLY configuration option that defaults to true
when archive mode is enabled, allowing relays to operate as read-only syncs
of archived repositories.
Key changes:
- Add NGIT_ARCHIVE_READ_ONLY config option (defaults to true if archive enabled)
- NIP-11 advertises GRASP-05 support and includes curation field when read-only
- Validation logic rejects non-whitelisted repos in read-only mode
- Comprehensive tests for read-only behavior and defaults
- Full documentation in config reference, .env.example, and NixOS module
Read-only mode enables passive mirroring without being listed in announcements,
useful for backup/archive operations while preventing accidental write acceptance.
|
|
Implements GRASP-05 specification for accepting repository announcements
that don't list this relay, enabling archive, mirror, and backup use cases.
Core Features:
- Three whitelist formats: <npub>, <npub>/<identifier>, <identifier>
- Archive-all mode for complete ecosystem mirrors
- Fail-fast npub validation at startup
- Read-only enforcement (archived repos reject pushes)
- Full GRASP-02 sync (git data + Nostr events)
- Dynamic archive status (no flags/metadata)
Implementation:
- Add ArchiveWhitelistEntry enum with Pubkey/Repository/Identifier variants
- Add ArchiveConfig with validation and matching logic
- Update AnnouncementResult to include AcceptArchive variant
- Refactor validate_announcement() to return AnnouncementResult with archive check
- Update AnnouncementPolicy with catch-all pattern for cleaner code
- Wire archive config through builder and policy layers
Configuration:
- NGIT_ARCHIVE_ALL: Accept all announcements (⚠️ storage risk)
- NGIT_ARCHIVE_WHITELIST: Comma-separated whitelist entries
- Updated docs, .env.example, and nix/module.nix
Testing:
- 28 unit tests for config parsing and whitelist matching
- 7 integration tests for archive mode validation
- All 296 tests passing
Validation Priority:
1. Lists our service → Accept (GRASP-01, read/write)
2. Is maintainer → AcceptMaintainer (multi-maintainer, read/write)
3. Matches archive config → AcceptArchive (GRASP-05, read-only)
4. None of above → Reject
Security Considerations:
- Archive-all mode has storage/bandwidth DoS risk
- Identifier-only format matches any pubkey (use npub/identifier for high-value)
- Invalid npubs cause startup failure (fail-fast)
Documentation:
- Concise explanation focused on rationale
- Reference docs updated with all config options
- README updated to reflect completed feature
- Removed from roadmap, added to compliance section
See docs/explanation/grasp-05-archive.md for details.
|
|
|
|
Add GRASP-02 to supported_grasps array in NIP-11 relay information
document to advertise proactive sync capability to clients and tools.
|
|
Add comprehensive GRASP-01 compliance tests for uploadpack.allowFilter
capability to the grasp-audit test suite. These tests can be run against
ANY GRASP implementation (ngit-relay, ngit-grasp, or others) to verify
filter support.
New test module: grasp-audit/src/specs/grasp01/git_filter.rs
Tests added:
- test_filter_capability_advertised: Verifies filter appears in info/refs
- test_filtered_clone_succeeds: Tests git clone --filter=blob:none
- test_filtered_fetch_succeeds: Tests git fetch --filter=tree:0
Usage:
cd grasp-audit && nix develop -c bash test-ngit-relay.sh --mode test
cd grasp-audit && nix develop -c cargo run -- audit -r ws://localhost:8080 -s git-filter
|
|
Add mandatory uploadpack.allowFilter capability to support partial clones
and fetches as required by GRASP-01 specification. This enables efficient
git operations for bandwidth-constrained clients (e.g., browser-based git
clients like git-natural-api).
Changes:
- Add uploadpack.allowFilter=true to git subprocess configuration
- Update SmartGitServer test helper with filter support
- Add integration tests for filter capability advertisement and functionality
- Update documentation to reflect filter as required capability
Tests verify:
- Filter capability is advertised in info/refs
- Filtered clones with blob:none work correctly
- Filtered fetches with tree:0 work correctly
|
|
Previously, purgatory sync was using '--depth=1' when fetching OIDs from
remote servers. This created shallow clones with only 1-2 commits instead
of the complete git history.
The fix removes the '--depth=1' flag, allowing git to fetch the complete
commit history chain when fetching specific commit OIDs. This is the
correct behavior for GRASP - users cloning from our relay should get the
full repository history.
Changes:
- Remove '--depth=1' from git fetch command in RealSyncContext::fetch_oids
- Update comment to clarify that full history is fetched
Impact:
- Production repositories will now contain full git history
- Users cloning from the relay will get complete commit chains
- No more 'shallow' files in git repositories
- May be slightly slower due to fetching more data, but correctness is prioritized
Testing:
- All 564 tests pass (276 unit + 288 integration)
- No regressions in existing functionality
Fixes issue documented in work/active-issues/shallow-git-fetch.md
|
|
Implements ngit_repositories_total metric by counting *.git directories
on disk every time /metrics is requested (~15s interval by Prometheus).
This approach is simpler than increment-on-create because:
- No need to pass metrics through the relay builder chain
- Always accurate and self-correcting
- Negligible performance impact (~100-200 dir entries)
Changes:
- Add count_repositories_on_disk() static method to Metrics
- Update Metrics::render() to count repos before encoding metrics
- Pass git_data_path to Metrics::new() in main.rs
- Consolidate metrics tests to avoid global Prometheus registry conflicts
Fixes repository count metric issue from Phase 8 deployment plan.
|
|
reading
- Add coreutils to systemd service PATH so cat command is available
- Use absolute path for cat in ExecStart for reliability
- Fixes startup panic: relay_owner_keys should be available: Invalid relay_owner_nsec
- Fixes: cat: command not found error in systemd logs
This ensures the nsec file can be read properly during service startup,
allowing the sync manager to initialize correctly with relay owner authentication.
|
|
When relay_owner_nsec is provided via CLI argument or environment
variable (e.g., read from a file by the NixOS module), trim any
leading/trailing whitespace including newlines. This matches the
behavior when reading from the .relay-owner.nsec file directly.
Fixes issue where NixOS module reads nsec file with 'cat', which
includes the trailing newline, making the nsec invalid when passed
as a CLI argument.
Also reverted the tr workaround in nix/module.nix since ngit-grasp
now handles this correctly.
|
|
When reading the nsec from a file, strip any trailing newline
characters that would invalidate the nsec string. Use tr -d to
remove all newline characters from the file content before passing
to ngit-grasp.
|
|
ngit-grasp requires git and ssh binaries in PATH to clone repositories
during purgatory sync operations. Without these in the systemd service
environment, all git fetch operations fail with 'No target repo found'.
This fix adds git and openssh to the service PATH via systemd's
Environment directive, allowing purgatory to successfully clone
repositories from remote URLs.
|
|
systemd's ExecStart doesn't execute shell commands by default, so the
command substitution was being passed literally to ngit-grasp
instead of being evaluated. This caused a panic at startup when using
relayOwnerNsecFile option.
Wrap the command in bash -c to properly execute the file read.
|
|
- Add new how-to guide covering hash updates for git dependencies
- Applies to any git dependency (e.g., nostr-sdk fork)
- Add critical note in AGENTS.md linking to this guide
- Emphasize that hash updates in both flake.nix and nix/module.nix are MANDATORY
|
|
The preStart script was trying to chown directories but running as an
unprivileged user, causing permission errors. Instead, use systemd
tmpfiles.rules which run as root during system activation.
This ensures data directories are created with correct ownership before
the service starts.
|
|
Simplified approach: disable tests entirely during Nix package build.
Many tests require git in PATH which isn't available in the Nix sandbox:
- Unit tests that spawn git subprocesses (src/git/)
- Integration tests that create git repos (tests/*)
- Grasp-audit spec tests (grasp-audit/src/specs/)
All tests run successfully in environments with git:
- Local dev: nix develop (includes git)
- CI/CD: git installed in runners
- Manual: cargo test (uses system git)
This is a pragmatic solution for deployment - the binary itself
doesn't need git (it's only for testing git interaction).
|
|
Changed from selectively skipping test modules to running only --lib
tests (unit tests). This is cleaner and more maintainable.
Integration tests (tests/*.rs) require:
- git binary in PATH
- Ability to spawn subprocesses
- Network access for some tests
- TestRelay fixture (spawns ngit-grasp)
These requirements don't work in the Nix sandbox, so we run only
unit tests (--lib) during package build. Full integration test suite
runs in environments where git is available:
- Local dev (nix develop includes git)
- CI/CD (git installed)
- Manual testing (cargo test runs all tests)
|
|
Extended test skipping to include integration tests in tests/common/
that create git repos and spawn git processes:
- common::git_server:: - Tests that create git repos and run git daemon
- common::purgatory_helpers:: - Helper tests that init git repos
These tests are integration tests that verify git interaction, they
run successfully in:
- Local development (git available in devShell)
- CI/CD pipelines (git installed)
- Docker builds (git installed in image)
The Nix sandbox intentionally isolates builds and doesn't provide git
during the package build phase. We skip these tests to allow clean
builds while maintaining test coverage in appropriate environments.
|
|
Tests that spawn git subprocesses fail in the Nix sandbox because
git is not available in PATH during the build phase. These tests
are integration tests that verify git subprocess interaction, not
unit tests of core functionality.
Skipping test modules:
- git::subprocess::tests - Tests git upload-pack/receive-pack spawning
- git::tests - Tests that create git repos and manipulate refs
- purgatory::helpers::tests - Tests that init git repos
The skipped tests still run in:
- Local development (git is in devShell)
- CI/CD pipelines (git is installed)
- Integration test suite (uses TestRelay fixture)
This fix allows the package to build cleanly in Nix while maintaining
test coverage in appropriate environments.
|
|
The hash for the nostr-0.44.1 dependency was in Nix base32 format
(sha256-02cawkx...) but needs to be in SRI base64 format
(sha256-DwcWmwxNUQRR...) for compatibility with modern Nix.
This was causing nixos-rebuild to fail with:
error: invalid SRI hash '02cawkx6bxfi3bn1sb5ws8cn9wzcwsk8cdv1vx8h8lad1jdic1qg'
|
|
- Complete guide for deploying ngit-grasp to NixOS servers
- Step-by-step deployment instructions
- Configuration options reference
- Troubleshooting section
- Security hardening recommendations
- Multiple instance examples
- References nix/example-configuration.nix which has clear examples
|
|
Removed outdated options that don't exist in code:
- NGIT_SYNC_STARTUP_DELAY_SECS
- NGIT_SYNC_RECONNECT_DELAY_SECS
- NGIT_SYNC_RECONNECT_LOOKBACK_DAYS
- NGIT_SYNC_STARTUP_JITTER_MS
- NGIT_ARCHIVE_MODE (future/planned)
Added missing options that exist in code:
- NGIT_SYNC_DISCONNECT_CHECK_INTERVAL_SECS
- NGIT_SYNC_BASE_BACKOFF_SECS
- NGIT_SYNC_DISABLE_NEGENTROPY
- NGIT_REJECTED_HOT_CACHE_DURATION_SECS
- NGIT_REJECTED_COLD_INDEX_EXPIRY_SECS
- NGIT_NAUGHTY_LIST_EXPIRATION_HOURS
All environment variables now match exactly between src/config.rs
and .env.example, with consistent defaults and descriptions.
|
|
- Add Configuration Management section documenting 4-way sync
- Config must be consistent across: src/config.rs, docs/reference/configuration.md, nix/module.nix, and .env.example
- Include complete example showing all four formats
- Add to Critical Gotchas list (#8)
- Ensures .env.example stays accurate for development and Docker deployments
|
|
- Convert module from single service to attrsOf instances
- Each instance gets separate systemd service: ngit-grasp-<name>
- Each instance gets separate user: ngit-grasp-<name> (customizable)
- Default dataDir per instance: /var/lib/ngit-grasp-<name>
- Update example to show single and multiple instance configs
- Add notes on systemd service management per instance
|
|
- Create nix/module.nix with comprehensive systemd service
- Support both relayOwnerNsecFile and relayOwnerNsec options
- Auto-generate nsec if neither specified
- Add security hardening (NoNewPrivileges, ProtectSystem, etc.)
- Expose as nixosModules.default and nixosModules.ngit-grasp
- Include example configuration in nix/example-configuration.nix
- Add outputHashes for nostr git dependency
|