upleb.uk

Public git repos — served from a NIP-34 GRASP relay at git.upleb.uk

summaryrefslogtreecommitdiff
path: root/src
AgeCommit message (Collapse)Author
2026-01-21feat: add archive-grasp-services configuration optionDanConwayDev
Enables relay operators to backup/archive specific GRASP servers by domain. Includes configuration, validation, documentation, and integration tests.
2026-01-21fix: create_announcement_event test helper uses correct NIP-34 tag formatDanConwayDev
NIP-34 specifies single clone/relays tags with multiple values, not multiple tags with single values. Update test helper to match spec.
2026-01-19fix: archive_read_only creates bare repos for archived announcementsDanConwayDev
Combined Accept and AcceptArchive match arms in builder.rs to ensure bare repositories are created for both cases. Previously AcceptArchive had duplicate code that didn't call ensure_bare_repository(). Also includes: - Config fix: effective_git_data_path() respects explicit paths with memory backend - TestRelay: Added git_data_path() and archive config support for testing - Integration tests for archive_read_only behavior
2026-01-19config: increase max_connections default from 2000 to 4096DanConwayDev
Increases connection limit across all configuration sources: - src/config.rs: default_value_t = 4096 - docs/reference/configuration.md: updated default and examples - nix/module.nix: maxConnections default = 4096 - .env.example: updated default and comment This allows the relay to handle more concurrent connections and reduces the likelihood of connection exhaustion under normal load. The previous limit of 2000 was too conservative for production deployments.
2026-01-14Add explicit rate limits and total connection limitDanConwayDev
- Make RateLimit explicit in relay builder (500 subs, 60 events/min) - Add NGIT_MAX_CONNECTIONS config option (default: 500) - Update all 4 config locations (src, nix, docs, .env.example) - Fix documentation error: filter limit 5000→500 - Document Phase 2 deferral decision (per-IP enforcement) Addresses primary DoS vector (connection exhaustion) with minimal code. Per-IP rate limiting deferred until abuse detected in production. Related: issue ff38 (git endpoint throttling - separate concern)
2026-01-14feat(sync): add rejected events cache persistence and integrate with ↵DanConwayDev
shutdown/startup Implement save/restore functionality for rejected events cache and integrate persistence with relay shutdown/startup lifecycle. Both purgatory and rejected cache now survive relay restarts. Key features: - Serialize rejected events cache to JSON (rejected-events-cache.json) - Save both hot cache (2min, full events) and cold index (7day, metadata) - Restore with downtime adjustment (preserves remaining TTL) - Graceful degradation (missing/corrupted files don't crash) - File cleanup after successful restore - Automatic restoration in SyncManager::new() Integration: - Shutdown hook saves both purgatory and rejected cache - Startup hook restores both and re-queues repositories - Non-fatal errors (logs warnings, continues on failure) Files: - src/sync/rejected_index.rs: save_to_disk/restore_from_disk methods - src/sync/mod.rs: SyncManager integration and auto-restore - src/main.rs: Shutdown/startup hooks for both caches - tests/purgatory_persistence.rs: 17 integration tests Tests: 13 unit tests + 17 integration tests covering full lifecycle
2026-01-14feat(purgatory): add persistence to survive relay restartsDanConwayDev
Implement save/restore functionality for purgatory state to prevent event loss during relay restarts. Events in purgatory (state events, PR events, and expired events) are now saved to disk on graceful shutdown and restored on startup. Key features: - Serialize purgatory state to JSON (purgatory-state.json) - Time conversion helpers for Instant <-> Duration serialization - Restore with downtime adjustment (preserves remaining TTL) - Graceful degradation (missing/corrupted files don't crash) - File cleanup after successful restore - get_all_identifiers() for re-queueing after restore Files: - src/purgatory/persistence.rs: Time conversion helpers - src/purgatory/types.rs: Serialization derives - src/purgatory/mod.rs: save_to_disk/restore_from_disk methods Tests: 15 unit tests covering serialization, downtime, edge cases
2026-01-13fix: Enable sync relay discovery in archive_all modeDanConwayDev
The bug: SelfSubscriber filtered announcements with lists_our_relay() check, preventing archive_all mode from discovering relays in announcements that don't list our relay domain. The insight: SelfSubscriber only receives events that ALREADY passed write policy validation (archive_all, archive_whitelist, blacklist, etc.) via admit_event() before being saved to the database. The event flow: External relay → process_event_static() → write_policy.admit_event() → (validation happens here) → save to DB → notify_event() → SelfSubscriber receives via WebSocket So the lists_our_relay() check was redundant double-validation that broke archive_all mode by filtering events that had already been accepted by the write policy. The fix: Simply remove the lists_our_relay() filtering. Events reaching SelfSubscriber are pre-validated and should all be processed for relay discovery according to the configured archive policy. Changes: - Removed lists_our_relay() check from process_notification() (4 lines) - Removed unused lists_our_relay() helper function (9 lines) - Added comment explaining events are pre-validated (3 lines) - Total: 13 lines removed, 3 lines added Fixes #194d
2026-01-12Change default port from 8080 to 7334 (NGIT on phone keypad)DanConwayDev
- Update default bind address in src/config.rs to 127.0.0.1:7334 - Update all four critical config sources per AGENTS.md: - src/config.rs (code default and tests) - .env.example (development template) - docs/reference/configuration.md (user documentation) - nix/module.nix (NixOS deployment) - Update all documentation examples and references: - README.md (with note about phone keypad mnemonic) - docs/how-to/*.md (deploy, prometheus-setup, test-compliance) - docs/explanation/*.md (architecture, comparison) - docs/learnings/grasp-audit.md Port 7334 spells NGIT on a phone keypad, making it memorable and project-specific. All tests pass (336 lib tests + 51 integration tests).
2026-01-12feat(config): add event blacklist to block all events from specific authorsDanConwayDev
Adds NGIT_EVENT_BLACKLIST option for blocking all events from specific npubs, taking precedence over all other validation to enable comprehensive moderation without affecting curation policy. Key features: - Simple npub-only format: <npub>,<npub>,... - Checked FIRST before any other validation (including repository blacklist) - Blocks ALL event types (announcements, state events, PRs, comments, etc.) - Events never reach relay storage or purgatory - Specific rejection reason for operator debugging Implementation: - Add EventBlacklistConfig struct with check() method - Add NGIT_EVENT_BLACKLIST config option and event_blacklist_config() method - Add config field to PolicyContext for policy access - Add check_event_blacklist() to Nip34WritePolicy - Check event blacklist first in admit_event() method (before any other validation) - 4 new unit tests covering all blacklist behavior Configuration synced across all four sources: - src/config.rs: Core implementation with EventBlacklistConfig - .env.example: Comprehensive documentation with examples - docs/reference/configuration.md: Complete reference documentation - nix/module.nix: NixOS module option with environment mapping README updates: - Add comprehensive "Curation & Moderation" section - Document repository whitelists (GRASP-01 and GRASP-05 modes) - Document repository and event blacklists with precedence order - Add configuration table for all curation/moderation settings - Provide real-world examples for different relay configurations Testing: - 4 new tests for event blacklist functionality - All 336 library tests passing - All 64 integration tests passing - All 38 filter support tests passing Verification: - Repository blacklist confirmed to apply to sync (uses same admit_event flow) - Sync events validated through process_event_static -> write_policy.admit_event Use cases: - Block spam/abusive users completely - Prevent malicious actors from submitting any events - Temporary blocks for investigation - Moderation without affecting whitelist curation policy
2026-01-12feat(config): add repository blacklist to block specific repos/npubs/identifiersDanConwayDev
Adds NGIT_REPOSITORY_BLACKLIST option for blocking repositories, taking precedence over all whitelists (archive and repository) to enable moderation without affecting curation policy. Key features: - Three blacklist formats: <npub>, <npub>/<identifier>, <identifier> - Blacklist checked first before any other validation - Overrides archive whitelist and repository whitelist - Specific rejection reasons based on match type (npub/identifier/both) - Not flagged in NIP-11 curation (operational, not policy) Implementation: - Add BlacklistConfig struct with check() method returning detailed reasons - Add NGIT_REPOSITORY_BLACKLIST config option and blacklist_config() method - Update validate_announcement() to check blacklist first with specific reasons - 12 new unit tests covering all blacklist behavior and precedence Configuration synced across all four sources: - src/config.rs: Core implementation with BlacklistConfig - .env.example: Comprehensive documentation with examples - docs/reference/configuration.md: Complete reference documentation - nix/module.nix: NixOS module option with environment mapping Testing: - 12 new tests for blacklist functionality (config + validation) - All 332 library tests passing - All 38 integration tests passing Use cases: - Block spam/malware repos by identifier - Block abusive users by npub - Block specific problematic repos by npub/identifier - Temporary blocks for investigation
2026-01-12refactor(config): validate eagerly at startup and remove Result from runtime ↵DanConwayDev
config methods Refactors configuration validation to fail fast on fatal errors at startup while gracefully handling recoverable issues (e.g., malformed whitelist entries). Changes: - Add Config::validate() for eager validation called immediately after load - Remove Result<> from archive_config() and repository_config() methods - WhitelistEntry::parse_whitelist() skips invalid entries with warnings - Validate relay_owner_nsec format in Config::validate() - Update all call sites to remove Result handling from config getters Benefits: - Fatal config errors (incompatible settings) fail at startup, not runtime - Recoverable errors (bad whitelist entries) logged as warnings and skipped - No Result handling scattered throughout runtime code after validation - Config methods safe to call without error handling after validate() Testing: - Add 7 new tests for validation edge cases and error handling - Total config tests: 40 (up from 33) - All 320 library tests passing Breaking change: Config users must call config.validate() after Config::load() to ensure configuration is valid. This is enforced in main.rs.
2026-01-12feat(config): add repository whitelist for curated GRASP-01 acceptanceDanConwayDev
Adds NGIT_REPOSITORY_WHITELIST option for curated relay operation that accepts only whitelisted repositories while maintaining GRASP-01 compliance (announcements must list the service). This differs from archive whitelist which enables GRASP-05 mode and doesn't require service listing. Key features: - Supports three whitelist formats: npub, npub/identifier, identifier - Enforces mutual exclusivity with archive read-only mode - Updates NIP-11 curation field when whitelist is enabled - Maintains GRASP-01 compliance (doesn't add GRASP-05 support) Configuration synced across all four sources: src/config.rs, docs/reference/configuration.md, nix/module.nix, and .env.example as required by AGENTS.md.
2026-01-12feat(grasp-05): add read-only mode with auto-enable for archive configsDanConwayDev
Implements NGIT_ARCHIVE_READ_ONLY configuration option that defaults to true when archive mode is enabled, allowing relays to operate as read-only syncs of archived repositories. Key changes: - Add NGIT_ARCHIVE_READ_ONLY config option (defaults to true if archive enabled) - NIP-11 advertises GRASP-05 support and includes curation field when read-only - Validation logic rejects non-whitelisted repos in read-only mode - Comprehensive tests for read-only behavior and defaults - Full documentation in config reference, .env.example, and NixOS module Read-only mode enables passive mirroring without being listed in announcements, useful for backup/archive operations while preventing accidental write acceptance.
2026-01-12feat(grasp-05): implement archive mode for backup/mirror operationDanConwayDev
Implements GRASP-05 specification for accepting repository announcements that don't list this relay, enabling archive, mirror, and backup use cases. Core Features: - Three whitelist formats: <npub>, <npub>/<identifier>, <identifier> - Archive-all mode for complete ecosystem mirrors - Fail-fast npub validation at startup - Read-only enforcement (archived repos reject pushes) - Full GRASP-02 sync (git data + Nostr events) - Dynamic archive status (no flags/metadata) Implementation: - Add ArchiveWhitelistEntry enum with Pubkey/Repository/Identifier variants - Add ArchiveConfig with validation and matching logic - Update AnnouncementResult to include AcceptArchive variant - Refactor validate_announcement() to return AnnouncementResult with archive check - Update AnnouncementPolicy with catch-all pattern for cleaner code - Wire archive config through builder and policy layers Configuration: - NGIT_ARCHIVE_ALL: Accept all announcements (⚠️ storage risk) - NGIT_ARCHIVE_WHITELIST: Comma-separated whitelist entries - Updated docs, .env.example, and nix/module.nix Testing: - 28 unit tests for config parsing and whitelist matching - 7 integration tests for archive mode validation - All 296 tests passing Validation Priority: 1. Lists our service → Accept (GRASP-01, read/write) 2. Is maintainer → AcceptMaintainer (multi-maintainer, read/write) 3. Matches archive config → AcceptArchive (GRASP-05, read-only) 4. None of above → Reject Security Considerations: - Archive-all mode has storage/bandwidth DoS risk - Identifier-only format matches any pubkey (use npub/identifier for high-value) - Invalid npubs cause startup failure (fail-fast) Documentation: - Concise explanation focused on rationale - Reference docs updated with all config options - README updated to reflect completed feature - Removed from roadmap, added to compliance section See docs/explanation/grasp-05-archive.md for details.
2026-01-12feat(nip11): advertise GRASP-02 support in relay infoDanConwayDev
Add GRASP-02 to supported_grasps array in NIP-11 relay information document to advertise proactive sync capability to clients and tools.
2026-01-12feat: add uploadpack.allowFilter support for GRASP-01 complianceDanConwayDev
Add mandatory uploadpack.allowFilter capability to support partial clones and fetches as required by GRASP-01 specification. This enables efficient git operations for bandwidth-constrained clients (e.g., browser-based git clients like git-natural-api). Changes: - Add uploadpack.allowFilter=true to git subprocess configuration - Update SmartGitServer test helper with filter support - Add integration tests for filter capability advertisement and functionality - Update documentation to reflect filter as required capability Tests verify: - Filter capability is advertised in info/refs - Filtered clones with blob:none work correctly - Filtered fetches with tree:0 work correctly
2026-01-12fix: fetch full git history instead of shallow clonesDanConwayDev
Previously, purgatory sync was using '--depth=1' when fetching OIDs from remote servers. This created shallow clones with only 1-2 commits instead of the complete git history. The fix removes the '--depth=1' flag, allowing git to fetch the complete commit history chain when fetching specific commit OIDs. This is the correct behavior for GRASP - users cloning from our relay should get the full repository history. Changes: - Remove '--depth=1' from git fetch command in RealSyncContext::fetch_oids - Update comment to clarify that full history is fetched Impact: - Production repositories will now contain full git history - Users cloning from the relay will get complete commit chains - No more 'shallow' files in git repositories - May be slightly slower due to fetching more data, but correctness is prioritized Testing: - All 564 tests pass (276 unit + 288 integration) - No regressions in existing functionality Fixes issue documented in work/active-issues/shallow-git-fetch.md
2026-01-12fix(metrics): count repositories on disk on each metrics requestDanConwayDev
Implements ngit_repositories_total metric by counting *.git directories on disk every time /metrics is requested (~15s interval by Prometheus). This approach is simpler than increment-on-create because: - No need to pass metrics through the relay builder chain - Always accurate and self-correcting - Negligible performance impact (~100-200 dir entries) Changes: - Add count_repositories_on_disk() static method to Metrics - Update Metrics::render() to count repos before encoding metrics - Pass git_data_path to Metrics::new() in main.rs - Consolidate metrics tests to avoid global Prometheus registry conflicts Fixes repository count metric issue from Phase 8 deployment plan.
2026-01-11fix(config): trim whitespace from relay-owner-nsec CLI/env inputDanConwayDev
When relay_owner_nsec is provided via CLI argument or environment variable (e.g., read from a file by the NixOS module), trim any leading/trailing whitespace including newlines. This matches the behavior when reading from the .relay-owner.nsec file directly. Fixes issue where NixOS module reads nsec file with 'cat', which includes the trailing newline, making the nsec invalid when passed as a CLI argument. Also reverted the tr workaround in nix/module.nix since ngit-grasp now handles this correctly.
2026-01-10fix: document relay behavior in negentropy retry zero-event scenarioDanConwayDev
Add comprehensive comment explaining why some relays (azzamo.net, snort.social) return zero events during negentropy retry even when they have the events. Documents infinite loop prevention logic and suggests future REQ+EOSE fallback strategy.
2026-01-10fix: normalize URLs with trailing slashes in announcement validationDanConwayDev
Announcements were being rejected when clone URLs or relay URLs had trailing slashes that didn't match. Added URL normalization to strip trailing slashes before comparison, allowing announcements to be accepted regardless of trailing slash presence. - Add normalize_url_for_comparison() helper - Update has_clone_url() and has_relay() to normalize before matching - Add comprehensive tests for trailing slash scenarios Fixes issue in work/active-issues/clone-relays-mismatch-validation.md
2026-01-10Add naughty list for git remotes with persistent SSL/DNS errorsDanConwayDev
Implement domain-level naughty list tracking for git remotes, reusing the existing NaughtyListTracker from relay sync. This prevents repeated attempts to fetch from git domains with persistent infrastructure issues (SSL/TLS certificate errors, DNS failures). Changes: - Updated NaughtyListTracker to track both relay URLs and git domains - Added git_naughty_list field to RealSyncContext for error classification - Modified fetch_oids() to classify git fetch errors and record naughty domains - Updated sync_identifier_next_url() to filter out naughty domains during URL selection - Added git_naughty_list parameter to ThrottleManager for domain queue processing - Threaded naughty list through start_sync_loop and all sync functions - Updated all tests to pass naughty list parameter The naughty list uses 12-hour expiration (configurable) to allow domains to recover from infrastructure issues. First occurrence logs WARN, repeats log DEBUG.
2026-01-10fix: propagate git fetch errors instead of logging misleading successDanConwayDev
2026-01-10fix: implement negentropy fallback to REQ+EOSE when negentropy failsDanConwayDev
When negentropy sync fails (one or more filters fail during diff), the code previously left a pending batch and returned early, preventing any sync from happening. This caused the "No sync targets found" issue. Changes: - Track negentropy success with a boolean flag - On negentropy failure: clean up pending batch and fall through to REQ+EOSE - Log the fallback at info level for visibility - Restructure control flow so REQ+EOSE path executes after negentropy failure This ensures sync always completes using traditional REQ+EOSE when NIP-77 negentropy is unavailable or fails.
2026-01-10Implement relay naughty list featureDanConwayDev
Add naughty list tracking for relays with persistent infrastructure issues (DNS failures, TLS certificate errors, protocol violations) to reduce log noise and provide better visibility via metrics. Key features: - Classify errors into naughty (persistent) vs transient (temporary) - Track naughty relays with category, reason, and occurrence count - Log WARN on first naughty occurrence, DEBUG on repeats - Automatic expiration after 12 hours (configurable) - Prometheus metrics for monitoring naughty relays by category - Periodic cleanup task integrated with health checker Components added: - src/sync/naughty_list.rs: Core naughty list tracker with error classification - NaughtyListTracker integration in RelayHealthTracker - Connection error handling updates in sync manager - Naughty list metrics (total by category, detailed info per relay) - Config option for naughty_list_expiration_hours (default: 12) Closes DNS lookup failures and TLS certificate errors tracking issues.
2026-01-10fix: downgrade EOSE unknown subscription warning to traceDanConwayDev
Live subscriptions (limit:0, no auto-close) are not tracked in PendingBatch because they stay open indefinitely for new events. When they receive EOSE (immediately, since no historic events), handle_eose can't find them in outstanding_subs. This is expected behavior, not an error. Changed log level from warn to trace to reduce noise. Observed in production logs: sync_live() subscriptions with limit:0 complete immediately and trigger this path. Issue: work/active-issues/eose-unknown-subscription.md
2026-01-10fix: move state events from Layer 1 to identifier-based filtersDanConwayDev
Removes kind 30618 (state events) from Layer 1 announcement filter and adds targeted subscriptions using #d (identifier) tags in Layer 2. Problem: Layer 1 was receiving ALL state events from all relays, causing 1000+ rejections for repositories we don't host. Solution: - Remove Kind::RepoState from build_announcement_filter (Layer 1) - Add state_event_filters_for_our_repos() function that creates filters with kind 30618 and #d tags for only our hosted repo identifiers - Integrate state filters into build_layer2_and_layer3_filters - Extract unique identifiers from repo refs and batch by 100 per filter Benefits: - Dramatically reduces bandwidth and rejection noise (1000+ → ~0) - More efficient: one filter with multiple identifiers vs broadcast - Only receive state events for repositories we actually care about Resolves: work/active-issues/layer1-state-event-oversubscription.md
2026-01-10fix: reduce log noise for expected state event rejections during syncDanConwayDev
State events from remote relays for repos we don't host are expected rejections during proactive sync. Changed to only WARN for user-submitted events (potential misconfiguration/attack) while using DEBUG for synced events (normal operation). This reduces log noise from ~1967 warnings to <10 warnings in a 30-second production sync test, making real issues visible again.
2026-01-10fix: detect NIP-77 NOTICE immediately during negentropy syncDanConwayDev
Previously, when a relay didn't support NIP-77, the negentropy_sync_diff function would wait for the full client.sync() timeout even after receiving a NOTICE message that marked the relay as not supporting NIP-77. This change uses tokio::select! to race the sync operation against a polling task that checks the nip77_supported flag every 10ms. When a NOTICE is received (detected in the message handler), the poll task detects the status change and immediately returns an error, allowing quick fallback to REQ+EOSE without waiting for timeouts. Benefits: - Fast failure (within 10ms) when relay sends NIP-77 NOTICE - No artificial timeout reduction that could hurt legitimate operations - Maintains full timeout for relays that actually support NIP-77
2026-01-10fix: return error when negentropy has failures to enable REQ fallbackDanConwayDev
When negentropy sync times out or has other failures, it now properly returns Err() instead of Ok() with empty reconciliation. This ensures historic_sync increments failed_count and triggers fallback to REQ+EOSE instead of treating it as a successful sync with 0 events. Resolves issue where bootstrap relay timeouts were marked as complete instead of falling back to traditional sync.
2026-01-09fix: reduce duplicate NOTICE loggingDanConwayDev
Change relay NOTICE logging from DEBUG to TRACE level to avoid duplicate logs (nostr-sdk already logs all NOTICEs at DEBUG level). Negentropy-specific NOTICEs remain at INFO level as they indicate important NIP-77 support information.
2026-01-09fix: downgrade duplicate EOSE log to trace levelDanConwayDev
EOSE messages can arrive after batch completion due to: 1. Late/duplicate EOSE from relay (e.g., live_sync REQ subscriptions) 2. Race condition between batch confirmation and EOSE arrival 3. EOSE during intentional disconnect cleanup Since this is expected behavior, downgrade from debug to trace level to reduce log noise. Added detailed code comment explaining the scenarios and suggesting how to investigate if needed (tracking recently-completed subscription IDs). Resolves issue where duplicate EOSE from live_sync subscriptions appeared as confusing 'unknown relay' debug messages.
2026-01-09fix: eliminate disconnect race condition by adding Disconnecting stateDanConwayDev
Previously, disconnect_relay() would immediately remove RelayState and pending batches before the event loop finished draining messages. This caused confusing 'unknown relay' debug messages for EOSE and other events that arrived after state removal but were expected during normal shutdown. Changes: - Add ConnectionStatus::Disconnecting to track intentional disconnects - disconnect_relay() now marks relay as Disconnecting (keeps state) - Event loop drains messages while state exists - handle_disconnect() detects intentional vs unexpected disconnects: - Intentional: Completes cleanup by removing state/connections - Unexpected: Updates to Disconnected, keeps connection for retry - handle_eose() suppresses logs for Disconnecting relays (TRACE level) - check_disconnects() skips relays already in Disconnecting state This ensures proper sequencing: mark->drain->cleanup instead of remove->drain->confusion. Fixes the root cause instead of just hiding log messages.
2026-01-09improve: detect and skip negentropy for unsupported relaysDanConwayDev
- Upgrade NOTICE log level to INFO when relay rejects negentropy (envelope/NEG- errors) - Track NIP-77 support status per relay connection to avoid repeated failed attempts - Mark relay as unsupported when NOTICE rejection or timeout occurs - Skip negentropy on subsequent syncs during same connection session - Reset support status on reconnect to allow retry after relay upgrades This reduces log noise and eliminates 10-second timeout delays on each historic sync attempt for relays that don't support NIP-77 negentropy. Fixes negentropy-timeout-10-seconds issue by learning from relay behavior.
2026-01-09fix: mark bootstrap relay with is_bootstrap flag to prevent disconnectionDanConwayDev
The bootstrap relay was being registered with is_bootstrap=false, causing it to be disconnected when empty. This change adds an is_bootstrap parameter to register_relay() and passes true when registering the bootstrap relay. The existing check_disconnects() logic already skips bootstrap relays, but the flag was never being set correctly.
2026-01-09feat: add helpful feedback after bootstrap relay sync completesDanConwayDev
When bootstrap sync completes with zero announcements, users may not know if this is expected or indicates a configuration problem (wrong domain or wrong bootstrap relay). Changes: - Add INFO-level message after bootstrap announcement sync completes - If zero announcements: suggest verifying domain/relay configuration - If announcements found: report count for user awareness - Only applies to bootstrap relay (is_bootstrap flag) This helps users quickly diagnose configuration issues during initial setup and testing. Discovered via production sync testing against wss://git.shakespeare.diy
2026-01-09fix: downgrade negentropy timeout warning to debug levelDanConwayDev
Negentropy diff timeouts are expected when relays don't support NIP-77. The relay responds with NOTICE 'unknown envelope label' and the timeout is hit before we recognize this is unsupported rather than a failure. Changes: - Downgrade from warn! to debug! in negentropy_sync_filter() (src/sync/relay_connection.rs:493) - Add comment explaining timeouts are common for non-NIP-77 relays - Update message to clarify timeout typically means no NIP-77 support The existing fallback mechanism (lines 505-509) properly handles this case and logs a one-time warning about falling back to REQ+EOSE. Discovered via production sync testing against wss://git.shakespeare.diy
2026-01-09fix: downgrade EOSE race condition warning to debug levelDanConwayDev
During relay disconnect, EOSE messages may arrive after the relay has been removed from pending_sync_index. This creates a benign race condition that was logged as a warning. Changes: - Downgrade from warn! to debug! in handle_eose() (src/sync/mod.rs:632) - Add clarifying comment explaining this occurs during disconnect - Update message to indicate this is expected behavior Discovered via production sync testing against wss://git.shakespeare.diy
2026-01-09refactor(sync): consolidate to single rejected index with helper extractionDanConwayDev
Remove rejected_states_index and use single rejected_events_index for both announcement and state events. Extract duplicate re-processing logic into a consolidated helper function. Changes: - Eliminate duplicate RepositoryAnnouncement::from_event() call - Remove rejected_states_index field from SyncManager - Update cleanup loop to process both event types via single index - Add ReprocessingStats struct to track re-processing outcomes - Add reprocess_events_from_hot_cache() helper that handles: - Logging re-processing attempts with context - Calling process_event_static recursively - Tracking saved/duplicate/purgatory/rejected counts - Replace three nearly-identical re-processing loops with helper calls Consolidates phases 1, 5, and 6 of rejected events index refactoring.
2026-01-09refactor(sync): parameterize rejected index metrics by event typeDanConwayDev
Replace duplicate metrics methods (announcements vs states) with unified methods using IntGaugeVec/IntCounterVec with an event_type label: - update_rejected_hot_cache_size(event_type, size) - record_rejected_hot_cache_hit(event_type) - record_rejected_hot_cache_miss(event_type) - record_rejected_hot_cache_expired(event_type, count) - update_rejected_cold_index_size(event_type, size) - record_rejected_cold_index_expired(event_type, count) - record_rejected_invalidation(event_type, count) Prometheus labels remain separate (event_type="announcement" vs event_type="state") but implementation is now unified. Phase 4 of rejected events index refactoring.
2026-01-09refactor(sync): add EventType enum and unify rejected index methodsDanConwayDev
Add EventType enum (Announcement, State) to distinguish event types within RejectedEventsIndex. This consolidates the two-tier index design into a single unified interface. Changes: - Add EventType enum with Announcement and State variants - Add event_type field to HotCacheEntry and ColdIndexEntry - Create unified invalidate_and_get() with optional event_type filter - Update cleanup_expired_for_type() to handle both types - Remove deprecated wrapper methods (invalidate_and_get_events, invalidate_and_get_state_events, cleanup_expired, cleanup_states_expired) Consolidates phases 2, 3, and 7 of rejected events index refactoring.
2026-01-09chore: cargo fmtDanConwayDev
2026-01-09refactor(sync): remove PR references from commentsDanConwayDev
Replace PR-specific references (PR3, PR4.1, PR4.2) with problem-focused documentation that explains what the code does and why. Changes: - Maintainer re-processing: Explain race condition handling - State event re-processing (announcement): Clarify timing issue - State event re-processing (state): Describe multi-event scenario Why: PR numbers are ephemeral and meaningless to future readers. Comments should explain the problem being solved, not when code was added. All tests pass: 248 library tests passing
2026-01-09feat(sync): fix race condition with announcement-before-state event orderingDanConwayDev
**Problem:** Integration test `test_concurrent_state_and_pr_sync` was timing out because of a race condition: when syncing from remote relays, state events can arrive BEFORE their announcements (no ordering guarantee). The system was rejecting these state events with "no announcement exists" but NOT tracking them for re-processing when the announcement later arrived. **Solution:** Implemented announcement → state event re-processing (GRASP-02 PR4.1) to handle the race condition, mirroring the existing maintainer announcement re-processing logic (GRASP-02 PR3). **What Changed:** 1. **Announcement → State Event Re-processing (GRASP-02 PR4.1)**: When a repository announcement is accepted, the system now invalidates and re-processes state events that were rejected with "no announcement exists". This ensures state events arriving before their announcements are eventually processed correctly. 2. **State Event → State Event Re-processing (GRASP-02 PR4.2)**: When a state event is accepted (git data arrives), the system invalidates and re-processes other rejected state events for the same repository from the hot cache. (Renamed from PR4 for clarity - this was already implemented in previous commit) 3. **Proper Rejection Tracking**: Extended rejection reason detection to include "no announcement exists" and "not authorized" messages, ensuring these state events are properly tracked in the rejected events index for re-processing. 4. **Proper State Event Metrics**: State events now use `add_state()` instead of `add_announcement()` when rejected, ensuring correct metrics tracking. 5. **Removed Redundant Field**: Removed `event_id` field from `ColdIndexEntry` since it's already stored as the HashMap key. This eliminates dead code while preserving the cold index's core purpose: preventing re-fetch of rejected events during negentropy sync via `get_all_event_ids()`. 6. **Fixed Doc Test**: Changed doc test from `no_run` to `ignore` since it uses undefined variables for illustration purposes. 7. **Fixed Clippy Warnings**: - Added `#[allow(dead_code)]` for `reason` fields (reserved for future metrics) - Fixed unused variable warning - Collapsed nested if statement **Why:** The two-tier rejected events index was handling two scenarios: - GRASP-02 PR3: Maintainer announcement arrives → re-process announcements - GRASP-02 PR4.2: State event with git data arrives → re-process state events But it was missing: - GRASP-02 PR4.1: Repository announcement arrives → re-process state events This created a race condition where state events arriving before their announcements would be rejected and never re-processed. **Implementation Details:** The fix follows the same pattern as maintainer re-processing: 1. When announcement accepted, parse it to get pubkey + identifier 2. Call `invalidate_and_get_state_events()` to get rejected state events 3. Re-process each state event from hot cache using `process_event_static()` 4. Log results (Saved, Purgatory, Duplicate, or still rejected) **Test Results:** ✅ All tests pass (578 total): - 248 unit tests pass - 330 integration tests pass (including the previously failing test) - All clippy warnings fixed - Doc tests pass ✅ Target test now passes consistently: - `test_concurrent_state_and_pr_sync` completes in ~2.7s (was timing out at 30s) **Impact:** - Fixes race condition in sync ordering (state before announcement) - No breaking changes - only adds re-processing capability - Follows existing patterns - mirrors GRASP-02 PR3 maintainer re-processing - Minimal code changes - ~86 lines added to handle new re-processing path **Files Changed:** ``` src/sync/mod.rs | 86 +++++++++++++++++++++++++++++++++++++++++++++ src/sync/rejected_index.rs | 6 ++-- 2 files changed, 87 insertions(+), 5 deletions(-) ``` Co-authored-by: Assistant <assistant@anthropic.com>
2026-01-09feat: implement state event authorization per GRASP-01 specDanConwayDev
Add comprehensive authorization checks to ensure state events are only accepted from maintainers of accepted repository announcements. This implements the core GRASP-01 requirement that pushes must match the latest state announcement "respecting the maintainer set." Changes: 1. StatePolicy authorization (src/nostr/policy/state.rs): - Check authorization BEFORE git data validation (fail-fast) - Reject if no announcement exists for repository - Reject if author not in maintainer set - Use existing helpers: fetch_repository_data() and pubkey_authorised_for_repo_owners() - Structured logging for all rejections 2. Purgatory invalidation (src/nostr/builder.rs): - New method: check_purgatory_state_events_for_identifier() - Called when announcements accepted (Accept and AcceptMaintainer) - Re-evaluates state events in purgatory for the identifier - Processes newly-authorized events (releases from purgatory) - Keeps unauthorized events for natural expiry (30 min) - Enables retroactive authorization when announcements arrive late 3. Purgatory sync authorization (src/git/sync.rs): - Check authorization BEFORE processing git data - Remove unauthorized events from purgatory (permanent rejection) - Prevents processing even if git data arrives first - Structured logging for monitoring 4. Rejected events tracking (src/sync/rejected_index.rs): - Add support for tracking rejected state events - New methods: add_state(), contains_state() - Separate metrics for state rejections - Enables sync to avoid re-fetching rejected states 5. Sync metrics (src/sync/metrics.rs, src/sync/mod.rs): - Add state-specific metrics (hot cache, cold index) - Track rejected states separately from announcements - Support monitoring of authorization rejections 6. Comprehensive tests (tests/state_authorization.rs): - test_reject_state_without_announcement - test_reject_state_from_unauthorized_author - test_accept_state_from_announcement_author - test_accept_state_from_maintainer Security Impact: - Before: State events could be published by anyone - After: Only maintainers can publish state events - Defense-in-depth: Authorization checked at 3 points: 1. On arrival (StatePolicy) 2. On announcement acceptance (purgatory re-evaluation) 3. On git data arrival (purgatory sync) All tests pass: - 248 unit tests - 51 NIP-34 announcement tests - 4 new state authorization tests - 9 rejected index tests Closes: State authorization requirement from GRASP-01 spec
2026-01-09feat(sync): add cleanup loops and metrics for rejected events indexDanConwayDev
Add automatic cleanup and Prometheus metrics for the two-tier rejected events index that caches rejected announcements for re-processing. Cleanup loops: - Hot cache: Every 60 seconds (events expire after 2 minutes) - Cold index: Every 24 hours (metadata expires after 7 days) - Background task with graceful shutdown support New Prometheus metrics (7): - Gauges: hot_cache_current, cold_index_current - Counters: hits, misses, hot_expired, cold_expired, invalidated This completes the maintainer announcement re-processing feature, reducing wait time from 24 hours to <1 second when a maintainer's announcement arrives before the repository owner's announcement. Memory is bounded through automatic cleanup, and comprehensive metrics enable monitoring of hit rates, memory usage, and cleanup effectiveness. Changes: - src/sync/metrics.rs: Added 7 metrics with recording methods - src/sync/rejected_index.rs: Added optional metrics support - src/sync/mod.rs: Added cleanup background task Tests: 248 library tests passing, 3 integration tests passing
2026-01-09feat(sync): invalidation + immediate re-processing of maintainer announcementsDanConwayDev
- Add two-tier rejected events index (hot cache + cold index) - Hot cache: 2-minute in-memory storage of full rejected events - Cold index: 7-day metadata storage for deduplication - Immediate re-processing when owner announcements list maintainers - Fix rejection reason detection to match actual error messages - Rewrite integration tests to use two-relay sync pattern - All tests passing (3 passed, 1 ignored slow test)
2026-01-09feat: Switch SyncManager to use two-tier RejectedEventsIndexDanConwayDev
Replaces the simple HashSet<EventId> with the sophisticated two-tier RejectedEventsIndex from PR1, enabling future immediate re-processing when maintainer dependencies resolve. ## Changes ### Config (src/config.rs) - Add `rejected_hot_cache_duration_secs` (default: 120 = 2 minutes) - Add `rejected_cold_index_expiry_secs` (default: 604800 = 7 days) - Both configurable via CLI flags or environment variables ### SyncManager (src/sync/mod.rs) **Type Change:** - Before: `Arc<RwLock<HashSet<EventId>>>` (simple event ID set) - After: `Arc<RejectedEventsIndex>` (two-tier storage) **Initialization:** - Pass config durations to RejectedEventsIndex::new() - Creates hot cache (2 min) + cold index (7 days) **Event Processing (process_event_static):** - Extract identifier from 'd' tag - Determine rejection reason from error message - Call `add_announcement()` with full event + metadata - Stores in both hot cache and cold index **Negentropy Sync (derive_relay_targets):** - Call `get_all_event_ids()` to get rejected IDs - Returns union of hot cache + cold index event IDs - Excludes from negentropy reconciliation **Event Loop (relay_connection):** - Use `contains()` method instead of direct HashSet access - Simpler API, same skip-rejected behavior ### RejectedEventsIndex (src/sync/rejected_index.rs) **New Method:** - `get_all_event_ids()`: Returns HashSet<EventId> from both tiers - Used for negentropy exclusion (replaces direct HashSet access) ### Tests Updated **test_rejected_events_index_tracks_announcements:** - Create RejectedEventsIndex with config durations - Add 'd' tag to test announcement - Use `add_announcement()` with full event - Verify both hot cache and cold index populated - Check lengths with `hot_cache_len()` and `cold_index_len()` **test_rejected_events_excluded_from_negentropy:** - Create RejectedEventsIndex instead of HashSet - Build full event with 'd' tag - Add to index with `add_announcement()` - Get IDs with `get_all_event_ids()` - Verify excluded from reconciliation ## Architecture ``` ┌─────────────────────────────────────────────────────────────┐ │ SyncManager │ │ │ │ rejected_events_index: Arc<RejectedEventsIndex> │ │ ├─ Hot Cache (2 min): Full events for re-processing │ │ └─ Cold Index (7 days): Metadata for dedup │ └─────────────────────────────────────────────────────────────┘ │ │ On rejection ▼ ┌─────────────────────────────────────────────────────────────┐ │ add_announcement(event, pubkey, identifier, reason) │ │ ├─ Store full event in hot cache │ │ └─ Store metadata in cold index │ └─────────────────────────────────────────────────────────────┘ │ │ On negentropy sync ▼ ┌─────────────────────────────────────────────────────────────┐ │ get_all_event_ids() → HashSet<EventId> │ │ ├─ Union of hot cache IDs │ │ └─ Union of cold index IDs │ └─────────────────────────────────────────────────────────────┘ ``` ## Benefits ### Immediate - **Better tracking**: Store rejection reason + metadata - **Configurable**: Tune cache/index durations per deployment - **Observable**: Separate hot/cold metrics (future PR4) ### Future (PR3) - **Immediate re-processing**: Get events from hot cache when valid - **No 24h delay**: Maintainer announcements accepted in <1 second - **Automatic recovery**: Hot cache for immediate, cold index for later ## Backward Compatibility **No breaking changes:** - Same rejection behavior (skip events in index) - Same negentropy exclusion (union with purgatory IDs) - Default config values match previous implicit behavior **Migration:** - Existing deployments continue working with defaults - Optional: Tune durations via new config flags ## Testing All tests passing: - ✅ 9 rejected_index tests (hot cache, cold index, two-tier) - ✅ 139 sync module tests (including updated integration tests) - ✅ 247 total library tests ## Next Steps **PR3: Add invalidation + immediate re-processing** - Invalidate cold index when owner announcement accepted - Get events from hot cache for re-processing - Recursive call to process_event_static - Integration tests for <1s maintainer acceptance **PR4: Add cleanup + metrics** - Hot cache cleanup task (every 60s) - Cold index cleanup task (daily) - Prometheus metrics for both tiers - Monitor hot cache hits vs misses ## Configuration Examples ```bash # Default (2 min hot cache, 7 day cold index) ngit-grasp # Longer hot cache for slow relays ngit-grasp --rejected-hot-cache-duration-secs 300 # Shorter cold index for memory-constrained systems ngit-grasp --rejected-cold-index-expiry-secs 86400 # Environment variables export NGIT_REJECTED_HOT_CACHE_DURATION_SECS=180 export NGIT_REJECTED_COLD_INDEX_EXPIRY_SECS=259200 ngit-grasp ``` Part of: Maintainer chain discovery fix See: work/SOLUTION-SUMMARY-V2.md for full design Previous: PR1 (rejected_index.rs implementation) Next: PR3 (invalidation + re-processing)
2026-01-09feat: Add two-tier rejected events indexDanConwayDev
Implements a sophisticated two-tier storage system for rejected repository announcements to enable immediate re-processing when dependencies resolve. ## Architecture **Tier 1: Hot Cache (2 minutes)** - Stores full event objects for immediate re-processing - Enables <1 second re-processing vs 24 hour wait - Auto-expires to prevent memory growth - Memory: ~200 KB typical, ~20 MB worst case **Tier 2: Cold Index (7 days)** - Stores metadata only (event_id, pubkey, identifier) - Prevents repeated downloads of rejected events - Enables invalidation when circumstances change - Memory: ~1 MB typical ## Problem Solved Without this system, maintainer announcements face a timing gap: 00:00 - Maintainer announcement rejected → Event discarded 00:02 - Owner announcement accepted (lists maintainer) → Want to re-process 00:02 - ❌ Maintainer announcement GONE → Must wait 24h for next sync With two-tier system: 00:00 - Maintainer announcement rejected → Stored in both tiers 00:02 - Owner announcement accepted → Invalidate + get from hot cache 00:02 - ✅ Re-process immediately → Accepted in <1 second ## Implementation New module: src/sync/rejected_index.rs - RejectedEventsIndex: Public API combining both tiers - HotCache: Internal struct for full event storage - ColdIndex: Internal struct for metadata storage - RejectionReason: Enum for tracking why events were rejected Key methods: - add_announcement(): Add to both tiers - contains(): Check if event is rejected - invalidate_and_get_events(): Remove from cold index, get from hot cache - cleanup_expired(): Remove expired entries from both tiers ## Testing 9 comprehensive unit tests covering: - Hot cache storage and retrieval - Hot cache expiration - Cold index metadata tracking - Cold index invalidation - Two-tier integration - Cleanup of expired entries - Hot cache misses after expiry - Multiple maintainer repositories All tests passing. ## Next Steps PR2: Switch SyncManager to use new RejectedEventsIndex PR3: Add invalidation + immediate re-processing logic PR4: Add cleanup task + Prometheus metrics Part of: Maintainer chain discovery fix See: work/SOLUTION-SUMMARY-V2.md for full design
2026-01-09Fix sync tests after Syncing status introductionDanConwayDev
- Fix relay_connected() helper to check v >= 2 (Syncing/Connected states) - Fix unit test to use status value 3 (Connected) instead of 1 (Connecting) - Fix clippy warning: use .to_vec() instead of .iter().cloned().collect() All 61 sync integration tests now passing. All 238 unit tests passing. Clippy clean.
2026-01-09refactor(sync): rename ConnectedDegraded to ConnectedHistoricSyncFailuresDanConwayDev
Resolves naming conflict with RelayHealthState::Degraded by using a more explicit name that clearly indicates the connection status relates to historic sync failures, not connection health degradation. Changes: - ConnectionStatus::ConnectedDegraded → ConnectedHistoricSyncFailures - Updated all documentation and comments - Updated Prometheus metric descriptions - Metric value remains 4 for backward compatibility This makes it clear that: - ConnectedHistoricSyncFailures = connection lifecycle (missing historic data) - RelayHealthState::Degraded = connection health (reliability issues) These are orthogonal concerns - a relay can be ConnectedHistoricSyncFailures but Healthy, or Connected but Degraded.
2026-01-09feat(sync): add ConnectedDegraded status for failed historic syncDanConwayDev
- Add ConnectionStatus::ConnectedDegraded (status=4 in metrics) - Track batch failures via PendingBatch.failed field - Track relay-level failures via RelayState.historic_sync_had_failures - Transition to ConnectedDegraded when any batch fails during historic sync - Add is_live_sync_active() helper for cleaner match patterns - Update state machine diagram with ConnectedDegraded transitions - Update metrics docs with status=4 and example queries Fixes issue where relays with failed negentropy retries would incorrectly transition to Connected status despite missing data. Now operators can distinguish 'fully synced' vs 'degraded (partial data)'.
2026-01-09feat(sync): add Syncing connection status to track historic sync progressDanConwayDev
- Add ConnectionStatus::Syncing state between Connecting and Connected - Track historic_sync_completed and historic_sync_completed_at in RelayState - Auto-detect sync completion via check_and_complete_historic_sync() - Update metrics: ngit_sync_relay_connected now shows 0-3 (disconnected/connecting/syncing/connected) - Update Prometheus metric documentation with new status values - Add state machine diagram showing Syncing transition - Operators can now distinguish 'connected but catching up' vs 'fully synced'
2026-01-09feat(sync): prevent infinite retry loop in negentropy validationDanConwayDev
Add retry protection to negentropy event validation: - Track retry_count in PendingBatch (incremented on each retry attempt) - Detect when retry makes zero progress (relay returns no requested events) - Abort retry and complete batch with partial results when stuck - Log error with full details when retry protection triggers This prevents infinite loops when: - Relay has bugs and returns wrong events for ID queries - Relay is malicious and returns unrelated events - Relay has eventual consistency issues - Network corruption causes incorrect responses The protection triggers when received_count == 0 on a retry (relay returned nothing we asked for), indicating the relay will never provide the missing events. Future work: Track failed batches in Prometheus metrics (sync_failed_batches_total) for monitoring and alerting.
2026-01-09feat(sync): validate negentropy event receipt and retry missing eventsDanConwayDev
Add validation that all events requested by ID during negentropy sync are actually received from the relay. When events are missing: - Log detailed information (requested/received/missing counts and IDs) - Create retry subscriptions for missing events (chunked by 300) - Update batch to track only missing events in next round - Only complete batch after all events received or retry fails This handles relays that have limits on ID-based queries (e.g., max 150 events per query) by automatically retrying in smaller chunks. Also excludes purgatory and rejected announcement events from negentropy requests to avoid re-requesting events we know we can't/won't store. Note: Current implementation lacks retry limit - infinite loop protection needed (tracked as future work).
2026-01-09feat(sync): track and exclude rejected announcement eventsDanConwayDev
Implement RejectedEventsIndex to prevent repeatedly fetching and processing announcement events (kinds 30617/30618) that have been rejected by the write policy. Changes: - Add RejectedEventsIndex to track rejected announcement EventIds - Record rejections in process_event_static when announcements fail write policy validation - Exclude rejected events from negentropy sync (along with purgatory) - Skip rejected events early in REQ+EOSE processing - Add 2 tests verifying tracking and exclusion logic Benefits: - Reduced network traffic (no re-fetching of known-bad events) - Lower CPU usage (no repeated validation) - Faster sync (smaller negentropy diffs) - Better observability (trace logging when skipping) Scope limited to announcements as they are the primary source of repeated rejection cycles during Layer 1 sync. Closes: Reduces wasted bandwidth from continually fetching rejected events
2026-01-09fix: MockSyncContext creates single clone tag with multiple valuesDanConwayDev
The mock was creating multiple clone tags (one per URL), which violated NIP-34 format and triggered validation errors added in commit 92bfbd3. NIP-34 specifies: single clone tag with multiple values ["clone", "https://url1.com", "https://url2.com", ...] NOT multiple clone tags: ["clone", "https://url1.com"] ["clone", "https://url2.com"] This regression caused 7 purgatory::sync::functions tests to fail because RepositoryAnnouncement::from_event() now correctly rejects announcements with multiple clone tags. Fixes: - next_url_skips_throttled_domains - next_url_skips_tried_urls - next_url_filters_our_domain - next_url_with_specific_domain - get_throttled_domains_returns_only_throttled_with_untried - sync_identifier_enqueues_throttled_domains_when_incomplete - sync_identifier_tries_multiple_urls_until_complete All 232 unit tests now pass.
2026-01-09feat: replace owner-npub with relay-owner-nsec for persistent operator identityDanConwayDev
Replace the owner-npub configuration option with relay-owner-nsec to provide a persistent cryptographic identity for the relay operator. This addresses NIP-42 authentication requirements discovered during sync debugging. Motivation: - Some relays (e.g., relay.damus.io) require NIP-42 authentication for advanced features like NIP-77 negentropy sync - Previously used random ephemeral keys per connection, providing no persistent identity - Other relays can now recognize us by pubkey for reputation-based rate limiting - Ensures consistency between NIP-11 pubkey and authentication key Changes: - Config: relay_owner_nsec with auto-load/generate from .relay-owner.nsec - NIP-11: Pubkey derived from nsec instead of separate npub field - Sync: RelayConnection now uses operator keys for NIP-42 auth - Docs: Updated README, .env.example, and added .relay-owner.nsec to gitignore Key Features: - Auto-generates key on first run and saves to .relay-owner.nsec - Loads existing key from file on subsequent runs - Can override via CLI flag or environment variable - Enables reputation building across relay network - Future-ready for event signing and WoT calculations Testing: - 225/232 tests passing (7 pre-existing purgatory failures unrelated) - Verified key generation, loading, and NIP-11 derivation - Release build successful Related: work/sync-debug-analysis.md, work/relay-owner-nsec-implementation.md
2026-01-08fix: filter out malformed announcements generated by gittrDanConwayDev
2026-01-08fix: remove debug logging entry triggering every 2sDanConwayDev
2026-01-08fix: sync-bootstrap-relay-url scheme optionalDanConwayDev
2026-01-08fix: sync uses bind_address rather than service-domain for self subscriberDanConwayDev
2026-01-08refactor: replace hardcoded Kind constants with rust-nostr variantsDanConwayDev
- Replace KIND_REPOSITORY_ANNOUNCEMENT with Kind::GitRepoAnnouncement - Replace KIND_REPOSITORY_STATE with Kind::RepoState - Replace KIND_PR with Kind::GitPullRequest - Replace KIND_PR_UPDATE with Kind::GitPullRequestUpdate - Replace KIND_USER_GRASP_LIST with Kind::GitUserGraspList - Replace KIND_PATCH with Kind::GitPatch - Replace KIND_ISSUE with Kind::GitIssue - Replace KIND_COMMENT with Kind::Comment - Replace all Kind::Custom(30617|30618|1617|1618|1619|1621|1111|10317) patterns - Remove all hardcoded KIND_* constants from events.rs - Update all match statements to use Kind enum directly - Update all filter builders to use Kind variants - Update all test helpers and assertions Benefits: - Type safety: compiler prevents wrong kind numbers - Readability: Kind::GitRepoAnnouncement is self-documenting - Maintainability: single source of truth (rust-nostr) - IDE support: full autocompletion and refactoring - Standards: aligns with rust-nostr best practices Files modified: 21 Constants removed: 9 Patterns replaced: 100+ Tests passing: 222/222
2026-01-08chore: upgrade nostr-* packages to rev 4767ad13DanConwayDev
- Update nostr-relay-builder, nostr-sdk, nostr-lmdb to latest revision - Update grasp-audit nostr-sdk dependency - Fix clippy warnings: - Replace .clone() with std::slice::from_ref() in src/git/sync.rs - Change &PathBuf to &Path in tests/common/git_server.rs - Replace vec![] with array literal in src/purgatory/sync/functions.rs - Update PR_TEST_COMMIT_HASH in grasp-audit due to event generation changes All 249 tests passing, no breaking changes required.
2026-01-08chore: cargo fmtDanConwayDev
2026-01-08test: disable GPG signing in all test helpersDanConwayDev
Prevent GPG signing prompts (including Yubikey activation) during test runs by explicitly disabling commit.gpgsign and tag.gpgsign in all test repository creation helpers. Modified: - tests/common/purgatory_helpers.rs: create_test_repo_with_commit() - src/git/mod.rs: create_test_repo_with_commit() - src/purgatory/helpers.rs: create_test_repo_with_commit() All test repositories now have GPG signing disabled regardless of global git configuration.
2026-01-08feat(purgatory): track expired events to prevent infinite re-sync loopsDanConwayDev
Adds expired event tracking to prevent proactive sync from repeatedly fetching and re-adding events that expired from purgatory without finding git data. Key features: - Track expired events for 7 days to prevent re-sync loops - Distinguish synced vs user-submitted events (via socket address) - Allow users to retry expired events (git data might now be available) - Reject synced expired events (prevents infinite loop) - Daily cleanup of expired event records older than 7 days Implementation: - Added expired_events: DashMap<EventId, Instant> to Purgatory - Updated event_ids() to include both purgatory + expired events - Added is_expired(), mark_expired(), cleanup_expired_events() - Updated cleanup() to mark expired events automatically - Added is_synced detection in WritePolicy (localhost:0 = synced) - Policy layer checks is_synced && is_expired() before rejecting Behavior: - Negentropy: Filters expired events before fetching (optimal) - REQ+EOSE: Rejects synced expired events at policy layer - User submissions: Always allowed to retry (skip expired check) Testing: - Added 5 new tests for expired event tracking - All 222 tests passing Fixes the infinite re-sync loop where events without git data would expire, get synced again, expire again, repeat forever.
2026-01-07refactor: unify event processing logicDanConwayDev
Eliminates code duplication by extracting core event processing into reusable functions. All state and PR event processing now uses the same unified logic from src/git/process.rs. Changes: - Add src/git/process.rs with unified processing functions - process_state_with_git_data() for state events - process_pr_with_git_data() for PR events - Pure functions with comprehensive result types - Refactor policy handlers to use unified processing - src/nostr/policy/state.rs: Remove ~70 lines of duplicated logic - src/nostr/policy/pr_event.rs: Remove ~40 lines of duplicated logic - Refactor purgatory processing to use unified functions - src/git/sync.rs: Remove ~125 lines of duplicated logic - Make extract_owner_from_repo_path() public for reuse Benefits: - DRY: Single source of truth for event processing - Testable: Pure functions with clear contracts - Maintainable: Changes happen in one place - Consistent: All code paths use same logic All 217 unit tests + 40 integration tests pass (257/257).
2026-01-07fix: refs/nostr/<event-id> gets removed after 30m if no event arrivesDanConwayDev
we forgot to add the placeholder entry
2026-01-07Add Git protocol v2 support to fix modern git client compatibilityDanConwayDev
Modern git clients (2.51.0+) default to protocol v2 and send the Git-Protocol header. The server must pass this to git processes via the GIT_PROTOCOL environment variable for proper negotiation. Changes: - Extract Git-Protocol header in HTTP layer (src/http/mod.rs) - Pass git_protocol parameter through all handler functions - Set GIT_PROTOCOL env var when spawning git subprocesses - Update all tests to pass None for backward compatibility This fixes hangs/timeouts when modern git clients connect to the server. Fixes issue discovered in work/2025-01-07-pr-clone-tag-sync-investigation.md
2026-01-07fix: resolve clippy warningsDanConwayDev
- Prefix unused variable auth_result with underscore - Prefix unused field git_data_path with underscore in Purgatory struct - Add #[allow(clippy::too_many_arguments)] to handle_receive_pack - Replace len() >= 1 with !is_empty() - Replace .last() with .next_back() on DoubleEndedIterator - Fix doc list item overindentation - Replace map_or(true, ...) with is_none_or(...) - Replace map_or(false, ...) with is_some_and(...)
2026-01-07feat(sync): extract clone URLs from PR events in purgatoryDanConwayDev
Add support for extracting clone URLs from PR/PR-Update events (kind 1618/1619) during purgatory sync, per NIP-34 specification. This enables fetching PR commits from URLs specified in the PR event itself, not just from repository announcement clone URLs. Changes: - Add collect_pr_clone_urls() to SyncContext trait - Implement in RealSyncContext: extract clone tags from PR events in purgatory - Implement in MockSyncContext: configurable PR clone URLs for testing - Update sync_identifier_next_url to merge PR clone URLs with announcement URLs - Update get_throttled_domains_with_untried_urls with same merge logic - Add unit tests for PR clone URL extraction and filtering
2026-01-07test: add test_state_event_syncs_from_remote integration testDanConwayDev
Implements Phase 3 of the purgatory sync integration test plan. Key changes: - Add immediate sync triggering for sync-received events that go to purgatory (instead of default 3-minute delay for user-submitted events) - TestRelay now respects RUST_LOG environment variable for debugging - New test verifies end-to-end flow: state event syncs from source relay, enters purgatory, git data is fetched from source's clone URL, and event is released and served
2026-01-07Wire up new purgatory sync loop, remove legacy sync_state_git_dataDanConwayDev
Phase 13 of purgatory-sync-redesign: - Add sync loop startup in main.rs (RealSyncContext + ThrottleManager + start_sync_loop) - Update add_state() and add_pr() to automatically enqueue for background sync - Remove start_state_sync() call from state.rs (now handled by sync loop) - Remove orphaned legacy functions: sync_state_git_data, fetch_missing_oids_from_server, get_most_complete_local_repo, identify_missing_oids, get_date_of_most_recent_commit_on_default_branch - Clean up unused imports in purgatory/mod.rs
2026-01-07Add RealSyncContext implementation for production purgatory syncDanConwayDev
Implement the production SyncContext that connects to real systems: - RealSyncContext struct holding purgatory, database, git_data_path, our_domain, and local_relay references - fetch_repository_data: delegates to git::authorization module - collect_needed_oids: collects commit hashes from state events (branches/tags) and PR events (c-tag) in purgatory - oid_exists: delegates to git::oid_exists function - fetch_oids: uses git fetch --depth=1 to retrieve specific OIDs from remote servers, running in spawn_blocking for async safety - process_newly_available_git_data: delegates to the unified function in git::sync module for consistent post-git-data processing - has_pending_events: delegates to purgatory method - find_target_repo: finds first existing owner repository on disk - our_domain: returns configured domain for clone URL filtering This enables the purgatory sync loop to use real database queries, git operations, and event processing instead of mocks.
2026-01-07refactor: remove align_repository_with_state duplicationDanConwayDev
- Remove duplicate AlignmentResult struct from nostr/policy/state.rs - Remove duplicate align_repository_with_state method from StatePolicy - Import and use the canonical implementation from git::sync - Re-export AlignmentResult from git::sync in policy/mod.rs The git::sync version is preferred as it: - Handles symbolic refs (ref:) properly by skipping them - Uses git::oid_exists which is more general than git::commit_exists - Has a cleaner iteration pattern (delete first, then update/create)
2026-01-07git: removed duplicate default branch updateDanConwayDev
this is now handled through process_newly_available_git_data
2026-01-07purgatory: more robust process_purgatory_state_events syncingDanConwayDev
2026-01-07purgatory: improve process_newly_available_git_data state event syncDanConwayDev
2026-01-07Refactor handle_receive_pack to use unified process_newly_available_git_dataDanConwayDev
Replace ~100 lines of duplicated post-push processing in handle_receive_pack with a single call to the unified process_newly_available_git_data function. The unified function handles all post-git-data-available processing: - Discovering satisfiable events from purgatory (state and PR events) - Syncing OIDs to authorized owner repos - Aligning refs (+ setting HEAD) in all owner repos - Saving events to database - Notifying WebSocket subscribers - Removing from purgatory This ensures consistent behavior regardless of how git data arrives (git push vs purgatory sync fetching from remote servers). Also mark test-only internal methods with #[cfg(test)] to silence dead code warnings.
2026-01-07Add unified process_newly_available_git_data functionDanConwayDev
Implement the unified function that handles all post-git-data-available processing, regardless of how data arrived (git push or purgatory sync). This function: - Discovers satisfiable events from purgatory (state and PR events) - Syncs OIDs to authorized owner repos - Aligns refs and sets HEAD - Saves events to database - Notifies WebSocket subscribers - Removes from purgatory New additions: - ProcessResult struct for tracking processing outcomes - process_newly_available_git_data async function in src/git/sync.rs - Helper functions: extract_identifier_from_repo_path, extract_identifier_from_pr_event - Purgatory::find_prs_for_identifier method for PR event discovery - Unit tests for all helper functions Also fixes: - Simplified extract_domain to avoid url crate dependency - Removed unused imports in sync/loop.rs
2026-01-07Add background sync loop for purgatory identifier processingDanConwayDev
Implement the main sync loop that runs in the background and processes identifiers that are ready for git data synchronization: - Runs every 1 second (hardcoded interval, not configurable) - Finds all ready identifiers where !in_progress && next_attempt <= now - Spawns parallel tasks for each ready identifier - Each task calls sync_identifier to try fetching git data from remotes - Applies backoff when sync completes but events remain in purgatory - Removes identifiers from queue when sync completes or no events remain The loop integrates with the existing sync infrastructure: - Uses SyncContext trait for testability - Uses ThrottleManager for domain-based rate limiting - Uses sync_identifier for the actual fetch orchestration This enables automatic background fetching of git data for events in purgatory, complementing the existing push-triggered sync path.
2026-01-07Add sync queue to Purgatory with enqueue_sync and has_pending_eventsDanConwayDev
- Add sync_queue field to Purgatory struct for tracking identifiers that need background git data fetching - Implement enqueue_sync() with debouncing - resets attempt_count and updates next_attempt when new events arrive for an identifier already in queue - Add enqueue_sync_default() for user-submitted events (3 minute delay to wait for git push) - Add enqueue_sync_immediate() for sync-triggered events (500ms delay for batching burst arrivals) - Implement has_pending_events() to check if an identifier has state events or PR events in purgatory - Add helper methods: sync_queue(), remove_from_sync_queue(), sync_queue_size() - Add unit tests for debouncing behavior and pending event detection
2026-01-07Add sync_identifier orchestration and ThrottleManager queue processingDanConwayDev
Implement the main sync orchestration function and trigger-based queue processing for throttled domains: sync_identifier function: - Orchestrates syncing git data for a single identifier - Tries all non-throttled URLs in sequence - Checks completion after each fetch (no pending events or all OIDs fetched) - Enqueues with throttled domains when non-throttled URLs are exhausted - Returns true if complete, false if events remain (for backoff) ThrottleManager enhancements: - Add set_context() to provide SyncContext for queue processing - Add try_process_next() to spawn tasks when capacity frees - Add process_queued_identifier() to handle queued work - Update complete_request() to trigger processing on completion - Update enqueue_identifier() to trigger processing when capacity available - Add internal methods for non-Arc testing compatibility Generic function updates: - Add ?Sized bound to sync_identifier_next_url, sync_identifier_from_url, sync_identifier, and get_throttled_domains_with_untried_urls for dynamic dispatch support (Arc<dyn SyncContext>) Tests: - sync_identifier_tries_multiple_urls_until_complete: verifies sequential URL fetching until all OIDs are available - sync_identifier_enqueues_throttled_domains_when_incomplete: verifies throttled domains get the identifier enqueued for later processing - has_queued_work_reflects_queue_state: verifies queue state tracking
2026-01-07Add core sync functions for identifier-based purgatory synchronizationDanConwayDev
Implement sync_identifier_next_url and sync_identifier_from_url functions that provide the core URL selection and fetch logic for purgatory sync. sync_identifier_next_url: - Pure URL selection logic with no side effects - Filters out our own domain and already-tried URLs - Respects domain throttling when domain parameter is None - Can target a specific domain when domain parameter is Some sync_identifier_from_url: - Fetches OIDs from a specific URL via the SyncContext - Tracks request start/completion with ThrottleManager for rate limiting - Calls process_newly_available_git_data on successful fetch Also adds get_throttled_domains_with_untried_urls helper for the main sync loop to know which DomainThrottle queues to enqueue identifiers to. These functions are designed to be called by both: - Main sync loop (tries non-throttled URLs immediately) - DomainThrottle queue processing (when capacity frees up) Includes 10 unit tests covering: - Throttled domain skipping - Tried URL skipping - Our domain filtering - Specific domain targeting - Fetch success/failure handling - Throttle request tracking
2026-01-07Add SyncContext trait and MockSyncContext for purgatory syncDanConwayDev
Implement the abstraction layer for purgatory sync operations: - SyncContext trait: defines interface for repository data fetching, OID existence checks, git fetch operations, and event processing - ProcessResult: captures outcomes when releasing events from purgatory - MockSyncContext: test mock with builder pattern for configuring: - Clone URLs and which OIDs each URL provides - Needed OIDs (simulates purgatory state) - URL failure simulation - Fetch logging for assertions The trait uses async_trait for async method support and requires Send + Sync for use in concurrent sync operations. This abstraction enables unit testing of sync logic without I/O, while the real implementation (to be added later) will connect to actual database, git, and relay systems.
2026-01-07Add ThrottleManager for cross-domain rate limitingDanConwayDev
Implements ThrottleManager which manages all per-domain DomainThrottle instances and provides: - Throttle status checking via is_throttled() for sync URL selection - Request tracking via start_request()/complete_request() - Identifier queue management via enqueue_identifier() - Automatic domain throttle creation on first access - Thread-safe access via DashMap with Mutex-wrapped throttles The manager uses the configured max_concurrent and max_per_minute limits for all domains. Trigger-based queue processing (set_context, process_queued_identifier) will be added after SyncContext is available. Tests verify: - is_throttled reflects domain capacity correctly - enqueue_identifier creates domain throttle if needed - start_request creates domain throttle if needed
2026-01-07Add DomainThrottle for per-domain rate limitingDanConwayDev
Implement per-domain throttling for purgatory sync operations: - Concurrent request limit (max in-flight requests per domain) - Rate limit (max requests per minute via sliding window) - Fair round-robin queue processing across identifiers - In-progress tracking to prevent duplicate fetches - Tried URL tracking per identifier Add indexmap dependency for ordered iteration in round-robin queue. Includes 6 unit tests covering: - Concurrent limit enforcement - Rate limit enforcement (sliding window) - Round-robin fair processing - In-progress identifier skipping - Round-robin index adjustment on removal - Tried URL merging on re-enqueue
2026-01-07Add SyncQueueEntry with exponential backoff for purgatory syncDanConwayDev
Implement the sync queue entry struct that tracks sync state per identifier: - next_attempt: when the next sync should be attempted - attempt_count: for backoff calculation (resets on new events) - in_progress: prevents concurrent syncs for same identifier Backoff schedule: 20s → 40s → 80s → 120s (capped at 2 minutes) This is the foundation for the identifier-based purgatory sync system that will replace the current per-event syncing approach.
2026-01-05sync PR refs to all relivant reposDanConwayDev
2026-01-05sync PR refs (refs/nostr/<event-id>) to all owner repos when push receivedDanConwayDev
When a push to refs/nostr/<event-id> is received (PR data), the git data is now synced to all other owner repositories that share maintainers with the source owner. This mirrors the behavior added for state event data. Changes: - Add sync_pr_refs_to_owner_repos() function in git/sync.rs - Add PrSyncResult struct to track sync statistics - Add copy_single_commit_between_repos() helper function - Call PR sync in handle_receive_pack after successful push - Add unit test for PrSyncResult default values
2026-01-05sync all repos when authorised state data push receivedDanConwayDev
2026-01-05purgatory: git data sync applies state and saves eventDanConwayDev
2026-01-05purgatory: state git data sync use single command to fetch oidsDanConwayDev
2026-01-05purgatory: add state git data syncDanConwayDev
2026-01-02sync: use purgatoryDanConwayDev
don't save new events destined for purgatory events directly to db or serve on websockets don't download events already in purgatory via negentropy sync
2025-12-31purgatory: when state data recieved sync across repositoiesDanConwayDev
2025-12-31purgatory: fix pr event recieve codeDanConwayDev
2025-12-31purgatory: fix state event receive codeDanConwayDev