upleb.uk

Public git repos — served from a NIP-34 GRASP relay at git.upleb.uk

summaryrefslogtreecommitdiff
path: root/src/purgatory/sync/mod.rs
AgeCommit message (Collapse)Author
2026-01-07Add RealSyncContext implementation for production purgatory syncDanConwayDev
Implement the production SyncContext that connects to real systems: - RealSyncContext struct holding purgatory, database, git_data_path, our_domain, and local_relay references - fetch_repository_data: delegates to git::authorization module - collect_needed_oids: collects commit hashes from state events (branches/tags) and PR events (c-tag) in purgatory - oid_exists: delegates to git::oid_exists function - fetch_oids: uses git fetch --depth=1 to retrieve specific OIDs from remote servers, running in spawn_blocking for async safety - process_newly_available_git_data: delegates to the unified function in git::sync module for consistent post-git-data processing - has_pending_events: delegates to purgatory method - find_target_repo: finds first existing owner repository on disk - our_domain: returns configured domain for clone URL filtering This enables the purgatory sync loop to use real database queries, git operations, and event processing instead of mocks.
2026-01-07Add background sync loop for purgatory identifier processingDanConwayDev
Implement the main sync loop that runs in the background and processes identifiers that are ready for git data synchronization: - Runs every 1 second (hardcoded interval, not configurable) - Finds all ready identifiers where !in_progress && next_attempt <= now - Spawns parallel tasks for each ready identifier - Each task calls sync_identifier to try fetching git data from remotes - Applies backoff when sync completes but events remain in purgatory - Removes identifiers from queue when sync completes or no events remain The loop integrates with the existing sync infrastructure: - Uses SyncContext trait for testability - Uses ThrottleManager for domain-based rate limiting - Uses sync_identifier for the actual fetch orchestration This enables automatic background fetching of git data for events in purgatory, complementing the existing push-triggered sync path.
2026-01-07Add sync_identifier orchestration and ThrottleManager queue processingDanConwayDev
Implement the main sync orchestration function and trigger-based queue processing for throttled domains: sync_identifier function: - Orchestrates syncing git data for a single identifier - Tries all non-throttled URLs in sequence - Checks completion after each fetch (no pending events or all OIDs fetched) - Enqueues with throttled domains when non-throttled URLs are exhausted - Returns true if complete, false if events remain (for backoff) ThrottleManager enhancements: - Add set_context() to provide SyncContext for queue processing - Add try_process_next() to spawn tasks when capacity frees - Add process_queued_identifier() to handle queued work - Update complete_request() to trigger processing on completion - Update enqueue_identifier() to trigger processing when capacity available - Add internal methods for non-Arc testing compatibility Generic function updates: - Add ?Sized bound to sync_identifier_next_url, sync_identifier_from_url, sync_identifier, and get_throttled_domains_with_untried_urls for dynamic dispatch support (Arc<dyn SyncContext>) Tests: - sync_identifier_tries_multiple_urls_until_complete: verifies sequential URL fetching until all OIDs are available - sync_identifier_enqueues_throttled_domains_when_incomplete: verifies throttled domains get the identifier enqueued for later processing - has_queued_work_reflects_queue_state: verifies queue state tracking
2026-01-07Add core sync functions for identifier-based purgatory synchronizationDanConwayDev
Implement sync_identifier_next_url and sync_identifier_from_url functions that provide the core URL selection and fetch logic for purgatory sync. sync_identifier_next_url: - Pure URL selection logic with no side effects - Filters out our own domain and already-tried URLs - Respects domain throttling when domain parameter is None - Can target a specific domain when domain parameter is Some sync_identifier_from_url: - Fetches OIDs from a specific URL via the SyncContext - Tracks request start/completion with ThrottleManager for rate limiting - Calls process_newly_available_git_data on successful fetch Also adds get_throttled_domains_with_untried_urls helper for the main sync loop to know which DomainThrottle queues to enqueue identifiers to. These functions are designed to be called by both: - Main sync loop (tries non-throttled URLs immediately) - DomainThrottle queue processing (when capacity frees up) Includes 10 unit tests covering: - Throttled domain skipping - Tried URL skipping - Our domain filtering - Specific domain targeting - Fetch success/failure handling - Throttle request tracking
2026-01-07Add SyncContext trait and MockSyncContext for purgatory syncDanConwayDev
Implement the abstraction layer for purgatory sync operations: - SyncContext trait: defines interface for repository data fetching, OID existence checks, git fetch operations, and event processing - ProcessResult: captures outcomes when releasing events from purgatory - MockSyncContext: test mock with builder pattern for configuring: - Clone URLs and which OIDs each URL provides - Needed OIDs (simulates purgatory state) - URL failure simulation - Fetch logging for assertions The trait uses async_trait for async method support and requires Send + Sync for use in concurrent sync operations. This abstraction enables unit testing of sync logic without I/O, while the real implementation (to be added later) will connect to actual database, git, and relay systems.
2026-01-07Add ThrottleManager for cross-domain rate limitingDanConwayDev
Implements ThrottleManager which manages all per-domain DomainThrottle instances and provides: - Throttle status checking via is_throttled() for sync URL selection - Request tracking via start_request()/complete_request() - Identifier queue management via enqueue_identifier() - Automatic domain throttle creation on first access - Thread-safe access via DashMap with Mutex-wrapped throttles The manager uses the configured max_concurrent and max_per_minute limits for all domains. Trigger-based queue processing (set_context, process_queued_identifier) will be added after SyncContext is available. Tests verify: - is_throttled reflects domain capacity correctly - enqueue_identifier creates domain throttle if needed - start_request creates domain throttle if needed
2026-01-07Add DomainThrottle for per-domain rate limitingDanConwayDev
Implement per-domain throttling for purgatory sync operations: - Concurrent request limit (max in-flight requests per domain) - Rate limit (max requests per minute via sliding window) - Fair round-robin queue processing across identifiers - In-progress tracking to prevent duplicate fetches - Tried URL tracking per identifier Add indexmap dependency for ordered iteration in round-robin queue. Includes 6 unit tests covering: - Concurrent limit enforcement - Rate limit enforcement (sliding window) - Round-robin fair processing - In-progress identifier skipping - Round-robin index adjustment on removal - Tried URL merging on re-enqueue
2026-01-07Add SyncQueueEntry with exponential backoff for purgatory syncDanConwayDev
Implement the sync queue entry struct that tracks sync state per identifier: - next_attempt: when the next sync should be attempted - attempt_count: for backoff calculation (resets on new events) - in_progress: prevents concurrent syncs for same identifier Backoff schedule: 20s → 40s → 80s → 120s (capped at 2 minutes) This is the foundation for the identifier-based purgatory sync system that will replace the current per-event syncing approach.