upleb.uk

Public git repos — served from a NIP-34 GRASP relay at git.upleb.uk

summaryrefslogtreecommitdiff
path: root/docs/explanation
diff options
context:
space:
mode:
authorDanConwayDev <DanConwayDev@protonmail.com>2025-12-08 20:31:18 +0000
committerDanConwayDev <DanConwayDev@protonmail.com>2025-12-08 20:31:18 +0000
commit103ede51485601892af1df6dab9f96f232b10f49 (patch)
treef4c7f7d4f5afc235d54dfe4437310c3c96cc97c8 /docs/explanation
parent20388e4e76864195860dfab81a5b80725184e7c9 (diff)
redsign sync - architecture tweaks
Diffstat (limited to 'docs/explanation')
-rw-r--r--docs/explanation/grasp-02-proactive-sync-v2.md72
1 files changed, 69 insertions, 3 deletions
diff --git a/docs/explanation/grasp-02-proactive-sync-v2.md b/docs/explanation/grasp-02-proactive-sync-v2.md
index faa9255..d1bbe14 100644
--- a/docs/explanation/grasp-02-proactive-sync-v2.md
+++ b/docs/explanation/grasp-02-proactive-sync-v2.md
@@ -11,6 +11,34 @@ This document presents a simplified redesign of the proactive sync module. The k
113. **Efficiency**: Minimize connections and bandwidth through filter consolidation 113. **Efficiency**: Minimize connections and bandwidth through filter consolidation
124. **Consistency**: Use unified filters for both live sync and catchup 124. **Consistency**: Use unified filters for both live sync and catchup
13 13
14## Scale Targets & Upper Bounds
15
16This design targets the following scale:
17
18| Metric | Target | Notes |
19|--------|--------|-------|
20| **Repositories** | 1,000 | Repos we host/track |
21| **Root events per repo** | 50 (avg) | PRs, Issues, Patches per repo |
22| **Total relays in ecosystem** | 100 | Unique relays across all repos |
23| **Relays per repo** | 5 (avg) | Relays listed in each repo's announcement |
24| **Total root events** | ~50,000 | 1,000 repos × 50 events |
25| **Sync connections** | ~50-100 | Based on relay overlap |
26
27**Memory Estimate (in-memory HashMaps):**
28
29- `FollowingRepoRootEvents`: ~1,000 entries × 50 EventIds = ~3-5 MB
30- `SyncRelays`: ~100 entries × varying repo counts = ~2-3 MB
31- **Total in-memory state**: ~10 MB (well within acceptable limits)
32
33**Upper Bounds (redesign triggers):**
34
35- **10,000+ repos**: Consider database-backed state instead of in-memory HashMaps
36- **500+ sync relays**: Consider connection pooling or relay prioritization
37- **500+ root events per repo**: Consider per-repo pagination in Layer 3 filters
38- **Sustained >100 events/second**: Consider write batching to database
39
40Beyond these limits, the in-memory HashMap model may need to evolve to a database-backed approach with lazy loading.
41
14## Core Data Structures 42## Core Data Structures
15 43
16The entire sync filter state is captured in two HashMaps, initialized from database queries at startup: 44The entire sync filter state is captured in two HashMaps, initialized from database queries at startup:
@@ -216,6 +244,8 @@ A single self-subscriber watches for new events from **our own relay** and updat
216 244
217The batch timer **starts only when the first event arrives**, not on a fixed interval. This prevents the scenario where an event arriving at second 4 of a 5-second interval only gets 1 second before the batch fires. 245The batch timer **starts only when the first event arrives**, not on a fixed interval. This prevents the scenario where an event arriving at second 4 of a 5-second interval only gets 1 second before the batch fires.
218 246
247**Important:** Once the batch timer starts, it does NOT reset when additional events arrive. The batch will fire exactly 5 seconds after the first event, regardless of how many subsequent events are queued. This ensures predictable latency and prevents indefinite batching during high-activity periods.
248
219```rust 249```rust
220impl SelfSubscriber { 250impl SelfSubscriber {
221 async fn run(&self) { 251 async fn run(&self) {
@@ -326,6 +356,9 @@ impl SelfSubscriber {
326 let current_filter_count = self.get_filter_count_for_relay(relay_url).await; 356 let current_filter_count = self.get_filter_count_for_relay(relay_url).await;
327 let new_filter_count = current_filter_count + incremental_filters.len(); 357 let new_filter_count = current_filter_count + incremental_filters.len();
328 358
359 // Note: 70 is a conservative threshold that may need tuning based on
360 // production observations. It was chosen to trigger consolidation earlier
361 // than v1's 150, but optimal value depends on relay behavior.
329 if new_filter_count > 70 { 362 if new_filter_count > 70 {
330 // Consolidate: add incremental filters first (no since), wait for EOSE, 363 // Consolidate: add incremental filters first (no since), wait for EOSE,
331 // then close all and resubscribe with consolidated filters (with since) 364 // then close all and resubscribe with consolidated filters (with since)
@@ -425,6 +458,37 @@ async fn consolidate_relay_subscription(
425} 458}
426``` 459```
427 460
461## Daily Full Catchup
462
463To capture events that may have taken longer than 15 minutes to propagate through the nostr network, we perform a daily full catchup:
464
465```rust
466impl SyncManager {
467 /// Runs approximately every 24 hours per relay connection
468 async fn daily_catchup(&mut self, relay_url: &str) {
469 // Close all current subscriptions for this relay
470 self.close_all_subscriptions(relay_url).await;
471
472 // Rebuild fresh filters from current HashMap state
473 let filters = self.build_three_layer_filters_for_relay(relay_url).await;
474
475 // Subscribe WITHOUT since filter to get full historical sync
476 for filter in filters {
477 self.subscribe_to_relay(relay_url, filter).await;
478 }
479
480 // After EOSE, switch back to live mode with since filter
481 self.wait_for_eose(relay_url).await;
482
483 // Re-add since filter for ongoing live sync
484 let since = Timestamp::now() - 900; // 15 minutes ago
485 self.resubscribe_with_since(relay_url, since).await;
486 }
487}
488```
489
490**Rationale:** The 15-minute reconnection window is standard for nostr event propagation, but some events may take longer. Rather than increasing the window (which would cause more duplicate processing), we do a daily full catchup to ensure nothing is missed. This adds minimal complexity while providing comprehensive coverage.
491
428## Sync Relay Connections 492## Sync Relay Connections
429 493
430Each sync relay connection uses the three-layer filter strategy: 494Each sync relay connection uses the three-layer filter strategy:
@@ -628,16 +692,18 @@ src/sync/
628 692
6291. **Single Source of Truth**: Two HashMaps represent all sync state, initialized from database 6931. **Single Source of Truth**: Two HashMaps represent all sync state, initialized from database
6302. **Event-Driven Updates**: Self-subscriber updates HashMaps; relay connections read from them 6942. **Event-Driven Updates**: Self-subscriber updates HashMaps; relay connections read from them
6313. **Batched Filter Updates**: 5-second window that starts on first event (not fixed interval) 6953. **Batched Filter Updates**: 5-second window that starts on first event (timer does NOT reset on subsequent events)
6324. **Uniform Reconnection**: Always use `since = last_successful - 15min` 6964. **Uniform Reconnection**: Always use `since = last_successful - 15min`
6335. **No Jitter**: Trade-offs not worth it - orphan filters and inefficiency outweigh thundering herd concerns 6975. **No Jitter**: Trade-offs not worth it - orphan filters and inefficiency outweigh thundering herd concerns
6346. **Bootstrap Relay Protected**: Never removed from sync_relays; subscribes with Layer 1 even if empty 6986. **Bootstrap Relay Protected**: Never removed from sync_relays - ensures at least one sync connection exists even when no repositories currently list our service (cold start / recovery scenario)
6357. **30618 Remote-Only**: Maintainer state synced from remote relays, not self-subscribed 6997. **30618 Remote-Only**: Maintainer state synced from remote relays, not self-subscribed
6368. **70 Filter Consolidation Threshold**: Lower than v1's 150 for earlier consolidation 7008. **70 Filter Consolidation Threshold**: Lower than v1's 150 for earlier consolidation (conservative value that may need tuning based on production observation)
6379. **100-Tag Batching**: Consistent batch size for Layer 2 and Layer 3 filters 7019. **100-Tag Batching**: Consistent batch size for Layer 2 and Layer 3 filters
63810. **Layer 2 All Kinds**: Subscribe to ALL events with 'a' tags, not just NIP-34 kinds 70210. **Layer 2 All Kinds**: Subscribe to ALL events with 'a' tags, not just NIP-34 kinds
63911. **Two-Phase Consolidation**: Incremental filters WITHOUT since first, then consolidated WITH since 70311. **Two-Phase Consolidation**: Incremental filters WITHOUT since first, then consolidated WITH since
64012. **Multiple Repo Refs**: Handle events that tag multiple repos correctly 70412. **Multiple Repo Refs**: Handle events that tag multiple repos correctly
70513. **Daily Full Catchup**: Periodic sync restart without `since` filter (~24h) to catch slow-propagating events
70614. **Dual DB Queries at Startup**: Separate queries for root events and announcements. Could be combined into a single query, but some relays cache by kind which may make separate queries more efficient. Trade-off deferred for future optimization.
641 707
642--- 708---
643 709