upleb.uk

Public git repos — served from a NIP-34 GRASP relay at git.upleb.uk

summaryrefslogtreecommitdiff
path: root/docs/how-to
diff options
context:
space:
mode:
authorDanConwayDev <DanConwayDev@protonmail.com>2026-02-23 15:20:59 +0000
committerDanConwayDev <DanConwayDev@protonmail.com>2026-02-23 15:20:59 +0000
commit113928aa84894ea8f65c247d9987527e792b32a9 (patch)
treeec967d6195d9f7ec4f061449596611afe3a0950f /docs/how-to
parent26f608e5011b9d1ad6036da75b89272835e69695 (diff)
parente0ad39a489b3398f8208713bf728db0cb11475b0 (diff)
Merge master into 3ca0-announcements-purgatory
Diffstat (limited to 'docs/how-to')
-rw-r--r--docs/how-to/README.md12
-rw-r--r--docs/how-to/production-sync-testing.md533
2 files changed, 0 insertions, 545 deletions
diff --git a/docs/how-to/README.md b/docs/how-to/README.md
index ba58c08..087ae53 100644
--- a/docs/how-to/README.md
+++ b/docs/how-to/README.md
@@ -110,18 +110,6 @@ How-to guides are **recipes** that show you how to solve specific problems or ac
110 110
111--- 111---
112 112
113### Migrate from ngit-relay
114**Status:** πŸ”œ Planned
115
116**Problem:** Switch from reference implementation
117**You'll learn:**
118- Export data from ngit-relay
119- Import to ngit-grasp
120- Update repository URLs
121- Verify migration
122
123---
124
125## How to Use How-To Guides 113## How to Use How-To Guides
126 114
1271. **Find your problem** - Browse or search for what you need 1151. **Find your problem** - Browse or search for what you need
diff --git a/docs/how-to/production-sync-testing.md b/docs/how-to/production-sync-testing.md
deleted file mode 100644
index 3a273a7..0000000
--- a/docs/how-to/production-sync-testing.md
+++ /dev/null
@@ -1,533 +0,0 @@
1# How-To: Test Sync Against Production Data
2
3> **Quick Start Prompt:** Check work/active-issues/ for existing issues. If issues exist, pick the most important, fix it, test with cargo test, run clippy and fmt, commit, and report back with a brief 1-2 sentence summary of each issue you identified. If no issues exist, run a 60-second production sync test, analyze logs, create individual issue files in work/active-issues/ (one per issue with minimal description), then report summary listing each issue in 1-2 sentences.
4
5**Problem:** Debug and improve sync behavior using real-world data from production relays
6**Difficulty:** Intermediate
7**Time:** 30 minutes per iteration
8
9## Two-Mode Workflow
10
11This guide operates in two modes:
12
13### Mode 1: Fix Existing Issues
14**When:** There are files in `work/active-issues/` (excluding README.md)
15
161. Check for active issues: `ls work/active-issues/`
172. Pick the most important issue to fix
183. **Review proposed fix and ask for permission before implementing**
194. Implement the fix (after approval)
205. Run `cargo test` to verify tests pass
216. Run `cargo clippy` to check for warnings
227. Run `cargo fmt` to format code
238. Commit changes with descriptive message
249. Report back - **DO NOT** do another issue or run more tests
25
26### Mode 2: Discover New Issues
27**When:** No active issues in `work/active-issues/`
28
291. Run 60-second production sync test (logs saved to `tmp/run-{timestamp}/`)
302. Analyze logs for errors, warnings, unexpected patterns
313. Document each issue as a separate markdown file in `work/active-issues/`
324. Keep issue files minimal - just enough to identify the issue
335. Report brief summary listing each issue in 1-2 sentences
346. **DO NOT** create separate detailed analysis files
357. **DO NOT** do thorough investigation or root cause analysis
36
37## Overview
38
39This guide helps you run ngit-grasp's sync system against production relays to discover unexpected errors, inefficiencies, and edge cases that don't appear in controlled tests.
40
41**Why production testing matters:**
42- Real data has inconsistencies, malformed events, and edge cases
43- Production relays may behave differently (rate limiting, timeouts, partial NIP-77 support)
44- Volume and patterns reveal performance bottlenecks
45- Sync discovery leads to cascading subscriptions we can't predict in tests
46
47## Prerequisites
48
49- ngit-grasp compiles successfully (`cargo build`)
50- Familiarity with [GRASP-02 Proactive Sync](../explanation/grasp-02-proactive-sync.md)
51- Understanding of log levels and tracing
52
53## Test Setup
54
55### 1. Choose a Test Identity
56
57Pick a domain with manageable sync volume. Smaller domains mean fewer repos to sync, making logs tractable.
58
59**Recommended starting point:**
60```bash
61--domain ngit.danconwaydev.com
62```
63
64This domain has few repo announcements listing it, so sync stays manageable.
65
66### 2. Choose a Bootstrap Relay
67
68The bootstrap relay provides the initial set of announcements to discover repos:
69
70```bash
71--sync-bootstrap-relay-url wss://git.shakespeare.diy
72```
73
74### 3. Run with Time Limit
75
76Run for **60 seconds** to allow the full sync cascade to complete. This duration allows:
77- Layer 1: Discovery of repo announcements (0-5s)
78- Layer 2: Sending `#a`, `#A`, `#q` filters for repos (5-30s)
79- Layer 3: Receiving issues, patches, PRs (30-60s)
80- Layer 4: Receiving comments on root events (40-60s)
81
82Each run creates its own subdirectory in `tmp/` to keep data and logs isolated:
83
84```bash
85# Create run directory with timestamp
86RUN_DIR="tmp/run-$(date +%Y%m%d-%H%M%S)"
87mkdir -p "$RUN_DIR"
88
89# Run for 60 seconds, saving both raw and sanitized logs
90timeout 60s cargo run -- \
91 --sync-bootstrap-relay-url wss://git.shakespeare.diy \
92 --domain ngit.danconwaydev.com \
93 --git-data-path "$RUN_DIR/git-data" \
94 --relay-data-path "$RUN_DIR/relay-data" \
95 2>&1 | tee "$RUN_DIR/sync-raw.log" | ./scripts/sanitize-logs.sh | tee "$RUN_DIR/sync.log"
96```
97
98**Why 60 seconds?** The sync system uses a 5-second batch window for aggregating discovered repos. Repos discovered late in the sync need time for:
991. Batch window to expire (5s)
1002. Layer 2 filters to be sent and processed
1013. Layer 3 events (issues/patches/PRs) to be returned
1024. Layer 4 filters for root events to be sent
1035. Comments and threaded replies to be returned
104
105Testing shows 30 seconds is too short for late-discovered repos to complete the Layer 2β†’3β†’4 cascade.
106
107**Note:** The `timeout` command returns exit code 124, which is expected.
108
109**Directory structure after run:**
110```
111tmp/
112└── run-20260109-143022/
113 β”œβ”€β”€ git-data/ # Git repository data
114 β”œβ”€β”€ relay-data/ # Relay database
115 β”œβ”€β”€ sync.log # Sanitized log (for quick analysis)
116 └── sync-raw.log # Raw log (for full details when needed)
117```
118
119**When to use which log:**
120- **sync.log** - Use for quick scanning and pattern recognition (long lines truncated)
121- **sync-raw.log** - Use when you need full details (e.g., complete rejection reasons, full event data)
122
123## Log Sanitization
124
125Raw logs include full events and hundreds of event IDs per line, making them unwieldy for analysis. The sanitizer truncates long lines:
126
127```bash
128./scripts/sanitize-logs.sh < raw.log > sanitized.log
129
130# Or pipe directly
131cargo run -- [args] 2>&1 | ./scripts/sanitize-logs.sh
132```
133
134**Options:**
135- `--head-chars N` - First N characters to show (default: 200)
136- `--tail-chars N` - Last N characters to show (default: 100)
137
138Example output:
139```
1402024-01-09T10:00:00Z DEBUG sync: Processing events ids=[abc123, def456, ghi789, jkl012...<1847 chars>...xyz999, end123]
141```
142
143### Retrieving Full Details from Raw Logs
144
145When sanitized logs show truncated messages (e.g., rejection reasons), use the raw log to see the complete content:
146
147```bash
148# Find specific error in raw log
149grep "Rejected repository announcement" "$RUN_DIR/sync-raw.log"
150
151# Extract full line for specific event ID
152grep "note1z5ys7wf3ms5yxhnp3kfw7hpu5asfkx4jngzt5zgs4tm4tnvggnsqjfqeyt" "$RUN_DIR/sync-raw.log"
153
154# View context around a truncated warning
155grep -A 2 -B 2 "pattern from sanitized log" "$RUN_DIR/sync-raw.log"
156```
157
158The raw log contains complete, untruncated messages including full rejection reasons, event data, and debug details.
159
160## What to Look For
161
162### Phase 1: Connection & Bootstrap (0-5 seconds)
163
164**Expected behavior:**
165- Connection to bootstrap relay succeeds
166- Layer 1 (announcement) subscription starts
167- First batch of 30617/30618 events received
168
169**Red flags:**
170- Connection timeout or failure
171- NIP-77 negentropy errors (should fall back gracefully)
172- Immediate rate limiting
173
174### Phase 2: Discovery Cascade (5-15 seconds)
175
176**Expected behavior:**
177- Self-subscriber batches fire as announcements are processed
178- New relays discovered from announcement `relays` tags
179- Layer 2 (repo tags) subscriptions created
180
181**Red flags:**
182- Excessive relay discovery (>10 relays rapidly)
183- Filter consolidation warnings (>70 filters)
184- Missing self-subscriber batch logs
185
186### Phase 3: Steady State (15+ seconds)
187
188**Expected behavior:**
189- Historic sync batches completing (EOSE received)
190- Periodic health checks running
191- Events being saved to database
192
193**Red flags:**
194- Pending batches never confirming
195- Repeated connection/disconnect cycles
196- Memory growth (check with `top` in another terminal)
197
198## Debugging Checklist
199
200When analyzing logs, look for these patterns:
201
202### Errors to Investigate
203
204| Pattern | Possible Cause | Action |
205|---------|----------------|--------|
206| `error` (any) | Unexpected failure | Investigate immediately |
207| `connection failed` | Network/relay issue | Check relay URL, try different relay |
208| `rate limit` | Too many requests | Check consolidation, increase backoff |
209| `negentropy` + `error` | NIP-77 incompatibility | Should fall back - verify it does |
210| `timeout` | Slow relay or large sync | Increase timeouts or reduce scope |
211
212### Warnings to Monitor
213
214| Pattern | Meaning | Action |
215|---------|---------|--------|
216| `consolidating filters` | Filter count high | Expected, but frequent = problem |
217| `backing off` | Health tracker retry | Normal, but watch for excessive |
218| `batch failed` | Historic sync incomplete | Check which batches, why |
219
220### Debug Patterns to Verify
221
222| Pattern | What it shows |
223|---------|---------------|
224| `fresh_start` | Full sync initiated |
225| `quick_reconnect` | Incremental sync (<15min gap) |
226| `historic sync complete` | Sync finished successfully |
227| `sync_live` | Live subscriptions active |
228| `PendingBatch` | Items awaiting EOSE confirmation |
229
230## Mode 1: Fix Existing Issues (Detailed)
231
232When `work/active-issues/` contains issue files:
233
234### Step 1: Check for Active Issues
235
236```bash
237ls work/active-issues/
238```
239
240If any `.md` files exist (excluding README.md), you're in Mode 1.
241
242### Step 2: Pick Most Important Issue
243
244Review issue files and select based on:
245- Severity (errors > warnings > log quality)
246- Impact (functionality > performance > UX)
247- Complexity (quick fixes first to clear backlog)
248
249### Step 3: Review Proposed Fix and Get Permission
250
251**IMPORTANT:** Before implementing any changes:
252
2531. Read relevant code files to understand the issue
2542. Analyze the root cause
2553. Propose a fix with explanation of what will change and why
2564. Summarize the proposed fix in 2-3 sentences
2575. **Ask for user permission to proceed**
258
259**Do NOT implement changes without explicit approval.**
260
261### Step 4: Implement the Fix
262
263After receiving permission, make the necessary code changes based on the issue description and approved plan.
264
265### Step 5: Test, Lint, Format
266
267```bash
268# Run tests
269cargo test
270
271# Check for warnings
272cargo clippy
273
274# Format code
275cargo fmt
276```
277
278### Step 6: Commit
279
280```bash
281git add .
282git commit -m "fix: [brief description of what was fixed]"
283```
284
285### Step 7: Report Back
286
287**STOP HERE.** Report what was fixed. Do NOT:
288- Fix another issue
289- Run production sync test
290- Do additional investigation
291
292The workflow will cycle back through Mode 1 if more issues remain.
293
294## Mode 2: Discover New Issues (Detailed)
295
296When `work/active-issues/` is empty (or only contains README.md):
297
298### Step 1: Run Production Sync Test
299
300```bash
301# Create run directory with timestamp
302RUN_DIR="tmp/run-$(date +%Y%m%d-%H%M%S)"
303mkdir -p "$RUN_DIR"
304
305# Run 60-second test, saving both raw and sanitized logs
306timeout 60s cargo run -- \
307 --sync-bootstrap-relay-url wss://git.shakespeare.diy \
308 --domain ngit.danconwaydev.com \
309 --git-data-path "$RUN_DIR/git-data" \
310 --relay-data-path "$RUN_DIR/relay-data" \
311 2>&1 | tee "$RUN_DIR/sync-raw.log" | ./scripts/sanitize-logs.sh | tee "$RUN_DIR/sync.log"
312```
313
314Each run is isolated in its own timestamped directory under `tmp/`, keeping data and logs organized. Both raw and sanitized logs are saved for flexible analysis.
315
316**Note:** 60 seconds allows the full sync cascade (Layer 1β†’2β†’3β†’4) to complete for late-discovered repos.
317
318### Step 2: Analyze Logs
319
320Scan for errors and unexpected patterns:
321```bash
322# Find the most recent run
323LATEST_RUN=$(ls -1t tmp/run-*/sync.log | head -n1)
324LATEST_RAW=$(ls -1t tmp/run-*/sync-raw.log | head -n1)
325
326# Analyze sanitized log for quick scanning
327grep -i error "$LATEST_RUN"
328grep -i warn "$LATEST_RUN"
329grep -i panic "$LATEST_RUN"
330
331# If you find truncated messages, check the raw log for full details
332grep "pattern from truncated message" "$LATEST_RAW"
333```
334
335### Step 3: Document Issues
336
337Create **one markdown file per issue** in `work/active-issues/`:
338
339```bash
340# Example: Minimal issue documentation
341cat > work/active-issues/bootstrap-disconnect.md <<'EOF'
342# Bootstrap relay disconnects when empty
343
344Bootstrap relay wss://git.shakespeare.diy disconnects after sync finds 0 events. Should persist since user-specified.
345
346Log: "Disconnecting empty relay relay=wss://git.shakespeare.diy"
347File: src/sync/mod.rs (check_disconnects function)
348EOF
349```
350
351**Keep each file brief:**
352- Descriptive title (one line)
353- What happens (1-2 sentences max)
354- Relevant log excerpt (one line)
355- File/function location if obvious (one line)
356- **NO** separate detailed analysis files
357- **NO** root cause analysis
358- **NO** proposed solutions (unless immediately obvious)
359
360### Step 4: Report Summary
361
362Provide a brief closing message with 1-2 sentence summary of **each issue** identified:
363- State what the issue is
364- Where it occurs (file/component)
365- Keep it concise
366
367**STOP HERE.** Do NOT:
368- Fix the issues immediately
369- Create separate detailed analysis markdown files
370- Do thorough investigations
371- Write lengthy explanations
372
373The workflow will cycle back through Mode 1 to fix issues one at a time.
374
375## Logging Improvements
376
377If the logs aren't helpful enough, improve them. Common needs:
378
379### Add Context to Existing Logs
380
381```rust
382// Before
383tracing::debug!("Processing events");
384
385// After
386tracing::debug!(
387 relay = %relay_url,
388 event_count = events.len(),
389 "Processing events"
390);
391```
392
393### Add New Log Points
394
395Key places that may need more logging:
396- `src/sync/mod.rs` - SyncManager state transitions
397- `src/sync/relay_connection.rs` - Connection lifecycle
398- `src/sync/self_subscriber.rs` - Batch processing
399
400### Reduce Noise
401
402If a log line appears too frequently:
403```rust
404// Change from debug! to trace!
405tracing::trace!("Per-event detail that's too noisy");
406```
407
408## Managing Active Issues
409
410Issues are tracked in `work/active-issues/` as individual markdown files.
411
412**Check for active issues:**
413```bash
414ls work/active-issues/
415```
416
417**After fixing an issue:**
418```bash
419# Delete the resolved issue file
420rm work/active-issues/issue-name.md
421
422# Or archive if important for future reference
423mv work/active-issues/issue-name.md docs/archive/2026-01-09-issue-name.md
424```
425
426**Issue file format (minimal):**
427```markdown
428# Brief title
429
430What happens (1-2 sentences).
431
432Log evidence: "relevant log line"
433File: src/path/to/file.rs (function_name if known)
434```
435
436Keep documentation minimal - just enough to identify and locate the issue.
437
438---
439
440## Workflow Summary
441
442```
443Check work/active-issues/
444 β”‚
445 β”œβ”€ Has issues? ──► Mode 1: Pick one issue
446 β”‚ β”‚
447 β”‚ β”œβ”€ Review & propose fix
448 β”‚ β”œβ”€ Ask permission
449 β”‚ β”œβ”€ Fix code (after approval)
450 β”‚ β”œβ”€ cargo test
451 β”‚ β”œβ”€ cargo clippy
452 β”‚ β”œβ”€ cargo fmt
453 β”‚ β”œβ”€ git commit
454 β”‚ └─ Report & STOP
455 β”‚
456 └─ No issues? ──► Mode 2: Run production sync
457 β”‚
458 β”œβ”€ timeout 60s cargo run ...
459 β”œβ”€ Analyze logs
460 β”œβ”€ Document issues (minimal)
461 └─ Report summary & STOP
462```
463
464**Key Rules:**
465- Only do ONE thing per cycle (fix one issue OR discover issues)
466- Always stop after reporting
467- Keep issue documentation minimal
468- No root cause analysis during discovery
469
470---
471
472## Quick Reference
473
474### Minimal Test Command
475
476```bash
477# Create run directory
478RUN_DIR="tmp/run-$(date +%Y%m%d-%H%M%S)"
479mkdir -p "$RUN_DIR"
480
481# Run test with both raw and sanitized logs (60s for full cascade)
482timeout 60s cargo run -- \
483 --sync-bootstrap-relay-url wss://git.shakespeare.diy \
484 --domain ngit.danconwaydev.com \
485 --git-data-path "$RUN_DIR/git-data" \
486 --relay-data-path "$RUN_DIR/relay-data" \
487 2>&1 | tee "$RUN_DIR/sync-raw.log" | ./scripts/sanitize-logs.sh | tee "$RUN_DIR/sync.log"
488```
489
490### With Metrics Endpoint
491
492```bash
493# Create run directory
494RUN_DIR="tmp/run-$(date +%Y%m%d-%H%M%S)"
495mkdir -p "$RUN_DIR"
496
497# Run with metrics and both log formats (60s for full cascade)
498timeout 60s cargo run -- \
499 --sync-bootstrap-relay-url wss://git.shakespeare.diy \
500 --domain ngit.danconwaydev.com \
501 --git-data-path "$RUN_DIR/git-data" \
502 --relay-data-path "$RUN_DIR/relay-data" \
503 --metrics-address 127.0.0.1:9090 \
504 2>&1 | tee "$RUN_DIR/sync-raw.log" | ./scripts/sanitize-logs.sh | tee "$RUN_DIR/sync.log"
505```
506
507Then in another terminal: `curl http://127.0.0.1:9090/metrics`
508
509### Cleanup Old Runs
510
511```bash
512# Remove runs older than 7 days
513find tmp/run-* -type d -mtime +7 -exec rm -rf {} +
514
515# Remove all test runs
516rm -rf tmp/run-*
517```
518
519### Different Log Level
520
521The default is DEBUG. For more detail:
522```bash
523RUST_LOG=trace cargo run -- [args]
524```
525
526For less noise:
527```bash
528RUST_LOG=info cargo run -- [args]
529```
530
531---
532
533*Part of the [ngit-grasp documentation](../README.md) using the [DiΓ‘taxis](https://diataxis.fr/) framework.*