diff options
| author | DanConwayDev <DanConwayDev@protonmail.com> | 2026-01-10 03:24:45 +0000 |
|---|---|---|
| committer | DanConwayDev <DanConwayDev@protonmail.com> | 2026-01-10 03:24:45 +0000 |
| commit | 0bae1738ace1af196272a333b5d835a7e497861b (patch) | |
| tree | a02124797b96e6ef7c334e7e8d04e00e7a7dd823 /docs | |
| parent | 3d5c6102e39e881edf056dc69cdc0dcb9b6d281b (diff) | |
docs: update production sync testing to require 60 seconds
The sync system uses a 5-second batch window for discovered repos.
Repos discovered late in a 30-second test don't have enough time for
the full Layer 2→3→4 cascade:
- Layer 1: Discover repo announcements (0-5s)
- Layer 2: Send #a, #A, #q filters for repos (5-30s)
- Layer 3: Receive issues, patches, PRs (30-60s)
- Layer 4: Receive comments on root events (40-60s)
Testing confirmed that 60 seconds allows late-discovered repos
(gitworkshop, ngit) to complete all layers, while 30 seconds only
allows 1 second after Layer 2 filters are sent.
Updated all references from 30s to 60s throughout the guide and added
explanation of why this duration is necessary.
Diffstat (limited to 'docs')
| -rw-r--r-- | docs/how-to/production-sync-testing.md | 41 |
1 files changed, 29 insertions, 12 deletions
diff --git a/docs/how-to/production-sync-testing.md b/docs/how-to/production-sync-testing.md index 3b1d4e4..3a273a7 100644 --- a/docs/how-to/production-sync-testing.md +++ b/docs/how-to/production-sync-testing.md | |||
| @@ -1,6 +1,6 @@ | |||
| 1 | # How-To: Test Sync Against Production Data | 1 | # How-To: Test Sync Against Production Data |
| 2 | 2 | ||
| 3 | > **Quick Start Prompt:** Check work/active-issues/ for existing issues. If issues exist, pick the most important, fix it, test with cargo test, run clippy and fmt, commit, and report back with a brief 1-2 sentence summary of each issue you identified. If no issues exist, run a 30-second production sync test, analyze logs, create individual issue files in work/active-issues/ (one per issue with minimal description), then report summary listing each issue in 1-2 sentences. | 3 | > **Quick Start Prompt:** Check work/active-issues/ for existing issues. If issues exist, pick the most important, fix it, test with cargo test, run clippy and fmt, commit, and report back with a brief 1-2 sentence summary of each issue you identified. If no issues exist, run a 60-second production sync test, analyze logs, create individual issue files in work/active-issues/ (one per issue with minimal description), then report summary listing each issue in 1-2 sentences. |
| 4 | 4 | ||
| 5 | **Problem:** Debug and improve sync behavior using real-world data from production relays | 5 | **Problem:** Debug and improve sync behavior using real-world data from production relays |
| 6 | **Difficulty:** Intermediate | 6 | **Difficulty:** Intermediate |
| @@ -26,7 +26,7 @@ This guide operates in two modes: | |||
| 26 | ### Mode 2: Discover New Issues | 26 | ### Mode 2: Discover New Issues |
| 27 | **When:** No active issues in `work/active-issues/` | 27 | **When:** No active issues in `work/active-issues/` |
| 28 | 28 | ||
| 29 | 1. Run 30-second production sync test (logs saved to `tmp/run-{timestamp}/`) | 29 | 1. Run 60-second production sync test (logs saved to `tmp/run-{timestamp}/`) |
| 30 | 2. Analyze logs for errors, warnings, unexpected patterns | 30 | 2. Analyze logs for errors, warnings, unexpected patterns |
| 31 | 3. Document each issue as a separate markdown file in `work/active-issues/` | 31 | 3. Document each issue as a separate markdown file in `work/active-issues/` |
| 32 | 4. Keep issue files minimal - just enough to identify the issue | 32 | 4. Keep issue files minimal - just enough to identify the issue |
| @@ -73,15 +73,21 @@ The bootstrap relay provides the initial set of announcements to discover repos: | |||
| 73 | 73 | ||
| 74 | ### 3. Run with Time Limit | 74 | ### 3. Run with Time Limit |
| 75 | 75 | ||
| 76 | Start with short runs (30 seconds) to capture manageable log volumes. Each run creates its own subdirectory in `tmp/` to keep data and logs isolated: | 76 | Run for **60 seconds** to allow the full sync cascade to complete. This duration allows: |
| 77 | - Layer 1: Discovery of repo announcements (0-5s) | ||
| 78 | - Layer 2: Sending `#a`, `#A`, `#q` filters for repos (5-30s) | ||
| 79 | - Layer 3: Receiving issues, patches, PRs (30-60s) | ||
| 80 | - Layer 4: Receiving comments on root events (40-60s) | ||
| 81 | |||
| 82 | Each run creates its own subdirectory in `tmp/` to keep data and logs isolated: | ||
| 77 | 83 | ||
| 78 | ```bash | 84 | ```bash |
| 79 | # Create run directory with timestamp | 85 | # Create run directory with timestamp |
| 80 | RUN_DIR="tmp/run-$(date +%Y%m%d-%H%M%S)" | 86 | RUN_DIR="tmp/run-$(date +%Y%m%d-%H%M%S)" |
| 81 | mkdir -p "$RUN_DIR" | 87 | mkdir -p "$RUN_DIR" |
| 82 | 88 | ||
| 83 | # Run for 30 seconds, saving both raw and sanitized logs | 89 | # Run for 60 seconds, saving both raw and sanitized logs |
| 84 | timeout 30s cargo run -- \ | 90 | timeout 60s cargo run -- \ |
| 85 | --sync-bootstrap-relay-url wss://git.shakespeare.diy \ | 91 | --sync-bootstrap-relay-url wss://git.shakespeare.diy \ |
| 86 | --domain ngit.danconwaydev.com \ | 92 | --domain ngit.danconwaydev.com \ |
| 87 | --git-data-path "$RUN_DIR/git-data" \ | 93 | --git-data-path "$RUN_DIR/git-data" \ |
| @@ -89,6 +95,15 @@ timeout 30s cargo run -- \ | |||
| 89 | 2>&1 | tee "$RUN_DIR/sync-raw.log" | ./scripts/sanitize-logs.sh | tee "$RUN_DIR/sync.log" | 95 | 2>&1 | tee "$RUN_DIR/sync-raw.log" | ./scripts/sanitize-logs.sh | tee "$RUN_DIR/sync.log" |
| 90 | ``` | 96 | ``` |
| 91 | 97 | ||
| 98 | **Why 60 seconds?** The sync system uses a 5-second batch window for aggregating discovered repos. Repos discovered late in the sync need time for: | ||
| 99 | 1. Batch window to expire (5s) | ||
| 100 | 2. Layer 2 filters to be sent and processed | ||
| 101 | 3. Layer 3 events (issues/patches/PRs) to be returned | ||
| 102 | 4. Layer 4 filters for root events to be sent | ||
| 103 | 5. Comments and threaded replies to be returned | ||
| 104 | |||
| 105 | Testing shows 30 seconds is too short for late-discovered repos to complete the Layer 2→3→4 cascade. | ||
| 106 | |||
| 92 | **Note:** The `timeout` command returns exit code 124, which is expected. | 107 | **Note:** The `timeout` command returns exit code 124, which is expected. |
| 93 | 108 | ||
| 94 | **Directory structure after run:** | 109 | **Directory structure after run:** |
| @@ -287,8 +302,8 @@ When `work/active-issues/` is empty (or only contains README.md): | |||
| 287 | RUN_DIR="tmp/run-$(date +%Y%m%d-%H%M%S)" | 302 | RUN_DIR="tmp/run-$(date +%Y%m%d-%H%M%S)" |
| 288 | mkdir -p "$RUN_DIR" | 303 | mkdir -p "$RUN_DIR" |
| 289 | 304 | ||
| 290 | # Run 30-second test, saving both raw and sanitized logs | 305 | # Run 60-second test, saving both raw and sanitized logs |
| 291 | timeout 30s cargo run -- \ | 306 | timeout 60s cargo run -- \ |
| 292 | --sync-bootstrap-relay-url wss://git.shakespeare.diy \ | 307 | --sync-bootstrap-relay-url wss://git.shakespeare.diy \ |
| 293 | --domain ngit.danconwaydev.com \ | 308 | --domain ngit.danconwaydev.com \ |
| 294 | --git-data-path "$RUN_DIR/git-data" \ | 309 | --git-data-path "$RUN_DIR/git-data" \ |
| @@ -298,6 +313,8 @@ timeout 30s cargo run -- \ | |||
| 298 | 313 | ||
| 299 | Each run is isolated in its own timestamped directory under `tmp/`, keeping data and logs organized. Both raw and sanitized logs are saved for flexible analysis. | 314 | Each run is isolated in its own timestamped directory under `tmp/`, keeping data and logs organized. Both raw and sanitized logs are saved for flexible analysis. |
| 300 | 315 | ||
| 316 | **Note:** 60 seconds allows the full sync cascade (Layer 1→2→3→4) to complete for late-discovered repos. | ||
| 317 | |||
| 301 | ### Step 2: Analyze Logs | 318 | ### Step 2: Analyze Logs |
| 302 | 319 | ||
| 303 | Scan for errors and unexpected patterns: | 320 | Scan for errors and unexpected patterns: |
| @@ -438,7 +455,7 @@ Check work/active-issues/ | |||
| 438 | │ | 455 | │ |
| 439 | └─ No issues? ──► Mode 2: Run production sync | 456 | └─ No issues? ──► Mode 2: Run production sync |
| 440 | │ | 457 | │ |
| 441 | ├─ timeout 30s cargo run ... | 458 | ├─ timeout 60s cargo run ... |
| 442 | ├─ Analyze logs | 459 | ├─ Analyze logs |
| 443 | ├─ Document issues (minimal) | 460 | ├─ Document issues (minimal) |
| 444 | └─ Report summary & STOP | 461 | └─ Report summary & STOP |
| @@ -461,8 +478,8 @@ Check work/active-issues/ | |||
| 461 | RUN_DIR="tmp/run-$(date +%Y%m%d-%H%M%S)" | 478 | RUN_DIR="tmp/run-$(date +%Y%m%d-%H%M%S)" |
| 462 | mkdir -p "$RUN_DIR" | 479 | mkdir -p "$RUN_DIR" |
| 463 | 480 | ||
| 464 | # Run test with both raw and sanitized logs | 481 | # Run test with both raw and sanitized logs (60s for full cascade) |
| 465 | timeout 30s cargo run -- \ | 482 | timeout 60s cargo run -- \ |
| 466 | --sync-bootstrap-relay-url wss://git.shakespeare.diy \ | 483 | --sync-bootstrap-relay-url wss://git.shakespeare.diy \ |
| 467 | --domain ngit.danconwaydev.com \ | 484 | --domain ngit.danconwaydev.com \ |
| 468 | --git-data-path "$RUN_DIR/git-data" \ | 485 | --git-data-path "$RUN_DIR/git-data" \ |
| @@ -477,8 +494,8 @@ timeout 30s cargo run -- \ | |||
| 477 | RUN_DIR="tmp/run-$(date +%Y%m%d-%H%M%S)" | 494 | RUN_DIR="tmp/run-$(date +%Y%m%d-%H%M%S)" |
| 478 | mkdir -p "$RUN_DIR" | 495 | mkdir -p "$RUN_DIR" |
| 479 | 496 | ||
| 480 | # Run with metrics and both log formats | 497 | # Run with metrics and both log formats (60s for full cascade) |
| 481 | timeout 30s cargo run -- \ | 498 | timeout 60s cargo run -- \ |
| 482 | --sync-bootstrap-relay-url wss://git.shakespeare.diy \ | 499 | --sync-bootstrap-relay-url wss://git.shakespeare.diy \ |
| 483 | --domain ngit.danconwaydev.com \ | 500 | --domain ngit.danconwaydev.com \ |
| 484 | --git-data-path "$RUN_DIR/git-data" \ | 501 | --git-data-path "$RUN_DIR/git-data" \ |