diff options
Diffstat (limited to 'docs/how-to/migrate-to-ngit-grasp.md')
| -rw-r--r-- | docs/how-to/migrate-to-ngit-grasp.md | 412 |
1 files changed, 412 insertions, 0 deletions
diff --git a/docs/how-to/migrate-to-ngit-grasp.md b/docs/how-to/migrate-to-ngit-grasp.md new file mode 100644 index 0000000..f4dff86 --- /dev/null +++ b/docs/how-to/migrate-to-ngit-grasp.md | |||
| @@ -0,0 +1,412 @@ | |||
| 1 | # Migrate to ngit-grasp from another GRASP implementation | ||
| 2 | |||
| 3 | This guide walks you through migrating a production GRASP relay to ngit-grasp. The process involves analyzing your existing data to identify repositories that need attention before switching over. | ||
| 4 | |||
| 5 | ## Compatibility | ||
| 6 | |||
| 7 | This migration process works with any GRASP implementation that: | ||
| 8 | |||
| 9 | - Stores git data in the `<npub>/<identifier>.git` directory structure | ||
| 10 | - Uses standard GRASP events (kind 30617 announcements, kind 30618 state, kind 5 deletions) | ||
| 11 | - Exposes a Nostr relay WebSocket endpoint | ||
| 12 | |||
| 13 | **Known compatible implementations:** | ||
| 14 | - ngit-relay (reference implementation) | ||
| 15 | - ngit-grasp (when migrating between instances or from archive mode) | ||
| 16 | - Other GRASP-compliant relays following the specification | ||
| 17 | |||
| 18 | The migration scripts analyze Nostr events and git data directly, making them implementation-agnostic. | ||
| 19 | |||
| 20 | ## Quick Start | ||
| 21 | |||
| 22 | Run the migration analysis with a single command: | ||
| 23 | |||
| 24 | ```bash | ||
| 25 | # Basic analysis (fetches events, compares relays) | ||
| 26 | ./docs/how-to/migration-scripts/run-migration-analysis.sh \ | ||
| 27 | --prod-relay wss://source-relay.example.com \ | ||
| 28 | --archive-relay wss://target-relay.example.com | ||
| 29 | |||
| 30 | # Full analysis (includes git sync check - run on VPS) | ||
| 31 | ./docs/how-to/migration-scripts/run-migration-analysis.sh \ | ||
| 32 | --prod-relay wss://source-relay.example.com \ | ||
| 33 | --archive-relay wss://target-relay.example.com \ | ||
| 34 | --prod-git /var/lib/grasp-relay/git \ | ||
| 35 | --archive-git /var/lib/ngit-grasp/git \ | ||
| 36 | --service ngit-grasp.service | ||
| 37 | ``` | ||
| 38 | |||
| 39 | The script produces three output files: | ||
| 40 | - `results/no-action-required.txt` - Repos ready for migration | ||
| 41 | - `results/action-required.txt` - Repos needing intervention | ||
| 42 | - `results/manual-investigation.txt` - Repos needing human review | ||
| 43 | |||
| 44 | See [Running the Analysis](#running-the-analysis) for detailed options. | ||
| 45 | |||
| 46 | ## Prerequisites | ||
| 47 | |||
| 48 | ### Required Tools | ||
| 49 | |||
| 50 | - **nak** - Nostr Army Knife for fetching events ([install](https://github.com/fiatjaf/nak)) | ||
| 51 | - **jq** - JSON processing (install via package manager) | ||
| 52 | |||
| 53 | ### For Full Analysis (VPS) | ||
| 54 | |||
| 55 | - SSH access to the VPS running your source relay | ||
| 56 | - Read access to git data directories | ||
| 57 | - Access to systemd journal (for log extraction) | ||
| 58 | |||
| 59 | ### Verify Installation | ||
| 60 | |||
| 61 | ```bash | ||
| 62 | # Check required tools | ||
| 63 | nak --version | ||
| 64 | jq --version | ||
| 65 | |||
| 66 | # Check optional tools (for VPS phases) | ||
| 67 | journalctl --version | ||
| 68 | ``` | ||
| 69 | |||
| 70 | ## Migration Overview | ||
| 71 | |||
| 72 | The migration process has three stages: | ||
| 73 | |||
| 74 | ### Stage 1: Deploy Archive Instance | ||
| 75 | |||
| 76 | Deploy ngit-grasp alongside your production relay: | ||
| 77 | |||
| 78 | 1. Configure ngit-grasp with: | ||
| 79 | - `domain` set to `<prod-domain>.internal` (temporary) | ||
| 80 | - `archiveService` set to your production domain | ||
| 81 | - Running on a different port | ||
| 82 | |||
| 83 | 2. Let it sync for ~1 hour to gather all events and git data | ||
| 84 | |||
| 85 | ### Stage 2: Analyze Data | ||
| 86 | |||
| 87 | Run the migration analysis to identify: | ||
| 88 | - Repositories successfully migrated (no action needed) | ||
| 89 | - Repositories with incomplete data (need investigation) | ||
| 90 | - Repositories with parse failures (may need re-announcement) | ||
| 91 | |||
| 92 | ### Stage 3: Switch Over | ||
| 93 | |||
| 94 | Once all issues are resolved: | ||
| 95 | 1. Set `domain` to your production URL | ||
| 96 | 2. Disable archive mode | ||
| 97 | 3. Update your reverse proxy to point to ngit-grasp | ||
| 98 | |||
| 99 | ## Running the Analysis | ||
| 100 | |||
| 101 | ### Basic Usage | ||
| 102 | |||
| 103 | ```bash | ||
| 104 | # Preview what will happen (dry run) | ||
| 105 | ./run-migration-analysis.sh \ | ||
| 106 | --prod-relay wss://source-relay.example.com \ | ||
| 107 | --archive-relay wss://target-relay.example.com \ | ||
| 108 | --dry-run | ||
| 109 | |||
| 110 | # Run the analysis | ||
| 111 | ./run-migration-analysis.sh \ | ||
| 112 | --prod-relay wss://source-relay.example.com \ | ||
| 113 | --archive-relay wss://target-relay.example.com | ||
| 114 | ``` | ||
| 115 | |||
| 116 | ### Full Analysis on VPS | ||
| 117 | |||
| 118 | ```bash | ||
| 119 | ./run-migration-analysis.sh \ | ||
| 120 | --prod-relay wss://source-relay.example.com \ | ||
| 121 | --archive-relay wss://target-relay.example.com \ | ||
| 122 | --prod-git /var/lib/grasp-relay/git \ | ||
| 123 | --archive-git /var/lib/ngit-grasp/git \ | ||
| 124 | --service ngit-grasp.service | ||
| 125 | ``` | ||
| 126 | |||
| 127 | ### Phase Control | ||
| 128 | |||
| 129 | Skip or run specific phases: | ||
| 130 | |||
| 131 | ```bash | ||
| 132 | # Skip Phase 2 (use cached git sync data) | ||
| 133 | ./run-migration-analysis.sh ... --skip-phase-2 | ||
| 134 | |||
| 135 | # Run only Phase 1 (fetch events) | ||
| 136 | ./run-migration-analysis.sh ... --only-phase-1 | ||
| 137 | |||
| 138 | # Resume from Phase 3 (using existing data) | ||
| 139 | ./run-migration-analysis.sh ... --from-phase-3 --output work/migration-analysis-20260122-1430 | ||
| 140 | ``` | ||
| 141 | |||
| 142 | ### All Options | ||
| 143 | |||
| 144 | | Option | Description | | ||
| 145 | |--------|-------------| | ||
| 146 | | `--prod-relay <url>` | Source relay WebSocket URL (required) | | ||
| 147 | | `--archive-relay <url>` | Target relay WebSocket URL (required) | | ||
| 148 | | `--prod-git <path>` | Git base directory for prod (enables Phase 2) | | ||
| 149 | | `--archive-git <path>` | Git base directory for archive (enables Phase 2) | | ||
| 150 | | `--service <name>` | Systemd service name (enables Phase 4) | | ||
| 151 | | `--output <dir>` | Output directory (default: auto-generated) | | ||
| 152 | | `--skip-phase-N` | Skip phase N (1-5) | | ||
| 153 | | `--only-phase-N` | Run only phase N | | ||
| 154 | | `--from-phase-N` | Start from phase N | | ||
| 155 | | `--dry-run` | Show what would be executed | | ||
| 156 | | `--continue-on-error` | Continue even if a phase fails | | ||
| 157 | |||
| 158 | ## Understanding Results | ||
| 159 | |||
| 160 | ### Summary File | ||
| 161 | |||
| 162 | The `results/summary.txt` file provides an overview: | ||
| 163 | |||
| 164 | ``` | ||
| 165 | ## Overview | ||
| 166 | |||
| 167 | | Category | Count | Percentage | | ||
| 168 | |----------|-------|------------| | ||
| 169 | | No Action Required | 450 | 85.7% | | ||
| 170 | | Action Required | 52 | 9.9% | | ||
| 171 | | Manual Investigation | 23 | 4.4% | | ||
| 172 | ``` | ||
| 173 | |||
| 174 | ### No Action Required | ||
| 175 | |||
| 176 | Repositories in `no-action-required.txt` are ready for migration: | ||
| 177 | |||
| 178 | ``` | ||
| 179 | myrepo | npub1abc... | complete in both prod and archive | ||
| 180 | oldrepo | npub1def... | deleted by user | ||
| 181 | testrepo | npub1ghi... | empty/blank in both (user never pushed) | ||
| 182 | ``` | ||
| 183 | |||
| 184 | **Common reasons:** | ||
| 185 | - `complete in both prod and archive` - Successfully migrated | ||
| 186 | - `deleted by user` - User requested deletion (kind 5 event) | ||
| 187 | - `empty/blank in both` - No git data was ever pushed | ||
| 188 | - `purgatory expired` - System already handled the timeout | ||
| 189 | |||
| 190 | ### Action Required | ||
| 191 | |||
| 192 | Repositories in `action-required.txt` need intervention: | ||
| 193 | |||
| 194 | ``` | ||
| 195 | myrepo | npub1abc... | complete in prod, missing from archive | trigger re-sync or investigate | ||
| 196 | otherrepo | npub1def... | incomplete in both (prod=cat3, archive=cat2) | investigate git data source | ||
| 197 | ``` | ||
| 198 | |||
| 199 | **Common actions:** | ||
| 200 | - **Re-sync needed**: Trigger the archive to re-fetch from the source | ||
| 201 | - **Wait for sync**: Archive sync may still be in progress | ||
| 202 | - **Investigate git source**: Original git data may be incomplete | ||
| 203 | - **Fix parse failure**: Event format issue, may need re-announcement | ||
| 204 | |||
| 205 | ### Manual Investigation | ||
| 206 | |||
| 207 | Repositories in `manual-investigation.txt` have unusual states: | ||
| 208 | |||
| 209 | ``` | ||
| 210 | weirdrepo | npub1abc... | in archive (cat1) but not in prod | may be new announcement or deleted from prod | ||
| 211 | conflictrepo | npub1def... | complete in prod, missing from archive, parse failure logged | investigate parse failure | ||
| 212 | ``` | ||
| 213 | |||
| 214 | These require human judgment to determine the correct action. | ||
| 215 | |||
| 216 | ## Troubleshooting | ||
| 217 | |||
| 218 | ### "nak not found" | ||
| 219 | |||
| 220 | Install nak from https://github.com/fiatjaf/nak: | ||
| 221 | |||
| 222 | ```bash | ||
| 223 | # Using Go | ||
| 224 | go install github.com/fiatjaf/nak@latest | ||
| 225 | |||
| 226 | # Or download binary from releases | ||
| 227 | ``` | ||
| 228 | |||
| 229 | ### "Permission denied" on git directories | ||
| 230 | |||
| 231 | Run with sudo or ensure your user has read access: | ||
| 232 | |||
| 233 | ```bash | ||
| 234 | # Check permissions | ||
| 235 | ls -la /var/lib/grasp-relay/git | ||
| 236 | |||
| 237 | # Run with sudo if needed | ||
| 238 | sudo ./run-migration-analysis.sh ... | ||
| 239 | ``` | ||
| 240 | |||
| 241 | ### Phase 2 takes too long | ||
| 242 | |||
| 243 | The git sync check processes each repository individually (~20 minutes total). To speed up iteration: | ||
| 244 | |||
| 245 | 1. Run Phase 2 once and save the output | ||
| 246 | 2. Use `--skip-phase-2` for subsequent runs | ||
| 247 | 3. Use `--from-phase-3` to re-run classification with existing data | ||
| 248 | |||
| 249 | ### No parse failures found | ||
| 250 | |||
| 251 | This is expected if: | ||
| 252 | - ngit-grasp logging improvements aren't deployed yet | ||
| 253 | - No events actually failed to parse | ||
| 254 | |||
| 255 | The analysis will continue without log data. | ||
| 256 | |||
| 257 | ### Event counts are multiples of 250 | ||
| 258 | |||
| 259 | This suggests pagination may have failed. The scripts use `--paginate` by default, but if you see exactly 250, 500, 750 events, verify the relay is responding correctly. | ||
| 260 | |||
| 261 | ## Architecture | ||
| 262 | |||
| 263 | ### Analysis Phases | ||
| 264 | |||
| 265 | The analysis is split into 5 modular phases: | ||
| 266 | |||
| 267 | | Phase | Name | Time | Location | Description | | ||
| 268 | |-------|------|------|----------|-------------| | ||
| 269 | | 1 | Fetch Events | ~30s each | Local | Fetch events from both relays | | ||
| 270 | | 2 | Git Sync Check | ~20 min each | VPS | Compare state events to git data | | ||
| 271 | | 3 | Categorize & Compare | <1s | Local | Categorize and compare results | | ||
| 272 | | 4 | Extract Logs | <30s | VPS | Extract parse failures and purgatory expiry | | ||
| 273 | | 5 | Final Classification | <5s | Local | Combine all data into actionable results | | ||
| 274 | |||
| 275 | ### Phase Flow Diagram | ||
| 276 | |||
| 277 | ``` | ||
| 278 | ┌─────────────────────────────────────────────────────────────────┐ | ||
| 279 | │ PHASE 1: Fetch Events (~30s, local) │ | ||
| 280 | │ Fetches kind 30618 (state), 30617 (announcements), 5 (deletion) │ | ||
| 281 | │ Run twice: once for prod, once for archive │ | ||
| 282 | └─────────────────────────────────────────────────────────────────┘ | ||
| 283 | ↓ | ||
| 284 | ┌─────────────────────────────────────────────────────────────────┐ | ||
| 285 | │ PHASE 2: Git Sync Check (~20 mins, VPS required) │ | ||
| 286 | │ Compares state event refs to actual git data on disk │ | ||
| 287 | │ Categorizes into: complete, empty, partial, no-match │ | ||
| 288 | └─────────────────────────────────────────────────────────────────┘ | ||
| 289 | ↓ | ||
| 290 | ┌─────────────────────────────────────────────────────────────────┐ | ||
| 291 | │ PHASE 3: Categorize & Compare (fast, local) │ | ||
| 292 | │ Compares prod vs archive categories │ | ||
| 293 | │ Identifies gaps and sync issues │ | ||
| 294 | └─────────────────────────────────────────────────────────────────┘ | ||
| 295 | ↓ | ||
| 296 | ┌─────────────────────────────────────────────────────────────────┐ | ||
| 297 | │ PHASE 4: Log-Based Categories (VPS required) │ | ||
| 298 | │ Extracts [PARSE_FAIL] and [PURGATORY_EXPIRED] from logs │ | ||
| 299 | │ Provides context for why repos failed to sync │ | ||
| 300 | └─────────────────────────────────────────────────────────────────┘ | ||
| 301 | ↓ | ||
| 302 | ┌─────────────────────────────────────────────────────────────────┐ | ||
| 303 | │ PHASE 5: Final Classification (fast, local) │ | ||
| 304 | │ Combines all data sources │ | ||
| 305 | │ Outputs: no-action, action-required, manual-investigation │ | ||
| 306 | └─────────────────────────────────────────────────────────────────┘ | ||
| 307 | ``` | ||
| 308 | |||
| 309 | ### Git Sync Categories | ||
| 310 | |||
| 311 | Phase 2 categorizes repositories into 4 categories: | ||
| 312 | |||
| 313 | | Category | Description | Meaning | | ||
| 314 | |----------|-------------|---------| | ||
| 315 | | 1 | Complete Match | All refs in state event match git data | | ||
| 316 | | 2 | Empty/Blank | No git data available | | ||
| 317 | | 3 | Partial Match | Some refs match, some don't | | ||
| 318 | | 4 | No Match | Git data exists but refs don't match | | ||
| 319 | |||
| 320 | ### Output Directory Structure | ||
| 321 | |||
| 322 | ``` | ||
| 323 | work/migration-analysis-YYYYMMDD-HHMM/ | ||
| 324 | ├── prod/ | ||
| 325 | │ ├── raw/ | ||
| 326 | │ │ ├── state-events.json # Phase 1 | ||
| 327 | │ │ ├── announcements.json # Phase 1 | ||
| 328 | │ │ └── deletions.json # Phase 1 | ||
| 329 | │ ├── git-sync-status.tsv # Phase 2 | ||
| 330 | │ └── category*.txt # Phase 2/3 | ||
| 331 | ├── archive/ | ||
| 332 | │ └── (same structure as prod) | ||
| 333 | ├── comparison/ | ||
| 334 | │ ├── complete-in-both.txt # Phase 3 | ||
| 335 | │ ├── complete-prod-missing-archive.txt | ||
| 336 | │ ├── complete-prod-incomplete-archive.txt | ||
| 337 | │ ├── incomplete-in-both.txt | ||
| 338 | │ ├── in-archive-not-prod.txt | ||
| 339 | │ └── summary.txt | ||
| 340 | ├── logs/ | ||
| 341 | │ ├── parse-failures.txt # Phase 4 | ||
| 342 | │ └── purgatory-expired.txt # Phase 4 | ||
| 343 | └── results/ | ||
| 344 | ├── no-action-required.txt # Phase 5 | ||
| 345 | ├── action-required.txt # Phase 5 | ||
| 346 | ├── manual-investigation.txt # Phase 5 | ||
| 347 | └── summary.txt # Phase 5 | ||
| 348 | ``` | ||
| 349 | |||
| 350 | ## Why Migration May Require Attention | ||
| 351 | |||
| 352 | Different GRASP implementations may handle edge cases differently. ngit-grasp has stricter validation and better observability, which can surface issues that were previously hidden: | ||
| 353 | |||
| 354 | | Aspect | Typical Source Relay | ngit-grasp | | ||
| 355 | |--------|---------------------|------------| | ||
| 356 | | Git data validation | May accept partial data | Requires all git data to reproduce state | | ||
| 357 | | PR refs cleanup | May not clear `refs/nostr/<event-id>` | Properly manages PR refs | | ||
| 358 | | Parse failures | May silently ignore | Logs structured `[PARSE_FAIL]` entries | | ||
| 359 | | Sync timeout | May have no timeout | Purgatory expires after configurable period | | ||
| 360 | |||
| 361 | These differences explain why some repositories may need attention during migration - ngit-grasp's stricter validation catches issues that other implementations may have silently accepted. | ||
| 362 | |||
| 363 | ## Next Steps | ||
| 364 | |||
| 365 | After running the analysis: | ||
| 366 | |||
| 367 | 1. **Review the summary** - Check `results/summary.txt` for the overview | ||
| 368 | 2. **Address action items** - Work through `results/action-required.txt` | ||
| 369 | 3. **Investigate edge cases** - Review `results/manual-investigation.txt` | ||
| 370 | 4. **Re-run analysis** - After fixing issues, re-run to verify | ||
| 371 | 5. **Plan cutover** - Schedule the switch when all issues are resolved | ||
| 372 | |||
| 373 | ### When to Re-run | ||
| 374 | |||
| 375 | Re-run the analysis when: | ||
| 376 | - Archive sync has had time to complete | ||
| 377 | - You've fixed parse failures or re-announced events | ||
| 378 | - You want to verify fixes before cutover | ||
| 379 | |||
| 380 | ```bash | ||
| 381 | # Re-run with existing Phase 2 data (faster) | ||
| 382 | ./run-migration-analysis.sh ... --skip-phase-2 --output work/migration-analysis-20260122-1430 | ||
| 383 | ``` | ||
| 384 | |||
| 385 | ## Individual Scripts | ||
| 386 | |||
| 387 | For advanced usage, you can run individual phase scripts: | ||
| 388 | |||
| 389 | ```bash | ||
| 390 | # Phase 1: Fetch events | ||
| 391 | ./migration-scripts/01-fetch-events.sh wss://source-relay.example.com output/prod | ||
| 392 | |||
| 393 | # Phase 2: Git sync check | ||
| 394 | ./migration-scripts/10-check-git-sync.sh output/prod/raw/state-events.json /var/lib/grasp-relay/git output/prod --categorize | ||
| 395 | |||
| 396 | # Phase 3a: Categorize | ||
| 397 | ./migration-scripts/20-categorize.sh output/prod/git-sync-status.tsv output/prod | ||
| 398 | |||
| 399 | # Phase 3b: Compare relays | ||
| 400 | ./migration-scripts/21-compare-relays.sh output/prod output/archive output/comparison | ||
| 401 | |||
| 402 | # Phase 4a: Extract parse failures | ||
| 403 | ./migration-scripts/30-extract-parse-failures.sh ngit-grasp.service output/logs | ||
| 404 | |||
| 405 | # Phase 4b: Extract purgatory expiry | ||
| 406 | ./migration-scripts/31-extract-purgatory-expiry.sh ngit-grasp.service output/logs | ||
| 407 | |||
| 408 | # Phase 5: Final classification | ||
| 409 | ./migration-scripts/40-classify-actions.sh work/migration-analysis-20260122-1430 | ||
| 410 | ``` | ||
| 411 | |||
| 412 | Each script has detailed help available with `--help` or by reading the script header. | ||