upleb.uk

Public git repos — served from a NIP-34 GRASP relay at git.upleb.uk

summaryrefslogtreecommitdiff
path: root/docs
diff options
context:
space:
mode:
Diffstat (limited to 'docs')
-rw-r--r--docs/how-to/migrate-ngit-relay-to-ngit-grasp.md484
1 files changed, 336 insertions, 148 deletions
diff --git a/docs/how-to/migrate-ngit-relay-to-ngit-grasp.md b/docs/how-to/migrate-ngit-relay-to-ngit-grasp.md
index 4c3a4ba..975eb4c 100644
--- a/docs/how-to/migrate-ngit-relay-to-ngit-grasp.md
+++ b/docs/how-to/migrate-ngit-relay-to-ngit-grasp.md
@@ -1,207 +1,395 @@
1# Migrate ngit-relay to ngit-grasp on NixOS VPS 1# Migrate ngit-relay to ngit-grasp
2 2
3**Goal:** Replace an ngit-relay instance on a VPS running NixOS with ngit-grasp. 3This guide walks you through migrating a production ngit-relay instance to ngit-grasp. The process involves analyzing your existing data to identify repositories that need attention before switching over.
4 4
5**Specifics:** VPS running NixOS. 5## Quick Start
6 6
7## Approach 7Run the migration analysis with a single command:
8 8
91. Deploy ngit-grasp with 'domain' of `<prod-domain>.internal` and an `archiveService` of `<prod-domain>` running on a different port. This will gather all the events and git data from the production service and relays/git servers/grasp servers that for repositories that list the service in their announcement event. To sync all git data may take an hour. 9```bash
10# Basic analysis (fetches events, compares relays)
11./docs/how-to/migration-scripts/run-migration-analysis.sh \
12 --prod-relay wss://relay.ngit.dev \
13 --archive-relay wss://archive.relay.ngit.dev
10 14
112. Analyze the data to see which repositories have not been moved with complete data. Understand why and for each decide if action is needed / not needed to move it. 15# Full analysis (includes git sync check - run on VPS)
16./docs/how-to/migration-scripts/run-migration-analysis.sh \
17 --prod-relay wss://relay.ngit.dev \
18 --archive-relay wss://archive.relay.ngit.dev \
19 --prod-git /var/lib/ngit-relay/git \
20 --archive-git /var/lib/ngit-relay-archive/git \
21 --service ngit-grasp.service
22```
23
24The script produces three output files:
25- `results/no-action-required.txt` - Repos ready for migration
26- `results/action-required.txt` - Repos needing intervention
27- `results/manual-investigation.txt` - Repos needing human review
28
29See [Running the Analysis](#running-the-analysis) for detailed options.
30
31## Prerequisites
32
33### Required Tools
34
35- **nak** - Nostr Army Knife for fetching events ([install](https://github.com/fiatjaf/nak))
36- **jq** - JSON processing (install via package manager)
37
38### For Full Analysis (VPS)
39
40- SSH access to the VPS running ngit-relay
41- Read access to git data directories
42- Access to systemd journal (for log extraction)
43
44### Verify Installation
45
46```bash
47# Check required tools
48nak --version
49jq --version
50
51# Check optional tools (for VPS phases)
52journalctl --version
53```
54
55## Migration Overview
56
57The migration process has three stages:
58
59### Stage 1: Deploy Archive Instance
60
61Deploy ngit-grasp alongside your production ngit-relay:
62
631. Configure ngit-grasp with:
64 - `domain` set to `<prod-domain>.internal` (temporary)
65 - `archiveService` set to your production domain
66 - Running on a different port
67
682. Let it sync for ~1 hour to gather all events and git data
69
70### Stage 2: Analyze Data
71
72Run the migration analysis to identify:
73- Repositories successfully migrated (no action needed)
74- Repositories with incomplete data (need investigation)
75- Repositories with parse failures (may need re-announcement)
76
77### Stage 3: Switch Over
78
79Once all issues are resolved:
801. Set `domain` to your production URL
812. Disable archive mode
823. Update your reverse proxy to point to ngit-grasp
83
84## Running the Analysis
85
86### Basic Usage
87
88```bash
89# Preview what will happen (dry run)
90./run-migration-analysis.sh \
91 --prod-relay wss://relay.ngit.dev \
92 --archive-relay wss://archive.relay.ngit.dev \
93 --dry-run
94
95# Run the analysis
96./run-migration-analysis.sh \
97 --prod-relay wss://relay.ngit.dev \
98 --archive-relay wss://archive.relay.ngit.dev
99```
100
101### Full Analysis on VPS
102
103```bash
104./run-migration-analysis.sh \
105 --prod-relay wss://relay.ngit.dev \
106 --archive-relay wss://archive.relay.ngit.dev \
107 --prod-git /var/lib/ngit-relay/git \
108 --archive-git /var/lib/ngit-relay-archive/git \
109 --service ngit-grasp.service
110```
111
112### Phase Control
113
114Skip or run specific phases:
115
116```bash
117# Skip Phase 2 (use cached git sync data)
118./run-migration-analysis.sh ... --skip-phase-2
119
120# Run only Phase 1 (fetch events)
121./run-migration-analysis.sh ... --only-phase-1
122
123# Resume from Phase 3 (using existing data)
124./run-migration-analysis.sh ... --from-phase-3 --output work/migration-analysis-20260122-1430
125```
126
127### All Options
128
129| Option | Description |
130|--------|-------------|
131| `--prod-relay <url>` | Production relay WebSocket URL (required) |
132| `--archive-relay <url>` | Archive relay WebSocket URL (required) |
133| `--prod-git <path>` | Git base directory for prod (enables Phase 2) |
134| `--archive-git <path>` | Git base directory for archive (enables Phase 2) |
135| `--service <name>` | Systemd service name (enables Phase 4) |
136| `--output <dir>` | Output directory (default: auto-generated) |
137| `--skip-phase-N` | Skip phase N (1-5) |
138| `--only-phase-N` | Run only phase N |
139| `--from-phase-N` | Start from phase N |
140| `--dry-run` | Show what would be executed |
141| `--continue-on-error` | Continue even if a phase fails |
142
143## Understanding Results
144
145### Summary File
146
147The `results/summary.txt` file provides an overview:
148
149```
150## Overview
151
152| Category | Count | Percentage |
153|----------|-------|------------|
154| No Action Required | 450 | 85.7% |
155| Action Required | 52 | 9.9% |
156| Manual Investigation | 23 | 4.4% |
157```
158
159### No Action Required
12 160
133. Set the 'domain' to production URL, turn off archive mode, and point your reverse proxy at the new port. 161Repositories in `no-action-required.txt` are ready for migration:
14 162
15## Challenges 163```
164myrepo | npub1abc... | complete in both prod and archive
165oldrepo | npub1def... | deleted by user
166testrepo | npub1ghi... | empty/blank in both (user never pushed)
167```
168
169**Common reasons:**
170- `complete in both prod and archive` - Successfully migrated
171- `deleted by user` - User requested deletion (kind 5 event)
172- `empty/blank in both` - No git data was ever pushed
173- `purgatory expired` - System already handled the timeout
174
175### Action Required
176
177Repositories in `action-required.txt` need intervention:
178
179```
180myrepo | npub1abc... | complete in prod, missing from archive | trigger re-sync or investigate
181otherrepo | npub1def... | incomplete in both (prod=cat3, archive=cat2) | investigate git data source
182```
183
184**Common actions:**
185- **Re-sync needed**: Trigger the archive to re-fetch from the source
186- **Wait for sync**: Archive sync may still be in progress
187- **Investigate git source**: Original git data may be incomplete
188- **Fix parse failure**: Event format issue, may need re-announcement
16 189
17- **ngit-relay accepts any commits/annotated tags** that were at that point of time referenced in the latest state event. **ngit-grasp requires all the git data** to reproduce the latest state. So if the git data is incomplete, it won't accept the repository. 190### Manual Investigation
18 191
19- **ngit-relay doesn't clear out refs/nostr/<event-id>** where it doesn't have a PR event. Fortunately the 'PR' (as opposed to patches) functionality is not widely used so we just need to check a few repositories (shakespeare, ngit and gitworkshop). 192Repositories in `manual-investigation.txt` have unusual states:
193
194```
195weirdrepo | npub1abc... | in archive (cat1) but not in prod | may be new announcement or deleted from prod
196conflictrepo | npub1def... | complete in prod, missing from archive, parse failure logged | investigate parse failure
197```
20 198
21## Analysis Categories 199These require human judgment to determine the correct action.
22 200
23### No action required: 201## Troubleshooting
24 202
25| Category | How to Detect | Source | 203### "nak not found"
26|----------|---------------|--------|
27| **Git Data Complete - Moved** | prod cat1 AND archive cat1 (same repo) | Git sync check |
28| **Invalid Announcement** (Won't Parse) | Log: `[PARSE_FAIL] kind=30617` | Archive logs |
29| **Deletion Request** | kind 5 event tagging announcement | Event fetch |
30| **Announcement Not on Prod But In Archive** | In archive announcements, not in prod | Event comparison |
31 204
32### Action/decision required: 205Install nak from https://github.com/fiatjaf/nak:
33 206
34| Category | How to Detect | Source | 207```bash
35|----------|---------------|--------| 208# Using Go
36| **Invalid State Event** (Won't Parse) | Log: `[PARSE_FAIL] kind=30618` | Archive logs | 209go install github.com/fiatjaf/nak@latest
37| **Purgatory Expired** (sync should have worked) | Log: `[PURGATORY_EXPIRED]` | Archive logs |
38| **Incomplete Git Data** (both relays) | prod cat2/3/4 AND archive cat2/3/4 | Git sync check |
39| **No Announcement In Archive** | In prod, not in archive, no deletion | Event comparison |
40| **State but incomplete git in Archive** | archive cat3 or cat4 | Git sync check |
41 210
42### Manual investigation required: 211# Or download binary from releases
212```
43 213
44- Repos that don't fit above categories 214### "Permission denied" on git directories
45- Repos with unexpected state (e.g., complete in prod, missing in archive, no log entries)
46 215
47## Analysis Script Architecture 216Run with sudo or ensure your user has read access:
48 217
49The analysis is split into modular phases for fast iteration. Phases 1-3 and 5 can run locally; Phase 2 and 4 require VPS access. 218```bash
219# Check permissions
220ls -la /var/lib/ngit-relay/git
221
222# Run with sudo if needed
223sudo ./run-migration-analysis.sh ...
224```
225
226### Phase 2 takes too long
227
228The git sync check processes each repository individually (~20 minutes total). To speed up iteration:
229
2301. Run Phase 2 once and save the output
2312. Use `--skip-phase-2` for subsequent runs
2323. Use `--from-phase-3` to re-run classification with existing data
233
234### No parse failures found
235
236This is expected if:
237- ngit-grasp logging improvements aren't deployed yet
238- No events actually failed to parse
239
240The analysis will continue without log data.
241
242### Event counts are multiples of 250
243
244This suggests pagination may have failed. The scripts use `--paginate` by default, but if you see exactly 250, 500, 750 events, verify the relay is responding correctly.
245
246## Architecture
247
248### Analysis Phases
249
250The analysis is split into 5 modular phases:
251
252| Phase | Name | Time | Location | Description |
253|-------|------|------|----------|-------------|
254| 1 | Fetch Events | ~30s each | Local | Fetch events from both relays |
255| 2 | Git Sync Check | ~20 min each | VPS | Compare state events to git data |
256| 3 | Categorize & Compare | <1s | Local | Categorize and compare results |
257| 4 | Extract Logs | <30s | VPS | Extract parse failures and purgatory expiry |
258| 5 | Final Classification | <5s | Local | Combine all data into actionable results |
259
260### Phase Flow Diagram
50 261
51``` 262```
52┌─────────────────────────────────────────────────────────────────┐ 263┌─────────────────────────────────────────────────────────────────┐
53│ PHASE 1: Fetch Events (~30s, local) │ 264│ PHASE 1: Fetch Events (~30s, local) │
54│ migration-scripts/01-fetch-events.sh <relay> <output-dir> │ 265│ Fetches kind 30618 (state), 30617 (announcements), 5 (deletion) │
55├─────────────────────────────────────────────────────────────────┤ 266│ Run twice: once for prod, once for archive │
56│ Fetches from relay: │
57│ - kind 30618 (state events) │
58│ - kind 30617 (announcements) │
59│ - kind 5 (deletion requests) │
60│ │
61│ Run twice: once for prod (relay.ngit.dev), once for archive │
62│ Output: <output-dir>/{state,announcements,deletions}.json │
63└─────────────────────────────────────────────────────────────────┘ 267└─────────────────────────────────────────────────────────────────┘
64 268
65┌─────────────────────────────────────────────────────────────────┐ 269┌─────────────────────────────────────────────────────────────────┐
66│ PHASE 2: Git Sync Check (~20 mins, VPS required) │ 270│ PHASE 2: Git Sync Check (~20 mins, VPS required) │
67│ migration-scripts/10-check-git-sync.sh <events> <git-base> <out>│ 271│ Compares state event refs to actual git data on disk │
68├─────────────────────────────────────────────────────────────────┤ 272│ Categorizes into: complete, empty, partial, no-match │
69│ For each state event, compares refs to actual git data on disk. │
70│ │
71│ Run twice: │
72│ - prod: GIT_BASE=/persistent/relay-ngit-dev-ngit-relay/... │
73│ - archive: GIT_BASE=/persistent/grasp/sync-archive/git │
74│ │
75│ Output: git-sync-status.tsv │
76│ repo|npub|state_refs|git_refs|matches|status │
77└─────────────────────────────────────────────────────────────────┘ 273└─────────────────────────────────────────────────────────────────┘
78 274
79┌─────────────────────────────────────────────────────────────────┐ 275┌─────────────────────────────────────────────────────────────────┐
80│ PHASE 3: Categorize & Compare (fast, local) │ 276│ PHASE 3: Categorize & Compare (fast, local) │
81│ migration-scripts/20-categorize.sh <sync-status> <output-dir> │ 277│ Compares prod vs archive categories │
82│ migration-scripts/21-compare-relays.sh <prod> <archive> <out> │ 278│ Identifies gaps and sync issues │
83├─────────────────────────────────────────────────────────────────┤
84│ 20-categorize.sh applies 4-category logic: │
85│ - cat1: complete match (all refs match) │
86│ - cat2: empty/blank (no git data) │
87│ - cat3: partial match (some refs match) │
88│ - cat4: no match (git exists but refs don't match) │
89│ │
90│ 21-compare-relays.sh compares prod vs archive: │
91│ - complete-in-both.txt (no action needed) │
92│ - complete-prod-missing-archive.txt (needs investigation) │
93│ - complete-prod-incomplete-archive.txt (sync in progress?) │
94│ - incomplete-in-both.txt (git data incomplete) │
95│ - in-archive-not-prod.txt (deleted or new) │
96│ │
97│ Output: category-{1,2,3,4}.txt, comparison/*.txt, summary.txt │
98└─────────────────────────────────────────────────────────────────┘ 279└─────────────────────────────────────────────────────────────────┘
99 280
100┌─────────────────────────────────────────────────────────────────┐ 281┌─────────────────────────────────────────────────────────────────┐
101│ PHASE 4: Log-Based Categories (VPS required) │ 282│ PHASE 4: Log-Based Categories (VPS required) │
102│ migration-scripts/30-extract-parse-failures.sh <service> <out> │ 283│ Extracts [PARSE_FAIL] and [PURGATORY_EXPIRED] from logs │
103│ migration-scripts/31-extract-purgatory-expiry.sh <service> <out>│ 284│ Provides context for why repos failed to sync │
104├─────────────────────────────────────────────────────────────────┤
105│ Extracts structured log entries from journalctl: │
106│ - Parse failures: [PARSE_FAIL] kind=X event_id=Y reason=Z │
107│ - Purgatory expiry: [PURGATORY_EXPIRED] repo=X npub=Y │
108│ │
109│ NOTE: Requires logging improvements in ngit-grasp to emit │
110│ these structured log entries. See issue: TBD │
111│ │
112│ Output: parse-failures.txt, purgatory-expired.txt │
113└─────────────────────────────────────────────────────────────────┘ 285└─────────────────────────────────────────────────────────────────┘
114 286
115┌─────────────────────────────────────────────────────────────────┐ 287┌─────────────────────────────────────────────────────────────────┐
116│ PHASE 5: Final Classification (fast, local) │ 288│ PHASE 5: Final Classification (fast, local) │
117│ migration-scripts/40-classify-actions.sh <all-inputs> <out> │ 289│ Combines all data sources │
118├─────────────────────────────────────────────────────────────────┤ 290│ Outputs: no-action, action-required, manual-investigation │
119│ Combines all data sources to produce final classification: │
120│ │
121│ Inputs: │
122│ - category files (prod and archive) │
123│ - relay-gaps.txt │
124│ - parse-failures.txt │
125│ - purgatory-expired.txt │
126│ - deletions.json │
127│ │
128│ Output: │
129│ - no-action-required.txt (repo|reason) │
130│ - action-required.txt (repo|reason|suggested_action) │
131│ - manual-investigation.txt (repo|notes) │
132└─────────────────────────────────────────────────────────────────┘ 291└─────────────────────────────────────────────────────────────────┘
133``` 292```
134 293
135## Directory Structure 294### Git Sync Categories
295
296Phase 2 categorizes repositories into 4 categories:
297
298| Category | Description | Meaning |
299|----------|-------------|---------|
300| 1 | Complete Match | All refs in state event match git data |
301| 2 | Empty/Blank | No git data available |
302| 3 | Partial Match | Some refs match, some don't |
303| 4 | No Match | Git data exists but refs don't match |
304
305### Output Directory Structure
136 306
137``` 307```
138work/migration-analysis-YYYYMMDD-HHMM/ 308work/migration-analysis-YYYYMMDD-HHMM/
139├── prod/ 309├── prod/
140│ ├── raw/ 310│ ├── raw/
141│ │ ├── state-events.json # Phase 1 output 311│ │ ├── state-events.json # Phase 1
142│ │ ├── announcements.json # Phase 1 output 312│ │ ├── announcements.json # Phase 1
143│ │ └── deletions.json # Phase 1 output 313│ │ └── deletions.json # Phase 1
144│ ├── git-sync-status.tsv # Phase 2 output (optional) 314│ ├── git-sync-status.tsv # Phase 2
145│ ├── category1-complete-match.txt # Phase 2/3 output 315│ └── category*.txt # Phase 2/3
146│ ├── category2-empty-blank.txt # Phase 2/3 output
147│ ├── category3-partial-match.txt # Phase 2/3 output
148│ └── category4-no-match.txt # Phase 2/3 output
149├── archive/ 316├── archive/
150│ ├── raw/ 317│ └── (same structure as prod)
151│ │ ├── state-events.json
152│ │ ├── announcements.json
153│ │ └── deletions.json
154│ ├── git-sync-status.tsv
155│ ├── category1-complete-match.txt
156│ ├── category2-empty-blank.txt
157│ ├── category3-partial-match.txt
158│ └── category4-no-match.txt
159├── logs/
160│ ├── parse-failures.txt # Phase 4 output
161│ └── purgatory-expired.txt # Phase 4 output
162├── comparison/ 318├── comparison/
163│ ├── complete-in-both.txt # Phase 3 output (no action) 319│ ├── complete-in-both.txt # Phase 3
164│ ├── complete-prod-missing-archive.txt # Phase 3 output (investigate) 320│ ├── complete-prod-missing-archive.txt
165│ ├── complete-prod-incomplete-archive.txt # Phase 3 output (sync in progress?) 321│ ├── complete-prod-incomplete-archive.txt
166│ ├── incomplete-in-both.txt # Phase 3 output (git incomplete) 322│ ├── incomplete-in-both.txt
167│ ├── in-archive-not-prod.txt # Phase 3 output (deleted/new) 323│ ├── in-archive-not-prod.txt
168│ └── summary.txt # Phase 3 output (human-readable) 324│ └── summary.txt
325├── logs/
326│ ├── parse-failures.txt # Phase 4
327│ └── purgatory-expired.txt # Phase 4
169└── results/ 328└── results/
170 ├── no-action-required.txt # Phase 5 output 329 ├── no-action-required.txt # Phase 5
171 ├── action-required.txt # Phase 5 output 330 ├── action-required.txt # Phase 5
172 └── manual-investigation.txt # Phase 5 output 331 ├── manual-investigation.txt # Phase 5
332 └── summary.txt # Phase 5
173``` 333```
174 334
175## Prerequisites 335## Key Differences: ngit-relay vs ngit-grasp
336
337Understanding these differences helps explain why some repositories need attention:
176 338
177- `nak` - Nostr Army Knife for fetching events 339| Aspect | ngit-relay | ngit-grasp |
178- `jq` - JSON processing 340|--------|------------|------------|
179- SSH access to VPS for Phase 2 and 4 341| Git data validation | Accepts commits/tags referenced in state event | Requires all git data to reproduce state |
180- Logging improvements in ngit-grasp for Phase 4 (see Dependencies) 342| PR refs cleanup | Doesn't clear `refs/nostr/<event-id>` | Properly manages PR refs |
343| Parse failures | Silently ignores | Logs structured `[PARSE_FAIL]` entries |
344| Sync timeout | No timeout | Purgatory expires after configurable period |
181 345
182## Dependencies 346## Next Steps
183 347
184Phase 4 requires structured logging in ngit-grasp. Create a separate issue to add: 348After running the analysis:
185 349
186```rust 3501. **Review the summary** - Check `results/summary.txt` for the overview
187// On parse failure: 3512. **Address action items** - Work through `results/action-required.txt`
188tracing::warn!( 3523. **Investigate edge cases** - Review `results/manual-investigation.txt`
189 target: "migration", 3534. **Re-run analysis** - After fixing issues, re-run to verify
190 "[PARSE_FAIL] kind={} event_id={} reason=\"{}\"", 3545. **Plan cutover** - Schedule the switch when all issues are resolved
191 event.kind, event.id, reason
192);
193 355
194// On purgatory expiry: 356### When to Re-run
195tracing::warn!( 357
196 target: "migration", 358Re-run the analysis when:
197 "[PURGATORY_EXPIRED] repo={} npub={}", 359- Archive sync has had time to complete
198 identifier, npub 360- You've fixed parse failures or re-announced events
199); 361- You want to verify fixes before cutover
362
363```bash
364# Re-run with existing Phase 2 data (faster)
365./run-migration-analysis.sh ... --skip-phase-2 --output work/migration-analysis-20260122-1430
200``` 366```
201 367
202## Gotchas 368## Individual Scripts
369
370For advanced usage, you can run individual phase scripts:
371
372```bash
373# Phase 1: Fetch events
374./migration-scripts/01-fetch-events.sh wss://relay.ngit.dev output/prod
375
376# Phase 2: Git sync check
377./migration-scripts/10-check-git-sync.sh output/prod/raw/state-events.json /var/lib/ngit-relay/git output/prod --categorize
378
379# Phase 3a: Categorize
380./migration-scripts/20-categorize.sh output/prod/git-sync-status.tsv output/prod
381
382# Phase 3b: Compare relays
383./migration-scripts/21-compare-relays.sh output/prod output/archive output/comparison
384
385# Phase 4a: Extract parse failures
386./migration-scripts/30-extract-parse-failures.sh ngit-grasp.service output/logs
387
388# Phase 4b: Extract purgatory expiry
389./migration-scripts/31-extract-purgatory-expiry.sh ngit-grasp.service output/logs
390
391# Phase 5: Final classification
392./migration-scripts/40-classify-actions.sh work/migration-analysis-20260122-1430
393```
203 394
204- Always use `nak req` with `--paginate` flag so we don't miss any events. If we receive increments of 250 (e.g., exactly 500) then it's a red flag that we are not paginating and there are probably more events. 395Each script has detailed help available with `--help` or by reading the script header.
205- Phase 1 and 2 should run back-to-back for an accurate snapshot.
206- The git sync check (Phase 2) takes ~20 minutes per relay - this is the slow part.
207- Existing analysis data from Jan 22 can be used for developing Phase 3/5 logic before re-running Phase 2.