diff options
| -rw-r--r-- | docs/how-to/migrate-ngit-relay-to-ngit-grasp.md | 180 |
1 files changed, 164 insertions, 16 deletions
diff --git a/docs/how-to/migrate-ngit-relay-to-ngit-grasp.md b/docs/how-to/migrate-ngit-relay-to-ngit-grasp.md index e17ba0a..d01bbf2 100644 --- a/docs/how-to/migrate-ngit-relay-to-ngit-grasp.md +++ b/docs/how-to/migrate-ngit-relay-to-ngit-grasp.md | |||
| @@ -22,25 +22,173 @@ | |||
| 22 | 22 | ||
| 23 | ### No action required: | 23 | ### No action required: |
| 24 | 24 | ||
| 25 | - **Git Data Complete - Moved** (state event exists in archive and git data reflects it) | 25 | | Category | How to Detect | Source | |
| 26 | - **Invalid Repositories Announcement** (Won't Parse) | 26 | |----------|---------------|--------| |
| 27 | - **Deletion Request** (kind 5) tagging announcement event in archive | 27 | | **Git Data Complete - Moved** | prod cat1 AND archive cat1 (same repo) | Git sync check | |
| 28 | - **Announcement Not on Production But In Archive** that lists service | 28 | | **Invalid Announcement** (Won't Parse) | Log: `[PARSE_FAIL] kind=30617` | Archive logs | |
| 29 | | **Deletion Request** | kind 5 event tagging announcement | Event fetch | | ||
| 30 | | **Announcement Not on Prod But In Archive** | In archive announcements, not in prod | Event comparison | | ||
| 29 | 31 | ||
| 30 | ### Action/decision required: | 32 | ### Action/decision required: |
| 31 | 33 | ||
| 32 | - **Invalid State Event** (Won't Parse) | 34 | | Category | How to Detect | Source | |
| 33 | - **Incomplete Git Data** (at source and destination) And No State Event at Destination | 35 | |----------|---------------|--------| |
| 34 | - **No Announcement In Archive** (and no related delete event) | 36 | | **Invalid State Event** (Won't Parse) | Log: `[PARSE_FAIL] kind=30618` | Archive logs | |
| 35 | - **Complete Git Data at source, Announcement but no State Event in Archive** and empty bare git repo | 37 | | **Purgatory Expired** (sync should have worked) | Log: `[PURGATORY_EXPIRED]` | Archive logs | |
| 36 | - **State event but incomplete git data in Archive** | 38 | | **Incomplete Git Data** (both relays) | prod cat2/3/4 AND archive cat2/3/4 | Git sync check | |
| 37 | 39 | | **No Announcement In Archive** | In prod, not in archive, no deletion | Event comparison | | |
| 38 | ## Analysis Approach | 40 | | **State but incomplete git in Archive** | archive cat3 or cat4 | Git sync check | |
| 39 | 41 | ||
| 40 | This analysis and categorization should be scripted to facilitate easy review and decision making. | 42 | ### Manual investigation required: |
| 41 | 43 | ||
| 42 | There are already some scripts that we need to build on in the old issue worktree to help facilitate this. | 44 | - Repos that don't fit above categories |
| 45 | - Repos with unexpected state (e.g., complete in prod, missing in archive, no log entries) | ||
| 46 | |||
| 47 | ## Analysis Script Architecture | ||
| 48 | |||
| 49 | The analysis is split into modular phases for fast iteration. Phases 1-3 and 5 can run locally; Phase 2 and 4 require VPS access. | ||
| 50 | |||
| 51 | ``` | ||
| 52 | ┌─────────────────────────────────────────────────────────────────┐ | ||
| 53 | │ PHASE 1: Fetch Events (~30s, local) │ | ||
| 54 | │ scripts/migration/01-fetch-events.sh <relay> <output-dir> │ | ||
| 55 | ├─────────────────────────────────────────────────────────────────┤ | ||
| 56 | │ Fetches from relay: │ | ||
| 57 | │ - kind 30618 (state events) │ | ||
| 58 | │ - kind 30617 (announcements) │ | ||
| 59 | │ - kind 5 (deletion requests) │ | ||
| 60 | │ │ | ||
| 61 | │ Run twice: once for prod (relay.ngit.dev), once for archive │ | ||
| 62 | │ Output: <output-dir>/{state,announcements,deletions}.json │ | ||
| 63 | └─────────────────────────────────────────────────────────────────┘ | ||
| 64 | ↓ | ||
| 65 | ┌─────────────────────────────────────────────────────────────────┐ | ||
| 66 | │ PHASE 2: Git Sync Check (~20 mins, VPS required) │ | ||
| 67 | │ scripts/migration/10-check-git-sync.sh <events> <git-base> <out>│ | ||
| 68 | ├─────────────────────────────────────────────────────────────────┤ | ||
| 69 | │ For each state event, compares refs to actual git data on disk. │ | ||
| 70 | │ │ | ||
| 71 | │ Run twice: │ | ||
| 72 | │ - prod: GIT_BASE=/persistent/relay-ngit-dev-ngit-relay/... │ | ||
| 73 | │ - archive: GIT_BASE=/persistent/grasp/sync-archive/git │ | ||
| 74 | │ │ | ||
| 75 | │ Output: git-sync-status.tsv │ | ||
| 76 | │ repo|npub|state_refs|git_refs|matches|status │ | ||
| 77 | └─────────────────────────────────────────────────────────────────┘ | ||
| 78 | ↓ | ||
| 79 | ┌─────────────────────────────────────────────────────────────────┐ | ||
| 80 | │ PHASE 3: Categorize & Compare (fast, local) │ | ||
| 81 | │ scripts/migration/20-categorize.sh <sync-status> <output-dir> │ | ||
| 82 | │ scripts/migration/21-compare-relays.sh <prod> <archive> <out> │ | ||
| 83 | ├─────────────────────────────────────────────────────────────────┤ | ||
| 84 | │ 20-categorize.sh applies 4-category logic: │ | ||
| 85 | │ - cat1: complete match (all refs match) │ | ||
| 86 | │ - cat2: empty/blank (no git data) │ | ||
| 87 | │ - cat3: partial match (some refs match) │ | ||
| 88 | │ - cat4: no match (git exists but refs don't match) │ | ||
| 89 | │ │ | ||
| 90 | │ 21-compare-relays.sh finds gaps: │ | ||
| 91 | │ - in prod but not archive │ | ||
| 92 | │ - in archive but not prod │ | ||
| 93 | │ - different status between relays │ | ||
| 94 | │ │ | ||
| 95 | │ Output: category-{1,2,3,4}.txt, relay-gaps.txt │ | ||
| 96 | └─────────────────────────────────────────────────────────────────┘ | ||
| 97 | ↓ | ||
| 98 | ┌─────────────────────────────────────────────────────────────────┐ | ||
| 99 | │ PHASE 4: Log-Based Categories (VPS required) │ | ||
| 100 | │ scripts/migration/30-extract-parse-failures.sh <service> <out> │ | ||
| 101 | │ scripts/migration/31-extract-purgatory-expiry.sh <service> <out>│ | ||
| 102 | ├─────────────────────────────────────────────────────────────────┤ | ||
| 103 | │ Extracts structured log entries from journalctl: │ | ||
| 104 | │ - Parse failures: [PARSE_FAIL] kind=X event_id=Y reason=Z │ | ||
| 105 | │ - Purgatory expiry: [PURGATORY_EXPIRED] repo=X npub=Y │ | ||
| 106 | │ │ | ||
| 107 | │ NOTE: Requires logging improvements in ngit-grasp to emit │ | ||
| 108 | │ these structured log entries. See issue: TBD │ | ||
| 109 | │ │ | ||
| 110 | │ Output: parse-failures.txt, purgatory-expired.txt │ | ||
| 111 | └─────────────────────────────────────────────────────────────────┘ | ||
| 112 | ↓ | ||
| 113 | ┌─────────────────────────────────────────────────────────────────┐ | ||
| 114 | │ PHASE 5: Final Classification (fast, local) │ | ||
| 115 | │ scripts/migration/40-classify-actions.sh <all-inputs> <out> │ | ||
| 116 | ├─────────────────────────────────────────────────────────────────┤ | ||
| 117 | │ Combines all data sources to produce final classification: │ | ||
| 118 | │ │ | ||
| 119 | │ Inputs: │ | ||
| 120 | │ - category files (prod and archive) │ | ||
| 121 | │ - relay-gaps.txt │ | ||
| 122 | │ - parse-failures.txt │ | ||
| 123 | │ - purgatory-expired.txt │ | ||
| 124 | │ - deletions.json │ | ||
| 125 | │ │ | ||
| 126 | │ Output: │ | ||
| 127 | │ - no-action-required.txt (repo|reason) │ | ||
| 128 | │ - action-required.txt (repo|reason|suggested_action) │ | ||
| 129 | │ - manual-investigation.txt (repo|notes) │ | ||
| 130 | └─────────────────────────────────────────────────────────────────┘ | ||
| 131 | ``` | ||
| 132 | |||
| 133 | ## Directory Structure | ||
| 134 | |||
| 135 | ``` | ||
| 136 | work/migration-analysis-YYYYMMDD-HHMM/ | ||
| 137 | ├── prod/ | ||
| 138 | │ ├── raw/ | ||
| 139 | │ │ ├── state-events.json | ||
| 140 | │ │ ├── announcements.json | ||
| 141 | │ │ └── deletions.json | ||
| 142 | │ ├── git-sync-status.tsv | ||
| 143 | │ └── category-{1,2,3,4}.txt | ||
| 144 | ├── archive/ | ||
| 145 | │ ├── raw/ | ||
| 146 | │ │ ├── state-events.json | ||
| 147 | │ │ ├── announcements.json | ||
| 148 | │ │ └── deletions.json | ||
| 149 | │ ├── git-sync-status.tsv | ||
| 150 | │ └── category-{1,2,3,4}.txt | ||
| 151 | ├── logs/ | ||
| 152 | │ ├── parse-failures.txt | ||
| 153 | │ └── purgatory-expired.txt | ||
| 154 | ├── comparison/ | ||
| 155 | │ └── relay-gaps.txt | ||
| 156 | └── results/ | ||
| 157 | ├── no-action-required.txt | ||
| 158 | ├── action-required.txt | ||
| 159 | └── manual-investigation.txt | ||
| 160 | ``` | ||
| 161 | |||
| 162 | ## Prerequisites | ||
| 163 | |||
| 164 | - `nak` - Nostr Army Knife for fetching events | ||
| 165 | - `jq` - JSON processing | ||
| 166 | - SSH access to VPS for Phase 2 and 4 | ||
| 167 | - Logging improvements in ngit-grasp for Phase 4 (see Dependencies) | ||
| 168 | |||
| 169 | ## Dependencies | ||
| 170 | |||
| 171 | Phase 4 requires structured logging in ngit-grasp. Create a separate issue to add: | ||
| 172 | |||
| 173 | ```rust | ||
| 174 | // On parse failure: | ||
| 175 | tracing::warn!( | ||
| 176 | target: "migration", | ||
| 177 | "[PARSE_FAIL] kind={} event_id={} reason=\"{}\"", | ||
| 178 | event.kind, event.id, reason | ||
| 179 | ); | ||
| 180 | |||
| 181 | // On purgatory expiry: | ||
| 182 | tracing::warn!( | ||
| 183 | target: "migration", | ||
| 184 | "[PURGATORY_EXPIRED] repo={} npub={}", | ||
| 185 | identifier, npub | ||
| 186 | ); | ||
| 187 | ``` | ||
| 43 | 188 | ||
| 44 | ## Gotchas | 189 | ## Gotchas |
| 45 | 190 | ||
| 46 | Always use `nak req` with `--paginate` flag so we don't miss any events. If we receive increments of 250 eg 500 then it's a red flag that we are not paginating and there are probably more events. | 191 | - Always use `nak req` with `--paginate` flag so we don't miss any events. If we receive increments of 250 (e.g., exactly 500) then it's a red flag that we are not paginating and there are probably more events. |
| 192 | - Phase 1 and 2 should run back-to-back for an accurate snapshot. | ||
| 193 | - The git sync check (Phase 2) takes ~20 minutes per relay - this is the slow part. | ||
| 194 | - Existing analysis data from Jan 22 can be used for developing Phase 3/5 logic before re-running Phase 2. | ||