diff options
| author | DanConwayDev <DanConwayDev@protonmail.com> | 2026-02-03 14:50:22 +0000 |
|---|---|---|
| committer | DanConwayDev <DanConwayDev@protonmail.com> | 2026-02-03 15:18:23 +0000 |
| commit | 874a8abe1d076cfafd9baf919ec23d7d58200698 (patch) | |
| tree | dce0d0d36bddc496ff32f8555a8790d8dc7be7e4 /docs/archive/2026-01-relay-ngit-dev-migration/scripts/31-extract-purgatory-expiry.sh | |
| parent | 9fd4350c57bbe986ebf65bf3ea4c996572e81884 (diff) | |
| parent | 92a9a3bfe0bc522e8ae411991a366a3a6310d525 (diff) | |
Merge relay.ngit.dev migration: bug fixes and migration tooling
This merge includes critical bug fixes and comprehensive migration tooling
developed during the relay.ngit.dev migration effort.
Bug Fixes:
- Fix git protocol error handling to return HTTP 200 with ERR pkt-line
- Fix naughty list false positives and DNS failure identification
- Fix database query filters in load_existing_events (remove .since())
- Fix OID fetch tracking to distinguish 0 OIDs from successful fetches
- Fix purgatory event source tracking for filtered expiry logging
- Implement OID retry logic for 'not our ref' errors
Migration Tools & Documentation:
- Complete 5-phase migration analysis pipeline with orchestration script
- Phase 1: Event fetching from source relay
- Phase 2: Git sync verification
- Phase 3: Categorization and relay comparison
- Phase 4: Log extraction (parse failures, purgatory expiry)
- Phase 5: Action classification for migration decisions
- Comprehensive migration guide with lessons learned
- Troubleshooting guide for permission and corruption issues
Configuration:
- Add NGIT_LOG_LEVEL configuration option
- Update git throttle limits to 60/minute
- Improve logging throughout for better observability
Diffstat (limited to 'docs/archive/2026-01-relay-ngit-dev-migration/scripts/31-extract-purgatory-expiry.sh')
| -rwxr-xr-x | docs/archive/2026-01-relay-ngit-dev-migration/scripts/31-extract-purgatory-expiry.sh | 408 |
1 files changed, 408 insertions, 0 deletions
diff --git a/docs/archive/2026-01-relay-ngit-dev-migration/scripts/31-extract-purgatory-expiry.sh b/docs/archive/2026-01-relay-ngit-dev-migration/scripts/31-extract-purgatory-expiry.sh new file mode 100755 index 0000000..a0c8ad0 --- /dev/null +++ b/docs/archive/2026-01-relay-ngit-dev-migration/scripts/31-extract-purgatory-expiry.sh | |||
| @@ -0,0 +1,408 @@ | |||
| 1 | #!/usr/bin/env bash | ||
| 2 | # | ||
| 3 | # 31-extract-purgatory-expiry.sh - Extract purgatory expiry events from systemd logs | ||
| 4 | # | ||
| 5 | # PHASE 4b of the GRASP relay to ngit-grasp migration analysis pipeline. | ||
| 6 | # Extracts structured [PURGATORY_EXPIRED] log entries from journalctl. | ||
| 7 | # | ||
| 8 | # USAGE: | ||
| 9 | # ./31-extract-purgatory-expiry.sh <service-name> <output-dir> [options] | ||
| 10 | # | ||
| 11 | # EXAMPLES: | ||
| 12 | # # Extract from ngit-grasp service (last 30 days, default) | ||
| 13 | # ./31-extract-purgatory-expiry.sh ngit-grasp.service output/logs | ||
| 14 | # | ||
| 15 | # # Extract with custom time range | ||
| 16 | # ./31-extract-purgatory-expiry.sh ngit-grasp.service output/logs --since "2026-01-01" | ||
| 17 | # | ||
| 18 | # # Extract from specific time window | ||
| 19 | # ./31-extract-purgatory-expiry.sh ngit-grasp.service output/logs --since "2026-01-15" --until "2026-01-22" | ||
| 20 | # | ||
| 21 | # OPTIONS: | ||
| 22 | # --since <date> Start date for log extraction (default: 30 days ago) | ||
| 23 | # --until <date> End date for log extraction (default: now) | ||
| 24 | # --dry-run Show what would be extracted without writing files | ||
| 25 | # | ||
| 26 | # OUTPUT: | ||
| 27 | # <output-dir>/purgatory-expired.txt | ||
| 28 | # | ||
| 29 | # OUTPUT FORMAT (TSV): | ||
| 30 | # repo<TAB>npub<TAB>timestamp<TAB>reason | ||
| 31 | # | ||
| 32 | # EXPECTED LOG FORMAT: | ||
| 33 | # The script looks for structured log entries in this format: | ||
| 34 | # | ||
| 35 | # 2026-01-22T10:30:45Z ngit-grasp[1234]: [PURGATORY_EXPIRED] repo=myrepo npub=npub1... reason="clone URL unreachable after 7 days" | ||
| 36 | # | ||
| 37 | # Required fields: repo, npub | ||
| 38 | # Optional fields: reason (explains why purgatory expired) | ||
| 39 | # | ||
| 40 | # BACKGROUND: | ||
| 41 | # "Purgatory" is the state where ngit-grasp has received an announcement event | ||
| 42 | # but cannot yet sync the git data (e.g., clone URL unreachable, git server down). | ||
| 43 | # After a configurable timeout (default 7 days), the repository is marked as | ||
| 44 | # expired and removed from purgatory. | ||
| 45 | # | ||
| 46 | # Purgatory expiry during migration analysis indicates repositories that: | ||
| 47 | # - Had valid announcements on the production relay | ||
| 48 | # - Could not be synced to the archive relay | ||
| 49 | # - May need manual intervention or investigation | ||
| 50 | # | ||
| 51 | # DEPENDENCY: | ||
| 52 | # This script requires logging improvements in ngit-grasp to emit structured | ||
| 53 | # [PURGATORY_EXPIRED] log entries. Until those are implemented, this script | ||
| 54 | # will find no matching entries (which is handled gracefully). | ||
| 55 | # | ||
| 56 | # See: docs/how-to/migrate-to-ngit-grasp.md (Dependencies section) | ||
| 57 | # | ||
| 58 | # Expected Rust logging code: | ||
| 59 | # tracing::warn!( | ||
| 60 | # target: "migration", | ||
| 61 | # "[PURGATORY_EXPIRED] repo={} npub={} reason=\"{}\"", | ||
| 62 | # identifier, npub, reason | ||
| 63 | # ); | ||
| 64 | # | ||
| 65 | # PREREQUISITES: | ||
| 66 | # - journalctl (systemd) | ||
| 67 | # - grep, awk (standard Unix tools) | ||
| 68 | # - Access to systemd journal (may require sudo or journal group membership) | ||
| 69 | # | ||
| 70 | # RUNTIME: Depends on log volume, typically < 30 seconds | ||
| 71 | # | ||
| 72 | # SEE ALSO: | ||
| 73 | # docs/how-to/migrate-to-ngit-grasp.md - Full migration guide | ||
| 74 | # 30-extract-parse-failures.sh - Companion script for parse failure logs | ||
| 75 | # | ||
| 76 | |||
| 77 | set -euo pipefail | ||
| 78 | |||
| 79 | # Get script directory for sourcing helpers | ||
| 80 | SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)" | ||
| 81 | |||
| 82 | # Source the service validation helper | ||
| 83 | if [[ -f "$SCRIPT_DIR/validate-service.sh" ]]; then | ||
| 84 | source "$SCRIPT_DIR/validate-service.sh" | ||
| 85 | fi | ||
| 86 | |||
| 87 | # Colors for output (disabled if not a terminal) | ||
| 88 | if [[ -t 1 ]]; then | ||
| 89 | RED='\033[0;31m' | ||
| 90 | GREEN='\033[0;32m' | ||
| 91 | YELLOW='\033[0;33m' | ||
| 92 | BLUE='\033[0;34m' | ||
| 93 | NC='\033[0m' | ||
| 94 | else | ||
| 95 | RED='' | ||
| 96 | GREEN='' | ||
| 97 | YELLOW='' | ||
| 98 | BLUE='' | ||
| 99 | NC='' | ||
| 100 | fi | ||
| 101 | |||
| 102 | log_info() { | ||
| 103 | echo -e "${BLUE}[INFO]${NC} $*" >&2 | ||
| 104 | } | ||
| 105 | |||
| 106 | log_success() { | ||
| 107 | echo -e "${GREEN}[OK]${NC} $*" >&2 | ||
| 108 | } | ||
| 109 | |||
| 110 | log_warn() { | ||
| 111 | echo -e "${YELLOW}[WARN]${NC} $*" >&2 | ||
| 112 | } | ||
| 113 | |||
| 114 | log_error() { | ||
| 115 | echo -e "${RED}[ERROR]${NC} $*" >&2 | ||
| 116 | } | ||
| 117 | |||
| 118 | usage() { | ||
| 119 | echo "Usage: $0 <service-name> <output-dir> [options]" | ||
| 120 | echo "" | ||
| 121 | echo "Arguments:" | ||
| 122 | echo " service-name Systemd service name (e.g., ngit-grasp.service)" | ||
| 123 | echo " output-dir Directory to store extracted log data" | ||
| 124 | echo "" | ||
| 125 | echo "Options:" | ||
| 126 | echo " --since <date> Start date (default: 30 days ago)" | ||
| 127 | echo " --until <date> End date (default: now)" | ||
| 128 | echo " --dry-run Show what would be extracted without writing" | ||
| 129 | echo "" | ||
| 130 | echo "Examples:" | ||
| 131 | echo " $0 ngit-grasp.service output/logs" | ||
| 132 | echo " $0 ngit-grasp.service output/logs --since '2026-01-01'" | ||
| 133 | echo " $0 ngit-grasp.service output/logs --since '2026-01-15' --until '2026-01-22'" | ||
| 134 | echo "" | ||
| 135 | echo "Expected log format:" | ||
| 136 | echo " [PURGATORY_EXPIRED] repo=myrepo npub=npub1... reason=\"...\"" | ||
| 137 | exit 1 | ||
| 138 | } | ||
| 139 | |||
| 140 | # Parse a single log line and extract fields | ||
| 141 | # Input: log line containing [PURGATORY_EXPIRED] | ||
| 142 | # Output: TSV line: repo<TAB>npub<TAB>timestamp<TAB>reason | ||
| 143 | parse_log_line() { | ||
| 144 | local line="$1" | ||
| 145 | |||
| 146 | # Extract timestamp from the beginning of the log line | ||
| 147 | # Format: 2026-01-22T10:30:45+0000 or similar ISO format | ||
| 148 | local timestamp repo npub reason | ||
| 149 | |||
| 150 | # Extract ISO timestamp from beginning of line | ||
| 151 | timestamp=$(echo "$line" | grep -oP '^\d{4}-\d{2}-\d{2}T\d{2}:\d{2}:\d{2}' || echo "") | ||
| 152 | |||
| 153 | # Extract repo=VALUE (unquoted identifier) | ||
| 154 | repo=$(echo "$line" | grep -oP 'repo=\K[^ ]+' || echo "") | ||
| 155 | |||
| 156 | # Extract npub=VALUE (npub1... format) | ||
| 157 | npub=$(echo "$line" | grep -oP 'npub=\K[^ ]+' || echo "") | ||
| 158 | |||
| 159 | # Extract reason="VALUE" (quoted string, optional) | ||
| 160 | reason=$(echo "$line" | grep -oP 'reason="\K[^"]*' || echo "") | ||
| 161 | |||
| 162 | # Only output if we have the required fields | ||
| 163 | if [[ -n "$repo" && -n "$npub" ]]; then | ||
| 164 | printf '%s\t%s\t%s\t%s\n' "$repo" "$npub" "$timestamp" "$reason" | ||
| 165 | fi | ||
| 166 | } | ||
| 167 | |||
| 168 | # Main | ||
| 169 | main() { | ||
| 170 | if [[ $# -lt 2 ]]; then | ||
| 171 | usage | ||
| 172 | fi | ||
| 173 | |||
| 174 | local service="$1" | ||
| 175 | local output_dir="$2" | ||
| 176 | shift 2 | ||
| 177 | |||
| 178 | # Default time range: last 30 days | ||
| 179 | local since_date | ||
| 180 | since_date=$(date -d "30 days ago" "+%Y-%m-%d" 2>/dev/null || date -v-30d "+%Y-%m-%d" 2>/dev/null || echo "") | ||
| 181 | local until_date="" | ||
| 182 | local dry_run=false | ||
| 183 | |||
| 184 | # Parse options | ||
| 185 | while [[ $# -gt 0 ]]; do | ||
| 186 | case "$1" in | ||
| 187 | --since) | ||
| 188 | since_date="$2" | ||
| 189 | shift 2 | ||
| 190 | ;; | ||
| 191 | --until) | ||
| 192 | until_date="$2" | ||
| 193 | shift 2 | ||
| 194 | ;; | ||
| 195 | --dry-run) | ||
| 196 | dry_run=true | ||
| 197 | shift | ||
| 198 | ;; | ||
| 199 | *) | ||
| 200 | log_error "Unknown option: $1" | ||
| 201 | usage | ||
| 202 | ;; | ||
| 203 | esac | ||
| 204 | done | ||
| 205 | |||
| 206 | # Validate service name format | ||
| 207 | if [[ ! "$service" =~ \.service$ ]]; then | ||
| 208 | service="${service}.service" | ||
| 209 | fi | ||
| 210 | |||
| 211 | # Validate service is appropriate for structured logging | ||
| 212 | # This prevents the common mistake of using ngit-relay instead of ngit-grasp | ||
| 213 | if type validate_service_for_structured_logging &>/dev/null; then | ||
| 214 | # Use non-interactive mode if not a terminal, skip log check (we'll do our own) | ||
| 215 | local interactive="true" | ||
| 216 | [[ ! -t 0 ]] && interactive="false" | ||
| 217 | |||
| 218 | if ! validate_service_for_structured_logging "$service" "false" "$interactive"; then | ||
| 219 | log_error "Service validation failed. Use an ngit-grasp service for structured logging." | ||
| 220 | exit 1 | ||
| 221 | fi | ||
| 222 | else | ||
| 223 | # Fallback validation if helper not available | ||
| 224 | if [[ "$service" == *"ngit-relay"* ]]; then | ||
| 225 | log_error "Service name appears to be ngit-relay: $service" | ||
| 226 | log_error "Structured logging ([PURGATORY_EXPIRED]) only exists in ngit-grasp services." | ||
| 227 | log_error "Please use the ngit-grasp archive service instead." | ||
| 228 | log_error "" | ||
| 229 | log_error "To find the correct service:" | ||
| 230 | log_error " systemctl list-units 'ngit-grasp*' --all" | ||
| 231 | exit 1 | ||
| 232 | fi | ||
| 233 | fi | ||
| 234 | |||
| 235 | log_info "Extracting purgatory expiry events from systemd logs" | ||
| 236 | log_info "Service: $service" | ||
| 237 | log_info "Output: $output_dir" | ||
| 238 | log_info "Time range: ${since_date:-beginning} to ${until_date:-now}" | ||
| 239 | |||
| 240 | # Check if journalctl is available | ||
| 241 | if ! command -v journalctl &> /dev/null; then | ||
| 242 | log_error "journalctl not found. This script requires systemd." | ||
| 243 | exit 1 | ||
| 244 | fi | ||
| 245 | |||
| 246 | # Validate service exists (check if journalctl can find any logs for it) | ||
| 247 | # Note: We don't require the service to be running, just that it has logs | ||
| 248 | if ! journalctl --no-pager -u "$service" -n 1 &>/dev/null; then | ||
| 249 | log_warn "Could not query logs for service: $service" | ||
| 250 | log_warn "This may indicate the service doesn't exist or you lack permissions." | ||
| 251 | log_warn "" | ||
| 252 | log_warn "To list available ngit-grasp services:" | ||
| 253 | log_warn " systemctl list-units 'ngit-grasp*' --all" | ||
| 254 | log_warn " journalctl --list-boots # Check if you have journal access" | ||
| 255 | log_warn "" | ||
| 256 | # Continue anyway - the service might exist but have no logs yet | ||
| 257 | fi | ||
| 258 | |||
| 259 | # Build journalctl command | ||
| 260 | local journal_cmd="journalctl -u $service --no-pager -o short-iso" | ||
| 261 | |||
| 262 | if [[ -n "$since_date" ]]; then | ||
| 263 | journal_cmd="$journal_cmd --since '$since_date'" | ||
| 264 | fi | ||
| 265 | |||
| 266 | if [[ -n "$until_date" ]]; then | ||
| 267 | journal_cmd="$journal_cmd --until '$until_date'" | ||
| 268 | fi | ||
| 269 | |||
| 270 | log_info "Running: $journal_cmd | grep '\\[PURGATORY_EXPIRED\\]'" | ||
| 271 | |||
| 272 | if [[ "$dry_run" == true ]]; then | ||
| 273 | log_info "[DRY RUN] Would extract to: $output_dir/purgatory-expired.txt" | ||
| 274 | |||
| 275 | # Show sample of what would be extracted | ||
| 276 | log_info "Checking for matching log entries..." | ||
| 277 | local sample_count | ||
| 278 | sample_count=$(eval "$journal_cmd" 2>/dev/null | grep -c '\[PURGATORY_EXPIRED\]' || echo "0") | ||
| 279 | sample_count="${sample_count//[^0-9]/}" # Strip non-numeric characters | ||
| 280 | sample_count="${sample_count:-0}" | ||
| 281 | log_info "Found $sample_count matching log entries" | ||
| 282 | |||
| 283 | if [[ "$sample_count" -eq 0 ]]; then | ||
| 284 | log_warn "No [PURGATORY_EXPIRED] entries found in logs." | ||
| 285 | log_warn "This is expected if ngit-grasp logging improvements are not yet deployed." | ||
| 286 | log_warn "See: docs/how-to/migrate-to-ngit-grasp.md (Dependencies section)" | ||
| 287 | fi | ||
| 288 | |||
| 289 | exit 0 | ||
| 290 | fi | ||
| 291 | |||
| 292 | # Create output directory | ||
| 293 | mkdir -p "$output_dir" | ||
| 294 | |||
| 295 | local output_file="$output_dir/purgatory-expired.txt" | ||
| 296 | local temp_file | ||
| 297 | temp_file=$(mktemp) | ||
| 298 | |||
| 299 | # Extract and parse log entries | ||
| 300 | log_info "Extracting log entries..." | ||
| 301 | |||
| 302 | # Get raw log lines containing [PURGATORY_EXPIRED] | ||
| 303 | # Capture stderr separately to detect journalctl errors | ||
| 304 | local raw_lines journal_stderr journal_exit | ||
| 305 | local temp_stderr | ||
| 306 | temp_stderr=$(mktemp) | ||
| 307 | |||
| 308 | raw_lines=$(eval "$journal_cmd" 2>"$temp_stderr" | grep '\[PURGATORY_EXPIRED\]' || true) | ||
| 309 | journal_exit=$? | ||
| 310 | journal_stderr=$(cat "$temp_stderr" 2>/dev/null || true) | ||
| 311 | rm -f "$temp_stderr" | ||
| 312 | |||
| 313 | # Report any journalctl errors (but don't fail - empty logs are valid) | ||
| 314 | if [[ -n "$journal_stderr" ]]; then | ||
| 315 | log_warn "journalctl reported: $journal_stderr" | ||
| 316 | fi | ||
| 317 | |||
| 318 | if [[ -z "$raw_lines" ]]; then | ||
| 319 | log_warn "No [PURGATORY_EXPIRED] entries found in logs." | ||
| 320 | log_warn "" | ||
| 321 | log_warn "This is expected if ngit-grasp logging improvements are not yet deployed." | ||
| 322 | log_warn "The structured log format required by this script:" | ||
| 323 | log_warn "" | ||
| 324 | log_warn " [PURGATORY_EXPIRED] repo=myrepo npub=npub1... reason=\"...\"" | ||
| 325 | log_warn "" | ||
| 326 | log_warn "See: docs/how-to/migrate-to-ngit-grasp.md (Dependencies section)" | ||
| 327 | log_warn "" | ||
| 328 | |||
| 329 | # Create empty output file with header comment | ||
| 330 | { | ||
| 331 | echo "# Purgatory expiry events extracted from $service" | ||
| 332 | echo "# Time range: ${since_date:-beginning} to ${until_date:-now}" | ||
| 333 | echo "# Extracted: $(date -Iseconds)" | ||
| 334 | echo "# Format: repo<TAB>npub<TAB>timestamp<TAB>reason" | ||
| 335 | echo "#" | ||
| 336 | echo "# NOTE: No [PURGATORY_EXPIRED] entries found." | ||
| 337 | echo "# This is expected if ngit-grasp logging improvements are not yet deployed." | ||
| 338 | } > "$output_file" | ||
| 339 | |||
| 340 | log_info "Created empty output file: $output_file" | ||
| 341 | exit 0 | ||
| 342 | fi | ||
| 343 | |||
| 344 | # Write header | ||
| 345 | { | ||
| 346 | echo "# Purgatory expiry events extracted from $service" | ||
| 347 | echo "# Time range: ${since_date:-beginning} to ${until_date:-now}" | ||
| 348 | echo "# Extracted: $(date -Iseconds)" | ||
| 349 | echo "# Format: repo<TAB>npub<TAB>timestamp<TAB>reason" | ||
| 350 | } > "$output_file" | ||
| 351 | |||
| 352 | # Parse each line | ||
| 353 | local count=0 | ||
| 354 | while IFS= read -r line; do | ||
| 355 | local parsed | ||
| 356 | parsed=$(parse_log_line "$line") | ||
| 357 | if [[ -n "$parsed" ]]; then | ||
| 358 | echo "$parsed" >> "$output_file" | ||
| 359 | count=$((count + 1)) | ||
| 360 | fi | ||
| 361 | done <<< "$raw_lines" | ||
| 362 | |||
| 363 | rm -f "$temp_file" | ||
| 364 | |||
| 365 | # Summary | ||
| 366 | echo "" | ||
| 367 | log_info "=== Extraction Summary ===" | ||
| 368 | log_info "Service: $service" | ||
| 369 | log_info "Time range: ${since_date:-beginning} to ${until_date:-now}" | ||
| 370 | log_success "Extracted $count purgatory expiry entries" | ||
| 371 | echo "" | ||
| 372 | log_info "Output file: $output_file" | ||
| 373 | |||
| 374 | if [[ $count -gt 0 ]]; then | ||
| 375 | echo "" | ||
| 376 | log_info "Sample entries (first 5):" | ||
| 377 | # Use a subshell to avoid SIGPIPE issues with set -e | ||
| 378 | (tail -n +5 "$output_file" | head -5 | while IFS=$'\t' read -r repo npub timestamp reason; do | ||
| 379 | echo " repo=$repo npub=${npub:0:20}... timestamp=$timestamp" | ||
| 380 | done) || true | ||
| 381 | fi | ||
| 382 | |||
| 383 | # Show unique repos affected | ||
| 384 | if [[ $count -gt 0 ]]; then | ||
| 385 | echo "" | ||
| 386 | local unique_repos | ||
| 387 | unique_repos=$(tail -n +5 "$output_file" | awk -F'\t' '{print $1}' | sort -u | wc -l) | ||
| 388 | log_info "Unique repositories affected: $unique_repos" | ||
| 389 | |||
| 390 | echo "" | ||
| 391 | log_info "Repositories with purgatory expiry:" | ||
| 392 | # Use a subshell to avoid SIGPIPE issues with set -e | ||
| 393 | (tail -n +5 "$output_file" | awk -F'\t' '{print $1}' | sort | uniq -c | sort -rn | head -10 | while read -r cnt repo; do | ||
| 394 | echo " $repo: $cnt expiry events" | ||
| 395 | done) || true | ||
| 396 | |||
| 397 | local total_repos | ||
| 398 | total_repos=$(tail -n +5 "$output_file" | awk -F'\t' '{print $1}' | sort -u | wc -l) | ||
| 399 | if [[ $total_repos -gt 10 ]]; then | ||
| 400 | echo " ... and $((total_repos - 10)) more repositories" | ||
| 401 | fi | ||
| 402 | fi | ||
| 403 | |||
| 404 | # Explicit success exit | ||
| 405 | exit 0 | ||
| 406 | } | ||
| 407 | |||
| 408 | main "$@" | ||