upleb.uk

Public git repos — served from a NIP-34 GRASP relay at git.upleb.uk

summaryrefslogtreecommitdiff
path: root/docs/how-to/migrate-to-ngit-grasp.md
diff options
context:
space:
mode:
Diffstat (limited to 'docs/how-to/migrate-to-ngit-grasp.md')
-rw-r--r--docs/how-to/migrate-to-ngit-grasp.md314
1 files changed, 314 insertions, 0 deletions
diff --git a/docs/how-to/migrate-to-ngit-grasp.md b/docs/how-to/migrate-to-ngit-grasp.md
index 62cad87..abe2191 100644
--- a/docs/how-to/migrate-to-ngit-grasp.md
+++ b/docs/how-to/migrate-to-ngit-grasp.md
@@ -714,3 +714,317 @@ This section documents the specific configuration and lessons learned from migra
7142. **Investigate 5 edge cases**: Manual review of unusual states 7142. **Investigate 5 edge cases**: Manual review of unusual states
7153. **Monitor purgatory**: 382 expired entries indicate sync issues to investigate 7153. **Monitor purgatory**: 382 expired entries indicate sync issues to investigate
7164. **Plan cutover**: Once re-sync complete, switch DNS/proxy to ngit-grasp 7164. **Plan cutover**: Once re-sync complete, switch DNS/proxy to ngit-grasp
717
718## ngit-relay Troubleshooting
719
720This section covers common issues encountered when running ngit-relay in production, including git permission errors and repository corruption. These issues were discovered during the relay.ngit.dev migration and may affect other deployments.
721
722### Git Permission Denied Errors
723
724#### Symptoms
725
726When cloning repositories, you see:
727
728```bash
729$ git clone https://relay.ngit.dev/npub.../repo.git
730Cloning into 'repo'...
731remote: warning: unable to access '/root/.config/git/attributes': Permission denied
732```
733
734Or in container logs:
735
736```
737warning: unable to access '/root/.config/git/attributes': Permission denied
738```
739
740#### Explanation
741
742This occurs when:
7431. Git operations run as a non-root user (typically `nginx` user, UID 101)
7442. Git tries to access `/root/.config/git/attributes` for global git configuration
7453. The `/root` directory has permissions `0700` (drwx------), preventing non-root users from traversing into it
7464. Even though the `attributes` file itself may be world-readable, the nginx user cannot reach it due to parent directory permissions
747
748**Root cause:** The container runs git commands via fcgiwrap as the nginx user, but `/root` is only accessible by root.
749
750#### Quick Fix (Temporary - Does Not Survive Container Restart)
751
752This fix resolves the issue immediately but will be lost when containers restart:
753
754```bash
755# For each ngit-relay container, exec in and create the git config directory
756sudo podman exec <container-name> sh -c "mkdir -p /root/.config/git && touch /root/.config/git/attributes && chmod 644 /root/.config/git/attributes"
757
758# Example for specific containers:
759sudo podman exec gitnostr-com-ngit-relay sh -c "mkdir -p /root/.config/git && touch /root/.config/git/attributes && chmod 644 /root/.config/git/attributes"
760
761sudo podman exec relay-ngit-dev-ngit-relay sh -c "mkdir -p /root/.config/git && touch /root/.config/git/attributes && chmod 644 /root/.config/git/attributes"
762```
763
764**Important:** This fix is temporary and will be lost when the container restarts. For a permanent solution, see the NixOS configuration below.
765
766#### Permanent Fix (NixOS Configuration)
767
768For NixOS deployments, add systemd services that automatically fix `/root` permissions after each container start:
769
770```nix
771# In your ngit-relay service configuration (e.g., services/relay-ngit-dev-ngit-relay.nix)
772
773systemd.services.relay-ngit-dev-fix-root-perms = {
774 description = "Fix /root permissions in relay.ngit.dev container for git access";
775 after = [ "podman-relay-ngit-dev-ngit-relay.service" ];
776 requires = [ "podman-relay-ngit-dev-ngit-relay.service" ];
777 wantedBy = [ "multi-user.target" ];
778 serviceConfig = {
779 Type = "oneshot";
780 RemainAfterExit = true;
781 ExecStart = "${pkgs.bash}/bin/bash -c 'sleep 5 && ${pkgs.podman}/bin/podman exec relay-ngit-dev-ngit-relay chmod 711 /root'";
782 Restart = "on-failure";
783 RestartSec = "10s";
784 };
785};
786```
787
788This changes `/root` permissions from `0700` to `0711`, allowing the nginx user to traverse through `/root` to reach `/root/.config/git/`.
789
790**Why 711?**
791- `7` (owner/root): Full read/write/execute
792- `1` (group): Execute only (traverse)
793- `1` (other): Execute only (traverse)
794
795This allows non-root users to traverse through `/root` to access subdirectories, while still protecting `/root` contents from being listed or read.
796
797#### Verification
798
799After applying the fix:
800
801```bash
802# Test that cloning works without permission warnings
803git clone https://relay.ngit.dev/npub.../repo.git
804
805# Should clone successfully with no "Permission denied" warnings
806
807# Verify /root permissions inside container
808sudo podman exec relay-ngit-dev-ngit-relay ls -ld /root
809# Should show: drwx--x--x (711)
810
811# Verify nginx user can access git config
812sudo podman exec relay-ngit-dev-ngit-relay su -s /bin/sh nginx -c "cat /root/.config/git/attributes"
813# Should succeed without "Permission denied"
814```
815
816### Git Repository Corruption
817
818#### Symptoms
819
820When cloning repositories, you see:
821
822```bash
823$ git clone https://relay.ngit.dev/npub.../repo.git
824Cloning into 'repo'...
825remote: fatal: bad tree object 8b765235809eb27159657eb4c97fb37d21c29bf0
826remote: aborting due to possible repository corruption on the remote side.
827fatal: early EOF
828fatal: fetch-pack: invalid index-pack output
829```
830
831Or when running `git fsck` on the server:
832
833```
834broken link from tree 7d60270e1904c30ae6cef7b465ef842a9f9f63c3
835 to tree 8b765235809eb27159657eb4c97fb37d21c29bf0
836missing tree 8b765235809eb27159657eb4c97fb37d21c29bf0
837```
838
839#### Explanation
840
841Repository corruption typically occurs due to:
842
8431. **Incomplete push operations**: A git push was interrupted mid-transfer, creating a commit that references objects that were never written to disk
8442. **Permission issues during push**: The git-receive-pack process couldn't write objects due to permission problems (e.g., files owned by wrong user)
8453. **Disk/filesystem issues**: Rare cases of disk errors or filesystem corruption
846
847**Common pattern:** A commit exists with references to tree objects, but those tree objects are missing from the repository. Sometimes individual blobs (files) exist as "dangling" objects but were never properly linked into the tree structure.
848
849**Warning signs:**
850- HEAD file or objects owned by root when they should be owned by the service user (UID 101)
851- Dangling blobs in `git fsck` output
852- Recent permission denied errors in logs
853
854#### How to Fix
855
856**Step 1: Locate the corrupted repository**
857
858```bash
859# SSH to the server
860ssh dc@ngit.dev
861
862# Find the repository path
863# For relay.ngit.dev: /persistent/relay-ngit-dev-ngit-relay/data/repos/npub.../repo.git
864# For gitnostr.com: /persistent/gitnostr-com-ngit-relay/data/repos/npub.../repo.git
865
866cd /persistent/relay-ngit-dev-ngit-relay/data/repos/npub1c03rad0r6q833vh57kyd3ndu2jry30nkr0wepqfpsm05vq7he25slryrnw/axepool.git
867```
868
869**Step 2: Diagnose the corruption**
870
871```bash
872# Run git fsck to identify missing/corrupted objects
873git fsck --full
874
875# Example output:
876# broken link from tree 7d60270e1904c30ae6cef7b465ef842a9f9f63c3
877# to tree 8b765235809eb27159657eb4c97fb37d21c29bf0
878# missing tree 8b765235809eb27159657eb4c97fb37d21c29bf0
879# dangling blob 94490b902c9bceb6f901cd0c7c25b685e3685d87
880
881# Check which commit references the missing object
882git log --all --oneline | head -10
883
884# Inspect the broken commit
885git cat-file -p <commit-hash>
886# This will show which tree is missing
887```
888
889**Step 3: Attempt automatic repair**
890
891Try these in order:
892
893```bash
894# Option A: Repack and garbage collect
895git gc --aggressive --prune=now
896
897# Then check if corruption is fixed
898git fsck --full
899
900# Option B: If that doesn't work, try recovering from pack files
901git unpack-objects < .git/objects/pack/*.pack
902git fsck --full
903```
904
905**Step 4: Manual reconstruction (if automatic repair fails)**
906
907If the missing tree object can be reconstructed from dangling blobs:
908
909```bash
910# 1. Identify what should be in the missing tree
911# Look at the commit message and nearby commits to understand the structure
912
913# 2. Find dangling blobs that might belong to the tree
914git fsck --full | grep "dangling blob"
915
916# 3. Examine each dangling blob to identify files
917git cat-file -p 94490b902c9bceb6f901cd0c7c25b685e3685d87
918
919# 4. Reconstruct the tree manually
920# This requires creating a new tree object with the correct structure
921# Example (advanced):
922git mktree <<EOF
923100644 blob <blob-hash> filename1.rs
924100644 blob <blob-hash> filename2.rs
925EOF
926# This outputs a new tree hash
927
928# 5. Create a new commit with the fixed tree
929git commit-tree <new-tree-hash> -p <parent-commit> -m "Reconstructed commit message"
930# This outputs a new commit hash
931
932# 6. Update the branch reference
933git update-ref refs/heads/<branch-name> <new-commit-hash>
934
935# 7. Clean up
936git gc --prune=now
937```
938
939**Step 5: Verify the fix**
940
941```bash
942# Run fsck again - should show no errors
943git fsck --full
944
945# Test clone locally
946git clone /path/to/repo.git /tmp/test-clone
947
948# Test clone via HTTP
949git clone https://relay.ngit.dev/npub.../repo.git /tmp/test-clone-http
950```
951
952**Step 6: Fix ownership and permissions**
953
954Ensure all repository files are owned by the correct user:
955
956```bash
957# For ngit-relay containers, files should be owned by UID 101 (nginx user)
958sudo chown -R 101:101 /persistent/relay-ngit-dev-ngit-relay/data/repos/npub.../repo.git
959
960# Verify
961ls -la /persistent/relay-ngit-dev-ngit-relay/data/repos/npub.../repo.git
962```
963
964**Step 7: Replicate fix to other instances (if applicable)**
965
966If you have multiple relay instances (e.g., gitnostr.com and relay.ngit.dev), replicate the fix:
967
968```bash
969# Copy the repaired pack files
970sudo cp /persistent/relay-ngit-dev-ngit-relay/data/repos/npub.../repo.git/objects/pack/* \
971 /persistent/gitnostr-com-ngit-relay/data/repos/npub.../repo.git/objects/pack/
972
973# Update the branch reference
974cd /persistent/gitnostr-com-ngit-relay/data/repos/npub.../repo.git
975git update-ref refs/heads/<branch-name> <new-commit-hash>
976
977# Fix ownership
978sudo chown -R 101:101 /persistent/gitnostr-com-ngit-relay/data/repos/npub.../repo.git
979
980# Clean up
981git gc --prune=now
982```
983
984#### Prevention
985
986To prevent future corruption:
987
9881. **Fix permission issues first**: Ensure the permission denied errors are resolved (see previous section)
9892. **Monitor for root-owned files**: Files in git repositories should be owned by UID 101, not root
9903. **Check disk health**: Run `df -h` and `smartctl` to ensure disk is healthy
9914. **Enable git fsck in monitoring**: Periodically run `git fsck` on repositories to catch corruption early
992
993```bash
994# Add to monitoring/cron (example)
995find /persistent/*/data/repos -name "*.git" -type d | while read repo; do
996 echo "Checking $repo"
997 git -C "$repo" fsck --full 2>&1 | grep -v "^Checking\|^dangling"
998done
999```
1000
1001#### Real-World Example: axepool.git Corruption
1002
1003During the relay.ngit.dev migration, the `axepool.git` repository was corrupted:
1004
1005**Problem:**
1006- Commit `e84518b` referenced tree `8b765235...` (the `src` directory)
1007- Tree `8b765235...` was missing from the repository
1008- Blob `94490b90...` (mint_client.rs) existed as a dangling object but wasn't linked
1009
1010**Root cause:**
1011- An incomplete push operation
1012- Permission issues (HEAD file was owned by root)
1013- The commit was created but the tree object was never written
1014
1015**Solution:**
10161. Identified the missing tree should contain: `lib.rs`, `main.rs`, `mint_client.rs`
10172. Found the dangling blob `94490b90...` was `mint_client.rs`
10183. Reconstructed the `src` tree with all three files
10194. Created new commit `e12bc3cf...` with the fixed tree
10205. Updated `refs/heads/add-missing-hooks` to point to the new commit
10216. Ran `git gc --prune=now` to clean up
10227. Replicated fix to gitnostr.com instance
1023
1024**Result:** Both relays now clone successfully with all files intact.
1025
1026### Additional Resources
1027
1028- **ngit-relay repository**: https://github.com/danconwaydev/ngit-relay
1029- **Git internals documentation**: https://git-scm.com/book/en/v2/Git-Internals-Plumbing-and-Porcelain
1030- **Podman documentation**: https://docs.podman.io/