upleb.uk

Public git repos — served from a NIP-34 GRASP relay at git.upleb.uk

summaryrefslogtreecommitdiff
path: root/docs/archive/2025-11-04-evening/2025-11-04-git-http-backend-deep-dive.md
diff options
context:
space:
mode:
Diffstat (limited to 'docs/archive/2025-11-04-evening/2025-11-04-git-http-backend-deep-dive.md')
-rw-r--r--docs/archive/2025-11-04-evening/2025-11-04-git-http-backend-deep-dive.md714
1 files changed, 0 insertions, 714 deletions
diff --git a/docs/archive/2025-11-04-evening/2025-11-04-git-http-backend-deep-dive.md b/docs/archive/2025-11-04-evening/2025-11-04-git-http-backend-deep-dive.md
deleted file mode 100644
index 26d0526..0000000
--- a/docs/archive/2025-11-04-evening/2025-11-04-git-http-backend-deep-dive.md
+++ /dev/null
@@ -1,714 +0,0 @@
1**ARCHIVED: 2025-11-04**
2**Reason:** Analysis complete, crate validated
3**Outcome:** Confirmed suitable for use (with fork for authorization)
4
5---
6
7# git-http-backend Crate Deep Dive
8
9**Date:** 2025-11-04
10**Status:** ✅ ARCHIVED - Analysis Complete
11**Purpose:** Validate the recommendation in `work/current_status.md` regarding git-http-backend crate
12
13---
14
15## Executive Summary
16
17**Recommendation Status:** ✅ **VALIDATED WITH CAVEATS**
18
19The `git-http-backend` crate (v0.1.3) is a **good foundation** but requires significant customization for our inline authorization needs. The hybrid approach recommended in `current_status.md` is sound, but we'll need to:
20
211. **Fork or vendor** the crate for customization
222. **Add interception points** for authorization
233. **Enhance error handling** for better push rejection messages
244. **Add CORS support** (missing from current implementation)
25
26---
27
28## Crate Overview
29
30### Basic Info
31- **Name:** `git-http-backend`
32- **Version:** 0.1.3
33- **Author:** lazhenyi
34- **License:** MIT
35- **Repository:** https://github.com/lazhenyi/git-http-backend
36- **Documentation:** https://docs.rs/git-http-backend/0.1.3
37
38### Dependencies
39```toml
40tokio = { version = "1", features = ["sync","macros","rt", "rt-multi-thread","net"] }
41actix-web = { version = "4.9.0", features = ["default"] }
42actix-files = { version = "0.6.6", features = ["actix-server"] }
43futures-util = { version = "0.3.31", features = ["futures-channel"] }
44flate2 = "1.0.35" # Gzip compression
45async-stream = "0.3.6" # Streaming responses
46async-trait = "0.1.83" # Async trait support
47```
48
49**Good news:** Already uses actix-web 4.9.0 (same as we plan to use)
50
51---
52
53## Architecture Analysis
54
55### Core Design
56
57The crate provides:
58
591. **GitConfig Trait** - Path rewriting abstraction
602. **Actix Router** - Pre-configured routes for Git Smart HTTP
613. **Protocol Handlers** - Upload-pack, receive-pack, info/refs
624. **System Git Integration** - Spawns `git` subprocess
63
64### URL Structure
65
66```
67/{namespace}/{repo}/info/refs?service=git-upload-pack
68/{namespace}/{repo}/git-upload-pack
69/{namespace}/{repo}/git-receive-pack
70/{namespace}/{repo}/HEAD
71/{namespace}/{repo}/objects/info/packs
72/{namespace}/{repo}/objects/pack/{pack}
73```
74
75**Perfect match** for our `/{npub}/{identifier}.git/` structure!
76
77### Request Flow
78
79```
80HTTP Request
81
82Actix Router → Handler Function
83
84GitConfig::rewrite() → Path resolution
85
86Spawn git subprocess (upload-pack/receive-pack)
87
88Stream response back to client
89```
90
91---
92
93## Key Handlers Analysis
94
95### 1. info/refs Handler (refs.rs)
96
97**Purpose:** Advertise repository refs (clone/fetch discovery)
98
99**Flow:**
1001. Parse `service` query param (upload-pack or receive-pack)
1012. Resolve repository path via `GitConfig::rewrite()`
1023. Spawn `git upload-pack --advertise-refs --stateless-rpc .`
1034. Return with proper content-type header
104
105**Code:**
106```rust
107pub async fn info_refs(request: HttpRequest, service: web::Data<impl GitConfig>) -> impl Responder {
108 let uri = request.uri();
109 let path = uri.path().to_string().replace("/info/refs", "");
110 let path = service.rewrite(path).await;
111
112 // Parse service from query
113 let service = query.split('=').map(|x| x.to_string()).collect::<Vec<_>>()[1].clone();
114
115 // Spawn git
116 let mut cmd = Command::new("git");
117 cmd.arg(service_name.clone());
118 cmd.arg("--stateless-rpc");
119 cmd.arg("--advertise-refs");
120 cmd.arg(".");
121 cmd.current_dir(path);
122
123 // Return response with proper headers
124 resp.append_header(("Content-Type", format!("application/x-git-{}-advertisement", service_name)));
125 resp.append_header(("Cache-Control", "no-cache, max-age=0, must-revalidate"));
126}
127```
128
129**Good:**
130- ✅ Proper content-type headers
131- ✅ Cache control headers
132- ✅ Git protocol version support (Git-Protocol header)
133
134**Issues:**
135- ❌ No CORS headers
136- ❌ No error handling for missing repos
137- ❌ Query parsing is fragile (will panic on malformed input)
138
139### 2. git-upload-pack Handler (git_upload_pack.rs)
140
141**Purpose:** Handle clone/fetch operations (read-only)
142
143**Flow:**
1441. Resolve repository path
1452. Read request body (may be gzipped)
1463. Spawn `git upload-pack --stateless-rpc .`
1474. Stream response back
148
149**Code:**
150```rust
151pub async fn git_upload_pack(
152 request: HttpRequest,
153 mut payload: Payload,
154 service: web::Data<impl GitConfig>,
155) -> impl Responder {
156 // Resolve path
157 let path = service.rewrite(path).await;
158
159 // Spawn git
160 let mut cmd = Command::new("git");
161 cmd.arg("upload-pack");
162 cmd.arg("--stateless-rpc");
163 cmd.arg(".");
164 cmd.current_dir(path);
165
166 let mut span = cmd.spawn()?;
167 let mut stdin = span.stdin.take().unwrap();
168 let mut stdout = span.stdout.take().unwrap();
169
170 // Read request body
171 let mut bytes = web::BytesMut::new();
172 while let Some(chunk) = payload.next().await {
173 bytes.extend_from_slice(&data);
174 }
175
176 // Handle gzip
177 let body_data = match encoding {
178 Some("gzip") => decode_gzip(bytes),
179 _ => bytes.to_vec(),
180 };
181
182 // Write to git stdin
183 stdin.write_all(&body_data)?;
184 drop(stdin);
185
186 // Stream response
187 let body_stream = actix_web::body::BodyStream::new(async_stream::stream! {
188 let mut buffer = [0; 8192];
189 loop {
190 match stdout.read(&mut buffer) {
191 Ok(0) => break,
192 Ok(n) => yield Ok(web::Bytes::copy_from_slice(&buffer[..n])),
193 Err(e) => break,
194 }
195 }
196 });
197 resp.body(body_stream)
198}
199```
200
201**Good:**
202- ✅ Handles gzip compression
203- ✅ Streams response (efficient for large repos)
204- ✅ Proper content-type headers
205
206**Issues:**
207- ❌ No CORS headers
208- ❌ No repository existence check
209- ❌ Error handling uses eprintln! (not tracing)
210
211**For our use:** Upload-pack is read-only, so we can use as-is (just add CORS)
212
213### 3. git-receive-pack Handler (git_receive_pack.rs) ⚠️
214
215**Purpose:** Handle push operations (write)
216
217**This is the critical handler for inline authorization!**
218
219**Current Flow:**
2201. Resolve repository path
2212. **Check if bare repository** (good!)
2223. Read request body (may be gzipped)
2234. Spawn `git receive-pack --stateless-rpc .`
2245. Stream response back
225
226**Code:**
227```rust
228pub async fn git_receive_pack(
229 request: HttpRequest,
230 mut payload: Payload,
231 service: web::Data<impl GitConfig>,
232) -> impl Responder {
233 let path = service.rewrite(path).await;
234
235 // Check repository exists
236 if !path.join("HEAD").exists() || !path.join("config").exists() {
237 return HttpResponse::BadRequest().body("Repository not found or invalid.");
238 }
239
240 // Check if bare
241 let is_bare_repo = match std::fs::read_to_string(path.join("config")) {
242 Ok(config) => config.contains("bare = true"),
243 Err(_) => false,
244 };
245 if !is_bare_repo {
246 return HttpResponse::BadRequest().body("Push operation requires a bare repository.");
247 }
248
249 // Spawn git receive-pack
250 let mut cmd = Command::new("git");
251 cmd.arg("receive-pack");
252 cmd.arg("--stateless-rpc");
253 cmd.arg(".");
254 cmd.current_dir(&path);
255
256 let mut git_process = cmd.spawn()?;
257 let mut stdin = git_process.stdin.take().unwrap();
258 let mut stdout = git_process.stdout.take().unwrap();
259
260 // Read request body
261 let mut bytes = web::BytesMut::new();
262 while let Some(chunk) = payload.next().await {
263 bytes.extend_from_slice(&data);
264 }
265
266 // Decode if gzipped
267 let body_data = match encoding {
268 Some(encoding) if encoding.contains("gzip") => decode_gzip(bytes),
269 _ => bytes.to_vec(),
270 };
271
272 // Write to git stdin
273 stdin.write_all(&body_data)?;
274 drop(stdin);
275
276 // Stream response
277 let body_stream = /* stream stdout */;
278 resp.body(body_stream)
279}
280```
281
282**Good:**
283- ✅ Validates repository exists
284- ✅ Validates bare repository
285- ✅ Handles gzip compression
286- ✅ Streams response
287
288**Critical Issues for Our Use:**
289- ❌ **No authorization hook!** Spawns git immediately
290- ❌ **No way to inspect push data** before spawning git
291- ❌ **No CORS headers**
292- ❌ **Can't reject unauthorized pushes** with custom error
293
294**This is where we need customization!**
295
296---
297
298## Customization Requirements
299
300### 1. Authorization Interception Point
301
302**Need to add BEFORE spawning git:**
303
304```rust
305pub async fn git_receive_pack(
306 request: HttpRequest,
307 mut payload: Payload,
308 service: web::Data<impl GitConfig>,
309 validator: web::Data<PushValidator>, // ← ADD THIS
310) -> impl Responder {
311 let path = service.rewrite(path).await;
312
313 // Existing checks...
314
315 // Read request body
316 let body_data = read_and_decode_body(&mut payload, &request).await?;
317
318 // ← ADD AUTHORIZATION HERE
319 let ref_updates = parse_receive_pack_request(&body_data)?;
320
321 // Extract npub and identifier from path
322 let (npub, identifier) = extract_repo_info(&request.uri().path())?;
323
324 // Validate against Nostr state
325 if let Err(e) = validator.validate_push(&npub, &identifier, &ref_updates).await {
326 return HttpResponse::Forbidden()
327 .json(json!({
328 "error": "unauthorized",
329 "message": e.to_string(),
330 "ref_updates": ref_updates,
331 }));
332 }
333
334 // Only spawn git if authorized
335 let mut cmd = Command::new("git");
336 // ... rest of existing code
337}
338```
339
340### 2. Parse Git Protocol
341
342**Need to add protocol parsing:**
343
344```rust
345// src/git/protocol.rs
346
347pub struct RefUpdate {
348 pub old_oid: String,
349 pub new_oid: String,
350 pub ref_name: String,
351}
352
353pub fn parse_receive_pack_request(body: &[u8]) -> Result<Vec<RefUpdate>> {
354 // Parse git pack protocol
355 // Format: <old-oid> <new-oid> <ref-name>\0<capabilities>\n
356 // Example: 0000000000000000000000000000000000000000 a1b2c3d4... refs/heads/main\0 report-status\n
357
358 let mut updates = Vec::new();
359 let lines = body.split(|&b| b == b'\n');
360
361 for line in lines {
362 if line.is_empty() {
363 continue;
364 }
365
366 // Parse pkt-line format
367 // First 4 bytes are hex length
368 let pkt_len = parse_pkt_len(&line[0..4])?;
369 if pkt_len == 0 {
370 continue; // flush packet
371 }
372
373 let data = &line[4..pkt_len];
374 let parts: Vec<&[u8]> = data.splitn(3, |&b| b == b' ').collect();
375
376 if parts.len() >= 3 {
377 let old_oid = String::from_utf8_lossy(parts[0]).to_string();
378 let new_oid = String::from_utf8_lossy(parts[1]).to_string();
379
380 // Ref name may have capabilities after \0
381 let ref_data = parts[2];
382 let ref_name = if let Some(null_pos) = ref_data.iter().position(|&b| b == b'\0') {
383 String::from_utf8_lossy(&ref_data[..null_pos]).to_string()
384 } else {
385 String::from_utf8_lossy(ref_data).to_string()
386 };
387
388 updates.push(RefUpdate {
389 old_oid,
390 new_oid,
391 ref_name,
392 });
393 }
394 }
395
396 Ok(updates)
397}
398```
399
400**Note:** Git pack protocol is complex. We may want to use a library for this:
401- `git2` crate has protocol parsing
402- Or we can implement minimal parsing for our needs
403
404### 3. Add CORS Support
405
406**Need to add to all handlers:**
407
408```rust
409// Add CORS middleware or headers to all responses
410resp.append_header(("Access-Control-Allow-Origin", "*"));
411resp.append_header(("Access-Control-Allow-Methods", "GET, POST, OPTIONS"));
412resp.append_header(("Access-Control-Allow-Headers", "Content-Type, Git-Protocol"));
413```
414
415### 4. Better Error Handling
416
417**Replace eprintln! with tracing:**
418
419```rust
420use tracing::{error, info, debug};
421
422// Instead of:
423eprintln!("Error running command: {}", e);
424
425// Use:
426error!(error = ?e, "Failed to spawn git process");
427```
428
429---
430
431## Integration Strategy
432
433### Option A: Fork the Crate ✅ RECOMMENDED
434
435**Pros:**
436- Full control over authorization logic
437- Can add CORS, error handling, protocol parsing
438- Can publish as `ngit-grasp-git-http-backend`
439- Keep upstream changes visible
440
441**Cons:**
442- Need to maintain fork
443- Diverges from upstream
444
445**Implementation:**
4461. Fork https://github.com/lazhenyi/git-http-backend
4472. Add to our workspace as git submodule or copy
4483. Modify `git_receive_pack.rs` to add authorization
4494. Add protocol parsing module
4505. Add CORS support
4516. Improve error handling
452
453### Option B: Vendor the Code
454
455**Pros:**
456- Complete control
457- No external dependency
458- Can heavily customize
459
460**Cons:**
461- Lose upstream updates
462- More code to maintain
463
464**Implementation:**
4651. Copy source into `src/git/http_backend/`
4662. Modify as needed
4673. No external dependency
468
469### Option C: Wrap the Crate
470
471**Pros:**
472- Keep upstream crate
473- Add authorization via middleware
474
475**Cons:**
476- ❌ **Can't intercept before git spawns!**
477- Would need to parse response, too late
478- Complex to inject validator
479
480**Not recommended** - can't achieve inline authorization
481
482---
483
484## Recommended Approach
485
486### Use Forked git-http-backend + git2 + System Git
487
488**Architecture:**
489
490```
491HTTP Request
492
493Actix Router (from forked git-http-backend)
494
495Custom GitConfig Implementation
496
497git_receive_pack Handler (MODIFIED)
498
499┌─────────────────────────────────┐
500│ 1. Read request body │
501│ 2. Parse ref updates (protocol) │ ← ADD THIS
502│ 3. Validate via PushValidator │ ← ADD THIS
503│ ├─ Query Nostr relay │
504│ ├─ Check state event │
505│ └─ Validate maintainers │
506│ 4. If authorized: │
507│ └─ Spawn git receive-pack │ ← EXISTING
508│ 5. If unauthorized: │
509│ └─ Return 403 with error │ ← ADD THIS
510└─────────────────────────────────┘
511
512Stream response to client
513```
514
515**Dependencies:**
516
517```toml
518[dependencies]
519# Fork of git-http-backend (or vendored code)
520git-http-backend = { git = "https://github.com/our-org/git-http-backend", branch = "ngit-grasp" }
521
522# Or vendor it:
523# (no dependency, code in src/git/http_backend/)
524
525# Git operations
526git2 = "0.20" # For repository management, ref queries
527
528# Already have:
529actix-web = "4.9"
530tokio = { version = "1", features = ["full"] }
531nostr-sdk = "0.43"
532```
533
534**Implementation Plan:**
535
5361. **Phase 1: Fork & Setup**
537 - Fork git-http-backend
538 - Add to our project (git submodule or copy)
539 - Verify existing functionality works
540
5412. **Phase 2: Protocol Parsing**
542 - Add `src/git/protocol.rs`
543 - Implement `parse_receive_pack_request()`
544 - Unit tests for protocol parsing
545
5463. **Phase 3: Authorization Integration**
547 - Modify `git_receive_pack.rs`
548 - Add `PushValidator` parameter
549 - Call validator before spawning git
550 - Return 403 on unauthorized
551
5524. **Phase 4: CORS & Polish**
553 - Add CORS headers to all handlers
554 - Improve error messages
555 - Add tracing instead of eprintln!
556
5575. **Phase 5: Testing**
558 - Unit tests for authorization
559 - Integration tests with real git
560 - GRASP-01 compliance tests
561
562---
563
564## Validation of current_status.md Recommendations
565
566### Hybrid Approach ✅ VALIDATED
567
568**Original recommendation:**
569> 1. **git-http-backend** - HTTP protocol handling
570> 2. **git2-rs** - Repository management, ref validation
571> 3. **System git** - Actual pack operations (upload-pack/receive-pack)
572
573**Analysis:**
574- ✅ **git-http-backend** - Good foundation, needs customization
575- ✅ **git2** - Perfect for repo management (init, refs, validation)
576- ✅ **System git** - Proven pack protocol implementation
577
578**Verdict:** Sound approach, but need to fork/vendor git-http-backend
579
580### Tool Selection ✅ CORRECT
581
582**Original analysis:**
583- git2 for repository management ✅
584- System git for pack operations ✅
585- git-http-backend for HTTP layer ✅ (with modifications)
586
587**Additional findings:**
588- Need protocol parsing (can use git2 or implement minimal)
589- Need CORS support (add to fork)
590- Need better error handling (add to fork)
591
592### Inline Authorization ✅ ACHIEVABLE
593
594**Original goal:**
595> We intercept the `git-receive-pack` operation before spawning the Git process
596
597**Analysis:**
598- ✅ Possible by modifying `git_receive_pack.rs`
599- ✅ Can parse request body before spawning git
600- ✅ Can return 403 before git touches repository
601
602**Requirement:**
603- Must fork or vendor git-http-backend
604- Can't achieve with unmodified crate
605
606---
607
608## Updated Implementation Plan
609
610### Week 1: Foundation (UPDATED)
611
6121. ✅ Add git2 dependency
6132. **Fork git-http-backend** (NEW)
6143. **Add protocol parsing** (NEW)
6154. Implement GitRepository (Phase 1)
6165. Write unit tests for repository operations
6176. Test repository creation from announcements
618
619### Week 2: Protocol & Authorization
620
6211. Implement protocol parsing (Phase 2)
6222. Implement authorization logic (Phase 3)
6233. **Modify git_receive_pack handler** (NEW)
6244. Write unit tests for both
6255. Integration tests for validation
626
627### Week 3: HTTP & Integration
628
6291. **Add CORS support to fork** (NEW)
6302. Implement HTTP handlers (Phase 4)
6313. Integrate with Nostr events (Phase 5)
6324. Integration tests for full flow
6335. Error handling improvements
634
635### Week 4: E2E & Polish
636
6371. E2E tests with real git (Phase 6)
6382. Performance testing
6393. GRASP-01 compliance testing
6404. Documentation and examples
641
642---
643
644## Risks & Mitigations
645
646### Risk 1: Fork Maintenance
647
648**Risk:** Fork diverges from upstream, miss updates
649
650**Mitigation:**
651- Keep fork minimal (only modify git_receive_pack.rs)
652- Document all changes clearly
653- Consider upstreaming authorization hooks
654- Monitor upstream for security fixes
655
656### Risk 2: Protocol Parsing Complexity
657
658**Risk:** Git pack protocol is complex, may miss edge cases
659
660**Mitigation:**
661- Use git2 for protocol parsing if available
662- Implement minimal parsing (just ref updates)
663- Extensive testing with real git clients
664- Refer to Git protocol documentation
665
666### Risk 3: Performance
667
668**Risk:** Authorization adds latency to push operations
669
670**Mitigation:**
671- Keep validation logic fast (< 100ms target)
672- Cache state events in memory
673- Async validation (don't block)
674- Profile and optimize
675
676---
677
678## Conclusion
679
680### Summary
681
682The **hybrid approach** recommended in `current_status.md` is **sound and validated**, with these adjustments:
683
6841. **Fork or vendor git-http-backend** - Can't use unmodified crate
6852. **Add protocol parsing** - Need to parse ref updates from request
6863. **Modify git_receive_pack handler** - Add authorization before spawning git
6874. **Add CORS support** - Missing from current implementation
6885. **Improve error handling** - Better messages for push rejections
689
690### Next Steps
691
6921. ✅ **Review this analysis** - Confirm approach
6932. **Fork git-http-backend** - Set up fork/vendor
6943. **Start Phase 1** - Add git2, implement GitRepository
6954. **Add protocol parsing** - Parse ref updates from pack protocol
6965. **Modify receive-pack handler** - Add authorization logic
697
698### Questions for Review
699
7001. **Fork vs. Vendor?** Fork allows upstream tracking, vendor gives full control
7012. **Protocol parsing?** Use git2 or implement minimal parser?
7023. **CORS scope?** Support all origins or restrict?
7034. **Error detail?** How much info to expose in 403 responses?
7045. **Performance target?** Is < 100ms for auth validation reasonable?
705
706---
707
708**Status:** ✅ Analysis complete, ready to proceed with implementation
709
710**Recommendation:** Fork git-http-backend, add authorization to git_receive_pack, use git2 for repo management
711
712---
713
714*Analysis Date: November 4, 2025*