upleb.uk

Public git repos — served from a NIP-34 GRASP relay at git.upleb.uk

summaryrefslogtreecommitdiff
path: root/docs
diff options
context:
space:
mode:
Diffstat (limited to 'docs')
-rw-r--r--docs/explanation/deletion-requests.md432
1 files changed, 432 insertions, 0 deletions
diff --git a/docs/explanation/deletion-requests.md b/docs/explanation/deletion-requests.md
new file mode 100644
index 0000000..5d6284b
--- /dev/null
+++ b/docs/explanation/deletion-requests.md
@@ -0,0 +1,432 @@
1# Deletion Request Support (NIP-09)
2
3**Status:** 🚧 **PLANNED - NOT YET IMPLEMENTED** 🚧
4
5This document describes the planned architecture for NIP-09 deletion request support in ngit-grasp. Implementation is scheduled for 6-week phased rollout. See `work/active-issues/deletion-request-support.md` for implementation tracking.
6
7---
8
9## Overview
10
11ngit-grasp will implement optional support for NIP-09 deletion requests, allowing repository owners to remove their repositories from the relay while providing safeguards against the "left-pad problem" through configurable archival behavior.
12
13## The Left-Pad Problem
14
15The "left-pad problem" refers to a 2016 incident where a critical npm package was unpublished, breaking thousands of dependent projects. In the context of decentralized Git hosting, this translates to:
16
17**Scenario:** A popular repository with many PRs, issues, and community contributions gets deleted by its owner. All dependent work (forks, patches, discussions) becomes inaccessible, potentially breaking workflows and losing community knowledge.
18
19**Our Solution:** The `deletion-request-disrespector` configuration option allows operators to run **archival relays** that preserve deleted content, ensuring community work survives repository deletion while still respecting deletion requests on standard relays.
20
21## Architecture
22
23### Three-Database Design
24
25The deletion system uses three separate data stores:
26
27```
28┌─────────────────────────────────────────────────────────┐
29│ Main Database │
30│ (Live events - actively served) │
31│ LMDB/NostrDB/Memory backend │
32└─────────────────────────────────────────────────────────┘
33 ↓ deletion request
34┌─────────────────────────────────────────────────────────┐
35│ Holding Database │
36│ (Archived events - recovery window) │
37│ Same backend type as main │
38│ Retention: configurable (default 90 days) │
39└─────────────────────────────────────────────────────────┘
40 ↓ expiry
41┌─────────────────────────────────────────────────────────┐
42│ Permanent Deletion │
43│ (Events removed from holding DB) │
44└─────────────────────────────────────────────────────────┘
45
46 Git Data Flow
47┌─────────────────────────────────────────────────────────┐
48│ Git Repository (Live) │
49│ <git_data_path>/<npub>/<identifier>.git │
50└─────────────────────────────────────────────────────────┘
51 ↓ deletion request
52┌─────────────────────────────────────────────────────────┐
53│ Archive Filesystem │
54│ .archive/<npub>/<identifier>-<timestamp>.tar.gz │
55│ + metadata.json │
56│ Retention: configurable (default 90 days) │
57└─────────────────────────────────────────────────────────┘
58 ↓ expiry
59┌─────────────────────────────────────────────────────────┐
60│ Permanent Deletion │
61│ (Archive files removed) │
62└─────────────────────────────────────────────────────────┘
63```
64
65### Why Three Stores?
66
671. **Main Database:** Fast queries, clean data model (deleted = gone)
682. **Holding Database:** Recovery mechanism, prevents accidental permanent deletion
693. **Archive Filesystem:** Git data backup, compressed storage
70
71## Deletion Flow
72
73### Standard Mode (Respects Deletions)
74
75```
761. Kind 5 deletion request arrives
77
782. Validate: author matches announcement pubkey
79
803. Query dependent events (PRs, issues, patches, comments)
81
824. Archive git repository to .archive/<npub>/<identifier>-<timestamp>.tar.gz
83
845. Move events to holding database:
85 - Announcement
86 - All dependent events (cascade delete)
87
886. Delete events from main database
89
907. Events no longer served in queries
91
928. Background task (daily):
93 - Check holding database for expired entries
94 - Delete events older than retention period
95 - Delete corresponding archive files
96```
97
98### Archival Mode (Disrespector)
99
100When `deletion_request_disrespector = true`:
101
102```
1031. Kind 5 deletion request arrives
104
1052. Store deletion request event in main database
106
1073. Do NOT process deletion
108
1094. Repository and events remain fully accessible
110
111Result: Archival relay preserves all content
112```
113
114**Implementation Note:** We need to verify that `nostr-relay-builder` doesn't automatically process deletion requests at the relay library level. If it does, we'll need to override or disable this behavior when disrespector mode is enabled. This will be investigated in Phase 6.
115
116## Recovery Mechanism
117
118The holding database enables **accidental deletion recovery**:
119
120```
121Scenario: Owner deletes repository, then changes their mind
122
1231. Owner publishes new announcement with same identifier
124
1252. System detects matching entry in holding database
126
1273. Check: Is entry within retention period?
128
1294. If YES:
130 - Extract git data from archive tar.gz
131 - Restore to <git_data_path>/<npub>/<identifier>.git
132 - Move events from holding DB → main DB
133 - Re-run acceptance policy (should now pass)
134 - Delete archive records
135 - Return: "Restored X events"
136
1375. If NO (expired):
138 - Process as new repository
139 - Return: "New repository created"
140```
141
142## Cascade Deletion Strategy
143
144When a repository announcement is deleted, we cascade delete **all dependent events**:
145
146### Rationale
147
148**Decision:** Delete all dependent events, not just owner's events.
149
150**Why?**
1511. **Deletion Intent:** Owner wants repository gone - includes all associated data
1522. **Data Integrity:** Orphaned PRs/issues without context are confusing
1533. **Consistency:** Matches user expectation that "delete repo" means "delete everything"
1544. **Recovery Available:** Holding database preserves everything for recovery window
155
156**Community Protection:**
157- Archival relays (`deletion_request_disrespector = true`) preserve community work
158- 90-day default retention allows time for recovery
159- Other maintainers can continue repository with different identifier
160
161### Event Cascade Hierarchy
162
163```
164Repository Announcement (30617)
165 ↓ deleted
166├─→ State Events (30618) - same identifier
167├─→ Pull Requests (1618) - tag via 'a'
168├─→ Issues (1621) - tag via 'a'
169├─→ Patches (1617) - tag via 'a'
170 ↓ all above deleted
171 └─→ Comments (1111) - tag via 'e'
172 ├─→ Reactions (7) - tag via 'e'
173 └─→ Text Notes (1) - tag via 'e'
174```
175
176**Implementation:** Recursive dependency graph traversal starting from announcement.
177
178## Multi-Maintainer Scenarios
179
180### Challenge
181
182Multiple maintainers can have announcements for the same `identifier`:
183- `npub1alice.../my-repo`
184- `npub1bob.../my-repo`
185
186Git data is synced between their repositories. When ONE maintainer deletes, what happens?
187
188### Solution: Graph-Based Retention Algorithm
189
190```
191When npub1alice deletes her announcement:
192
1931. Archive HER git directory:
194 .archive/npub1alice.../my-repo-<timestamp>.tar.gz
195
1962. Query all events that referenced her announcement
197
1983. Re-evaluate each event through acceptance policy:
199 - WITHOUT alice's announcement
200 - WITH bob's announcement still present
201
2024. Build retention graph:
203 Event A kept because:
204 - References bob's announcement ✓
205 Event B kept because:
206 - References Event A ✓
207 Event C orphaned because:
208 - Only referenced alice's announcement ✗
209
2105. Delete orphaned events, keep retained events
211
2126. Handle circular dependencies:
213 - Event X kept because references Event Y
214 - Event Y kept because references Event X
215 - Neither has external anchor → both deleted
216```
217
218### Graph Algorithm Details
219
220**Topological Traversal:**
2211. Start from remaining announcements (roots)
2222. Traverse dependency edges (a/e/q tags)
2233. Mark reachable events as "keep"
2244. Mark unreachable events as "delete"
225
226**Max Depth Limit:**
227- Configurable maximum traversal depth (prevent infinite loops)
228- Default: 100 levels
229- Note: Will analyze edge cases where this limit matters
230
231**Complexity:**
232- Deletion events are rare (not performance critical)
233- Compute on-demand when deletion request arrives
234- No pre-computation or caching needed at current scale
235- Note: Will analyze large-scale scenarios in future
236
237## Configuration
238
239### deletion_request_disrespector
240
241**Type:** `bool`
242**Default:** `false` (respects deletion requests)
243**CLI:** `--deletion-request-disrespector`
244**Env:** `NGIT_DELETION_REQUEST_DISRESPECTOR`
245
246**Description:**
247When `true`, relay ignores deletion requests and acts as an archival server. Critical for preventing left-pad scenarios by ensuring at least some relays preserve deleted content.
248
249**Use Cases:**
250- Community archival relays
251- Research/historical preservation
252- Backup/mirror relays
253- GRASP-05 archive mode (future)
254
255### archive_retention_secs
256
257**Type:** `u64`
258**Default:** `7776000` (90 days in seconds)
259**CLI:** `--archive-retention-secs`
260**Env:** `NGIT_ARCHIVE_RETENTION_SECS`
261
262**Description:**
263How long to retain archived events and git data before permanent deletion. Provides recovery window for accidental deletions.
264
265**Recommended Values:**
266- Development/Testing: `5` seconds (fast test cycles)
267- Staging: `300` seconds (5 minutes)
268- Production: `7776000` seconds (90 days, default)
269- Archival Relay: `31536000` seconds (1 year) or higher
270
271**Notes:**
272- Configurable in seconds for testing flexibility
273- Background cleanup task runs daily (configuration for testing interval TBD in Phase 6)
274- Check occurs on startup to handle offline periods
275- **Testing Challenge:** Daily cleanup doesn't work well with 3-5 second retention for tests - alternative timing strategy needed
276
277## NIP-11 Advertisement
278
279Deletion support is **conditionally advertised** in NIP-11 relay information:
280
281- **When `deletion_request_disrespector = false`:** Include `"deletion"` in supported NIPs array
282- **When `deletion_request_disrespector = true`:** Do NOT include `"deletion"` (archival mode doesn't honor deletions)
283
284This allows clients to discover whether a relay respects deletion requests.
285
286## Documentation Updates
287
288When implementation is complete, the following documentation will be updated:
289
290**README.md:**
291- Add NIP-09 deletion request support to feature list
292- Document cascade deletion behavior
293- Update "Delete Events" roadmap section (mark as implemented)
294- Link to this explanation document
295
296**docs/explanation/architecture.md:**
297- Add deletion request system overview
298- Document cascade deletion strategy
299- Reference this document for detailed information
300
301## Implementation Status
302
303**Phase 1: Core Deletion + Simple Cascade** 🔄 (Planned)
304- Config options
305- Holding database
306- Kind 5 processing
307- Simple cascade delete
308
309**Phase 2: Git Archival & Cleanup** 🔄 (Planned)
310- Archive tar.gz creation
311- Background cleanup task
312- Metadata storage
313
314**Phase 3: Multi-Maintainer Graph Algorithm** 🔄 (Planned)
315- Dependency graph building
316- Re-evaluation through acceptance policy
317- Circular dependency detection
318
319**Phase 4: Recovery Mechanism** 🔄 (Planned)
320- Re-announcement detection
321- Archive restoration
322- Event recovery from holding DB
323
324**Phase 5: Extended Cascade Deletion** 🔄 (Planned)
325- Patches (1617) cascade
326- Issues (1621) cascade
327- PR Updates (1619) cascade
328- Full event type coverage
329
330**Phase 6: Analysis & Edge Cases** 🔄 (Planned)
331- Background cleanup timing strategy (daily doesn't work with 3-second test retention)
332- rust-nostr deletion behavior investigation (does relay builder auto-process deletions?)
333- Author validation enforcement and testing
334- Max depth edge case analysis
335- Large-scale testing
336- Race condition investigation
337- Lock strategy finalization
338
339## Security Considerations
340
341### Validation
342
3431. **Author Matching:** Deletion request pubkey MUST match announcement pubkey
344 - **Critical Requirement:** We ONLY honor deletion requests where the deletion request author is the same as the deleted event author
345 - This prevents malicious actors from deleting other people's repositories
346 - Enforced at validation layer before any deletion processing
3472. **Signature Verification:** Handled by nostr-relay-builder (already implemented)
3483. **Timestamp Check:** For addressable events, delete versions up to deletion `created_at`
349
350### Attack Vectors
351
352**DoS via Deletion Spam:**
353- Mitigation: Deletion requests only processed if announcement exists
354- Mitigation: Idempotent (deleting already-deleted announcement is no-op)
355
356**Archive Disk Exhaustion:**
357- Mitigation: Background cleanup enforces retention limits
358- Mitigation: Compressed tar.gz archives
359- Mitigation: Configurable retention period
360
361**Recovery Abuse:**
362- Mitigation: Recovery only within retention window
363- Mitigation: Must be original owner (pubkey match)
364- Mitigation: Normal announcement validation applies
365
366## Monitoring & Metrics
367
368**Prometheus Metrics (Planned):**
369- `ngit_deletion_requests_total` - Count of deletion requests received
370- `ngit_deletion_requests_processed` - Count actually processed (disrespector mode = 0)
371- `ngit_holding_database_events` - Current event count in holding DB
372- `ngit_holding_database_size_bytes` - Holding DB disk usage
373- `ngit_archive_files_total` - Count of archive tar.gz files
374- `ngit_archive_size_bytes` - Total archive disk usage
375- `ngit_recoveries_total` - Count of successful recoveries
376- `ngit_permanent_deletions_total` - Count of events permanently deleted (post-retention)
377
378## Testing Strategy
379
380### Unit Tests
381- Kind 5 validation and parsing
382- Author matching logic
383- Cascade dependency query
384- Graph traversal algorithm
385- Recovery detection
386
387### Integration Tests
388- Full deletion workflow (3-5 second retention)
389- Multi-maintainer scenarios
390- Recovery mechanism
391- Disrespector mode behavior
392- Background cleanup timing (mocked)
393
394### Audit Tests
395- NIP-09 compliance validation
396- Event re-submission after deletion (rejected)
397- Deletion request event itself (stored)
398- Archival mode relay behavior
399
400## Related Documentation
401
402- **NIP-09 Specification:** `/persistent/dcdev/clones/nips/09.md`
403- **Architecture Overview:** `docs/explanation/architecture.md`
404- **Configuration Reference:** `docs/reference/configuration.md`
405- **Roadmap:** `README.md` lines 198-206
406
407## Future Enhancements
408
409### GRASP-05 Archive Mode
410Once GRASP-05 is specified, `deletion_request_disrespector` mode can form the foundation for archive relay requirements.
411
412### Selective Disrespect
413Allow configuration to disrespect deletions only for specific criteria:
414- Popular repositories (e.g., >N PRs)
415- Repositories with community contributions
416- Specific identifiers (allowlist)
417
418### Distributed Archive Network
419Coordinate between archival relays to ensure redundant preservation of deleted content.
420
421### Recovery Notifications
422Notify repository owner when content is recovered from holding database, allowing them to confirm or re-delete.
423
424## Conclusion
425
426The deletion request system balances three competing needs:
427
4281. **User Agency:** Owners can delete their repositories
4292. **Community Protection:** Archival relays prevent left-pad scenarios
4303. **Recovery Grace Period:** Holding database prevents accidental permanent deletion
431
432By making deletion behavior **configurable** rather than mandatory, we enable a heterogeneous relay network where some relays respect deletions (user privacy) while others preserve content (community resilience).