upleb.uk

Public git repos — served from a NIP-34 GRASP relay at git.upleb.uk

summaryrefslogtreecommitdiff
diff options
context:
space:
mode:
authorDanConwayDev <DanConwayDev@protonmail.com>2026-01-14 13:40:20 +0000
committerDanConwayDev <DanConwayDev@protonmail.com>2026-01-14 13:40:20 +0000
commit2821578202d1313c23c30a5dbae39548822e3c55 (patch)
treecbe4d2447312b7bc7653bef874b6fb23d60a0ede
parent50000cd9d47681390c3c45feef98fe51c7b79a0f (diff)
docs: add defensive measures explanation
Add comprehensive documentation explaining the defensive features implemented in ngit-grasp. The detailed analysis of other relay implementations is now preserved in commit history (e3792b9).
-rw-r--r--README.md42
-rw-r--r--docs/explanation/README.md42
-rw-r--r--docs/explanation/defensive-analysis-of-other-relays.md669
-rw-r--r--docs/explanation/defensive-measures.md64
4 files changed, 93 insertions, 724 deletions
diff --git a/README.md b/README.md
index e0e39fd..189478c 100644
--- a/README.md
+++ b/README.md
@@ -237,6 +237,48 @@ NGIT_EVENT_BLACKLIST=npub1spam1...,npub1spam2...
237 237
238**See**: [Configuration Reference](docs/reference/configuration.md) for complete details 238**See**: [Configuration Reference](docs/reference/configuration.md) for complete details
239 239
240## Defensive Measures & Rate Limiting
241
242ngit-grasp implements multiple layers of defense against abuse, spam, and denial-of-service attacks:
243
244**Per-Connection Rate Limits:**
245- Max 500 concurrent subscriptions per connection
246- Max 60 events published per minute per connection
247- Built-in to rust-nostr relay-builder
248
249**Per-IP Connection Monitoring:**
250- Tracks connections per IP address (default threshold: 10)
251- Flags potential abusers in logs and metrics
252- **Does NOT enforce limits** (monitoring only)
253- Privacy-preserving (IP addresses never exposed in Prometheus)
254
255**Content Filtering (Blacklists/Whitelists):**
256- **Event blacklist** - Block ALL events from specific authors (npubs)
257- **Repository blacklist** - Block specific repositories/developers/identifiers
258- **Repository whitelist** - Curate which repositories are accepted (GRASP-01 mode)
259- **Archive whitelist** - Mirror specific repositories (GRASP-05 mode)
260- See [Curation & Moderation](#curation--moderation) section above for details
261
262**Relay Sync Protection (GRASP-02):**
263- **Exponential backoff** - Failed connections: 5s → 10s → 20s → ... → 1 hour max
264- **Naughty list** - Track relays with infrastructure issues separately (12h expiry)
265- **Rate limit detection** - Auto 65s cooldown when remote relays rate limit us
266- **Domain throttling** - Max 5 concurrent, 30/min per domain for git data fetching
267
268**Event Validation:**
269- Strict GRASP-01 protocol validation via WritePolicy plugin system
270- Extensible for custom validation logic (has access to client IP address)
271
272**Total Connection Limit:**
273- Max 500 total connections (configurable via `NGIT_MAX_CONNECTIONS`)
274- Prevents connection exhaustion DoS attacks
275
276**Not Implemented:**
277- Per-IP connection limits (only monitored, not enforced)
278- Per-IP event rate limits (tracked per connection, not per IP)
279
280**See**: [Defensive Measures](docs/explanation/defensive-measures.md) for complete details and future enhancements.
281
240## Roadmap 282## Roadmap
241 283
242### GRASP-02 Enhancements 284### GRASP-02 Enhancements
diff --git a/docs/explanation/README.md b/docs/explanation/README.md
index f477b73..58cc46f 100644
--- a/docs/explanation/README.md
+++ b/docs/explanation/README.md
@@ -151,6 +151,48 @@ Explanation documentation helps you **understand concepts** and design decisions
151 151
152--- 152---
153 153
154### [Defensive Measures & Rate Limiting](defensive-measures.md)
155**Protection against abuse, spam, and denial-of-service attacks**
156
157**Topics:**
158- Connection and subscription management
159- Event publishing rate limits
160- Content filtering (blacklists/whitelists)
161- Event validation plugin system (WritePolicy/QueryPolicy)
162- Relay health management (naughty list, exponential backoff)
163- Privacy-preserving IP tracking
164- Future enhancements (per-IP rate limiting)
165
166**Read when:** You want to understand how ngit-grasp protects against abuse and what defensive features are available
167
168---
169
170### [GRASP-05 Archive Mode](grasp-05-archive.md)
171**Read-only mirroring of repositories**
172
173**Topics:**
174- Archive whitelist configuration
175- Archive-all mode
176- Read-only mode defaults
177- Use cases for backup/mirror relays
178
179**Read when:** You want to understand how to run an archive/backup relay
180
181---
182
183### [Deletion Requests](deletion-requests.md)
184**Handling repository and event deletion**
185
186**Topics:**
187- Deletion request architecture
188- Delete disrespector concept
189- Preventing left-pad scenarios
190- Archival policies
191
192**Read when:** You want to understand how ngit-grasp handles deletion events (planned feature)
193
194---
195
154## Planned Explanation Documentation 196## Planned Explanation Documentation
155 197
156### GRASP Protocol Design 198### GRASP Protocol Design
diff --git a/docs/explanation/defensive-analysis-of-other-relays.md b/docs/explanation/defensive-analysis-of-other-relays.md
deleted file mode 100644
index eb5f020..0000000
--- a/docs/explanation/defensive-analysis-of-other-relays.md
+++ /dev/null
@@ -1,669 +0,0 @@
1# Defensive Analysis of Other Nostr Relays
2
3**Issue:** d6ee - Defensive Relay Features
4**Date:** 2026-01-13
5**Purpose:** Research findings on rate limiting and defensive features in major Nostr relay implementations to inform ngit-grasp's defensive strategy.
6
7## Executive Summary
8
9This analysis examines how three major Nostr relay implementations (strfry, nostr-rs-relay, and khatru) handle rate limiting, connection management, and DoS protection. The goal is to identify industry best practices and concrete defaults to implement in ngit-grasp.
10
11**Key Finding:** Most relays have VERY permissive defaults or no limits at all, relying on operators to configure appropriately or use external reverse proxies. Only khatru provides opinionated secure-by-default settings.
12
13## Current State of ngit-grasp
14
15### Existing Defensive Features ✅
16
17#### 1. Connection Tracking & Abuse Detection
18**Location:** `src/metrics/connection.rs`
19
20- Per-IP connection counting via `ConnectionTracker`
21- Abuse threshold detection (default: 10 connections per IP)
22- Privacy-preserving metrics (IPs never exposed to Prometheus)
23- Tracks: total connections, unique IPs, flagged abusers
24
25**Configuration:**
26```rust
27// src/config.rs:366-372
28pub metrics_connection_per_ip_abuse_threshold: u32 = 10
29```
30
31**Limitations:**
32- ⚠️ **Display-only** - Detection happens but no enforcement
33- ⚠️ No connection limit enforcement
34- ⚠️ No per-IP subscription limits
35- ⚠️ No time-based rate limits
36
37#### 2. Git Remote Throttling (Purgatory Sync)
38**Location:** `src/purgatory/sync/throttle.rs`
39
40- Sophisticated domain-based rate limiting for outbound git fetch requests
41- Per-domain concurrent request limits (default: 5)
42- Per-domain rate limits (default: 30 requests/minute)
43- Round-robin queue management for fairness
44- Sliding window implementation
45
46**How it works:**
47```rust
48// Lines 159: Default throttle manager creation
49let throttle_manager = Arc::new(ThrottleManager::new(5, 30));
50
51// Lines 96-106: DomainThrottle tracks concurrent and rate limits
52pub fn new(domain: String, max_concurrent: u32, max_per_minute: u32)
53
54// Lines 113-129: Checks both limits before allowing requests
55```
56
57**Note:** Only applies to **outbound** git fetches, not incoming client connections.
58
59#### 3. Event Blacklisting
60**Location:** `src/config.rs` (lines 247-281, 658-668), `src/nostr/builder.rs` (lines 75-86, 495-505)
61
62- Event author blacklist - Block all events from specific npubs
63- Repository blacklist - Block announcements for specific repos/identifiers/npubs
64- Blacklist checked FIRST in write policy (overrides everything)
65
66**Configuration:**
67```bash
68NGIT_EVENT_BLACKLIST="npub1...,npub2..."
69NGIT_REPOSITORY_BLACKLIST="npub1.../identifier,identifier"
70```
71
72#### 4. Naughty List for Problematic Git Remotes
73**Location:** `src/sync/naughty_list.rs`
74
75- Tracks git remote domains with persistent infrastructure errors
76- Classifies errors (DNS, TLS, protocol, WebSocket)
77- Temporary blacklisting with expiration (default: 12 hours)
78- Used to skip unreliable relays during sync
79
80#### 5. Metrics & Monitoring
81**Location:** `src/metrics/mod.rs`
82
83- WebSocket connection metrics (total, duration, messages by type)
84- Git operation tracking (clone, fetch, push by status)
85- Nostr event metrics (received, stored, rejected by kind and reason)
86- Sync metrics (connections, attempts, failures)
87- Repository count tracking
88
89**No enforcement capabilities** - purely observability.
90
91### What's Missing ❌
92
931. **WebSocket Connection Limits** - No global or per-IP enforcement
942. **Subscription (REQ) Limits** - Clients can open unlimited REQs
953. **Event Rate Limiting** - No per-client/per-IP limits
964. **HTTP Endpoint Protection** - All endpoints unprotected
975. **Message Size Limits** - No WebSocket/event size caps
986. **Rate Limiting Crates** - No dependencies available
99
100### Integration Points for Implementation
101
102#### WebSocket Connection Accept Point
103**Location:** `src/http/mod.rs:402-424`
104
105```rust
106tokio::spawn(async move {
107 match hyper::upgrade::on(req).await {
108 Ok(upgraded) => {
109 // Track connection
110 m.connection_tracker().on_connect(addr.ip());
111 // ⬅️ COULD ADD: Check connection limits here
112
113 relay.take_connection(TokioIo::new(upgraded), addr).await
114
115 m.connection_tracker().on_disconnect(addr.ip());
116 }
117 }
118});
119```
120
121This is the ideal location to add connection limit enforcement before accepting the WebSocket upgrade.
122
123## Analysis of Other Relays
124
125### 1. strfry (C++, by hoytech)
126
127**Repository:** https://github.com/hoytech/strfry
128**Stars:** 623 | **Focus:** High performance, custom LMDB schema
129
130#### Configuration File: `strfry.conf`
131
132##### Event Limits
133```conf
134maxEventSize = 65536 # 64 KB - Maximum normalized JSON size
135maxNumTags = 2000 # Maximum number of tags allowed
136maxTagValSize = 1024 # 1 KB - Maximum tag value size
137rejectEventsNewerThanSeconds = 900 # 15 minutes - Reject future events
138rejectEventsOlderThanSeconds = 94608000 # ~3 years - Reject old events
139rejectEphemeralEventsOlderThanSeconds = 60 # 60s - Ephemeral cutoff
140ephemeralEventsLifetimeSeconds = 300 # 5 minutes - Ephemeral retention
141```
142
143##### Connection & WebSocket Limits
144```conf
145maxWebsocketPayloadSize = 131072 # 128 KB - Max WebSocket frame size
146nofiles = 1000000 # OS-limit on max open files/sockets
147autoPingSeconds = 55 # WebSocket PING frequency
148```
149
150##### Query & Subscription Limits
151```conf
152maxReqFilterSize = 200 # Max filters allowed in a REQ
153maxSubsPerConnection = 20 # Max concurrent subscriptions per connection
154maxFilterLimit = 500 # Max records returned per filter
155queryTimesliceBudgetMicroseconds = 10000 # 10ms - Max CPU per query timeslice
156```
157
158##### Thread Pool Configuration
159```conf
160ingester = 3 # Route incoming requests, validate events/sigs
161reqWorker = 3 # Handle initial DB scan for events
162reqMonitor = 3 # Handle filtering of new events
163negentropy = 2 # Handle negentropy protocol messages
164```
165
166##### Compression
167```conf
168compression:
169 enabled = true # permessage-deflate compression
170 slidingWindow = true # Maintains sliding window (better compression, more memory)
171```
172
173#### Implementation Approach
174
175**Architecture Highlights:**
176- **No explicit per-IP rate limiting** in config - relies on external reverse proxy or plugin system
177- **Query pause/resume**: Long-running queries can be paused (stored as few hundred to few thousand bytes) and resumed when socket buffer drains
178- **Query prioritization**: New queries processed before resuming queries that already ran >10ms
179- **LMDB-based**: Zero-copy access from page cache, read path requires no locking
180- **Batching**: Events written in batches with single fsync for efficiency
181- **Plugin system**: External programs (any language) can implement write policies via line-based JSON interface
182
183**Rate Limiting Strategy:**
184- Delegates to external plugins for event acceptance policies
185- Relies on reverse proxy (nginx, etc.) for connection-level rate limiting
186- Focus on efficient query handling rather than built-in rate limits
187
188**Strengths:**
189- Extremely high performance
190- Sophisticated query engine with pause/resume
191- Flexible plugin system
192
193**Weaknesses:**
194- No built-in connection or event rate limiting
195- Requires external infrastructure for DoS protection
196- More complex to deploy securely
197
198---
199
200### 2. nostr-rs-relay (Rust, by scsibug/gheartsfield)
201
202**Repository:** https://git.sr.ht/~gheartsfield/nostr-rs-relay
203**Focus:** Rust implementation with SQLite or PostgreSQL backend
204
205#### Configuration File: `config.toml`
206
207##### Rate Limiting
208```toml
209# DEFAULT: 0 (unlimited) - Events created per second (server-wide, averaged over 1 minute)
210# RECOMMENDED: Set to low value like 5 for public relays
211messages_per_sec = 0
212
213# DEFAULT: 0 (unlimited) - Client subscriptions created (averaged over 1 minute)
214# RECOMMENDED: Set to low value like 10
215subscriptions_per_min = 0
216
217# DEFAULT: 0 (unlimited) - Concurrent DB connections per client
218db_conns_per_client = 0
219```
220
221##### Event & Message Size Limits
222```toml
223max_event_bytes = 131072 # 128 KB - Maximum EVENT message size
224max_ws_message_bytes = 131072 # 128 KB - Maximum WebSocket message
225max_ws_frame_bytes = 131072 # 128 KB - Maximum WebSocket frame
226```
227
228##### Buffering & Backpressure
229```toml
230broadcast_buffer = 16384 # Buffer for subscribers (prevents slow readers consuming memory)
231event_persist_buffer = 4096 # Buffer for DB commits (provides backpressure if DB writes slow)
232max_blocking_threads = 16 # Limit blocking threads for DB connections
233```
234
235##### Time-based Restrictions
236```toml
237# Reject events with timestamps this far in future
238# RECOMMENDED: 30 minutes, but defaults to allowing any date if not set
239reject_future_seconds = 1800 # 30 minutes
240```
241
242##### Connection Pool
243```toml
244min_conn = 4 # Minimum reader connections
245max_conn = 8 # Maximum reader connections (recommended: approx number of cores)
246```
247
248##### WebSocket
249```toml
250ping_interval = 300 # 5 minutes - WebSocket ping interval
251```
252
253##### Event Kind Filtering
254```toml
255# Optional - Specific event kinds to discard
256event_kind_blacklist = []
257
258# Optional - Only accept these event kinds
259event_kind_allowlist = []
260
261# Rejects imprecise requests (kind-only, author-only) to improve outbox model adoption
262limit_scrapers = false
263```
264
265#### Implementation Approach
266
267**Architecture Highlights:**
268- **Tokio async runtime**: Non-blocking I/O
269- **SQLite or PostgreSQL**: Configurable database backend
270- **gRPC plugin support**: External authorization service via `event_admission_server`
271- **Rate limiting**: Averaged over time windows (1 minute), applied server-wide
272- **No per-IP limits by default**: Relies on configuration or external proxy
273
274**Rate Limiting Strategy:**
275- Provides configuration options but defaults to UNLIMITED
276- Operators MUST configure limits for production use
277- Time-window averaging (1 minute) for rate calculations
278- Server-wide limits, not per-IP
279
280**Strengths:**
281- Well-documented configuration options
282- Flexible database backends
283- Buffer-based backpressure mechanism
284
285**Weaknesses:**
286- **Dangerously permissive defaults** - unlimited by default
287- No per-IP rate limiting built-in
288- Requires active operator configuration for security
289
290---
291
292### 3. khatru (Go framework, by fiatjaf)
293
294**Repository:** https://github.com/fiatjaf/khatru
295**Stars:** 133 | **Focus:** Framework for custom relays, not a standalone relay
296
297#### Default Configuration (from `relay.go` and `policies/`)
298
299##### Built-in Defaults (NewRelay)
300```go
301ReadBufferSize: 1024 // bytes
302WriteBufferSize: 1024 // bytes
303WriteWait: 10 * time.Second // Time allowed to write message to peer
304PongWait: 60 * time.Second // Time allowed to read next pong from peer
305PingPeriod: 30 * time.Second // Send pings with this period (must be < PongWait)
306MaxMessageSize: 512000 // ~500 KB - Maximum message size from peer
307```
308
309##### Sane Defaults Policy (`ApplySaneDefaults`)
310
311**Event Rate Limiting:**
312```go
313EventIPRateLimiter(
314 tokensPerInterval: 2, // events
315 interval: 180, // 3 minutes (180 seconds)
316 maxTokens: 10 // burst capacity
317)
318// Effective rate: ~0.67 events/minute per IP, burst up to 10
319```
320
321**Filter (REQ) Rate Limiting:**
322```go
323FilterIPRateLimiter(
324 tokensPerInterval: 20, // requests
325 interval: 60, // 1 minute
326 maxTokens: 100 // burst capacity
327)
328// Effective rate: 20 REQs/minute per IP, burst up to 100
329```
330
331**Connection Rate Limiting:**
332```go
333ConnectionRateLimiter(
334 tokensPerInterval: 1, // connection
335 interval: 300, // 5 minutes
336 maxTokens: 100 // burst capacity
337)
338// Effective rate: 1 connection per 5 minutes per IP, burst up to 100
339```
340
341**Event Policies:**
342- `RejectEventsWithBase64Media` - Rejects events containing `data:image/` or `data:video/`
343- `NoComplexFilters` - Rejects filters with >4 total items AND >2 tag filters
344
345#### Available Rate Limiter Functions
346
3471. **`EventIPRateLimiter(tokensPerInterval, interval, maxTokens)`** - Rate limit events by IP
3482. **`EventPubKeyRateLimiter(tokensPerInterval, interval, maxTokens)`** - Rate limit by pubkey
3493. **`EventAuthedPubKeyRateLimiter(tokensPerInterval, interval, maxTokens)`** - Rate limit authenticated users
3504. **`ConnectionRateLimiter(tokensPerInterval, interval, maxTokens)`** - Rate limit new connections
3515. **`FilterIPRateLimiter(tokensPerInterval, interval, maxTokens)`** - Rate limit REQ messages
352
353#### Other Available Policies
354
355**Event Rejection:**
356- `PreventTooManyIndexableTags(max, ignoreKinds, onlyKinds)` - Limit indexable tags
357- `PreventLargeTags(maxTagValueLen)` - Reject large tag values (default: 100 bytes)
358- `RestrictToSpecifiedKinds(allowEphemeral, kinds...)` - Whitelist specific kinds
359- `PreventTimestampsInThePast(threshold)` - Reject old events
360- `PreventTimestampsInTheFuture(threshold)` - Reject future-dated events
361
362**Filter Policies:**
363- `NoComplexFilters` - Max 4 items total, max 2 tag filters
364- `NoEmptyFilters` - Require at least one filter criterion
365- `AntiSyncBots` - Require author for kind:1 queries
366- `NoSearchQueries` - Disable search functionality
367- `MustAuth` - Require NIP-42 authentication
368
369#### Implementation Approach
370
371**Architecture Highlights:**
372- **Token bucket algorithm**: Implemented in `startRateLimitSystem[K]` using atomic counters
373- **Per-key tracking**: Uses `xsync.MapOf` for concurrent map access
374- **Automatic cleanup**: Goroutine periodically decrements buckets and removes zero/negative entries
375- **Framework design**: Relay operators compose policies by adding functions to hook slices
376- **No global defaults enforced**: Operators must explicitly apply policies
377- **Lightweight**: Pure Go, no external dependencies for rate limiting
378
379**Rate Limiting Strategy:**
380- **Most opinionated defaults** of all three relays
381- Token bucket with automatic refill
382- Per-IP tracking for all limits
383- Composable policy system
384
385**Strengths:**
386- **Secure by default** when using `ApplySaneDefaults`
387- Very clear, composable policy API
388- Lightweight token bucket implementation
389- Well-suited for custom relay development
390
391**Weaknesses:**
392- Framework, not standalone relay (requires custom code)
393- Aggressive defaults might be too restrictive for some use cases
394- Go-based (not applicable to ngit-grasp, but worth noting)
395
396---
397
398## Comparative Summary
399
400| Feature | strfry | nostr-rs-relay | khatru (sane defaults) |
401|---------|--------|----------------|------------------------|
402| **Max Event Size** | 64 KB | 128 KB | 500 KB |
403| **Max WS Message** | 128 KB | 128 KB | 500 KB |
404| **Max Subs/Connection** | 20 | ∞ (unlimited) | ∞ (unlimited) |
405| **Max Filters/REQ** | 200 | ∞ (unlimited) | Complexity-based (4 items, 2 tags) |
406| **Event Rate Limit** | Plugin-based | 0 (unlimited default) | **2 per 3min per IP** |
407| **REQ Rate Limit** | None built-in | 0 (unlimited default) | **20/min per IP** |
408| **Connection Rate** | None built-in | None | **1 per 5min per IP** |
409| **Future Event Rejection** | 15 minutes | 30 minutes | Policy-based |
410| **Rate Limit Technique** | External plugins | Averaged over 1 minute | Token bucket (atomic) |
411| **Backpressure** | Query pause/resume | Buffering + blocking | Framework hooks |
412| **Default Philosophy** | Permissive + plugins | **Dangerously permissive** | **Conservative** |
413| **Per-IP Tracking** | Metrics only | No | Yes (all limits) |
414| **Production Ready** | Yes (with config) | Yes (with config) | Framework (DIY) |
415
416## Rust Rate Limiting Ecosystem
417
418### Governor Crate
419
420**Repository:** https://github.com/boinkor-net/governor
421**Documentation:** https://docs.rs/governor/
422**Version:** 0.10.4 (stable)
423
424#### Overview
425
426Governor is the most popular rate limiting library in the Rust ecosystem. It implements the **Generic Cell Rate Algorithm (GCRA)**, which is equivalent to a token bucket but more space-efficient.
427
428#### Features
429
430- **Thread-safe**: Uses atomic operations for lock-free operation
431- **Per-key rate limiting**: Built-in support via `DefaultKeyedRateLimiter`
432- **Direct rate limiting**: Single-state limiter via `DefaultDirectRateLimiter`
433- **Async/await support**: Works with Tokio and other async runtimes
434- **Jitter support**: Built-in jitter for avoiding thundering herd
435- **Dashmap integration**: Uses `dashmap` for concurrent key-value storage
436- **Quota system**: Flexible quota definitions (per second, minute, hour, etc.)
437
438#### Example Usage
439
440```rust
441use std::num::NonZeroU32;
442use nonzero_ext::*;
443use governor::{Quota, RateLimiter};
444
445// Simple direct rate limiter
446let mut lim = RateLimiter::direct(Quota::per_second(nonzero!(50u32)));
447assert_eq!(Ok(()), lim.check());
448
449// Keyed rate limiter (e.g., per IP)
450use governor::state::{InMemoryState, keyed::DefaultKeyedRateLimiter};
451use std::net::IpAddr;
452
453let limiter = RateLimiter::keyed(Quota::per_minute(nonzero!(10u32)));
454let ip: IpAddr = "192.168.1.1".parse().unwrap();
455if limiter.check_key(&ip).is_err() {
456 // Rate limit exceeded for this IP
457}
458```
459
460#### Dependencies
461
462- `cfg-if` - Configuration
463- `dashmap` (optional) - Concurrent hashmap for keyed limiters
464- `parking_lot` (optional) - More efficient mutexes
465- `quanta` (optional) - High-resolution timing
466- `portable-atomic` - Atomic operations
467- `nonzero_ext` - NonZero integer utilities
468
469#### Pros
470
471- Industry standard, widely used
472- Well-maintained and documented
473- Efficient implementation (atomic operations)
474- Flexible quota system
475- Works with async
476
477#### Cons
478
479- Additional dependency (though well-vetted)
480- Slightly more complex API than hand-rolled solution
481- Uses more memory for keyed limiters with many keys
482
483### Alternative: Extend Existing ThrottleManager
484
485ngit-grasp already has a working rate limiter in `src/purgatory/sync/throttle.rs`:
486
487```rust
488pub struct ThrottleManager {
489 throttles: DashMap<String, Mutex<DomainThrottle>>,
490 max_concurrent_per_domain: u32,
491 max_per_minute_per_domain: u32,
492}
493```
494
495**Sliding window implementation:**
496```rust
497let recent_count = self.request_times
498 .iter()
499 .filter(|t| now.duration_since(**t) < window)
500 .count();
501recent_count < self.max_per_minute as usize
502```
503
504#### Pros of Reusing
505
506- No new dependencies
507- Already proven to work in production
508- Team familiarity with the code
509- Consistent patterns across codebase
510
511#### Cons of Reusing
512
513- More maintenance burden
514- May not handle all edge cases
515- Less efficient than GCRA algorithm
516- Would need to be generalized for different use cases
517
518## Recommendations for ngit-grasp
519
520### 1. Rate Limiting Library Choice
521
522**Recommendation: Use `governor` crate**
523
524**Reasoning:**
525- Industry standard with proven track record
526- More efficient than our sliding window approach
527- Handles edge cases we might miss
528- Good async support for our Tokio-based architecture
529- Active maintenance and community support
530- Minimal overhead (atomic operations, lock-free)
531
532### 2. Default Philosophy
533
534**Recommendation: Conservative defaults with clear relaxation path**
535
536**Reasoning:**
537- Following khatru's approach: secure by default
538- Better to start restrictive and allow operators to relax
539- Prevents "configuration debt" where operators forget to harden
540- ngit-grasp is infrastructure software - security should be default
541- Clear documentation on how to adjust for different use cases
542
543### 3. Proposed Default Values
544
545Based on research and ngit-grasp's specific use case (git-over-nostr relay):
546
547```toml
548# Connection Limits
549NGIT_MAX_CONNECTIONS_GLOBAL = 1000
550NGIT_MAX_CONNECTIONS_PER_IP = 10
551NGIT_CONNECTION_RATE_PER_IP = "5/minute" # 5 connections per minute per IP
552
553# Subscription (REQ) Limits
554NGIT_MAX_SUBSCRIPTIONS_PER_CONNECTION = 20
555NGIT_MAX_FILTERS_PER_REQ = 100
556NGIT_SUBSCRIPTION_RATE_PER_IP = "30/minute" # 30 REQs per minute per IP
557
558# Event Ingestion Limits
559NGIT_EVENT_RATE_PER_IP = "10/minute" # 10 events per minute per IP
560NGIT_EVENT_RATE_BURST = 30 # Allow burst up to 30
561NGIT_MAX_EVENT_SIZE_BYTES = 131072 # 128 KB (matches nostr-rs-relay)
562NGIT_MAX_WEBSOCKET_MESSAGE_BYTES = 131072 # 128 KB
563
564# HTTP Endpoint Protection
565NGIT_HTTP_RATE_PER_IP = "60/minute" # 60 HTTP requests per minute per IP
566
567# Time-based Event Restrictions
568NGIT_REJECT_EVENTS_NEWER_THAN_SECONDS = 900 # 15 minutes (matches strfry)
569NGIT_REJECT_EVENTS_OLDER_THAN_SECONDS = 94608000 # ~3 years (matches strfry)
570
571# Whitelist
572NGIT_RATE_LIMIT_WHITELIST_IPS = "" # Comma-separated IPs exempt from rate limits
573```
574
575**Rationale for values:**
576- **Connections:** 10/IP is conservative but allows legitimate multi-client use
577- **Subscriptions:** 20/connection matches strfry, reasonable for typical clients
578- **Events:** 10/min is more permissive than khatru (2 per 3min) but still protective
579- **Message size:** 128 KB matches industry standard (nostr-rs-relay, strfry's WS message size)
580- **HTTP:** 60/min allows normal browsing without allowing scraping abuse
581
582### 4. Implementation Phases
583
584**Phase 1: Core DoS Prevention (High Priority)**
585- Connection limits (global and per-IP)
586- Basic event rate limiting (per-IP)
587- Message size limits
588- WebSocket message limits
589
590**Phase 2: Advanced Subscription Protection (Medium Priority)**
591- Subscription limits per connection
592- Filter complexity limits
593- Subscription rate limiting per IP
594
595**Phase 3: HTTP & Advanced Features (Lower Priority)**
596- HTTP endpoint rate limiting
597- IP whitelisting
598- Fine-grained metrics for rate limit hits
599- Configurable rejection messages
600
601### 5. Configuration Management
602
603Following AGENTS.md requirements, ALL configuration changes must update:
604
6051. **`src/config.rs`** - Add fields with proper env var names and defaults
6062. **`docs/reference/configuration.md`** - Document each option with examples
6073. **`nix/module.nix`** - Add NixOS options in `instanceOptions`
6084. **`.env.example`** - Add options with comments
609
610### 6. Metrics & Observability
611
612Add Prometheus metrics for:
613- `ngit_rate_limit_hits_total{limit_type, reason}` - Counter of rate limit hits
614- `ngit_connections_active` - Current active connections
615- `ngit_connections_per_ip` - Histogram of connections per IP
616- `ngit_subscriptions_active` - Current active subscriptions
617- `ngit_rate_limit_whitelisted_requests_total` - Requests from whitelisted IPs
618
619### 7. Testing Strategy
620
621- **Unit tests**: Test rate limiter logic in isolation
622- **Integration tests**: Use `TestRelay` to verify limits enforced
623- **Fuzz testing**: Random patterns to ensure no panics
624- **Load testing**: Verify performance under rate-limited load
625- **Metrics verification**: Ensure metrics accurately reflect limit hits
626
627## Common Attack Patterns
628
629Based on production relay operator experiences:
630
6311. **Connection flooding** - Open thousands of connections to exhaust file descriptors
6322. **Subscription spam** - Open many REQs per connection to consume memory
6333. **Event spam** - Submit events rapidly to overwhelm storage/processing
6344. **Large message attacks** - Send huge WebSocket frames to consume bandwidth
6355. **Complex filter DoS** - Submit filters with thousands of authors/kinds to slow queries
6366. **Slow read attack** - Connect but never read, filling write buffers
6377. **Time-based attacks** - Events with extreme timestamps to bypass caching
6388. **Metrics scraping** - Hammer `/metrics` endpoint to consume CPU
639
640All of these are addressed by the proposed implementation.
641
642## Open Questions
643
6441. **Should we implement per-pubkey rate limiting** (like khatru) in addition to per-IP?
645 - Useful for authenticated scenarios
646 - Requires NIP-42 AUTH support
647 - Could be Phase 4
648
6492. **Should ephemeral events have different limits?**
650 - strfry has special handling for ephemeral events
651 - Consider separate retention and rate limits
652
6533. **Should we support dynamic limit adjustment?**
654 - Allow hot-reloading of limits without restart
655 - Useful for responding to active attacks
656
6574. **How should we handle IPv6?**
658 - Rate limit by /64 or /128?
659 - Per-address might be too granular for IPv6
660
661## References
662
663- strfry repository: https://github.com/hoytech/strfry
664- strfry config: https://github.com/hoytech/strfry/blob/master/strfry.conf
665- nostr-rs-relay repository: https://git.sr.ht/~gheartsfield/nostr-rs-relay
666- khatru repository: https://github.com/fiatjaf/khatru
667- khatru policies: https://github.com/fiatjaf/khatru/tree/master/policies
668- governor crate: https://docs.rs/governor/
669- GCRA algorithm: https://en.wikipedia.org/wiki/Generic_cell_rate_algorithm
diff --git a/docs/explanation/defensive-measures.md b/docs/explanation/defensive-measures.md
index f7abc30..51f7278 100644
--- a/docs/explanation/defensive-measures.md
+++ b/docs/explanation/defensive-measures.md
@@ -2,6 +2,8 @@
2 2
3This document describes the defensive measures implemented in ngit-grasp to protect against abuse, spam, and denial-of-service attacks. 3This document describes the defensive measures implemented in ngit-grasp to protect against abuse, spam, and denial-of-service attacks.
4 4
5**Note:** A point-in-time analysis of defensive measures in other Nostr relays (strfry, nostr-rs-relay, khatru) was conducted to inform these design decisions. The analysis examined connection limits, rate limiting approaches, and per-IP enforcement strategies across the ecosystem.
6
5## Overview 7## Overview
6 8
7ngit-grasp employs multiple layers of defense: 9ngit-grasp employs multiple layers of defense:
@@ -35,7 +37,7 @@ These limits prevent individual connections from overwhelming the relay.
35- **Privacy:** IP addresses never exposed in Prometheus metrics, only aggregate counts 37- **Privacy:** IP addresses never exposed in Prometheus metrics, only aggregate counts
36- Logs warnings when threshold exceeded 38- Logs warnings when threshold exceeded
37 39
38**Future:** Could be extended to enforce per-IP connection limits. 40**Note on enforcement:** Per-IP connection limits are not built into rust-nostr relay-builder (tracks per WebSocket connection, not per IP). If abuse is detected via metrics, enforcement should be implemented as a PR to rust-nostr/relay-builder to benefit the entire Nostr ecosystem, rather than custom code in ngit-grasp.
39 41
40### Content Filtering (Blacklists/Whitelists) 42### Content Filtering (Blacklists/Whitelists)
41 43
@@ -111,69 +113,21 @@ These limits prevent individual connections from overwhelming the relay.
111 113
112**To implement:** Would require custom middleware/WritePolicy to aggregate across connections from the same IP. 114**To implement:** Would require custom middleware/WritePolicy to aggregate across connections from the same IP.
113 115
114### Total Connection Limit
115
116**Status:** Supported by relay-builder but not currently configured in ngit-grasp.
117
118**To implement:** Add `max_connections(n)` to relay builder configuration.
119
120### Query Filtering 116### Query Filtering
121 117
122**Status:** QueryPolicy trait available but not currently used. 118**Status:** QueryPolicy trait available but not currently used.
123 119
124**Potential uses:** Rate limit queries per IP, block expensive queries, restrict access to certain event kinds. 120**Potential uses:** Rate limit queries per IP, block expensive queries, restrict access to certain event kinds.
125 121
126## Future Enhancements: Per-IP Rate Limiting (Deferred) 122## Future Enhancements
127
128### Decision: Defer Until Abuse Detected
129
130After comprehensive review (2026-01-14), we decided to defer per-IP rate limiting (Phase 2 & 3) until abuse patterns are detected in production.
131
132**Current protection (Phase 1):**
133- Per-connection limits: 500 subscriptions, 60 events/min
134- Total connection limit: 500 (configurable via `NGIT_MAX_CONNECTIONS`)
135- Connection monitoring: Tracks IPs, flags abuse at 10 connections
136- Content filtering: Event blacklist, repository blacklist/whitelist
137
138**Deferred features (Phase 2 & 3):**
139- Per-IP connection enforcement (reject after 10 connections)
140- Per-IP event rate limiting (reject after 100 events/min)
141 123
142### Rationale for Deferral 124### Per-IP Rate Limiting
143
1441. **Config-only approach sufficient** - Total connection limit addresses primary DoS vector
1452. **Git relay context** - Developer users less likely to abuse than general public
1463. **Existing protections strong** - Per-connection limits + content filtering already robust
1474. **Data-driven approach** - Monitor ConnectionTracker metrics, implement if needed
1485. **Minimal maintenance** - Avoid custom rate limiting code until proven necessary
149
150### Implementation Path if Needed
151
152**Preferred approach:** Contribute to rust-nostr/relay-builder as PR
153- Propose IP-based rate limiting as optional feature
154- Let upstream maintain the code
155- Benefits entire Nostr ecosystem
156
157**Fallback:** Implement in ngit-grasp
158- Per-IP connection enforcement via actix middleware
159- Per-IP event rate limiting via token bucket in WritePolicy
160- See issue d6ee for detailed implementation plan
161
162### Monitoring for Abuse
163
164Watch these metrics to determine if Phase 2 is needed:
165- `ngit_connections_per_ip` - IPs exceeding 10 connections
166- `ngit_flagged_abusers` - IPs flagged by ConnectionTracker
167- Event publishing patterns from single IPs
168 125
169**Trigger for Phase 2:** If abuse detected for 2-4 weeks after Phase 1 deployment 126Per-IP connection and event rate limiting were considered but deferred until abuse is detected in production. The current protections (per-connection limits, total connection limit, content filtering) are sufficient for the git relay use case.
170 127
171### Related Work 128**Decision rationale:** The primary DoS vector is connection exhaustion, which is addressed by the total connection limit (`NGIT_MAX_CONNECTIONS`). Per-IP enforcement would require custom middleware in rust-nostr relay-builder (which currently tracks limits per WebSocket connection, not per IP). If abuse is detected via the per-IP monitoring metrics, enforcement should be implemented as a PR to rust-nostr/relay-builder to benefit the entire Nostr ecosystem.
172 129
173**Git endpoint throttling:** Separate concern, tracked in issue ff38 130**Related:** Git endpoint throttling (issue ff38) is a separate concern with different requirements.
174- Git HTTP endpoints have different threat model (bandwidth/CPU intensive)
175- Requires separate IP-based throttling (5 concurrent, 30/min per IP)
176- No interaction with relay code
177 131
178## Summary Table 132## Summary Table
179 133
@@ -198,7 +152,7 @@ Watch these metrics to determine if Phase 2 is needed:
198| Naughty list | ✅ Active | Yes | Yes (12h default) | 152| Naughty list | ✅ Active | Yes | Yes (12h default) |
199| Rate limit detection | ✅ Active | Yes | Automatic | 153| Rate limit detection | ✅ Active | Yes | Automatic |
200| Domain throttling | ✅ Active | Yes | Hardcoded (5/30) | 154| Domain throttling | ✅ Active | Yes | Hardcoded (5/30) |
201| **Deferred (Phase 2)** | 155| **Not Implemented** |
202| Per-IP connection limit | ⚠️ Deferred | No | - | 156| Per-IP connection limit | ⚠️ Deferred | No | - |
203| Per-IP rate limiting | ⚠️ Deferred | No | - | 157| Per-IP rate limiting | ⚠️ Deferred | No | - |
204| Query filtering | ⚠️ Available | No | Not implemented | 158| Query filtering | ⚠️ Available | No | Not implemented |