blob: 3b1b1ac2511eefbc8d33c057dfff2238ce55f2a8 (
plain)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
|
# Monitoring
ngit-grasp exposes Prometheus metrics at `/metrics` for monitoring WebSocket connections, Git operations, Nostr events, and system health.
## Architecture
```mermaid
flowchart TB
subgraph ngit-grasp
HTTP[HTTP Service]
WS[WebSocket Handler]
GIT[Git Handlers]
RELAY[Nostr Relay]
subgraph Metrics Module
REG[Prometheus Registry]
CT[ConnectionTracker]
MC[Metric Counters]
end
ME[/metrics endpoint]
end
subgraph External
PROM[Prometheus Server]
GRAF[Grafana]
ADMIN[Admin Browser]
end
HTTP --> ME
WS --> CT
WS --> MC
GIT --> MC
RELAY --> MC
CT --> REG
MC --> REG
REG --> ME
PROM -->|scrape /metrics| ME
GRAF -->|query| PROM
ADMIN -->|view dashboards| GRAF
```
## Configuration
| Option | CLI Flag | Environment Variable | Default | Description |
|--------|----------|---------------------|---------|-------------|
| Metrics enabled | `--metrics-enabled` | `NGIT_METRICS_ENABLED` | `true` | Enable /metrics endpoint |
| Abuse threshold | `--abuse-threshold` | `NGIT_ABUSE_THRESHOLD` | `10` | Max connections per IP before flagging |
| Top N repos | `--top-n-repos` | `NGIT_TOP_N_REPOS` | `10` | Number of top bandwidth repos to track |
## Privacy Model
IP addresses are **never exposed in Prometheus metrics**. The connection tracker maintains per-IP counts internally only for abuse detection:
| Data | Exposed in Metrics? |
|------|---------------------|
| Total connections | ✅ Yes |
| Unique IP count | ✅ Yes |
| Flagged abuser count | ✅ Yes |
| Actual IP addresses | ❌ No (internal only) |
| IP + abuse flag | ⚠️ Logs only (when flagged) |
When an IP exceeds the abuse threshold, a warning is logged but the IP is never exposed via Prometheus.
## Deployment
See [Prometheus Setup Guide](../how-to/prometheus-setup.md) for NixOS configuration and Grafana dashboard provisioning.
## Future: Load-Based Sync Scheduling (GRASP-02)
The metrics infrastructure enables future load-based scheduling for GRASP-02 sync jobs:
```mermaid
flowchart TD
SYNC[Sync Manager] --> CHECK{Check Load}
CHECK --> MET[Query Metrics]
MET --> CONN{Connections > N?}
CONN -->|Yes| DELAY[Delay 5 min]
CONN -->|No| RUN[Run Sync Job]
DELAY --> CHECK
```
## Future: Loki for Detailed Logging
For detailed per-repository investigation at scale, consider adding **Loki** (log aggregation):
- Structured logging with tracing crate already in place
- Loki queries enable ad-hoc deep dives (e.g., find all transfers > 10MB)
- Pairs with Prometheus for long-term trends
## Future: Sync Metrics (GRASP-02)
When GRASP-02 proactive sync is implemented, additional metrics will track:
- Events received from sync (live vs catchup)
- Active outbound relay connections
- Catchup gap (events found during catchup indicating sync failures)
|