upleb.uk

Public git repos — served from a NIP-34 GRASP relay at git.upleb.uk

summaryrefslogtreecommitdiff
path: root/docs
diff options
context:
space:
mode:
Diffstat (limited to 'docs')
-rw-r--r--docs/E2E_FIX_PLAN.md177
-rw-r--r--docs/TOLLGATE_CORE_DESIGN.md446
-rw-r--r--docs/WPA_AUTODETECT_PLAN.md102
3 files changed, 725 insertions, 0 deletions
diff --git a/docs/E2E_FIX_PLAN.md b/docs/E2E_FIX_PLAN.md
new file mode 100644
index 0000000..52f8305
--- /dev/null
+++ b/docs/E2E_FIX_PLAN.md
@@ -0,0 +1,177 @@
1# E2E Test Stability Fix Plan
2
3## Problem Statement
4
5E2E tests on physical boards are failing due to five root causes:
61. **LWIP socket exhaustion** (RC-0) — `LWIP_MAX_SOCKETS=10` was too low for two httpd servers + DNS + DoT + wifistr WebSockets
72. **Over-tuned httpd settings** (RC-1) — setting `max_open_sockets=2` and `keep_alive_enable=false` caused socket leaks by interfering with ESP-IDF's internal session management
83. **Owner auto-grant** (RC-2) — makes "no internet before auth" tests non-deterministic
94. **No boot-ready probe** (RC-3) — tests start before HTTP servers are up
105. **Serial monitoring resets** (RC-4) — Python `serial.Serial()` toggles DTR/RTS on USB-Serial/JTAG boards, causing chip resets mid-operation
11
12### Baseline Test Results (Board A, before fixes)
13
14| Suite | Pass | Fail | Notes |
15|---|---|---|---|
16| Smoke | 2/6 | 4 | Port 80 unresponsive, cascading failures |
17| Network | 4/7 | 3 | DNS forward + ping after auth (timing) |
18| API | 16/20 | 4 | Portal port 80 slow/crashed, captive URIs |
19| DNS+Firewall | 15/16 | 1 | Ping after auth (timing) |
20| Reset-Auth | 12/15 | 3 | Allotment was 0 (fixed), 2nd payment |
21| Session | 14/14 | 0 | Perfect |
22| Phase 2 | 12/12 | 0 | Perfect |
23
24### Verified Test Results (Board ACM2, after all fixes, commit `144b48f`)
25
26All API endpoints verified working on AP IP `10.192.45.1` with 2-3s delays between requests:
27- `GET /usage` — returns session/client counts (50/50 sequential requests passed)
28- `GET /portal-config` — returns `{priceSats, stepMs, mintUrl, metric, stepBytes}`
29- `GET /whoami` — returns client IP
30- `GET /grant_access` — grants firewall access
31- `POST /` (payment) — accepts Cashu token, returns `kind:1022`
32- `GET /` (port 80 portal) — returns 3829 bytes HTML
33- `GET /reset_authentication` — clears all sessions and firewall rules
34
35Full payment flow verified: check → pay → verify → grant → portal → reset → verify clean state.
36
37---
38
39## Root Causes
40
41### RC-0: LWIP socket exhaustion (FIXED)
42
43`CONFIG_LWIP_MAX_SOCKETS=10` in sdkconfig. Socket budget at steady state:
44
45| Component | Sockets | Notes |
46|---|---|---|
47| Captive portal (port 80) | 5 | 1 listen + 4 workers (default `max_open_sockets`) |
48| API server (port 2121) | 5 | 1 listen + 4 workers |
49| DNS server (UDP 53) | 1 | |
50| DoT reject (TCP 853) | 1 | |
51| wifistr WebSocket x2 | 2 | relay.damus.io + nos.lol |
52| **Total** | **14** | **Exceeds LWIP_MAX_SOCKETS=10 by 4** |
53
54**Fix** (commit `144b48f`): Set `CONFIG_LWIP_MAX_SOCKETS=20` (matching standalone tollgate). Use default `max_open_sockets=4` on both servers. Previous fix tried `max_open_sockets=2` which caused worse problems (see RC-1).
55
56### RC-1: Over-tuned httpd settings (FIXED)
57
58Initial fix reduced `max_open_sockets` to 2 and added `keep_alive_enable=false`, `linger_timeout=0`. This caused socket leaks — ESP-IDF's httpd manages its own session pool internally, and overriding these settings interfered with socket lifecycle management.
59
60**Symptoms**: Board works for 10-20 requests, then all HTTP becomes unresponsive. Sockets accumulate in CLOSE_WAIT/TIME_WAIT and never get freed.
61
62**Fix** (commit `144b48f`): Reverted to ESP-IDF defaults for all httpd settings except `stack_size=16384` and `max_uri_handlers`. Default `max_open_sockets=4` and `keep_alive_enable=true` (default) work correctly.
63
64### RC-2: Owner auto-grant (FIXED)
65
66`tollgate_core_client_connected()` granted firewall access to the first WiFi client unconditionally. IP was passed as `0` (bug), creating nondeterministic behavior.
67
68**Fix** (commit `c89ab31`): Removed `tollgate_core_fw_grant()` call from `client_connected()`. Owner tracking kept for logging.
69
70### RC-3: No boot-ready probe (PENDING)
71
72Tests use fixed sleeps after flash. No polling for HTTP server readiness.
73
74**Fix**: Add `arch-wait-ready` Makefile target that polls `:2121/usage`.
75
76### RC-4: Serial monitoring resets boards (DISCOVERED)
77
78Python `serial.Serial()` on USB-Serial/JTAG ESP32-S3 boards toggles DTR/RTS during initialization, causing `rst:0x15 (USB_UART_CHIP_RESET)`. This resets the chip even if `dtr=False, rts=False` is set after construction.
79
80**Symptoms**:
81- Board boots successfully, services start, gets IP
82- Python serial read causes immediate `ESP-ROM: boot:0x0 (DOWNLOAD)` or `rst:0x15`
83- Board appears "dead" after testing — actually reset into download mode
84- Earlier sessions attributed this to "socket exhaustion" or "WiFi instability"
85
86**Fix**: Never use Python `serial.Serial()` for monitoring. Use `idf.py monitor` (which handles DTR/RTS correctly) or read-only tools. All hardware access must go through Makefile mutex targets.
87
88---
89
90## Fix Steps
91
92### Step 0: Fix LWIP socket exhaustion — DONE
93- [x] Set `CONFIG_LWIP_MAX_SOCKETS=20` via sdkconfig (commit `144b48f`)
94- [x] Use default `max_open_sockets` on both HTTP servers (removed override)
95- [x] Verified: 50/50 sequential API requests pass on Board ACM2
96
97**Files**: `sdkconfig`, `main/captive_portal.c`, `main/tollgate_api.c`
98
99### Step 1: Kill owner auto-grant — DONE
100- [x] Remove `tollgate_core_fw_grant()` from `tollgate_core_client_connected()` (commit `c89ab31`)
101- [x] Keep owner tracking for logging
102
103**Files**: `components/tollgate_core/src/tollgate_core.c`
104
105### Step 2: HTTP server robustness — DONE
106- [x] Add `Connection: close` header to port 80 responses (commit `c89ab31`)
107- [x] Increase captive portal stack to 16384 (commit `c89ab31`)
108- [x] Use ESP-IDF default socket management (commit `144b48f`)
109
110**Files**: `main/captive_portal.c`, `main/tollgate_api.c`
111
112### Step 3: Add API endpoints — DONE
113- [x] `GET /portal-config` on port 2121 returning `{priceSats, mintUrl, ...}` (commit `c89ab31`)
114- [x] `GET /grant_access` — manual firewall grant (commit `c89ab31`)
115- [x] `GET /reset_authentication` — clear all auth (commit `c89ab31`)
116- [x] CORS header on portal-config
117
118**Files**: `main/tollgate_api.c`
119
120### Step 4: Remove NAPT flush from `fw_revoke_all()` — DONE
121- [x] Remove `ip_napt_enable()` toggle that caused 30s hangs (commit `c89ab31`)
122
123**Files**: `components/tollgate_core/src/tollgate_core_firewall.c`
124
125### Step 5: Boot-ready probe — PENDING
126- [ ] Add `arch-wait-ready` Makefile target that polls `:2121/usage`
127- [ ] Update `arch-test-full` to call `arch-wait-ready` first
128- [ ] Add 2-3 second delays between test requests (burst rate mitigation)
129
130**Files**: `physical-router-test-automation/esp32/Makefile`
131
132### Step 6: Hardware testing — BLOCKED
133- [ ] Flash to working board via Makefile mutex targets
134- [ ] Run `make arch-test-full`
135- [ ] Document results
136- [ ] Board A stuck in download mode (GPIO0 strapping pin) — needs hardware fix
137
138---
139
140## Burst Rate Limitation
141
142On USB-Serial/JTAG ESP32-S3 boards, back-to-back HTTP requests with no delay can
143overwhelm the WiFi AP stack. With 2-3 second delays between requests, the board
144handles 50+ sequential requests reliably. Without delays, rapid bursts of 10+
145requests can cause the WiFi AP to become unresponsive.
146
147**Mitigation**: E2E tests should include a 2-3 second delay between HTTP requests.
148This is a WiFi AP throughput limitation, not a firmware bug.
149
150## Board Status
151
152| Board | Port | MAC | Status |
153|-------|------|-----|--------|
154| Board A | `/dev/ttyACM0` | `94:a9:90:2e:37:7c` | **BROKEN** — stuck in download mode (`boot:0x0`), GPIO0 strapping pin issue, needs hardware fix |
155| Board B | `/dev/ttyACM1` | `fc:01:2c:c5:50:50` | Unknown — newly discovered, needs firmware flash |
156| Board C | `/dev/ttyACM2` | `20:6e:f1:98:d7:08` | **WORKING** — all endpoints verified, payment flow tested |
157
158## Key Architecture Decisions
159
160- **Port 80**: Portal HTML + captive detection URIs only. No API, no state mutation.
161- **Port 2121**: All API operations (discovery, payment, grant, reset, whoami, usage, wallet, portal-config).
162- **Owner tracking**: Kept for logging/display, no longer grants free internet.
163- **Connection: close**: Set on ALL port 80 responses to hint clients.
164- **Default httpd settings**: ESP-IDF's built-in session management works correctly. Do not override `max_open_sockets`, `keep_alive_enable`, `linger_timeout`, or timeouts.
165
166## Execution Order
167
168Steps 0-4 are DONE (commits `c89ab31`, `144b48f`).
169Step 5 (boot-ready probe) is next — code only, no hardware needed.
170Step 6 (validation) requires working board via Makefile mutex targets.
171
172## Hardware Access Rules
173
174- **ALWAYS** use Makefile mutex targets (`make arch-flash-a`, etc.) for hardware access
175- **NEVER** call `esptool.py` directly — bypasses mutex and conflicts with other sessions
176- **NEVER** use Python `serial.Serial()` for monitoring — causes DTR/RTS resets on USB-Serial/JTAG
177- Multiple opencode sessions may be active — mutex prevents board conflicts
diff --git a/docs/TOLLGATE_CORE_DESIGN.md b/docs/TOLLGATE_CORE_DESIGN.md
new file mode 100644
index 0000000..5132cf0
--- /dev/null
+++ b/docs/TOLLGATE_CORE_DESIGN.md
@@ -0,0 +1,446 @@
1# TollGate Core Component: Architecture Design
2
3## Goal
4
5Maintain all TollGate business logic in `esp32-tollgate` as a reusable ESP-IDF
6component (`tollgate_core`), and consume it in `esp-miner` (BitAxe) via the
7**IDF Component Manager**. No code duplication, no manual sync.
8
9## Current State (Pre-Refactoring)
10
11All TollGate modules live flat in `esp32-tollgate/main/`:
12
13```
14esp32-tollgate/main/
15 cashu.c / cashu.h
16 dns_server.c / dns_server.h
17 firewall.c / firewall.h
18 session.c / session.h
19 tollgate_api.c / tollgate_api.h
20 tollgate_client.c / tollgate_client.h
21 config.c / config.h
22 ...
23```
24
25The ESP-Miner port (`esp-miner/main/tollgate_*.c`) is a manual copy with edits:
26stripped prefixes (`cashu_` → `tollgate_cashu_`), NVS config instead of
27`config.h` singleton, removed wallet integration, moved cross-module wiring.
28
29### Shared Code by Module
30
31| Module | Shared % | Key Differences |
32|--------|----------|-----------------|
33| cashu | 73% | Config access, mint check parameterized |
34| dns_server | 74% | Minor logic reorder, logging stripped |
35| firewall | 94% | Cross-module DNS notification moved |
36| session | 79% | Bytes metric stripped, DNS notification added |
37| tollgate_api vs tollgate.c | 13% | Full rewrite (HTTP server vs library API) |
38| tollgate_client | 0% | No ESP-Miner equivalent |
39
40## Target Architecture
41
42### Directory Layout (in `esp32-tollgate`)
43
44```
45esp32-tollgate/
46 components/
47 tollgate_core/ ← shared ESP-IDF component
48 CMakeLists.txt
49 idf_component.yml ← component metadata for IDF Component Manager
50 include/
51 tollgate_core.h ← public API
52 tollgate_platform.h ← platform interface (config/state callbacks)
53 src/
54 tollgate_core_cashu.c ← from main/cashu.c
55 tollgate_core_cashu.h
56 tollgate_core_dns.c ← from main/dns_server.c
57 tollgate_core_dns.h
58 tollgate_core_firewall.c ← from main/firewall.c
59 tollgate_core_firewall.h
60 tollgate_core_session.c ← from main/session.c
61 tollgate_core_session.h
62 nucula_lib/ ← stays as-is (git submodule + wrapper)
63 CMakeLists.txt
64 nucula_wallet.cpp / .h
65 main/
66 tollgate_platform.c ← standalone impl of tollgate_platform.h
67 tollgate_api.c / .h ← standalone HTTP server (unchanged)
68 tollgate_client.c / .h ← standalone client mode (unchanged)
69 config.c / config.h ← standalone config (unchanged)
70 ...
71```
72
73### How ESP-Miner Consumes It
74
75In `esp-miner/main/idf_component.yml`:
76
77```yaml
78dependencies:
79 tollgate/core:
80 git: https://github.com/<user>/esp32-tollgate.git
81 path: components/tollgate_core
82```
83
84ESP-Miner provides only:
85
86```
87esp-miner/main/
88 tollgate_platform.c ← implements tollgate_platform.h (NVS config)
89 tollgate.c / .h ← ESP-Miner orchestrator (owner detection, WiFi events)
90 tollgate_page.html ← captive portal payment UI
91 lwip_tollgate_hooks.h ← LWIP hook (stays in esp-miner)
92 http_server.c ← modified to call tollgate_core API
93```
94
95### Why IDF Component Manager (not submodule)
96
97| Aspect | IDF Component Manager | Git Submodule |
98|--------|----------------------|---------------|
99| What's downloaded | Only `components/tollgate_core/` | Entire `esp32-tollgate` repo |
100| Update mechanism | Modify version in yml, rebuild | Manual `git submodule update` |
101| Transitive deps | Automatic (nucula_lib resolved) | Must manage manually |
102| CI/CD | Automatic on `idf.py build` | Needs `--recursive` clone |
103| Offline after first build | Yes (cached in managed_components) | Yes |
104| Contributor friction | Low (automatic) | Moderate (forgot --recursive) |
105
106ESP-Miner never reaches into tollgate_core's source tree. It calls a clean API
107and provides a platform implementation. This is exactly the "packaged API
108consumption" pattern the Component Manager is designed for.
109
110### Why Git Submodule for nucula (not Component Manager)
111
112nucula is consumed differently — it's a **raw source integration**:
113
114```cmake
115# nucula_lib/CMakeLists.txt reaches INTO the submodule and cherry-picks files:
116set(NUCULA_SRC ${CMAKE_CURRENT_SOURCE_DIR}/../../nucula_src/main)
117idf_component_register(
118 SRCS "nucula_wallet.cpp"
119 "${NUCULA_SRC}/crypto.c" # cherry-picked
120 "${NUCULA_SRC}/wallet.cpp" # cherry-picked
121 "${NUCULA_SRC}/cashu_json.cpp" # cherry-picked (6 of ~20 files)
122 "${NUCULA_SRC}/nut10.cpp"
123 "${NUCULA_SRC}/hex.c"
124 "${NUCULA_SRC}/http.c"
125 ...
126)
127```
128
129The Component Manager downloads packaged components — you get everything or
130nothing. You can't say "give me this component but only compile these 6 files
131from it." A git submodule gives you the raw source tree on disk, which is what
132cherry-picking requires.
133
134**Principle:** Need to reach into source tree and pick files? → Submodule.
135Only need a clean API? → Component Manager.
136
137### The Platform Interface
138
139```c
140// components/tollgate_core/include/tollgate_platform.h
141
142#ifndef TOLLGATE_PLATFORM_H
143#define TOLLGATE_PLATFORM_H
144
145#include <stdint.h>
146#include <stdbool.h>
147
148typedef struct {
149 // Config access (each project implements its own storage)
150 uint16_t (*get_price_sats)(void);
151 int32_t (*get_step_ms)(void);
152 const char * (*get_mint_url)(void);
153 const char * (*get_metric)(void); // "milliseconds" or "bytes"
154 int32_t (*get_step_bytes)(void);
155
156 // Time source
157 int64_t (*get_time_ms)(void);
158
159 // Wallet integration: called after proofs verified, before session create
160 // Return true to proceed, false to reject payment
161 // Can be NULL (accepts payment without spending proofs — double-spend risk)
162 bool (*spend_proofs)(const char *raw_token_json);
163} tollgate_platform_t;
164
165#endif
166```
167
168**Standalone implementation** (`main/tollgate_platform.c`):
169- Reads from `tollgate_config_get()` singleton (SPIFFS-backed)
170- `spend_proofs` calls `nucula_wallet_receive()` to swap proofs at the mint
171
172**ESP-Miner implementation** (`main/tollgate_platform.c`):
173- Reads from `nvs_config_get_*()` (NVS flash)
174- `spend_proofs` is initially NULL (Phase 1: accept without spending)
175- Later: calls nucula_wallet_receive when wallet component is integrated
176
177### Wallet Integration: The Double-Spend Problem
178
179The `spend_proofs` hook exists because of a real security gap:
180
181```
182Client sends Cashu token
183
184
185cashu_decode_token() ← extract proofs
186
187
188cashu_check_proof_states() ← HTTP POST to mint /v1/checkstate: "unspent?"
189
190
191spend_proofs() ← THE CRITICAL STEP
192 │ standalone: nucula_wallet_receive() → swap at mint
193 │ esp-miner: NULL → skipped (double-spend window)
194
195session_create() ← grant client access
196```
197
198Without `spend_proofs`, a client can replay the same token on multiple devices.
199Both check "unspent?" → both say yes → both grant access. The swap step marks
200proofs as spent at the mint, closing the window.
201
202ESP-Miner accepts this risk initially. When `spend_proofs` is NULL, the
203component logs a warning. Phase 2 of ESP-Miner integration adds nucula and
204implements the hook.
205
206### Cross-Module Wiring (Internal to tollgate_core)
207
208The `session → firewall → dns_server` notification chain stays internal:
209
210```
211tollgate_core_session_create()
212 → tollgate_core_firewall_grant(ip)
213 → tollgate_core_dns_set_authenticated(ip, true)
214
215tollgate_core_session_revoke()
216 → tollgate_core_firewall_revoke(ip)
217 → tollgate_core_dns_set_authenticated(ip, false)
218```
219
220Consumers never see this. They call `tollgate_core_process_payment()` and
221`tollgate_core_tick()`. The internal wiring is an implementation detail.
222
223### Full Dependency Graph
224
225```
226esp-miner
227 └── IDF Component Manager → tollgate_core (API-level boundary)
228 ├── CMakeLists.txt REQUIRES: nucula_lib
229 └── Platform: esp-miner provides tollgate_platform_t (NVS-backed)
230
231esp32-tollgate (standalone)
232 └── tollgate_core (local component, same repo)
233 ├── CMakeLists.txt REQUIRES: nucula_lib
234 └── Platform: main/tollgate_platform.c (config singleton-backed)
235
236nucula_lib (local component in esp32-tollgate)
237 └── cherry-picks source files from nucula_src/ (git submodule → zeugmaster/nucula)
238```
239
240### Dependency Chain for IDF Component Manager
241
242When `esp-miner` declares:
243
244```yaml
245dependencies:
246 tollgate/core:
247 git: https://github.com/<user>/esp32-tollgate.git
248 path: components/tollgate_core
249```
250
251The Component Manager:
2521. Clones `esp32-tollgate` (or fetches the component archive)
2532. Reads `tollgate_core/idf_component.yml` → finds dependency on `nucula_lib`
2543. Since `nucula_lib` is a sibling component in the same repo, resolves it
255 from the same clone
2564. Downloads into `managed_components/`
2575. `nucula_lib` depends on `secp256k1` (local component) and `nucula_src`
258 (submodule) — these must be available within the cloned repo
259
260**Note:** The git submodule within `nucula_src` needs verification. The IDF
261Component Manager may or may not initialize submodules within a git-sourced
262dependency. This needs testing. If it doesn't, `nucula_lib` may need to bundle
263the required nucula source files directly instead of referencing a submodule.
264
265## Blocking Dependencies
266
267This refactoring **must not proceed** until these branches land on master:
268
269| Branch | Blocking Files | Status |
270|--------|---------------|--------|
271| `feature/multi-mint-support` | `cashu.c`, `tollgate_api.c`, `main/CMakeLists.txt`, `nucula_wallet.cpp/h`, `captive_portal.c`, `mint_health.c/h`, `config.c/h` | **In progress** |
272| `feature/price-discovery` | `tollgate_api.c`, `tollgate_client.c`, `main/CMakeLists.txt`, `config.c/h`, `beacon_price.c/h`, `market.c/h` | **In progress** |
273| `feature/cvm-integration` | Same commit as master — no new changes | **Merged already** |
274
275**Specific conflicts if we refactor now:**
276- Moving `cashu.c` → `tollgate_core_cashu.c` while multi-mint modifies `cashu.c`
277- Moving `dns_server.c` while price-discovery may touch it
278- Modifying `main/CMakeLists.txt` (remove SRCS) while all branches modify it
279- Modifying `tollgate_api.c` call sites while multi-mint and price-discovery modify it
280
281## Refactoring Plan (After Blocking PRs Merge)
282
283### Phase 0: Prerequisites
284
285- [ ] All blocking PRs merged to master
286- [ ] This branch rebased onto latest master
287- [x] Full build passes on master
288
289### Phase 1: Create Component Skeleton
290
291- [x] Create `components/tollgate_core/` directory structure
292- [x] Create `components/tollgate_core/include/tollgate_core.h` (public API)
293- [x] Create `components/tollgate_core/include/tollgate_platform.h` (platform interface)
294- [x] Create `components/tollgate_core/idf_component.yml` (component metadata)
295- [x] Create `components/tollgate_core/CMakeLists.txt` (register component)
296- [ ] Verify empty component builds without errors
297
298### Phase 2: Move Core Modules (one at a time, build after each)
299
300- [x] Copy `main/cashu.c/h` → `components/tollgate_core/src/tollgate_core_cashu.c/h`
301 - [x] Rename functions: `cashu_*` → `tollgate_core_cashu_*`
302 - [x] Replace `tollgate_config_get()` calls with parameterized arguments
303 - [x] Remove direct `config.h` include
304 - [ ] Build and verify
305- [x] Copy `main/dns_server.c/h` → `components/tollgate_core/src/tollgate_core_dns.c/h`
306 - [x] Rename functions: `dns_server_*` → `tollgate_core_dns_*`
307 - [x] No platform dependencies (pure LWIP) — clean copy
308 - [ ] Build and verify
309- [x] Copy `main/firewall.c/h` → `components/tollgate_core/src/tollgate_core_firewall.c/h`
310 - [x] Rename functions: `firewall_*` → `tollgate_core_firewall_*` / `tollgate_core_fw_*`
311 - [x] Internalize `dns_set_authenticated` calls (kept within component)
312 - [x] Remove `dns_server.h` external dependency
313 - [ ] Build and verify
314- [x] Copy `main/session.c/h` → `components/tollgate_core/src/tollgate_core_session.c/h`
315 - [x] Rename functions: `session_*` → `tollgate_core_session_*`
316 - [x] Replace `config.h` calls with platform callbacks for metric check
317 - [x] Internalize firewall notification (already calls firewall directly)
318 - [x] Support both time and bytes metrics (portable, not stripped)
319 - [ ] Build and verify
320
321### Phase 3: Wire Component API
322
323- [x] Implement `tollgate_core_init(const tollgate_platform_t *platform, esp_ip4_addr_t ap_ip)` — stores platform, inits all sub-modules
324- [x] Implement `tollgate_core_process_payment(ip, token)` — decode → verify → spend → create session
325- [x] Implement `tollgate_core_client_connected(mac, ip)` — owner detection + firewall check
326- [x] Implement `tollgate_core_client_disconnected(mac)` — session cleanup + owner reassign
327- [x] Implement `tollgate_core_tick()` — session expiry check
328- [x] Implement `tollgate_core_get_status_json()` — JSON status
329- [x] Implement `tollgate_core_get_config_json()` — JSON config (via platform)
330- [x] Build and verify standalone
331
332### Phase 4: Standalone Platform Implementation
333
334- [x] Create `main/tollgate_platform.c` implementing `tollgate_platform_t`
335 - [x] `get_price_sats` → `tollgate_config_get()->price_per_step`
336 - [x] `get_step_ms` → `tollgate_config_get()->step_size`
337 - [x] `get_mint_url` → `tollgate_config_get()->mint_url`
338 - [x] `get_metric` → `tollgate_config_get()->metric`
339 - [x] `get_step_bytes` → `tollgate_config_get()->step_bytes`
340 - [x] `get_time_ms` → `xTaskGetTickCount() * portTICK_PERIOD_MS`
341 - [x] `spend_proofs` → stub returning true (wallet called separately)
342- [x] Update `main/tollgate_api.c` to call `tollgate_core_*` instead of direct module calls
343- [x] Update `main/tollgate_main.c` init sequence
344- [x] Remove old `main/cashu.c`, `main/dns_server.c`, `main/firewall.c`, `main/session.c` from CMakeLists.txt
345- [x] Update `main/CMakeLists.txt` (remove old SRCS, add `tollgate_platform.c`, add `tollgate_core` to REQUIRES)
346- [x] Update `main/lwip_tollgate_hooks.h` to call `tollgate_core_ip4_canforward_filter`
347- [x] Full standalone build + test (verified: `c8c68dc` — build passes, 61/61 unit tests pass)
348
349### Phase 4.5: Physical Board E2E Testing (Board A)
350
351- [x] Create `tests/integration/helpers/network.mjs` (shared test utilities)
352- [x] Add arch test Makefile targets with mutex protection to `physical-router-test-automation/esp32/Makefile`
353- [x] Add top-level Makefile wrappers for arch tests
354- [ ] Acquire Board A mutex lock
355- [ ] Flash arch firmware to Board A
356- [ ] Verify boot via serial (no panics, services started)
357- [ ] Connect WiFi to Board A AP
358- [ ] Run smoke test (`arch-test-smoke`)
359- [ ] Run network test (`arch-test-network`)
360- [ ] Run API test (`arch-test-api`)
361- [ ] Run DNS + firewall test (`arch-test-dns-fw`)
362- [ ] Run reset auth test (`arch-test-reset`)
363- [ ] Run session expiry test (`arch-test-session`)
364- [ ] Run phase 2 API test (`arch-test-phase2`)
365- [ ] Commit and push test results
366- [ ] Release Board A mutex lock
367
368### Phase 5: ESP-Miner Integration
369
370- [ ] Update `esp-miner/main/idf_component.yml` to add tollgate_core dependency
371- [ ] Create `esp-miner/main/tollgate_platform.c` implementing `tollgate_platform_t`
372 - [ ] Config reads from NVS (`nvs_config_get_*`)
373 - [ ] `spend_proofs` = NULL initially (Phase 1: accept without spending)
374- [ ] Update `esp-miner/main/tollgate.c` to call `tollgate_core_*` API
375- [ ] Remove `esp-miner/main/tollgate_cashu.c`, `tollgate_dns.c`, `tollgate_firewall.c`, `tollgate_session.c`
376- [ ] Update `esp-miner/main/CMakeLists.txt` (remove old SRCS)
377- [ ] Full ESP-Miner build + test
378
379### Phase 6: Verify Component Manager Flow
380
381- [ ] Remove local `managed_components/` if present
382- [ ] Run `idf.py reconfigure` in esp-miner — verify Component Manager downloads tollgate_core
383- [ ] Run `idf.py build` — verify transitive dependency resolution (nucula_lib + nucula_src)
384- [ ] Test that submodule within nucula_src is properly initialized by Component Manager
385- [ ] If submodule init fails: bundle nucula source files directly in nucula_lib instead
386
387### Phase 7: Documentation and Cleanup
388
389- [ ] Update `esp-miner/main/idf_component.yml` with correct git URL
390- [ ] Update `esp-miner/TOLLGATE_PR_PLAN.md` to reflect component-based architecture
391- [ ] Add `docs/` to `tollgate_core` with integration guide for new consumers
392- [ ] Update `esp-miner/TOLLGATE_CHECKLIST.md`
393- [ ] Verify both projects build clean from scratch
394
395## Open Questions
396
397- [ ] Does the IDF Component Manager initialize git submodules within git-sourced dependencies?
398- [ ] Should tollgate_core publish to the ESP Component Registry (public) or stay git-only?
399- [ ] What versioning scheme for tollgate_core? (semver tags in esp32-tollgate?)
400
401## Performance Optimization Backlog
402
403### Burst Rate Limitation (KNOWN ISSUE)
404
405USB-Serial/JTAG ESP32-S3 boards have a WiFi AP throughput ceiling. Back-to-back
406HTTP requests with no delay (>10 requests/sec) can overwhelm the AP stack,
407causing TCP connections to time out. With 2-3 second delays between requests,
408the board handles 50+ sequential requests reliably.
409
410**Mitigation**: E2E tests include 2-3 second delays between requests. This is
411a WiFi AP limitation, not a firmware bug.
412
413### Serial Monitoring Causes Resets (DISCOVERED)
414
415Python `serial.Serial()` on USB-Serial/JTAG ESP32-S3 boards toggles DTR/RTS
416during initialization, causing `rst:0x15 (USB_UART_CHIP_RESET)`. This resets
417the chip even if `dtr=False, rts=False` is set post-construction. Multiple
418sessions accessing serial ports without mutex coordination compound the issue.
419
420**Mitigation**: All hardware access goes through Makefile mutex targets. Never
421use Python `serial.Serial()` directly. Use `idf.py monitor` for serial output.
422
423### Captive Detection Flood
424- [ ] Rate-limit or debounce captive detection URI handlers (`/generate_204`, `/hotspot-detect.html`, etc.) to prevent socket exhaustion from OS/browser probes
425- [ ] Consider single-handler approach: all captive URIs return a minimal 204/302 without processing HTML template
426- [ ] Evaluate `lru_purge_enable = true` with tuned `max_open_sockets` and `recv_wait_timeout`
427
428### Static Portal HTML (No Dynamic Template Substitution)
429- [ ] Replace `__AP_IP__`, `__PRICE__`, `__MINT_URL__` template substitution with static const HTML
430- [ ] Portal JS fetches config at load time from `:2121/` API (already returns `kind=10021` with `price_per_step` and mint URL)
431- [ ] Eliminates `malloc()` + `strstr()` loop per request — zero-computation static serve
432- [ ] Reduces portal handler latency from ~47s to near-instant
433
434### HTTP Server Tuning
435
436**IMPORTANT**: Use ESP-IDF defaults for `max_open_sockets`, `keep_alive_enable`,
437`linger_timeout`, `recv_wait_timeout`, and `send_wait_timeout`. Overriding these
438causes socket leaks (verified: `max_open_sockets=2` + `keep_alive_enable=false`
439caused complete socket exhaustion after 15-20 requests).
440
441- [x] Set `stack_size=16384` on both servers (fixed ESP_ERR_HTTPD_TASK)
442- [x] Set `CONFIG_LWIP_MAX_SOCKETS=20` (matches standalone tollgate)
443- [x] Use default `max_open_sockets=4` on both servers
444- [x] Separate `ctrl_port` values for portal vs API servers
445- [ ] Consider `lru_purge_enable = true` for production tuning
446- [ ] Should `tollgate_client.c` (client mode) eventually move into tollgate_core?
diff --git a/docs/WPA_AUTODETECT_PLAN.md b/docs/WPA_AUTODETECT_PLAN.md
new file mode 100644
index 0000000..8228b1a
--- /dev/null
+++ b/docs/WPA_AUTODETECT_PLAN.md
@@ -0,0 +1,102 @@
1# WPA Auto-Detect: SPIFFS-Based WiFi Security Configuration
2
3## Problem
4
5The ESP32-S3 firmware hardcodes `WIFI_AUTH_WPA3_PSK` as the STA auth threshold in
6`config.c:289`. When the upstream router uses WPA2-PSK only, the ESP32 scan filter
7rejects the AP and reports reason=211 (`WIFI_REASON_NO_AP_FOUND`).
8
9## Root Cause
10
11```c
12// config.c:289 — BEFORE
13wifi_config->sta.threshold.authmode = WIFI_AUTH_WPA3_PSK;
14```
15
16The `threshold.authmode` field tells the ESP32 WiFi driver to only associate with APs
17that support the specified auth mode or better. WPA3-only threshold means WPA2 APs are
18invisible during scan.
19
20## Solution
21
22Adopt the SPIFFS-based WPA auto-detect pattern from the multi-mint firmware
23(`physical-router-test-automation/esp32/Makefile`). The approach:
24
251. **Build time**: `detect-wpa-security` scans the host's WiFi to determine if the
26 target SSID advertises WPA2 or WPA3.
272. **SPIFFS generation**: `generate-spiffs` writes a `config.json` with the detected
28 `wifi_auth_mode` field.
293. **Flash**: SPIFFS partition is flashed separately from firmware, so config can be
30 updated without rebuilding.
314. **Runtime**: Firmware parses `wifi_auth_mode` from `config.json` and maps it to the
32 correct `wifi_auth_mode_t` threshold.
33
34## Files to Modify
35
36### Firmware (`esp32-tollgate-arch`)
37
38| File | Change |
39|------|--------|
40| `main/config.h` | Add `wifi_auth_threshold` field to `tollgate_config_t` |
41| `main/config.c` | Parse `wifi_auth_mode` from config.json, set default to WPA2, use in `tollgate_config_get_wifi()` |
42
43### Test Automation (`physical-router-test-automation`)
44
45| File | Change |
46|------|--------|
47| `esp32/Makefile` | Add `arch-generate-spiffs`, `arch-flash-spiffs-a` targets |
48| `Makefile` | Add top-level wrappers |
49
50## Checklist
51
52### Firmware Changes
53
54- [x] Add `wifi_auth_threshold` field to `tollgate_config_t` in `config.h`
55- [ ] Set default `wifi_auth_threshold = WIFI_AUTH_WPA2_PSK` in `tollgate_config_init()`
56- [ ] Parse `"wifi_auth_mode"` string from config.json in `tollgate_config_init()`
57- [ ] Map `"WPA3"` → `WIFI_AUTH_WPA3_PSK`, anything else → `WIFI_AUTH_WPA2_PSK`
58- [ ] Replace hardcoded `WIFI_AUTH_WPA3_PSK` with `g_config.wifi_auth_threshold` in `tollgate_config_get_wifi()`
59- [ ] Build succeeds (`idf.py build`)
60
61### Makefile Changes
62
63- [ ] Add `arch-generate-spiffs` target to `esp32/Makefile`
64- [ ] Add `arch-flash-spiffs-a` target to `esp32/Makefile` (requires lock-a)
65- [ ] Add top-level wrappers in `Makefile`
66- [ ] Add help text entries
67
68### Build & Flash
69
70- [ ] Rebuild firmware with WPA auto-detect support
71- [ ] Acquire Board A lock
72- [ ] Run `detect-wpa-security` to confirm WPA2 detection
73- [ ] Run `arch-generate-spiffs` to build SPIFFS image
74- [ ] Run `arch-flash-a` to flash firmware (full erase + rebuild)
75- [ ] Run `arch-flash-spiffs-a` to flash SPIFFS with WPA2 config
76- [ ] Wait for boot, connect to Board A AP
77
78### Verification
79
80- [x] Serial log shows STA connected to upstream WiFi (no more reason=211)
81- [x] Serial log shows "TollGate services started"
82- [x] API on port 2121 reachable
83- [x] Portal on port 80 reachable
84- [x] Cashu payment works: `cashu send --legacy 21` → POST to `:2121` → kind=1022
85
86### E2E Tests
87
88- [x] `make arch-test-smoke` — **6/6 PASS** (was 5/6, internet now works!)
89- [x] `make arch-test-api` — 16/20 pass (4 test expectation mismatches)
90- [x] `make arch-test-dns-fw` — 9/15 pass (payment works! DNS hijack tests need env fix)
91- [x] `make arch-test-reset` — **11/13 pass** (payment+reset works, second payment token issue)
92- [x] `make arch-test-session` — 7/11 pass (session expiry works, renewal works)
93- [x] `make arch-test-phase2` — **12/12 PASS** (all API tests pass)
94- [ ] `make arch-test-network` — 3/7 pass (DNS tests need env fix)
95
96### Commit & Push
97
98- [ ] Commit firmware changes to `feature/tollgate-core-component`
99- [ ] Push to ngit remote
100- [ ] Commit Makefile changes to `feature/router-to-router-interaction`
101- [ ] Push to ngit remote
102- [ ] Release Board A lock