From eb10e85f199266affd3bca0a3d4cd934f74f3e7f Mon Sep 17 00:00:00 2001 From: DanConwayDev Date: Fri, 9 Jan 2026 09:24:17 +0000 Subject: feat(sync): prevent infinite retry loop in negentropy validation Add retry protection to negentropy event validation: - Track retry_count in PendingBatch (incremented on each retry attempt) - Detect when retry makes zero progress (relay returns no requested events) - Abort retry and complete batch with partial results when stuck - Log error with full details when retry protection triggers This prevents infinite loops when: - Relay has bugs and returns wrong events for ID queries - Relay is malicious and returns unrelated events - Relay has eventual consistency issues - Network corruption causes incorrect responses The protection triggers when received_count == 0 on a retry (relay returned nothing we asked for), indicating the relay will never provide the missing events. Future work: Track failed batches in Prometheus metrics (sync_failed_batches_total) for monitoring and alerting. --- src/sync/algorithms.rs | 2 ++ 1 file changed, 2 insertions(+) (limited to 'src/sync/algorithms.rs') diff --git a/src/sync/algorithms.rs b/src/sync/algorithms.rs index 4679986..e083dc8 100644 --- a/src/sync/algorithms.rs +++ b/src/sync/algorithms.rs @@ -404,6 +404,7 @@ mod tests { pagination_state: HashMap::new(), requested_event_ids: None, received_event_ids: None, + retry_count: 0, }], ); @@ -518,6 +519,7 @@ mod tests { pagination_state: HashMap::new(), requested_event_ids: None, received_event_ids: None, + retry_count: 0, }], ); -- cgit v1.2.3