feat(l1): reintroduce proper Kademlia k-bucket routing table#6458
Conversation
Replace the flat IndexMap-based contact storage with a proper Kademlia routing table using 256 k-buckets (16 contacts per bucket) with replacement lists, as described in the Kademlia paper. Key changes: - Add KBucket struct with main contact list and replacement list - Store local_node_id in PeerTableServer for bucket computation - Use XOR distance (raw H256 comparison) for closest-node queries instead of logarithmic distance for accurate ordering - Promote replacements when contacts are pruned from buckets - Remove redundant local_node_id parameter from new_contacts and new_contact_records protocol methods Closes #4245
Lines of code reportTotal lines added: Detailed view |
🤖 Codex Code ReviewFindings
I did not run Automated review by OpenAI Codex · gpt-5.4 · custom prompt |
🤖 Claude Code ReviewHere is the review of PR #6458: Review: feat(l1): reintroduce proper Kademlia k-bucket routing tableOverall the approach is solid — the structural change from a flat Bug 1 —
|
🤖 Codex Code Review
No EVM, gas-accounting, trie, or consensus-rule logic is touched here; the review surface is discovery/routing behavior only. I could not run the affected tests in this sandbox because Cargo needs to resolve the git dependency Automated review by OpenAI Codex · gpt-5.4 · custom prompt |
Greptile SummaryReplaces the flat
Confidence Score: 4/5Safe to merge after fixing the replacement-list duplicate insertion bug in do_new_contacts. One P1 defect (duplicate replacement entries in do_new_contacts) needs to be resolved before merge; remaining findings are P2 quality issues that do not block correctness of the primary discovery path. crates/networking/p2p/peer_table.rs — specifically do_new_contacts (line ~1129) and insert_contact (line ~937)
|
| Filename | Overview |
|---|---|
| crates/networking/p2p/peer_table.rs | Core Kademlia routing table refactor — adds KBucket struct with main/replacement lists; do_new_contacts has a duplicate-insertion bug for replacement-list entries and insert_contact discards bucket insert result causing metric overcounting |
| crates/networking/p2p/discv4/server.rs | Removes redundant local_node_id parameter from new_contacts calls; straightforward cleanup matching updated PeerTableServer API |
| crates/networking/p2p/discv5/server.rs | Removes redundant local_node_id parameter from new_contacts/new_contact_records calls; test helpers updated to pass local_node_id to PeerTableServer::spawn; get_nodes_at_distances still passes local_node_id redundantly |
| cmd/ethrex/initializers.rs | Passes local_node.node_id() to PeerTableServer::spawn as required by new API; trivial change |
| cmd/ethrex/l2/initializers.rs | Same local_node_id threading as L1 initializer; no issues |
| crates/networking/rpc/test_utils.rs | Test utility updated to pass a default H256 as local_node_id to PeerTableServer::spawn; acceptable for testing context |
| test/tests/p2p/discovery/discv5_server_tests.rs | Integration tests updated to work with new routing table; no functional changes to test logic |
Flowchart
%%{init: {'theme': 'neutral'}}%%
flowchart TD
A[do_new_contacts / do_new_contact_records] --> B{discarded or local node?}
B -- yes --> Z[skip]
B -- no --> C{get_contact_mut found?\nmain list only}
C -- yes --> D[add_protocol on existing contact]
C -- no --> E{contact_exists?\nchecks replacements too}
E -- yes, do_new_contact_records --> F[update replacement entry]
E -- no / do_new_contacts skips this check --> G[insert_contact]
G --> H[bucket_index XOR distance]
H --> I{main list full?}
I -- no --> J[push to contacts main list\nreturn true]
I -- yes --> K[insert_replacement\nreturn false — IGNORED by insert_contact]
K --> L[⚠ duplicate if node already in replacements]
J --> M[record_new_discovery metric]
K --> M
Comments Outside Diff (1)
-
crates/networking/p2p/peer_table.rs, line 387-391 (link)Redundant
local_node_idparameter inget_nodes_at_distancesPeerTableServernow ownsself.local_node_id, so passinglocal_node_idfrom outside is redundant — callers must pass the same value that the routing table was constructed with, otherwise distance calculations diverge from bucket placement.do_get_nodes_at_distancescould simply useself.local_node_idand the parameter could be dropped from both the protocol definition and the public API.Prompt To Fix With AI
This is a comment left during a code review. Path: crates/networking/p2p/peer_table.rs Line: 387-391 Comment: **Redundant `local_node_id` parameter in `get_nodes_at_distances`** `PeerTableServer` now owns `self.local_node_id`, so passing `local_node_id` from outside is redundant — callers must pass the same value that the routing table was constructed with, otherwise distance calculations diverge from bucket placement. `do_get_nodes_at_distances` could simply use `self.local_node_id` and the parameter could be dropped from both the protocol definition and the public API. How can I resolve this? If you propose a fix, please make it concise.
Prompt To Fix All With AI
This is a comment left during a code review.
Path: crates/networking/p2p/peer_table.rs
Line: 1129-1138
Comment:
**Replacement-list contacts are silently duplicated**
`get_contact_mut` only searches `bucket.contacts` (the main list), not `bucket.replacements`. When a node is already sitting in the replacement list and `do_new_contacts` is called for it again, the `if let Some(contact)` branch is skipped, a brand-new `Contact` is constructed, and `insert_contact` → `KBucket::insert_replacement` pushes a **second copy** of the same node ID into `replacements`. The protocol field on the original replacement entry is also never updated.
`do_new_contact_records` avoids this by checking `self.contact_exists(&node_id)` (which covers both lists). `do_new_contacts` should do the same:
```rust
async fn do_new_contacts(&mut self, nodes: Vec<Node>, protocol: DiscoveryProtocol) {
for node in nodes {
let node_id = node.node_id();
if self.discarded_contacts.contains(&node_id) || node_id == self.local_node_id {
continue;
}
if let Some(contact) = self.get_contact_mut(&node_id) {
contact.add_protocol(protocol);
} else if self.contact_exists(&node_id) {
// Node is in the replacement list; update its protocol entry there.
if let Some(idx) = self.bucket_for(&node_id) {
if let Some((_, c)) = self.buckets[idx]
.replacements
.iter_mut()
.find(|(id, _)| id == &node_id)
{
c.add_protocol(protocol);
}
}
} else {
let contact = Contact::new(node, protocol);
if self.insert_contact(node_id, contact) {
METRICS.record_new_discovery().await;
}
}
}
}
```
How can I resolve this? If you propose a fix, please make it concise.
---
This is a comment left during a code review.
Path: crates/networking/p2p/peer_table.rs
Line: 937-943
Comment:
**`insert_contact` discards `KBucket::insert`'s return value, over-counting discoveries**
`KBucket::insert` returns `false` when the bucket is full and the contact is placed in the replacement list instead. `insert_contact` ignores that value and unconditionally returns `true` for any non-local node — so callers (e.g. `do_new_contacts`, `handle_insert_if_new`) record `METRICS.record_new_discovery()` even for contacts that only land in the replacement list.
```rust
fn insert_contact(&mut self, node_id: H256, contact: Contact) -> bool {
let Some(idx) = self.bucket_for(&node_id) else {
return false;
};
self.buckets[idx].insert(node_id, contact) // propagate the bucket's return value
}
```
How can I resolve this? If you propose a fix, please make it concise.
---
This is a comment left during a code review.
Path: crates/networking/p2p/peer_table.rs
Line: 387-391
Comment:
**Redundant `local_node_id` parameter in `get_nodes_at_distances`**
`PeerTableServer` now owns `self.local_node_id`, so passing `local_node_id` from outside is redundant — callers must pass the same value that the routing table was constructed with, otherwise distance calculations diverge from bucket placement. `do_get_nodes_at_distances` could simply use `self.local_node_id` and the parameter could be dropped from both the protocol definition and the public API.
How can I resolve this? If you propose a fix, please make it concise.Reviews (1): Last reviewed commit: "feat(l1): reintroduce proper Kademlia k-..." | Re-trigger Greptile
| if let Some(contact) = self.get_contact_mut(&node_id) { | ||
| // Contact already exists, just add the protocol | ||
| contact.add_protocol(protocol); | ||
| } else { | ||
| let contact = Contact::new(node, protocol); | ||
| if self.insert_contact(node_id, contact) { | ||
| METRICS.record_new_discovery().await; | ||
| } | ||
| Entry::Occupied(mut occupied_entry) => { | ||
| // Contact already exists, just add the protocol | ||
| occupied_entry.get_mut().add_protocol(protocol); | ||
| } | ||
| } | ||
| } |
There was a problem hiding this comment.
Replacement-list contacts are silently duplicated
get_contact_mut only searches bucket.contacts (the main list), not bucket.replacements. When a node is already sitting in the replacement list and do_new_contacts is called for it again, the if let Some(contact) branch is skipped, a brand-new Contact is constructed, and insert_contact → KBucket::insert_replacement pushes a second copy of the same node ID into replacements. The protocol field on the original replacement entry is also never updated.
do_new_contact_records avoids this by checking self.contact_exists(&node_id) (which covers both lists). do_new_contacts should do the same:
async fn do_new_contacts(&mut self, nodes: Vec<Node>, protocol: DiscoveryProtocol) {
for node in nodes {
let node_id = node.node_id();
if self.discarded_contacts.contains(&node_id) || node_id == self.local_node_id {
continue;
}
if let Some(contact) = self.get_contact_mut(&node_id) {
contact.add_protocol(protocol);
} else if self.contact_exists(&node_id) {
// Node is in the replacement list; update its protocol entry there.
if let Some(idx) = self.bucket_for(&node_id) {
if let Some((_, c)) = self.buckets[idx]
.replacements
.iter_mut()
.find(|(id, _)| id == &node_id)
{
c.add_protocol(protocol);
}
}
} else {
let contact = Contact::new(node, protocol);
if self.insert_contact(node_id, contact) {
METRICS.record_new_discovery().await;
}
}
}
}Prompt To Fix With AI
This is a comment left during a code review.
Path: crates/networking/p2p/peer_table.rs
Line: 1129-1138
Comment:
**Replacement-list contacts are silently duplicated**
`get_contact_mut` only searches `bucket.contacts` (the main list), not `bucket.replacements`. When a node is already sitting in the replacement list and `do_new_contacts` is called for it again, the `if let Some(contact)` branch is skipped, a brand-new `Contact` is constructed, and `insert_contact` → `KBucket::insert_replacement` pushes a **second copy** of the same node ID into `replacements`. The protocol field on the original replacement entry is also never updated.
`do_new_contact_records` avoids this by checking `self.contact_exists(&node_id)` (which covers both lists). `do_new_contacts` should do the same:
```rust
async fn do_new_contacts(&mut self, nodes: Vec<Node>, protocol: DiscoveryProtocol) {
for node in nodes {
let node_id = node.node_id();
if self.discarded_contacts.contains(&node_id) || node_id == self.local_node_id {
continue;
}
if let Some(contact) = self.get_contact_mut(&node_id) {
contact.add_protocol(protocol);
} else if self.contact_exists(&node_id) {
// Node is in the replacement list; update its protocol entry there.
if let Some(idx) = self.bucket_for(&node_id) {
if let Some((_, c)) = self.buckets[idx]
.replacements
.iter_mut()
.find(|(id, _)| id == &node_id)
{
c.add_protocol(protocol);
}
}
} else {
let contact = Contact::new(node, protocol);
if self.insert_contact(node_id, contact) {
METRICS.record_new_discovery().await;
}
}
}
}
```
How can I resolve this? If you propose a fix, please make it concise.| fn insert_contact(&mut self, node_id: H256, contact: Contact) -> bool { | ||
| let Some(idx) = self.bucket_for(&node_id) else { | ||
| return false; | ||
| }; | ||
| self.buckets[idx].insert(node_id, contact); | ||
| true | ||
| } |
There was a problem hiding this comment.
insert_contact discards KBucket::insert's return value, over-counting discoveries
KBucket::insert returns false when the bucket is full and the contact is placed in the replacement list instead. insert_contact ignores that value and unconditionally returns true for any non-local node — so callers (e.g. do_new_contacts, handle_insert_if_new) record METRICS.record_new_discovery() even for contacts that only land in the replacement list.
fn insert_contact(&mut self, node_id: H256, contact: Contact) -> bool {
let Some(idx) = self.bucket_for(&node_id) else {
return false;
};
self.buckets[idx].insert(node_id, contact) // propagate the bucket's return value
}Prompt To Fix With AI
This is a comment left during a code review.
Path: crates/networking/p2p/peer_table.rs
Line: 937-943
Comment:
**`insert_contact` discards `KBucket::insert`'s return value, over-counting discoveries**
`KBucket::insert` returns `false` when the bucket is full and the contact is placed in the replacement list instead. `insert_contact` ignores that value and unconditionally returns `true` for any non-local node — so callers (e.g. `do_new_contacts`, `handle_insert_if_new`) record `METRICS.record_new_discovery()` even for contacts that only land in the replacement list.
```rust
fn insert_contact(&mut self, node_id: H256, contact: Contact) -> bool {
let Some(idx) = self.bucket_for(&node_id) else {
return false;
};
self.buckets[idx].insert(node_id, contact) // propagate the bucket's return value
}
```
How can I resolve this? If you propose a fix, please make it concise.- Fix clippy lints: collapsible if, let...else to ? operator - Fix duplicate replacement entries: use contact_exists() in do_new_contacts to check both main and replacement lists - Fix insert_contact: propagate KBucket::insert return value so metrics only count contacts inserted into main list - Fix remove_and_promote: use FIFO (promote oldest replacement) instead of LIFO, matching standard Kademlia behavior - Remove redundant local_node_id param from get_nodes_at_distances protocol method (now uses self.local_node_id) - Make KBucket fields pub(crate) instead of pub
🤖 Claude Code ReviewNow I have everything I need for a thorough review. Let me compile my findings. Review: feat(l1): reintroduce proper Kademlia k-bucket routing table (#6458)The overall structure is sound — moving from a flat Bugs1.
if let Some(contact) = self.get_contact_mut(&node_id) {
contact.add_protocol(protocol);
} else {
// ...
self.insert_contact(node_id, contact); // ← runs even when node is in replacements
}
The correct guard is 2. Replacement promotion is LIFO instead of FIFO — diverges from Kademlia spec
fn remove_and_promote(&mut self, node_id: &H256) -> Option<H256> {
// ...
if let Some((replacement_id, replacement)) = self.replacements.pop() { // ← newest
Fix: change 3. Silent ENR record loss for nodes that are only in the replacement list
When This is a consequence of the inconsistency between 4.
fn insert_contact(&mut self, node_id: H256, contact: Contact) -> bool {
let Some(idx) = self.bucket_for(&node_id) else { return false; };
self.buckets[idx].insert(node_id, contact); // return value ignored
true // ← true even when contact only reached the replacement list
}
Minor Issues5.
fn do_get_nodes_at_distances(&self, local_node_id: H256, distances: &[u32]) -> Vec<NodeRecord> {The whole point of this PR is to store 6. Redundant variable in
if let Some((replacement_id, replacement)) = self.replacements.pop() {
let promoted_id = replacement_id; // ← dead rename
self.contacts.push((replacement_id, replacement));
Some(promoted_id)
7.
pub struct KBucket {
pub contacts: Vec<(H256, Contact)>,
pub replacements: Vec<(H256, Contact)>,
}Exposing both fields as fully What Looks Good
Automated review by Claude (Anthropic) · sonnet · custom prompt |
- Run rustfmt on all affected files - Add get_any/get_contact_or_replacement accessors that search both main and replacement lists - Fix do_new_contact_records: ENR updates for replacement-list nodes are no longer silently dropped
Add Prometheus histograms for measuring k-bucket operation latencies: - ethrex_kademlia_insert_contact_duration_seconds: times insert_contact - ethrex_kademlia_iter_contacts_duration_seconds: times full table scans (via collect_contacts, used by get_closest_nodes) Add three Grafana panels to the P2P Packets dashboard: - insert_contact p50/p99 latency - iter_contacts full-scan p50/p99 latency - Operations rate (ops/s) for both
Add Prometheus histograms to measure peer table operation latencies on the current flat IndexMap implementation. These metrics provide the baseline for comparing against the k-bucket implementation in #6458. Metrics added: - ethrex_kademlia_insert_contact_duration_seconds: times contact insertion via IndexMap::entry in do_new_contacts - ethrex_kademlia_iter_contacts_duration_seconds: times full table scan in do_get_closest_nodes Also adds three Grafana panels to the P2P Packets dashboard for visualizing p50/p99 latencies and operation rates.
Fixes bug where add_protocol was silently skipped for contacts in the replacement list, since get_contact_mut only searches the main bucket list.
## Summary Add Prometheus histogram metrics to measure peer table operation latencies on the current flat `IndexMap` implementation. This provides the **baseline** for comparing against the k-bucket implementation in #6458. **Metrics added** (behind `#[cfg(feature = "metrics")]`): - `ethrex_kademlia_insert_contact_duration_seconds` — times contact insertion via `IndexMap::entry` in `do_new_contacts` - `ethrex_kademlia_iter_contacts_duration_seconds` — times full table scan in `do_get_closest_nodes` **Grafana panels** added to P2P Packets dashboard: - insert_contact p50/p99 latency - iter_contacts full-scan p50/p99 latency - Operations rate (ops/s) ## Workflow 1. Merge this PR → deploy to mainnet → collect baseline data 2. Merge #6458 (k-bucket implementation) → deploy → compare latencies on the same Grafana panels ## Test plan - [x] `cargo clippy` passes with and without `metrics` feature - [x] All 21 p2p tests pass - [ ] Verify metrics appear in Prometheus endpoint with `--metrics` flag
avilagaston9
left a comment
There was a problem hiding this comment.
LGTM. It'd be great to add some unit tests for the KBucket struct itself — insert, remove_and_promote, bucket_index edge cases, replacement eviction, etc.
|
|
||
| for (contact_id, contact) in &self.contacts { | ||
| let dist = Self::distance(&node_id, contact_id); | ||
| for (contact_id, contact) in self.collect_contacts() { |
There was a problem hiding this comment.
nit: collect_contacts() clones every contact across all 256 buckets into a new Vec just to iterate over them. You can use iter_contacts() here instead (yields references, zero allocations) and only clone the 16 selected nodes. After that collect_contacts() becomes dead code and can be removed.
- Replace collect_contacts() with iter_contacts() in do_get_closest_nodes to avoid cloning all contacts across 256 buckets. Remove the now-dead collect_contacts() method. - Add unit tests for KBucket: insert, contains, get/get_any, remove_and_promote, replacement eviction, and bucket_index edge cases.
…talls Fixes snapsync failures where peer count stays constant and sync eventually fails with "Failed to receive block headers" after hours of operation. Root cause: After PR #6458 introduced Kademlia k-buckets, peers that became unresponsive during sync weren't marked as disposable, so they remained in the routing table indefinitely. New peers went into replacement lists but were never promoted because dead peers weren't pruned. Changes: - Enhanced prune() to remove disposable contacts from both main and replacement lists, with automatic promotion of replacements - Mark peers as disposable when they timeout during RLPx operations (block headers, block bodies, sync head requests) - Added periodic pruning in the snap_sync main loop to ensure dead peers are regularly removed and replaced Evidence from CI artifacts showed peer count stuck at 6 throughout 3h35m sync before failure. This fix enables peer rotation so healthy peers from replacement lists can take over when active peers become unresponsive.
…talls Fixes snapsync failures where peer count stays constant and sync eventually fails with "Failed to receive block headers" after hours of operation. Root cause: After PR #6458 introduced Kademlia k-buckets, peers that became unresponsive during sync weren't marked as disposable, so they remained in the routing table indefinitely. New peers went into replacement lists but were never promoted because dead peers weren't pruned. Changes: - Enhanced prune() to remove disposable contacts from both main and replacement lists, with automatic promotion of replacements - Mark peers as disposable when they timeout during RLPx operations (block headers, block bodies, sync head requests) - Added periodic pruning in the snap_sync main loop to ensure dead peers are regularly removed and replaced Evidence from CI artifacts showed peer count stuck at 6 throughout 3h35m sync before failure. This fix enables peer rotation so healthy peers from replacement lists can take over when active peers become unresponsive.
…e" (lambdaclass#6505) **Motivation** We observed that lambdaclass#6458 caused a performance regression in snapsync. **Description** We are reverting for now, until the cause is addressed.
Summary
Closes #4245
IndexMap<H256, Contact>with proper Kademlia k-bucket routing table (256 buckets, 16 contacts per bucket, 10 replacements per bucket)H256comparison) for closest-node queries instead of logarithmic distancelocal_node_idinPeerTableServerand remove redundant parameter fromnew_contacts/new_contact_recordsprotocol methodsAlready addressed by prior work:
Arc<Mutex<...>>per field (already in place)Test plan
cargo checkpasses with no warnings