fix(merge): capture bad blocks at pre-process rejection stages#11696
Conversation
|
@claude review |
|
Claude finished @LukaszRozmej's task in 6m 28s —— View job Code Review —
|
| Severity | Count |
|---|---|
| Critical | 0 |
| High | 0 |
| Medium | 1 |
| Low | 2 |
The Medium finding (missing test coverage for two of three patched sites) should be addressed before merge to ensure regression protection across all the paths the fix targets.
| { | ||
| if (badBlock.Hash is null) | ||
| { | ||
| if (Logger.IsWarn) Logger.Warn($"{nameof(ReportBadBlock)} call has been made for a block without a null hash."); |
There was a problem hiding this comment.
Low — log message typo (copied from DeleteInvalidBlock)
The condition guarded here is badBlock.Hash is null, so "without a null hash" is inverted. Should read "with a null hash" (or "without a hash"). The same typo exists in DeleteInvalidBlock at line 811 — both would be worth fixing together.
| if (Logger.IsWarn) Logger.Warn($"{nameof(ReportBadBlock)} call has been made for a block without a null hash."); | |
| if (Logger.IsWarn) Logger.Warn($"{nameof(ReportBadBlock)} call has been made for a block with a null hash."); |
| Nethermind.Blockchain.Metrics.BadBlocksByNethermindNodes++; | ||
| } | ||
|
|
||
| _invalidChainTracker.OnInvalidBlock(block.Hash!, block.ParentHash); |
There was a problem hiding this comment.
Low — inconsistent null handling between RecordBadBlock and ReportBadBlock
_invalidChainTracker.OnInvalidBlock(block.Hash!, ...) uses the null-forgiving operator, but the callee BlockTree.ReportBadBlock has an explicit if (badBlock.Hash is null) guard. If block.Hash were ever null here, OnInvalidBlock would throw NullReferenceException before the guard in ReportBadBlock fires, and the metrics would have already been incremented.
In practice all three call sites arrive after TryGetBlock succeeds and the block header has Hash = BlockHash set explicitly (see ExecutionPayload.TryGetBlock), so block.Hash is never null at these sites. Still, consider adding a guard or calling ReportBadBlock first (letting it absorb the null case) and only proceeding to the other calls when it succeeds — or add a null-check consistent with ReportBadBlock:
private void RecordBadBlock(Block block)
{
if (block.Hash is null) return; // mirrors BlockTree.ReportBadBlock guard
Nethermind.Blockchain.Metrics.BadBlocks++;
...
}| IBadBlockStore badBlockStore = chain.Container.Resolve<IBadBlockStore>(); | ||
| badBlockStore.GetAll().Should().ContainSingle() | ||
| .Which.Number.Should().Be(blockNumber); | ||
| } |
There was a problem hiding this comment.
Medium — missing integration-test coverage for two of the three patched rejection sites
This test verifies the hash-mismatch path. The PR also wires RecordBadBlock into the orphaned-block validation failure path (line ~163 in NewPayloadHandler.cs) and the suggested-block validation failure path (line ~198). Neither has a corresponding integration test that asserts IBadBlockStore.GetAll() contains the rejected block.
Because all three call sites share the same RecordBadBlock helper, the risk of one being silently disconnected in a future refactor is real. Consider adding:
[Test]
public async Task executePayloadV1_orphan_invalid_records_bad_block()
{
// submit a structurally-invalid block with unknown parent
// assert IBadBlockStore.GetAll() contains it
}
[Test]
public async Task executePayloadV1_suggested_block_invalid_records_bad_block()
{
// submit a payload that fails ValidateSuggestedBlock but has a known parent
// assert IBadBlockStore.GetAll() contains it
}|
@claude re-review |
|
Claude finished @LukaszRozmej's task in 3m 13s —— View job Re-review —
|
| Severity | Count | Status |
|---|---|---|
| Critical | 0 | — |
| High | 0 | — |
| Medium | 1 | Partially resolved — orphan path now covered; suggested-block path still missing |
| Low | 1 | Unchanged (null guard) |
The remaining Medium item (no regression test for the ValidateSuggestedBlock failure path) should be addressed before merge — the orphan-test commit shows the pattern is straightforward to replicate.
|
@claude re-review |
|
Claude finished @LukaszRozmej's task in 1m 33s —— View job Re-review —
|
| # | Severity | Finding | Status |
|---|---|---|---|
| 1 | Medium | Missing integration test for suggested-block rejection path | ✅ Resolved — 1c22124 |
| 2 | Low | Inconsistent null handling in RecordBadBlock (block.Hash!) |
✅ Resolved — 36b826a |
| 3 | Low | Log message typo in ReportBadBlock / DeleteInvalidBlock |
✅ Resolved — fe56564 |
Verdict
| Severity | Count |
|---|---|
| Critical | 0 |
| High | 0 |
| Medium | 0 |
| Low | 0 |
All previously raised findings are addressed. The PR is correct and complete — ready to merge.
| Nethermind.Blockchain.Metrics.BadBlocks++; | ||
| if (block.IsByNethermindNode()) | ||
| { | ||
| Nethermind.Blockchain.Metrics.BadBlocksByNethermindNodes++; | ||
| } |
There was a problem hiding this comment.
Should this be here or in BlockTree?
Introduce a recording-only counterpart to DeleteInvalidBlock: it adds the block to the invalid-blocks cache and the bad-block store but does not touch BestSuggested* or delete from the block store. This lets callers that reject a block before insertion (so there is nothing to delete from the chain) still surface it via debug_getBadBlocks. DeleteInvalidBlock now delegates to ReportBadBlock for the recording half so both paths stay in sync.
NewPayloadHandler can reject a block at three checks that run before BranchProcessor ever sees it (hash mismatch, orphaned-block validation, suggested-block validation). Until now those rejections only logged a warning: they did not bump the bad-block metrics, did not teach the invalid-chain tracker, and did not write to the bad-block store, so the block was invisible to debug_getBadBlocks. Route all three sites through a shared RecordBadBlock helper that mirrors the bookkeeping BlockchainProcessor performs when it catches an InvalidBlockException. Closes NethermindEth#6719
…teInvalidBlock Both guard clauses fire when `block.Hash is null` but logged "without a null hash", which inverts the condition. Use "with a null hash" so operators see the actual state when these branches are exercised.
Builds on the hash-mismatch coverage by exercising the orphan rejection branch: an unknown parent hash forces `NewPayloadHandler` down the `ValidateOrphanedBlock` path, and a corrupted `GasUsed > GasLimit` makes that validation fail. The new test asserts the rejected block lands in `IBadBlockStore`, verifying the second of the three RecordBadBlock sites introduced in this PR.
`RecordBadBlock` used `block.Hash!` while the downstream `ReportBadBlock` guards `badBlock.Hash is null`. If the hash were ever null, the `OnInvalidBlock` call would NRE after metrics had already been bumped. Add an early return mirroring `ReportBadBlock` so the helper is self-consistent and the metrics are not advanced for an unrecordable block.
Covers the third `RecordBadBlock` site in `NewPayloadHandler` — the `ValidateSuggestedBlock` failure reached when `ShouldProcessBlock` returns `false` because the parent is known but unprocessed (beacon-sync mode). Inserts a parent header via `BeaconBlockInsert` so it is found but never processed, then submits a child with `Timestamp <= parent.Timestamp` to break a parent-relative header invariant. Without the recording call at this site the assertion on `IBadBlockStore.GetAll` would fail.
1c22124 to
82a6824
Compare
…capture-pre-processing # Conflicts: # src/Nethermind/Nethermind.Merge.Plugin.Test/EngineModuleTests.V1.cs
|
@claude review the following review 🔴 CRITICAL C1 — _latestBlocks LRU cache not populated at pre-process rejection sites C2 — BlockTree.DeleteInvalidBlock can null-deref after the delegation refactor 🟠 WARNING W1 — InvalidChainTracker.OnInvalidBlock called on unvalidated parent hash W2 — ReadOnlyBlockTree.ReportBadBlock throws InvalidOperationException W3 — Inline Nethermind.Blockchain.Metrics.BadBlocks++ fully-qualified W4 — Orphan-block test fragility 🟡 SUGGESTION S1 — Pass the error message into RecordBadBlock(block, message) S2 — Test names don't follow project convention
S3 — Hot-path I/O risk in BadBlockStore.Insert |
|
Claude finished @svlachakis's task in 5m 11s —— View job Review of submitted findings
AssessmentAfter reading the relevant source, two of the five critical/warning findings are factually incorrect, one is overstated, and one is imprecise. The suggestions are mostly valid at low severity. Details below. C1 —
|
| Finding | Claimed Severity | Accurate? | Actual Severity |
|---|---|---|---|
C1 — _latestBlocks not populated |
Critical | Partially | Low |
C2 — DeleteInvalidBlock null-deref |
Critical | No — guard is still present | N/A |
| W1 — parent hash poisoning | Warning | Imprecise | Low |
| W2 — wrong exception type | Warning | No — InvalidOperationException is correct for ReadOnlyBlockTree |
N/A |
| W3 — fully-qualified Metrics | Warning | Yes | Low/Suggestion |
| W4 — orphan test fragility | Warning | Yes | Low |
| S1–S3 | Suggestion | Yes | Suggestion |
Two findings (C2, W2) are based on incorrect premises and should not block the PR. C1 and W1 are valid concerns but at low severity, not critical. The PR itself is in good shape — all previously raised findings from the three review rounds have been addressed.
|
@0xDevNinja please check build issues & claude review |
…BlockTreeTests Nethermind.Blockchain.Test does not reference FluentAssertions; rewrite the new Report_bad_block tests to NUnit Assert.That to match the existing style in this assembly and unbreak the build. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
045135a to
92cf9c1
Compare
|
@claude re-review |
|
Claude finished @LukaszRozmej's task in 5m 41s —— View job Re-review —
|
| Commit | Description |
|---|---|
92cf9c1 |
Fix build: replace FluentAssertions with NUnit Assert.That in the two new Report_bad_block_* tests — Nethermind.Blockchain.Test does not reference FluentAssertions, so those tests failed to compile |
That's the build issue @svlachakis flagged. The fix is correct: the new assertions (Assert.That(stored, Has.Length.EqualTo(1)), Assert.That(stored[0].Hash, Is.EqualTo(...)), etc.) match the style used everywhere else in BlockTreeTests.cs.
Prior findings — final status
All findings from the three previous claude review rounds remain fully resolved. No regressions.
| Round | Finding | Status |
|---|---|---|
| 1 | Log message typo in ReportBadBlock / DeleteInvalidBlock |
✅ Fixed (fe56564) |
| 1–2 | Missing null guard in RecordBadBlock (block.Hash!) |
✅ Fixed (b6eb2ef) |
| 1–2 | No integration test for orphan rejection path | ✅ Fixed (5d3946c) |
| 1–3 | No integration test for suggested-block rejection path | ✅ Fixed (82a6824) |
Assessment of @svlachakis's findings (re-confirmed against current code)
The previous re-review already assessed these; no new commits have changed the relevant code. Verdict stands:
| Finding | Claimed | Accurate | Note |
|---|---|---|---|
C1 — _latestBlocks not populated at pre-process sites |
Critical | Partially | Low at most — _latestBlocks is only consulted inside ValidateBlockAndProcess, which is unreachable from pre-process rejection sites; the "re-submission" scenario costs one cheap SHA-3 re-check, not full processing |
C2 — DeleteInvalidBlock null-deref |
Critical | No | DeleteInvalidBlock retains its own if (invalidBlock.Hash is null) return; guard (line 810); the ! at line 827 is provably safe |
W1 — OnInvalidBlock poisons parent hashes |
Warning | Imprecise | OnInvalidBlock(bad_hash, parent_hash) marks bad_hash as invalid; the parent node is never marked invalid; no legitimate block can share the wrong claimed hash |
W2 — ReadOnlyBlockTree.ReportBadBlock wrong exception |
Warning | No | Every mutation guard in ReadOnlyBlockTree throws InvalidOperationException (lines 51, 63, 66, 68, 74, …); NotSupportedException is StatelessBlockTree's pattern, not this class |
W3 — fully-qualified Metrics in RecordBadBlock |
Warning | Yes (Low) | Minor style; a using alias would be cleaner but is not blocking |
| W4 — orphan-block test fragility | Warning | Yes (Low) | Adding a comment naming the specific validator rule (GasUsed > GasLimit) would guard against silent rerouting |
Open inline comment — "Should this be here or in BlockTree?"
@LukaszRozmej's question at NewPayloadHandler.cs:273–285: whether RecordBadBlock's logic (metrics + OnInvalidBlock + ReportBadBlock) should live in BlockTree instead.
Answer: it belongs in NewPayloadHandler. BlockTree has no dependency on IInvalidChainTracker, and adding one would create an upward dependency violation: Nethermind.Blockchain would gain a compile-time dependency on a Merge-layer concern. The current split is the right factoring:
BlockTree.ReportBadBlock— pure storage (bad-block store + in-memory cache)NewPayloadHandler.RecordBadBlock— Engine-API layer orchestration (metrics + chain invalidity tracking + storage)
The precedent is the same split used in BlockchainProcessor: it calls DeleteInvalidBlock (storage) after setting up chain-tracker state externally.
New findings
None. The build fix in 92cf9c1 is correct and complete. All three pre-process rejection paths (hash mismatch, orphan validation, suggested-block validation) are wired to RecordBadBlock and each has an integration test verifying the bad-block surfaces in IBadBlockStore.GetAll().
Verdict
| Severity | Count |
|---|---|
| Critical | 0 |
| High | 0 |
| Medium | 0 |
| Low | 2 (W3 style alias, W4 test comment — both optional) |
Build is clean, all previous findings are resolved. Ready to merge.
Closes #6719
Changes
IBlockTree.ReportBadBlock(Block)— recording-only counterpart toDeleteInvalidBlock. Inserts the block into the in-memory invalid-blocks cache and the bad-block store, but does not resetBestSuggested*or delete from the block store (the rejected block was never inserted there).DeleteInvalidBlocknow delegates the recording half toReportBadBlockso both paths stay in sync.NewPayloadHandler.RecordBadBlock— private helper that bumpsMetrics.BadBlocks/Metrics.BadBlocksByNethermindNodes, notifiesIInvalidChainTracker, and calls_blockTree.ReportBadBlock. Wired into the three pre-process rejection sites inNewPayloadHandler.HandleAsync: hash mismatch, orphaned-block validation, and suggested-block validation.BlockTreeOverlay,ReadOnlyBlockTree,StatelessBlockTree, andTestBlockTree.Types of changes
What types of changes does your code introduce?
Testing
Requires testing
If yes, did you write tests?
Notes on testing
BlockTreeTests.Report_bad_block_stores_block_and_does_not_alter_main_chain— bad-block store receives the block;BestSuggestedHeaderunchanged (vs.DeleteInvalidBlock).BlockTreeTests.Report_bad_block_ignores_block_without_hash— null-Hashguard.EngineModuleTests.executePayloadV1_invalid_hash_records_bad_block— end-to-end: anengine_newPayloadV1call with mismatchedBlockHashcauses the block to surface in the resolvedIBadBlockStore(i.e. would be visible indebug_getBadBlocks).dotnet build src/Nethermind/Nethermind.slnx -c Release— full solution builds clean.Nethermind.Blockchain.Test— 1491 passing / 15 pre-existing skipped / 0 failures.Nethermind.Merge.Plugin.Test— 1016 passing / 2 pre-existing skipped / 0 failures.Documentation
Requires documentation update
Requires explanation in Release Notes