fix root hash calculation during reorgs by marcello33 · Pull Request #2080 · 0xPolygon/bor

marcello33 · 2026-02-24T09:51:54Z

Description

Fix root hash calculation during bor reorgs by always checking the end block's hash instead of current header.
More context here: #1950

Changes

Bugfix (non-breaking change that solves an issue)
Hotfix (change that solves an urgent issue, and requires immediate attention)
New feature (non-breaking change that adds functionality)
Breaking change (change that is not backwards-compatible and/or changes current functionality)
Changes only for a subset of nodes

Breaking changes

Please complete this section if any breaking changes have been made, otherwise delete it

Nodes audience

In case this PR includes changes that must be applied only to a subset of nodes, please specify how you handled it (e.g. by adding a flag with a default value...)

Checklist

I have added at least 2 reviewer or the whole pos-v1 team
I have added sufficient documentation in code
I will be resolving comments - if any - by pushing each fix in a separate commit and linking the commit hash in the comment reply
Created a task in Jira and informed the team for implementation in Erigon client (if applicable)
Includes RPC methods changes, and the Notion documentation has been updated

Cross repository changes

This PR requires changes to heimdall
- In case link the PR here:
This PR requires changes to matic-cli
- In case link the PR here:

Testing

I have added unit tests
I have added tests to CI
I have tested this code manually on local environment
I have tested this code manually on remote devnet using express-cli
I have tested this code manually on amoy
I have created new e2e tests into express-cli

Manual tests

Please complete this section with the steps you performed if you ran manual tests for this functionality, otherwise delete it

Additional comments

Please post additional comments in this section if you have them, otherwise delete it

codecov · 2026-02-24T10:55:02Z

Codecov Report

❌ Patch coverage is 86.36364% with 3 lines in your changes missing coverage. Please review.
✅ Project coverage is 50.61%. Comparing base (d7b2be8) to head (eca8045).
⚠️ Report is 5 commits behind head on develop.

Files with missing lines	Patch %	Lines
consensus/bor/api.go	86.36%	2 Missing and 1 partial ⚠️

❌ Your patch status has failed because the patch coverage (86.36%) is below the target coverage (90.00%). You can increase the patch coverage or adjust the target coverage.

Additional details and impacted files

@@             Coverage Diff             @@
##           develop    #2080      +/-   ##
===========================================
- Coverage    50.62%   50.61%   -0.01%     
===========================================
  Files          875      875              
  Lines       151794   151799       +5     
===========================================
- Hits         76842    76835       -7     
- Misses       69879    69895      +16     
+ Partials      5073     5069       -4

Files with missing lines	Coverage Δ
consensus/bor/api.go	`86.12% <86.36%> (-1.93%)`	⬇️

... and 15 files with indirect coverage changes

Files with missing lines	Coverage Δ
consensus/bor/api.go	`86.12% <86.36%> (-1.93%)`	⬇️

... and 15 files with indirect coverage changes

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

lucca30 · 2026-02-24T11:38:42Z

@claude Can you review this replacement check from the root hash to the end hash on the function and analyze the impacts? Do a deep analysis based on your deep knowledge on reorgs and how they should be handled. Get the context of the affected function and analyze the impact of this change. By the end evaluate if the tests made for these changes are compatible with the changes or if we should cover any other scenario

claude · 2026-02-24T11:39:02Z

Claude finished @lucca30's task in 1m 24s —— View job

Deep Analysis: Root Hash Calculation During Reorgs

Read and understand the changed files
Analyze the core fix and its implications for reorg handling
Examine how this relates to blockchain reorg fundamentals
Evaluate test coverage and identify any gaps

Summary

This PR fixes a subtle but important bug in the GetRootHash function related to cache invalidation during blockchain reorganizations. The fix changes from tracking the chain tip hash to tracking the end block hash of the checkpoint range, which is a more precise and correct approach.

Core Change Analysis

Before (Bug)

type rootHashCacheEntry struct {
    headHash common.Hash  // Tracked chain tip
    root     string
}

// Cache invalidation checked:
if entry.headHash == headHash { /* cache hit */ }

// Reorg detection checked:
newHead := api.chain.CurrentHeader()
if newHead.Hash() != headHash { /* reorg detected */ }

After (Fix)

type rootHashCacheEntry struct {
    endHash common.Hash  // Tracks end block of range
    root    string
}

// Cache invalidation checks:
if entry.endHash == endHash { /* cache hit */ }

// Reorg detection checks:
newEndHeader := api.chain.GetHeaderByNumber(end)
if newEndHeader.Hash() != endHash { /* reorg detected */ }

Why This Fix is Correct

The Problem with Tracking Head Hash

Consider a checkpoint request for blocks [100, 200] when the chain is at block 500:

Scenario: Normal block production (no reorg)

Call GetRootHash(100, 200) → computes root, caches with headHash=hash(500)
Chain advances to block 501
Call GetRootHash(100, 200) → cache miss because hash(500) != hash(501)
Unnecessarily recomputes the same root hash

This caused spurious cache invalidations whenever the chain tip advanced, even though the blocks [100, 200] hadn't changed.

The Solution: Track End Block Hash

By tracking endHash (hash of block 200 in the range [100, 200]):

Call GetRootHash(100, 200) → computes root, caches with endHash=hash(200)
Chain advances to block 501
Call GetRootHash(100, 200) → cache hit because hash(200) is unchanged
No recomputation needed

Reorg Detection Remains Sound

The fix also correctly updates the reorg detection mechanism:

Scenario	Old Behavior	New Behavior
Tip advances past `end` (normal)	Detects as "reorg" (false positive)	No false positive
Actual reorg affecting `[start, end]`	Correctly detected	Correctly detected
Reorg only affecting blocks after `end`	Detected (unnecessary)	Not detected (correct - irrelevant)

Defense in Depth: Multiple Reorg Safety Mechanisms

The function employs three layers of protection against mixed-fork data:

End-block hash binding (lines 302-318): Cache entries are invalidated if the canonical block at end changes
Post-computation verification (lines 351-353): After fetching all headers concurrently, re-checks that the end block hasn't changed mid-flight
Continuity validation (lines 367-372): Verifies H[n].ParentHash == H[n-1].Hash() for all headers, catching any interleaved forks

This multi-layered approach is sound from a reorg-safety perspective.

Test Coverage Evaluation

Tests Added/Modified

Test	Purpose	Coverage
`TestGetRootHash_ReorgDuringCall_ReturnsError`	In-flight reorg detection	✅
`TestGetRootHash_PureChainsStable`	Stable roots for consistent chains	✅
`TestGetRootHashCacheInvalidatedOnReorg`	Cache invalidation on actual reorg	✅
`TestAPI_GetRootHash_StableWhenTipAdvancesPastEnd`	NEW: Cache hit when tip advances	✅

The New Test is Key

The new test TestAPI_GetRootHash_StableWhenTipAdvancesPastEnd specifically validates the fix:

Computes root for range [0, 3]
Advances tip to block 5
Verifies the same root is returned (cache hit)
Confirms only 1 additional read (the end header check), not a full recomputation

This directly tests the scenario that was broken before.

Potential Gap: Non-contiguous Header Error Test

The errNonContiguousHeaderRange path (line 371) is not explicitly tested in the reorg tests. While the continuity check is a safety mechanism that would catch mixed-fork headers, adding a test that triggers this specific error path would improve coverage.

Suggested additional test scenario:

Configure a mid-flight reorg that switches chains after fetching some headers but before the end-hash check
Verify that either errReorgDuringRootComputation or errNonContiguousHeaderRange is returned

However, this is a minor gap since:

The existing TestGetRootHash_ReorgDuringCall_ReturnsError covers the end-hash detection
The continuity check is a secondary safety net

Conclusion

This fix is correct and improves both performance and precision:

Aspect	Assessment
Correctness	✅ Fixes spurious cache invalidation
Reorg safety	✅ Maintains protection via end-hash binding + continuity check
Performance	✅ Eliminates unnecessary recomputation when tip advances
Test coverage	✅ New test validates the specific fix
Backward compatibility	✅ No API changes, only internal behavior improvement

The change from headHash to endHash is semantically correct: the root hash for range [start, end] depends only on the blocks in that range, not on blocks that come after. Binding the cache to the end block hash precisely captures this invariant.

lucca30

lgtm

lucca30 · 2026-02-24T11:45:22Z

govuln check is pointing some errors, also kurtosis e2e is not properly running. Would be good having both working before merging.

marcello33 · 2026-02-24T11:57:05Z

govuln check is pointing some errors, also kurtosis e2e is not properly running. Would be good having both working before merging.

@lucca30
govuln is fixed here #2081
And kurtosis is failing due to pumba which was fixed in my PRs and in @kamuikatsurgi's too

sonarqubecloud · 2026-02-24T12:02:23Z

Quality Gate passed

Issues
1 New issue
0 Accepted issues

Measures
0 Security Hotspots
0.0% Coverage on New Code
0.0% Duplication on New Code

See analysis details on SonarQube Cloud

…othash fix root hash calculation during reorgs (cherry picked from commit bbc3b53)

marcello33 added 3 commits February 24, 2026 10:50

fix root hash calculation during reorgs

5f72e85

add regression test

ca8d34a

fix lint

ccd567a

kamuikatsurgi approved these changes Feb 24, 2026

View reviewed changes

lucca30 approved these changes Feb 24, 2026

View reviewed changes

Merge branch 'develop' into mardizzone/reorg_detection_roothash

eca8045

marcello33 merged commit bbc3b53 into develop Feb 25, 2026
20 of 24 checks passed

lucca30 pushed a commit that referenced this pull request Mar 5, 2026

Merge pull request #2080 from 0xPolygon/mardizzone/reorg_detection_ro…

752f439

…othash fix root hash calculation during reorgs (cherry picked from commit bbc3b53)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix root hash calculation during reorgs#2080

fix root hash calculation during reorgs#2080
marcello33 merged 4 commits intodevelopfrom
mardizzone/reorg_detection_roothash

marcello33 commented Feb 24, 2026

Uh oh!

codecov bot commented Feb 24, 2026 •

edited

Loading

Uh oh!

lucca30 commented Feb 24, 2026

Uh oh!

claude bot commented Feb 24, 2026 •

edited

Loading

Uh oh!

lucca30 left a comment

Uh oh!

lucca30 commented Feb 24, 2026

Uh oh!

marcello33 commented Feb 24, 2026

Uh oh!

sonarqubecloud bot commented Feb 24, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

marcello33 commented Feb 24, 2026

Description

Changes

Breaking changes

Nodes audience

Checklist

Cross repository changes

Testing

Manual tests

Additional comments

Uh oh!

codecov bot commented Feb 24, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

lucca30 commented Feb 24, 2026

Uh oh!

claude bot commented Feb 24, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Deep Analysis: Root Hash Calculation During Reorgs

Summary

Core Change Analysis

Before (Bug)

After (Fix)

Why This Fix is Correct

The Problem with Tracking Head Hash

The Solution: Track End Block Hash

Reorg Detection Remains Sound

Defense in Depth: Multiple Reorg Safety Mechanisms

Test Coverage Evaluation

Tests Added/Modified

The New Test is Key

Potential Gap: Non-contiguous Header Error Test

Conclusion

Uh oh!

lucca30 left a comment

Choose a reason for hiding this comment

Uh oh!

lucca30 commented Feb 24, 2026

Uh oh!

marcello33 commented Feb 24, 2026

Uh oh!

sonarqubecloud bot commented Feb 24, 2026

Quality Gate passed

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

codecov bot commented Feb 24, 2026 •

edited

Loading

claude bot commented Feb 24, 2026 •

edited

Loading