Skip to content

rpc: fix debug_getModifiedAccountsByNumber/Hash to match Geth semantics#20043

Merged
lupin012 merged 12 commits into
mainfrom
lupin012/align_debug_getModifiedAccountsBy_to_geth_resp
Mar 26, 2026
Merged

rpc: fix debug_getModifiedAccountsByNumber/Hash to match Geth semantics#20043
lupin012 merged 12 commits into
mainfrom
lupin012/align_debug_getModifiedAccountsBy_to_geth_resp

Conversation

@lupin012

Copy link
Copy Markdown
Contributor

Three bugs fixed:

  1. Block range semantics: adopt Geth's exclusive-start convention.

    • Single param N → cover block N: [Min(N), Max(N)+1)
    • Two params (S,E] → cover blocks S+1…E: [Max(S)+1, Max(E)+1)
  2. Storage-only modified accounts: add a StorageDomain pass. In Geth's MPT every storage write updates storageRoot in the account node, so contracts with only storage changes appear in the trie diff automatically. In Erigon's flat model AccountsDomain and StorageDomain are separate; a storage-only write leaves no trace in AccountsDomain. We now iterate StorageDomain and union the results.

  3. Filters to match Geth's new-trie-walk behaviour:

    • Precompiles touched but unchanged (bytes.Equal check): skip.
    • Self-destructed accounts (postVal empty): excluded via deletedAddrs.
    • Storage slots of deleted accounts: skipped via deletedAddrs with no extra GetAsOf(Account) calls — a destroyed account cannot write storage, so any StorageDomain entry for a non-deleted address is guaranteed to belong to a live account.

waiting RPC-test TAG

lupin012 and others added 9 commits March 20, 2026 22:41
Three bugs fixed:

1. Block range semantics: adopt Geth's exclusive-start convention.
   - Single param N  → cover block N:        [Min(N), Max(N)+1)
   - Two params (S,E] → cover blocks S+1…E: [Max(S)+1, Max(E)+1)
   Previously Erigon used Min(endNum)-1 as upper bound, producing a
   completely wrong and disjoint range from Geth.

2. Storage-only modified accounts: add a StorageDomain pass.
   In Geth's MPT every storage write updates storageRoot in the account
   node, so contracts with only storage changes appear in the trie diff
   automatically. In Erigon's flat model AccountsDomain and StorageDomain
   are separate; a storage-only write leaves no trace in AccountsDomain.
   We now iterate StorageDomain and union the results.

3. Filters to match Geth's new-trie-walk behaviour:
   - Precompiles touched but unchanged (bytes.Equal check): skip.
   - Self-destructed accounts (postVal empty): excluded via deletedAddrs.
   - Storage slots of deleted accounts: skipped via deletedAddrs with no
     extra GetAsOf(Account) calls — a destroyed account cannot write
     storage, so any StorageDomain entry for a non-deleted address is
     guaranteed to belong to a live account.

Performance: eliminating GetAsOf(Account) from the StorageDomain pass
reduces random reads on snapshot files significantly (3:20 → ~1 min
cold, ~38 s warm on 38-test suite vs Geth's 30 s cold / 0.8 s warm).

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
endBlock > latestBlock is an error per Geth behaviour; move that case
from correct_input to invalid_input and remove the incorrect NoError
assertion.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- getModifiedAccounts: clarify HistoryRange comment (one entry per
  unique key; pre-value equals state before range start)
- GetModifiedAccountsByHash: fetch latestBlock before hash lookups,
  consistent with GetModifiedAccountsByNumber; add endNum > latestBlock
  guard matching the ByNumber variant
- TestGetModifiedAccountsByNumber: fix stale comment, add address
  verification for (1,2] case, add storage-only contracts sub-test

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
@lupin012 lupin012 requested a review from yperbasis March 26, 2026 10:28

@yperbasis yperbasis left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Non-blocking:

  1. GetAsOf ok return value discarded — debug_api.go (diff lines):
    postVal, _, err := tx.GetAsOf(kv.AccountsDomain, k, endTxNum)
    When ok == false, postVal is nil/empty, which makes len(postVal) == 0 true → account is treated as deleted. This happens to produce the right behavior (key not found at end-of-range = effectively deleted),
    but it's relying on an implicit invariant. A brief comment noting this would make the intent clearer.
  2. checkPruneHistory only checks the range start — In both the ByNumber and ByHash two-param paths, only startNum+1 is checked for pruning. This is correct under sequential-pruning semantics (if block N is
    available, all blocks > N are too), but if pruning semantics ever change, this would silently return incomplete results. A comment noting the sequential-pruning assumption would be helpful.
  3. Test: discarded result in old code was a silent bug — The old test assigned _, err = GetModifiedAccountsByNumber(...) then checked require.Len(t, result, 3) against a stale result from a prior call. The PR
    correctly removes this, but doesn't call it out — might be worth a note in the PR description since it shows the old test had false coverage.

Minor Suggestions

  • debug_api.go — The addAddr helper captures addrLen from closure scope. This is fine, but since it's only used inside getModifiedAccounts, consider making addrLen a const (const addrLen = 20) instead of
    computing len(common.Address{}) at runtime — it's zero-cost but signals intent.
  • The large block comment on getModifiedAccounts (~40 lines) is genuinely useful documentation. If the function ever gets split or moved, it may be worth extracting the Geth-vs-Erigon architectural explanation
    into a package-level doc comment or a doc.go, but for now it's fine where it is.

- getModifiedAccounts: addrLen as const (compile-time, signals intent)
- getModifiedAccounts: comment on GetAsOf ok==false → nil postVal
  correctly maps to "deleted" (key absent from end state)
- GetModifiedAccountsByNumber/Hash: comment on checkPruneHistory only
  checking startNum+1, noting the sequential-pruning assumption

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
@lupin012 lupin012 marked this pull request as ready for review March 26, 2026 12:01
@lupin012 lupin012 force-pushed the lupin012/align_debug_getModifiedAccountsBy_to_geth_resp branch from b7867ae to 7f98988 Compare March 26, 2026 18:06
@lupin012 lupin012 added this pull request to the merge queue Mar 26, 2026
Merged via the queue into main with commit 6847589 Mar 26, 2026
35 checks passed
@lupin012 lupin012 deleted the lupin012/align_debug_getModifiedAccountsBy_to_geth_resp branch March 26, 2026 19:56
lupin012 added a commit that referenced this pull request May 31, 2026
… semantics fix to release/3.4 (#21507)

**[SharovBot]**

## Summary

Backport of PR #20043 (`rpc: fix debug_getModifiedAccountsByNumber/Hash
to match Geth semantics`) which was merged to `main` on 2026-03-26 but
never cherry-picked to `release/3.4`.

## Root cause

The `mainnet-rpc-integ-tests` CI job has been failing on `release/3.4`
with 37 tests failing:
- `debug_getModifiedAccountsByHash/test_01` through `test_19`
- `debug_getModifiedAccountsByNumber/test_01` through `test_19`

All these pass on `main` (confirmed from artifact comparison).

## Three bugs fixed

**1. Block range semantics** — adopted Geth's exclusive-start
convention:
- Single param N → cover exactly block N: `[Min(N), Max(N)+1)`
- Two params (S, E] → cover blocks S+1…E: `[Max(S)+1, Max(E)+1)`

**2. Storage-only modified accounts** — added a `StorageDomain` pass:
- In Geth's MPT, every storage write changes `storageRoot` in the
account node, so a trie diff automatically surfaces ALL modified
accounts including contracts with only storage changes.
- In Erigon's flat model, `AccountsDomain` and `StorageDomain` are
separate. A storage-only write leaves no trace in `AccountsDomain`. We
now iterate `StorageDomain` and union the results.

**3. Filters to match Geth output exactly:**
- Precompiles touched but unchanged (`bytes.Equal(preVal, postVal)`):
skip.
- Self-destructed accounts (`len(postVal) == 0`): excluded via
`deletedAddrs`.
- Storage slots of deleted accounts: skipped via `deletedAddrs`.

## Also

Disabled `ots_searchTransactionsAfter/test_11.json` and `test_12.json` —
these are tip-dependent tests whose expected values change as the chain
advances. Already disabled on `main` via PR #21348.

## Testing

- `go build ./rpc/...` passes cleanly
- All 37 previously failing tests expected to pass after this backport
(verified by comparing with passing main artifacts)
- Detected by `erigon-ci-monitor-release-3.4` cron monitoring job at
commit `4c7849ff`

---------

Co-authored-by: SharovBot <erigon-ci-bot@erigontech.io>
Co-authored-by: Giulio Rebuffo <giulio.rebuffo@gmail.com>
Co-authored-by: erigon-copilot[bot] <erigon-copilot[bot]@users.noreply.github.com>
Co-authored-by: Erigon CI Bot <erigon-ci-bot@erigon.tech>
Co-authored-by: lupin012 <58134934+lupin012@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants