Skip to content

(feat): tracing support for state-sync transactions#2172

Closed
manav2401 wants to merge 44 commits into
developfrom
state-sync-tracing
Closed

(feat): tracing support for state-sync transactions#2172
manav2401 wants to merge 44 commits into
developfrom
state-sync-tracing

Conversation

@manav2401

@manav2401 manav2401 commented Mar 31, 2026

Copy link
Copy Markdown
Member

Description

Extension of #2062 from @milando12.

This PR

  • Refactors almost all tracing methods as state-sync transactions weren't handled correctly.
    • This involves major simplification of mostly all tracing methods (e.g. debug_traceBlockByNumber, debug_traceBlockByHash, debug_traceChain, etc).
    • It mimics the actual production environment for tracing state-sync events which wasn't the case earlier (and hence used to give incorrect results).
  • Adds live tracing support for state-sync transactions. This means if someone has live tracing enabled while running a node, the tracing will also cover the state-sync transactions which contains the bridge events.
  • Adds new custom hooks for tracing state-sync transactions which wraps the inner tracer's hooks. More explanation on this below.
  • Adds e2e tests to actually test a state-sync transaction trace via state receiver contract instead of using mock data.

The reason why we add custom hooks is because state-sync transactions can contain multiple bridge events and all of them are executed as separate EVM calls which are packed into 1 transaction. This breaks assumptions for certain tracer e.g. in callTracer, it should only have 1 root call frame per transaction. This would be invalid for state-sync transactions with multiple state-sync events (because it would have multiple independent root call frames). Hence, referencing the erigon implementation, this PR adds wrapper hooks which crafts a single synthetic root call frame and all bridge event calls are nested sub-calls inside it.

Changes

  • Bugfix (non-breaking change that solves an issue)
  • Hotfix (change that solves an urgent issue, and requires immediate attention)
  • New feature (non-breaking change that adds functionality)
  • Breaking change (change that is not backwards-compatible and/or changes current functionality)
  • Changes only for a subset of nodes

Breaking changes

Please complete this section if any breaking changes have been made, otherwise delete it

Nodes audience

In case this PR includes changes that must be applied only to a subset of nodes, please specify how you handled it (e.g. by adding a flag with a default value...)

Checklist

  • I have added at least 2 reviewer or the whole pos-v1 team
  • I have added sufficient documentation in code
  • I will be resolving comments - if any - by pushing each fix in a separate commit and linking the commit hash in the comment reply
  • Created a task in Jira and informed the team for implementation in Erigon client (if applicable)
  • Includes RPC methods changes, and the Notion documentation has been updated

Cross repository changes

  • This PR requires changes to heimdall
    • In case link the PR here:
  • This PR requires changes to matic-cli
    • In case link the PR here:

Testing

  • I have added unit tests
  • I have added tests to CI
  • I have tested this code manually on local environment
  • I have tested this code manually on remote devnet using express-cli
  • I have tested this code manually on amoy
  • I have created new e2e tests into express-cli

Manual tests

Please complete this section with the steps you performed if you ran manual tests for this functionality, otherwise delete it

Additional comments

Please post additional comments in this section if you have them, otherwise delete it

Add live tracing and refactor tracing methods for state-sync transactions.

Co-authored-by: Milan Došlić <milan.doslic@tenderly.co>
Copilot AI review requested due to automatic review settings March 31, 2026 10:26

@claude claude Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Claude Code Review

This repository is configured for manual code reviews. Comment @claude review to trigger a review and subscribe this PR to future pushes, or @claude review once for a one-time review.

Tip: disable this comment in your organization's Code Review settings.

@manav2401

Copy link
Copy Markdown
Member Author

@claude review

@claude

claude Bot commented Mar 31, 2026

Copy link
Copy Markdown

Claude finished @manav2401's task in 3m 51s —— View job


Code Review: State-Sync Tracing Support

  • Gather context and read changed files
  • Review core tracing changes (eth/tracers/api.go)
  • Review state-sync processor changes (consensus/bor/statefull/processor.go)
  • Review transaction type changes (core/types/transaction.go)
  • Review state processor changes (core/state_processor.go, core/parallel_state_processor.go)
  • Review test changes
  • Review remaining files
  • Post final review

Summary

Good PR overall. The refactoring consolidates state-sync tracing into the standard tracing codepaths (removing the separate api_bor.go and TraceBorBlock endpoint), adds live tracing support via vmConfig propagation through the consensus engine, and uses real StateReceiver contract bytecode in e2e tests. The net deletion of ~700 lines is a positive sign.

Below are the findings, organized by severity.


Issues

1. [HIGH] OnTxEnd called even when no state-sync receipt was added

File: core/state_processor.go:179-181 and core/parallel_state_processor.go:451-453

if hooks := evm.Config.Tracer; hooks != nil && hooks.OnTxEnd != nil {
    hooks.OnTxEnd(receipts[len(receipts)-1], nil)
}

OnTxEnd is called unconditionally after the Madhugiri check, regardless of whether appliedNewStateSyncReceipt is true. If no new state-sync receipt was added (i.e., appliedNewStateSyncReceipt == false), this will still call OnTxEnd with the last regular transaction's receipt, creating a mismatched OnTxStart/OnTxEnd pair. The OnTxStart at line 157 fires only when a state-sync tx is present, but OnTxEnd fires whenever IsMadhugiri is true.

Suggested fix: Guard OnTxEnd with the same condition:

if appliedNewStateSyncReceipt {
    allLogs = append(allLogs, receipts[len(receipts)-1].Logs...)
    if hooks := evm.Config.Tracer; hooks != nil && hooks.OnTxEnd != nil {
        hooks.OnTxEnd(receipts[len(receipts)-1], nil)
    }
}

Similarly, OnTxStart at line 156-159 should also be guarded — it fires if the last tx in the block is a state-sync tx, but that doesn't guarantee Finalize will actually produce a receipt for it. Consider moving both hooks to be conditional on receipt production.

Fix this →


2. [MEDIUM] Duplicate import in consensus/bor/heimdall/span/spanner.go

File: consensus/bor/heimdall/span/spanner.go:13-14

"github.com/ethereum/go-ethereum/consensus/bor/abi"
borabi "github.com/ethereum/go-ethereum/consensus/bor/abi"

The same package is imported twice — once unnamed and once as borabi. Only borabi is used (for borabi.SystemTxGas and borabi.ValidatorSet()). The unused abi import will cause a compile error unless the linter/compiler is lenient about it (it shouldn't be).

Note: Looking more carefully, the abi import was already there before the PR (for the abi.ValidatorSet() call), and the diff adds borabi as an alias. The fix should be to remove the unnamed import and keep only the aliased one, or just use abi.SystemTxGas without the alias.

Fix this →


3. [MEDIUM] traceChain doesn't handle state-sync transactions

File: eth/tracers/api.go:290-313

In traceChain, the inner loop iterates over task.block.Transactions() and calls traceTx for each tx. For state-sync transactions, traceTx calls ApplyStateSyncEvents which sets up a tx context via vmenv.SetTxContext(core.NewEVMTxContext(message)). However, the traceChain worker calls core.TransactionToMessage(tx, signer, ...) for every transaction including state-sync ones. TransactionToMessage uses the signer to recover the sender — for state-sync transactions (which have no signature), this may produce unexpected results or errors that are silently discarded by the _ assignment.

While traceTx internally handles state-sync via the StateSyncTxType branch, the message created by traceChain's outer loop at line 297 isn't meaningful for state-sync txs. This isn't a correctness bug per se (the message is passed through but traceTx overrides the behavior), but it's worth noting.

More importantly: traceBlockParallel (line 715) correctly skips state-sync txs in the parallel phase and traces them sequentially at the end (lines 807-824). But traceChain (line 290) processes all txs including state-sync in the parallel workers without this special handling. This means state-sync txs in traceChain are traced in the worker thread using a copied statedb, which may not have the correct accumulated state from prior transactions (since state-sync must be the last tx and depends on all prior state changes).


4. [MEDIUM] ApplyStateSyncEvents uses time.Now() for sync time

File: consensus/bor/statefull/processor.go:181

now := time.Now().Unix()

Using time.Now() as the syncTime parameter for commitState during tracing means:

  • Different tracing runs of the same block will produce different calldata (different timestamp).
  • This is non-deterministic and means trace results may not be reproducible.

For tracing (replay) purposes, consider using the block header's timestamp instead, which would be available from the EVM's block context. This would make traces deterministic and reproducible.

Fix this →


5. [LOW] GetSystemMessage uses params.MaxTxGas instead of block gas limit

File: consensus/bor/statefull/processor.go:77

Gas: params.MaxTxGas, // should be more than enough for state-sync related syscalls

MaxTxGas is used as the gas limit for the system message in ApplyStateSyncEvents. This is fine for production state-sync (where gas usage is bounded), but the comment acknowledges it's "should be more than enough." The original CommitState path via GenesisContractsClient also uses this value, so at least it's consistent.


6. [LOW] api_bor.go deletion removes TraceBorBlock and DebugBlock type

File: eth/tracers/api_bor.go (deleted), internal/cli/server/service.go:232

The TraceBorBlock RPC method, BlockTraceResult, TxTraceResult, and TraceBlockRequest types are removed. The DebugBlock gRPC handler is updated to use TraceBlockByNumber instead. This is a breaking change for any external callers of the debug_traceBorBlock RPC method. Make sure this is documented and communicated.


7. [LOW] Config reordering in DefaultConfig() changes TOML output order

File: internal/cli/server/config.go:810-1029

The fields in DefaultConfig() have been reordered and new defaults explicitly added (Ethstats, VMTrace, VMTraceJsonConfig, SyncMode, etc.). While functionally equivalent, this changes the output of make docs (the default_config.toml and example_config.toml diffs confirm this). No functional concern, but worth noting in case any tooling depends on field ordering.


Positive Observations

  1. Good refactoring: Removing the separate api_bor.go and consolidating state-sync handling into the standard tracing methods is the right approach. It reduces code duplication and ensures all tracing codepaths handle state-sync consistently.

  2. ABI extraction: Moving ABI definitions from consensus/bor/contract/ to consensus/bor/abi/ is a clean separation that allows the statefull package to use them without circular dependencies.

  3. Bounds check added: The new bounds check in eth/state_accessor.go:268-269 is a good defensive improvement.

  4. Strong test coverage: The e2e tests use real StateReceiver contract bytecode and verify actual EVM execution traces (LOG0 opcode counts, intermediate roots). This is much better than the previous mock-based approach.

  5. Live tracing support: Propagating vmConfig through Bor.SetVMConfigCommitState/CommitSpanstatefull.ApplyMessage is a clean way to enable live tracing for system transactions without major architectural changes.


Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds Bor-specific tracing support so state-sync (bridge) transactions are included in VM/live traces, and refactors tracer codepaths to treat state-sync txs as first-class transactions (post-Madhugiri). Also updates consensus “system tx” execution to propagate VM tracer config and adds end-to-end tests that replay state-sync events via the StateReceiver contract.

Changes:

  • Extend tracing APIs to replay and trace post-Madhugiri StateSyncTx execution, including correct log indexing/cumulative context plumbing.
  • Propagate vmtrace VM config into Bor consensus system calls (span/state commits) so live tracing covers bridge processing.
  • Add E2E state-sync tracing tests using actual StateReceiver bytecode and update docs/config defaults for VM tracing.

Reviewed changes

Copilot reviewed 36 out of 37 changed files in this pull request and generated 5 comments.

Show a summary per file
File Description
tests/bor/mocks/IHeimdallClient.go Regenerated mocks (mockgen metadata/any updates).
tests/bor/helper.go Update mock expectations for CommitSpan signature change.
tests/bor/bor_test.go Update mock expectations for CommitSpan signature change.
params/protocol_params.go Introduce params.BorSystemAddress for Bor system transactions.
internal/cli/server/service.go Switch DebugBlock tracing implementation to tracer API’s TraceBlockByNumber.
internal/cli/server/flags.go Clarify vmtrace flag usage text.
internal/cli/server/config.go Extend/reorder defaults (vmtrace/jsonconfig, rpc limits, sync/gc defaults, etc.).
eth/tracers/internal/tracetest/supply_test.go Update expected supply trace fields (issuance reward).
eth/tracers/dir.go Add Bor-specific context fields for state-sync tracing (cumulative gas/log index).
eth/tracers/api.go Major tracer refactor: state-sync tx replay/tracing, new traceTx return signature, removal of BorTraceEnabled paths.
eth/tracers/api_test.go Remove IO tracing dump test and related helpers.
eth/tracers/api_statesync_test.go Add E2E tests for tracing state-sync txs via StateReceiver + target contract.
eth/tracers/api_bor.go Remove Bor-only tracing API (TraceBorBlock) implementation.
eth/state_accessor.go Add bounds check for txIndex in stateAtTransaction.
eth/filters/IIterator.go Regenerated mocks (mockgen metadata/structure).
eth/filters/IDatabase.go Regenerated mocks + parameter naming cleanup/any updates.
eth/filters/IBatch.go Regenerated mocks + parameter naming cleanup/any updates.
eth/filters/IBackend.go Regenerated mocks + interface additions reflected in mock.
eth/ethconfig/config.go Use centralized Bor ABI package for validator set ABI.
eth/backend.go Pass VM tracer config into Bor engine for system tx tracing (SetVMConfig).
docs/cli/server.md Document updated vmtrace / vmtrace.jsonconfig defaults.
docs/cli/example_config.toml Add VM tracing example config entries.
docs/cli/default_config.toml Default vmtrace.jsonconfig to {} instead of empty string.
core/types/transaction.go Add Transaction.GetStateSyncData() helper for StateSyncTx.
core/state_processor.go Fire tracer hooks around state-sync tx processing and pass hooked state into Finalize.
core/parallel_state_processor.go Fire tracer hooks around state-sync tx processing in parallel processor.
consensus/bor/statefull/processor.go Add replay helper ApplyStateSyncEvents and plumb VM config into system calls.
consensus/bor/span.go Extend Spanner.CommitSpan signature to accept vm.Config.
consensus/bor/span_mock.go Regenerated mock for updated Spanner interface.
consensus/bor/heimdall/span/spanner.go Pass VM config through to system call execution; use centralized SystemTxGas constant.
consensus/bor/genesis.go Extend GenesisContract CommitState signature to accept vm.Config.
consensus/bor/genesis_contract_mock.go Regenerated mock for updated GenesisContract interface.
consensus/bor/contract/client.go Move ABI + SystemTxGas constants to consensus/bor/abi.
consensus/bor/bor.go Store VM config in Bor engine and use it for span/state system transactions.
consensus/bor/bor_test.go Update fakes/mocks for new CommitSpan / CommitState signatures.
consensus/bor/abi/common.go New centralized Bor ABI + SystemTxGas definitions.
cmd/keeper/go.sum Dependency checksum updates.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread internal/cli/server/service.go Outdated
Comment thread consensus/bor/statefull/processor.go Outdated
Comment thread eth/tracers/api.go
Comment thread eth/tracers/api.go
Comment thread eth/tracers/api_statesync_test.go Outdated
@claude

claude Bot commented Mar 31, 2026

Copy link
Copy Markdown

Code Review

Found 3 issues:

1. Bug: Duplicate import causes compilation failure - consensus/bor/heimdall/span/spanner.go:13-14

The same package consensus/bor/abi is imported twice (once as abi, once as borabi). Go does not allow this. Fix: Remove the unaliased import and use borabi consistently.

2. Bug: OnTxEnd called without guard - panics on empty blocks - core/state_processor.go:176-181, core/parallel_state_processor.go:448-453

OnTxEnd is inside the IsMadhugiri block but NOT guarded by appliedNewStateSyncReceipt. On empty blocks, receipts[len(receipts)-1] panics with index out of range. On blocks without state-sync tx, OnTxEnd fires without matching OnTxStart. Fix: Move OnTxEnd inside the appliedNewStateSyncReceipt guard.

3. Bug: time.Now() produces non-deterministic traces - consensus/bor/statefull/processor.go:181

ApplyStateSyncEvents uses time.Now().Unix() as syncTime, but production path uses deterministic event.Time from Heimdall. Makes trace results non-reproducible. Root cause: StateSyncData lacks a Time field. Fix: Add Time field to StateSyncData and use it instead of time.Now().

Copilot AI review requested due to automatic review settings April 2, 2026 10:14

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 36 out of 37 changed files in this pull request and generated 5 comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread eth/tracers/api.go
Comment thread eth/tracers/api.go
Comment thread eth/tracers/api.go
Comment thread params/protocol_params.go
Comment thread eth/tracers/api_statesync_test.go Outdated
Comment thread consensus/bor/statefull/processor.go Outdated
Comment thread eth/tracers/api.go
@codecov

codecov Bot commented Apr 2, 2026

Copy link
Copy Markdown

Codecov Report

❌ Patch coverage is 40.64039% with 482 lines in your changes missing coverage. Please review.
✅ Project coverage is 52.76%. Comparing base (23b80b7) to head (57a903e).
⚠️ Report is 11 commits behind head on develop.

Files with missing lines Patch % Lines
eth/filters/IBackend.go 3.03% 256 Missing ⚠️
eth/tracers/api.go 67.08% 71 Missing and 8 partials ⚠️
eth/filters/IDatabase.go 0.00% 61 Missing ⚠️
eth/filters/IBatch.go 0.00% 16 Missing ⚠️
eth/tracers/native/prestate.go 0.00% 13 Missing ⚠️
core/state_processor.go 40.00% 11 Missing and 1 partial ⚠️
tests/bor/mocks/IHeimdallClient.go 0.00% 10 Missing ⚠️
consensus/bor/statefull/processor.go 81.63% 6 Missing and 3 partials ⚠️
eth/backend.go 0.00% 9 Missing ⚠️
eth/state_accessor.go 0.00% 5 Missing ⚠️
... and 5 more
Additional details and impacted files

Impacted file tree graph

@@             Coverage Diff             @@
##           develop    #2172      +/-   ##
===========================================
+ Coverage    52.38%   52.76%   +0.38%     
===========================================
  Files          885      886       +1     
  Lines       155855   156759     +904     
===========================================
+ Hits         81644    82716    +1072     
+ Misses       68979    68793     -186     
- Partials      5232     5250      +18     
Files with missing lines Coverage Δ
consensus/bor/abi/common.go 100.00% <100.00%> (ø)
core/parallel_state_processor.go 22.15% <100.00%> (+0.24%) ⬆️
core/types/bor_receipt.go 0.00% <ø> (ø)
core/types/transaction.go 51.55% <100.00%> (+3.00%) ⬆️
core/types/transaction_signing.go 54.54% <100.00%> (+3.43%) ⬆️
core/types/tx_state_sync.go 70.96% <100.00%> (+19.35%) ⬆️
eth/filters/IIterator.go 0.00% <ø> (ø)
eth/tracers/dir.go 76.47% <ø> (+23.52%) ⬆️
eth/tracers/state_sync_tracing_hooks.go 100.00% <100.00%> (ø)
internal/cli/server/config.go 63.99% <100.00%> (+0.11%) ⬆️
... and 18 more

... and 34 files with indirect coverage changes

Files with missing lines Coverage Δ
consensus/bor/abi/common.go 100.00% <100.00%> (ø)
core/parallel_state_processor.go 22.15% <100.00%> (+0.24%) ⬆️
core/types/bor_receipt.go 0.00% <ø> (ø)
core/types/transaction.go 51.55% <100.00%> (+3.00%) ⬆️
core/types/transaction_signing.go 54.54% <100.00%> (+3.43%) ⬆️
core/types/tx_state_sync.go 70.96% <100.00%> (+19.35%) ⬆️
eth/filters/IIterator.go 0.00% <ø> (ø)
eth/tracers/dir.go 76.47% <ø> (+23.52%) ⬆️
eth/tracers/state_sync_tracing_hooks.go 100.00% <100.00%> (ø)
internal/cli/server/config.go 63.99% <100.00%> (+0.11%) ⬆️
... and 18 more

... and 34 files with indirect coverage changes

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

Comment thread core/state_processor.go
Comment thread eth/tracers/api.go Outdated
Comment thread eth/tracers/api.go
Copilot AI review requested due to automatic review settings April 2, 2026 19:54
@manav2401

Copy link
Copy Markdown
Member Author

@claude review

Comment thread core/parallel_state_processor.go
Comment thread core/state_processor.go Outdated
)
if hooks := cfg.Tracer; hooks != nil && hasStateSyncTx {
if hooks.OnTxStart != nil {
hooks.OnTxStart(evm.GetVMContext(), txs[len(txs)-1], params.BorSystemAddress)

@vbhattaccmu vbhattaccmu May 21, 2026

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The state-sync trace starts before Finalize,

but consensus/bor/bor.go’s Finalize() method also does non-state-sync work like span commits and contract-code updates.

Since the tracer wrapper is active for that whole window, unrelated calls can show up inside the synthetic state-sync tx. I think we can move the OnTxStart / OnTxEnd bracketing closer to the actual CommitStates call so the trace only wraps state-sync events.

hooks.OnTxStart(...)

stateSyncData, err = c.CommitStates(...)

hooks.OnTxEnd(...)

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is intentional. Post rio hard fork, we don't do span commits so that's eliminated. Agree that contract code change via genesis will also be captured but that hasn't happen in quite sometime. It's an accepted trade-off because otherwise we'd have to move tracing to consensus which is not idea. Consensus should not cater to any tracing related things / conditions.

Copilot AI review requested due to automatic review settings May 21, 2026 09:47
@sonarqubecloud

Copy link
Copy Markdown

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 45 out of 50 changed files in this pull request and generated 3 comments.

Files not reviewed (5)
  • consensus/bor/genesis_contract_mock.go: Language not supported
  • consensus/bor/span_mock.go: Language not supported
  • eth/filters/IBackend.go: Language not supported
  • eth/filters/IBatch.go: Language not supported
  • eth/filters/IDatabase.go: Language not supported

Comment thread eth/tracers/api.go
Comment thread params/protocol_params.go
Comment thread core/types/tx_state_sync.go
@manav2401

Copy link
Copy Markdown
Member Author

Few things to note.

I've done extensive review locally (using the new review skills) and addressed all findings (and rejected those which are not needed) in this PR. If it's reported again, I'll comment with a rationale of why it wasn't addressed in the first place.

Re. diffguard suggestions, I've tried to simplify code complexity at multiple places. Other things flagged (mainly in the trace APIs) is pre-existing complexity and is not introduced in this PR. This PR rather simplified some code paths and remove unused ones in those methods. Few suggestions and not worth adding as breaking things down further worsens the readability and introduces many changes in tests (which are not necessary).

Re. code coverage, here's a rationale behind why it's low overall.

  • The tool also includes mocks generated during the change (the ones prefixed by I e.g. eth/filters/IBackend.go). That is not ideal and mocks shouldn't be included during coverage. Most of the file changes are due to that.
  • The tool captured some changes from our cli where as this PR doesn't change anything in that. Those should be addressed in a separate PR and shouldn't block this.
  • The tool ignores the integration tests which also covers some code paths but isn't captured in the report.

I will make changes to the code coverage config in a separate PR to address some of these concerns. Thanks!

@claude

claude Bot commented May 21, 2026

Copy link
Copy Markdown

Code review

No issues found. Checked for bugs and CLAUDE.md compliance.

@manav2401

Copy link
Copy Markdown
Member Author

Closing this to create a fresh PR.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants