Skip to content

tracers: fix Stop/GetResult race in native tracers#21123

Merged
AskAlexSharov merged 2 commits into
erigontech:mainfrom
Sahil-4555:fix/native-tracer-stop-race
May 12, 2026
Merged

tracers: fix Stop/GetResult race in native tracers#21123
AskAlexSharov merged 2 commits into
erigontech:mainfrom
Sahil-4555:fix/native-tracer-stop-race

Conversation

@Sahil-4555

Copy link
Copy Markdown
Contributor

This ports the relevant native tracer race fix to Erigon.

Native tracers that support Stop store an interruption reason from the trace timeout path, while GetResult can read that reason concurrently from the RPC handler path. The reason was stored as a plain error interface, so concurrent reads and writes were unsynchronized.

Because error is an interface value, racing access can observe an inconsistent interface state. This replaces the plain reason field with atomic.Pointer[error] in Erigon’s native tracers:

  • execution/tracing/tracers/native/call.go
  • execution/tracing/tracers/native/4byte.go
  • execution/tracing/tracers/native/prestate.go
    flatCallTracer shares the underlying callTracer.reason, so its GetResult now reads that pointer atomically as well.

Also adds TestTracerStopRace, covering concurrent Stop and GetResult for:

  • callTracer
  • flatCallTracer
  • 4byteTracer
  • prestateTracer

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR fixes a data race between Stop (timeout watchdog) and GetResult (RPC handler) in Erigon’s native tracers by making the interruption reason concurrently readable/writable via an atomic pointer, and adds a regression test intended to cover the concurrent Stop/GetResult path.

Changes:

  • Replace plain error interruption reason fields with atomic.Pointer[error] in callTracer, fourByteTracer, and prestateTracer.
  • Update GetResult implementations (including flatCallTracer) to read the reason via Load() and handle the unset (nil) case.
  • Add TestTracerStopRace to exercise concurrent Stop and GetResult across multiple native tracers.

Reviewed changes

Copilot reviewed 5 out of 5 changed files in this pull request and generated 2 comments.

Show a summary per file
File Description
execution/tracing/tracers/native/tracer_test.go Adds a concurrency regression test for Stop vs GetResult on several native tracers.
execution/tracing/tracers/native/prestate.go Makes Stop/GetResult interruption reason access atomic in prestateTracer.
execution/tracing/tracers/native/call.go Makes Stop/GetResult interruption reason access atomic in callTracer.
execution/tracing/tracers/native/call_flat.go Reads shared callTracer interruption reason atomically in flatCallTracer.GetResult.
execution/tracing/tracers/native/4byte.go Makes Stop/GetResult interruption reason access atomic in fourByteTracer.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +39 to +43
// callTracer and flatCallTracer's GetResult short-circuits on
// an empty callstack ("incorrect number of top-level calls") before loading
// the reason. For those tracers the test pushes a single top-level call frame
// via OnEnter so GetResult reaches the reason.Load() path where the race can
// be observed under -race.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The important bit here is that both tracers return before reaching the reason.Load() path when the callstack is empty, but you’re right that the quoted error only matches callTracer; flatCallTracer returns "invalid number of calls" instead.

Comment thread execution/tracing/tracers/native/tracer_test.go Outdated
@AskAlexSharov AskAlexSharov enabled auto-merge May 12, 2026 05:30
@AskAlexSharov AskAlexSharov added this pull request to the merge queue May 12, 2026
Merged via the queue into erigontech:main with commit d048265 May 12, 2026
34 of 35 checks passed
@Sahil-4555 Sahil-4555 deleted the fix/native-tracer-stop-race branch May 12, 2026 08:00
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants