Skip to content

parallel-exec: remove skipCheck and the heuristics compensating for skip-overridden results #21319

@mh0lt

Description

@mh0lt

Goal

Remove skipCheck from parallel-exec result validation so the validator is the single source of truth, and remove the heuristic fixes that have accreted to compensate for the slightly-wrong results skipCheck-overridden commits produce.

Background

In blockExecutor.nextResult (execution/stagedsync/exec3_parallel.go):

valid := be.skipCheck[tx] || validity == state.VersionValid

skipCheck[tx] is set when a task is dispatched while every preceding task is already finished and validated (tx == maxValidated+1) — under that precondition the result is correct by construction. But the || means a true here overrides a validator VersionInvalid verdict. Where the validator and skipCheck disagree, the disagreements are validator false positives — so the override has been "correct", but it means the validator has never been the forcing function for its own correctness, and downstream heuristics have grown around skip-overridden results.

(A reference doc explaining how/why skipCheck works is on the blockExecutor.skipCheck field in the WIP branch below.)

WIP branch

mh/parallel-exec-skipcheck-removal-wip (commit 97c05b5e02) — not for merge as-is.

Contains: skipCheck no longer masks VersionInvalid, plus three validator/recording false-positive fixes it surfaced:

  1. validateReadImpl MVReadResultNone — mirror the recording-side fold of a sub-field MapRead onto AddressPath.
  2. versionedRead — a per-account-field read of a self-destructed (not-revived) account records a SelfDestructPath dependency instead of a bare StorageRead.
  3. CreateAccount — the synthetic IncarnationPath read is stamped with the IncarnationPath cell's own version.

State

  • With skipCheck removed + the 3 fixes: prefund and SD/CREATE2-collision false positives resolved — cancun stable 8389/8389.
  • Remaining bug: 6 test_recreate[recreate_on_separate_block_False] fail invalid state root hash — a genuine same-block self-destruct-then-recreate parallel-commitment correctness bug (not a validator false positive).
  • The 3 fixes are coupled to skipCheck removal: each regresses state-root correctness if applied with skipCheck still in place (double-suppression — skipCheck and a more-permissive validator both dropping invalidations). They cannot be split into a skipCheck-on PR.

Tasks

  • Fix the same-block self-destruct-then-recreate state-root bug.
  • Audit and remove the heuristic fixes that compensate for skip-overridden results (comments of the form "without this, TestX fails with gas exactly N short" in validateReadImpl / versionedio.go / normalizeWriteSet). For each: remove it, re-run the test that justified it — still green ⇒ it was masking; red ⇒ a real bug to fix properly.
  • Land the 3 false-positive fixes together with skipCheck removal.
  • Sub-item: validateReadImpl recursive cross-validate probes are inert no-ops — parallel-exec: validateReadImpl recursive cross-validate probes are inert no-ops #21318.

Metadata

Metadata

Assignees

Type

No type
No fields configured for issues without a type.

Projects

No projects

Relationships

None yet

Development

No branches or pull requests

Issue actions