Skip to content

Erigon trails geth/besu/nethermind by ~1.5–2× on cold-SSTORE-bloated workloads; planned fix blocked by --experimental.concurrent-commitment wrong-trie-root bug #20920

@mh0lt

Description

@mh0lt

Erigon trails geth/besu/nethermind by ~1.5–2× on cold-SSTORE-bloated workloads; planned fix blocked by concurrent-commitment bug

Summary

On the EthPandaOps osaka-repricings-stateful-jochem benchmarkoor SSTORE-bloated workload (4,200 cold SSTOREs against a single 10 GB EOA storage trie in one 30M-gas block), erigon (bal-devnet-3 base, sequential commitment) sits at 6.1 Mgas/s cold / 7.9 Mgas/s warm, against geth / besu / nethermind at ~10–14 Mgas/s. They also read 3–4× less from disk per run, so the gap is real I/O work, not noise. Erigon's bal-devnet-3 baseline is already a ~4× lift over the published canonical erigon number for the same test (1.7 Mgas/s), so the recent commitment + parallel-exec work has done useful work — but we still trail the leaders.

The natural next optimization is --experimental.concurrent-commitment, which moves the per-block hashing from one goroutine to 16. We expected this to close some of the gap. It produces a deterministic wrong trie root on this benchmark (block 24358305) — first concurrent-commitment batch in the run, before the test block even fires — so we can't measure it. Fixing or working around that bug is the gating step before we can establish whether concurrent-commitment alone closes the gap, or whether deeper work (storage-trie sub-fanout) is needed on top.


Performance comparison

Test: test_sstore_bloated[10GB-fork_Osaka-NO_CACHE-existing_slots_True-write_new_value_True-30M] — 4,200 cold SSTOREs against a 10 GB EOA's storage trie, 30M gas, no cache.

Hardware envelope (matches canonical EIP-7870 fullnode): 6 vCPUs / 32 GB RAM container, cpu_freq pinned 3.6 GHz, no turbo, performance governor, swap disabled, drop_memory_caches: "steps".

Client Source Test time (s, avg / min) Mgas/s Disk read (GB) Read IOPS CPU (s) Mem (GB)
erigon (bal-devnet-3 base, sequential, cold) local 4.89 / 4.86 6.13 2.69 657k 3.63 26
erigon (bal-devnet-3 base, sequential, warm) local 3.78 / 3.73 7.95 1.94 472k 2.80 19
erigon published canonical 17.53 / 17.35 1.70 5.31 169k 21.98 25
besu published canonical 2.72 / 2.32 10.99 0.87 49k 9.67 2.5
geth published canonical 2.83 / 2.36 10.56 0.65 42k 4.62 1.18
nethermind published canonical 2.15 / 1.40 13.91 0.73 47k 6.25 12.2

(reth's 1.1 Mgas/s line excluded — that's a missed test on their end, not the comparison we should anchor on.)

Headlines:

  • bal-devnet-3 sequential commitment is already ~4× faster than the published canonical erigon (1.70 → 6.13 Mgas/s cold, 7.95 warm). The recent BAL/parallel-exec/commitment work has paid off.
  • We still trail geth / besu / nethermind by ~1.5–2× on this workload, and read 3–4× more from disk per run. The disk-read gap is the part most likely to yield to commitment-side parallelism.
  • Memory footprint is the other outlier: 19–26 GB vs 1–12 GB for everyone else. That's the snapshot mmap working set on a 246 GB MDBX + 2 TB segments dataset and is largely orthogonal to throughput — but worth keeping in mind for total-cost-of-ownership comparisons.

Why the gap (read-amplification hypothesis)

Geth / besu / nethermind read 0.65–0.87 GB to do this block; bal-devnet-3 reads 2.69 GB cold. With one shared 10 GB storage trie and 4,200 cold slots, the difference is how the 4,200 trie traversals are batched and which intermediate pages we re-read. Erigon's HexPatriciaHashed does 16-way concurrent-commitment fanout on the first nibble of keccak256(plainKey), but on this workload all 4,200 slots share the same keccak256(addr)[0] (one EOA → one nibble), so 1 of 16 subtries does 100% of the work. The other 15 are idle. The leaders presumably do better because they batch and dedupe storage-trie reads inside the single account.

Two paths to close the gap:

  1. Make --experimental.concurrent-commitment actually run. Removing the wrong-root bug below would let us measure whether concurrent-commitment alone is enough, even with the 1-of-16 imbalance (it might still help on cross-account workloads).
  2. Storage-trie sub-fanout — within the dominant subtrie, fanout 16 ways on keccak256(slot)[0]. We have a design and a Phase 1 detection commit for this on feat/storage-parallel-trie. Phase 2 (the real mount-at-depth-64 fold) is unwritten and depends on concurrent-commitment producing correct roots first.

Reproducer (end-to-end, for an external machine)

Access prerequisites (do this first)

Several of the URLs and APIs below sit behind EthPandaOps access controls. Confirm you have what you need before starting — discovering it after a 1.7 TB download is no fun.

  • Snapshot bucket (https://snapshots.ethpandaops.io/...): NOT publicly accessible. Our working download was authorised against mh0lt's GitHub account — the bucket is gated by GitHub identity / EthPandaOps allowlist. An external agent on a fresh machine will likely 404/403. To unblock: coordinate with mh0lt (or whoever owns the bal-devnet-3 work) to either (a) request EthPandaOps allowlists the new identity, (b) be issued a presigned URL, or (c) receive a forwarded copy of the artefact via another channel. Confirm with curl -sI <url> before starting the download — anything other than HTTP/2 200 means access is not yet granted.
  • Test fixture (https://data.ethpandaops.io/benchmarkoor/osaka-repricings-stateful-jochem.tar.gz) and opcode trace (https://data.ethpandaops.io/benchmarkoor/opcode_trace_results.json): currently public, downloaded by benchmarkoor itself at run time. If those 404 from your IP, the same fix applies.
  • Canonical published numbers (https://benchmarkoor-api.core.ethpandaops.io/api/v1/index/...): requires a bearer token from EthPandaOps. Get one before writing code that depends on it. Pattern: curl -H 'Authorization: Bearer bmk_...' '<url>'.
  • Genesis gist (https://gist.githubusercontent.com/skylenet/...): publicly hosted on GitHub Gist — typically fine, but if the gist is deleted you'll need a copy. Save the JSON locally as a fallback.
  • Docker Hub for golang:1.24-bookworm (benchmarkoor build): standard public pull, but rate-limited if unauthenticated. Consider docker login if you'll be rebuilding.

Hardware / OS requirements

  • Linux x86_64 (we used Ubuntu 24.04, kernel 6.8).
  • 6+ cores you can pin via cpuset (we used AMD EPYC 4244P; the canonical EIP-7870 envelope is 6 logical / 3 physical, plus 1–2 cores of headroom for the host).
  • 32 GB RAM allocated to the EL container; host with ≥48 GB recommended.
  • NVMe SSD storage strongly recommended — benchmark is read-IOPS heavy (470k–660k IOPS during the test step).
  • Docker (we used 27.x).
  • Root access on host (for vm.drop_caches, cpu_freq pinning, cgroup memory caps).
  • zstd, aria2c, jq, python3 installed.

Disk space requirements

Item Size When
Compressed snapshot (snapshot.tar.zst) 1.69 TB downloaded once
Extracted snapshot (datadir) 2.3 TB persistent
Tarball + extract simultaneously (peak) ~4 TB extraction window only
MDBX runtime growth during a run ~340 GB persistent after first run
Docker image (with embedded erigon binary) ~530 MB persistent
Overlayfs upper/work dirs per run ~5 GB ephemeral, cleaned on container stop

Recommendation: 4 TB free on the volume that holds the snapshot. Minimum 3 TB if you delete snapshot.tar.zst immediately after extract. We hit "no space left on device" mid-extract on a 3 TB volume, which corrupts the snapshot — see "Snapshot integrity" below.

Step 1: Download the snapshot

URL: https://snapshots.ethpandaops.io/perf-devnet-3/erigon/24358000/snapshot.tar.zst (1690501673719 bytes ≈ 1.69 TB, zstd-compressed, includes both EL chaindata and Caplin CL data).

Use aria2c, not curl. curl --retry truncates the file on retry without -C -, and we lost progress repeatedly. aria2c with 16 parallel segments achieved ~635 MiB/s on a 10 Gbit link.

mkdir -p /erigon-data/snapshots
aria2c -c -x 16 -s 16 \
  -d /erigon-data/snapshots \
  -o snapshot.tar.zst \
  'https://snapshots.ethpandaops.io/perf-devnet-3/erigon/24358000/snapshot.tar.zst'

-c = continue on interrupt; -x 16 -s 16 = 16 parallel connections, 16 segments.

Step 2: Extract

mkdir -p /erigon-data/snapshots/erigon/perf-devnet-3/24358000
cd /erigon-data/snapshots/erigon/perf-devnet-3/24358000
tar -I zstd -xf /erigon-data/snapshots/snapshot.tar.zst

(-I zstd tells tar to pipe through zstd. Plain tar -xf won't work — it's not gzipped.)

After extract, the directory should be ~2.3 TB and contain chaindata/, snapshots/, caplin/, plus salt-blocks.txt / salt-state.txt. Roughly 1606 segment files plus an MDBX mdbx.dat.

Critical: do not pass --keep-old-files if extraction fails midway. That flag preserves zero-byte stub files from the failed run, leaving the snapshot corrupt (we hit this; salt-blocks.txt was 0 bytes, expected 4). On failure: delete the partial extract and re-extract from scratch, OR use plain tar -I zstd -x (which overwrites stubs).

Snapshot integrity check

Anything ending in .seg, .kv, .v, .bt, .kvei, or salt-*.txt being zero bytes is corrupt — re-extract:

find /erigon-data/snapshots/erigon/perf-devnet-3/24358000 -size 0 -type f \
  \( -name '*.seg' -o -name '*.kv' -o -name '*.v' -o -name 'salt-*.txt' \)
# expected: empty output

(Zero-byte *.lck files are benign — those are MDBX lock files. Sparse .bt/.kvei index sidecars can be zero — verify by content type if unsure.)

Step 3: Build benchmarkoor (with the timeout patch)

benchmarkoor's stock DefaultReadyTimeout is 120s. Erigon takes 2+ minutes to come up RPC-ready on a 246 GB MDBX + 2 TB segments dataset, so the harness gives up before erigon is ready. Patch it to 900s.

git clone https://github.com/ethpandaops/benchmarkoor /tmp/benchmarkoor-src
cd /tmp/benchmarkoor-src
sed -i 's/DefaultReadyTimeout = 120 \* time.Second/DefaultReadyTimeout = 900 * time.Second/' \
  pkg/runner/runner.go

The host is missing C deps benchmarkoor needs (libbtrfs / libgpgme / libdevmapper), so build inside Docker:

cat > /tmp/Dockerfile.benchmarkoor-build <<'EOF'
FROM golang:1.24-bookworm
RUN apt-get update && apt-get install -y --no-install-recommends \
    libbtrfs-dev libgpgme-dev libdevmapper-dev pkg-config \
    && rm -rf /var/lib/apt/lists/*
WORKDIR /src
COPY . .
RUN go build -o /benchmarkoor ./cmd/benchmarkoor
EOF

mkdir -p $HOME/benchmarkoor
docker build -t benchmarkoor-build -f /tmp/Dockerfile.benchmarkoor-build /tmp/benchmarkoor-src
docker run --rm -v $HOME/benchmarkoor:/out benchmarkoor-build sh -c 'cp /benchmarkoor /out/benchmarkoor'
sudo chown root:root $HOME/benchmarkoor/benchmarkoor   # benchmarkoor's cgroup setup expects root-owned binary when run via sudo

Result: ~65 MB binary at ~/benchmarkoor/benchmarkoor.

Step 4: Build erigon (the binary you want to test)

Standard make erigon from the branch under test. For fast iteration we used a Docker fast-swap pattern (rebuild image in <5s vs 5+ min for the full Dockerfile):

# First time only: build the canonical image once
cd $ERIGON_REPO
make docker DOCKER_TAG=local/erigon:bal-devnet-3   # or use any base image with erigon at /usr/local/bin/erigon

# After every code change: just swap the freshly-built binary into the existing image
make erigon
cat > /tmp/Dockerfile.erigon-swap <<'EOF'
FROM local/erigon:bal-devnet-3
USER root
COPY erigon /usr/local/bin/erigon
RUN chmod +x /usr/local/bin/erigon
USER erigon
EOF
cp build/bin/erigon /tmp/erigon
cd /tmp && docker build -t local/erigon:bal-devnet-3 -f Dockerfile.erigon-swap .

Step 5: benchmarkoor config

cat > $HOME/benchmarkoor/run.erigon-osaka-sstore.yaml <<'EOF'
global:
  log_level: info

runner:
  client_logs_to_stdout: true
  docker_network: benchmarkoor
  cleanup_on_start: true

  live_reporting:
    enabled: false

  benchmark:
    generate_results_index: true
    generate_suite_stats: true

    tests:
      metadata:
        labels:
          name: perf-devnet-3-24358000-osaka-stateful-erigon-local
          chain: perf-devnet-3
          block: "24358000"
          test-type: stateful
          context: repricing
          fork: osaka
          erigon-build: local-bal-devnet-3

      filter: "sstore_bloated[10GB-fork_Osaka-benchmark_test-cache_strategy_CacheStrategy.NO_CACHE-existing_slots_True-write_new_value_True-benchmark_30M"

      source:
        archive:
          file: https://data.ethpandaops.io/benchmarkoor/osaka-repricings-stateful-jochem.tar.gz
          pre_run_steps:
            - "merged/gas-bump.txt"
            - "merged/funding.txt"
          steps:
            setup:
              - "merged/setup/*.txt"
            test:
              - "merged/testing/*.txt"

      opcode_source:
        file: https://data.ethpandaops.io/benchmarkoor/opcode_trace_results.json

  client:
    config:
      drop_memory_caches: "steps"
      rollback_strategy: container-recreate

      resource_limits:
        cpuset_count: 6
        cpu_freq: "3600MHz"
        cpu_turboboost: false
        cpu_freq_governor: performance
        memory: "32g"
        swap_disabled: true

      genesis:
        erigon: https://gist.githubusercontent.com/skylenet/85704e26f3e833a02a760f623aeaaf9b/raw/1b1dcf664b6cb6db997ba77cd869a51176b6ee06/genesis-perf-devnet-3-24358000-osaka-genesis.json

    datadirs:
      erigon:
        source_dir: /erigon-data/snapshots/erigon/perf-devnet-3/24358000/
        method: overlayfs

  instances:
    - id: erigon-bal-full
      client: erigon
      metadata:
        labels:
          bal-mode: full
      image: local/erigon:bal-devnet-3
      pull_policy: never
      environment:
        ERIGON_MAX_REORG_DEPTH: "512"
        EXEC_TERSE_LOGGER_LEVEL: "3"
      extra_args:
        - --networkid=12159
        - --fcu.background.commit=false
        # add --experimental.concurrent-commitment here to repro the wrong-root bug
      bootstrap_fcu:
        enabled: true
        max_retries: 60
        backoff: 30s
EOF

source_dir must point at the extracted snapshot dir from Step 2.

Datadir method (method:) — pick overlayfs

Three options; the trade-offs matter on a constrained box:

Method Speed Disk overhead Notes
overlayfs (kernel) fastest only the per-run diff (~5 GB) what we used. needs root + overlay kernel module. cleanly umounts on test end.
fuse-overlayfs ~2× slower only the per-run diff unprivileged, pure userspace. Use if kernel overlayfs is unavailable.
copy fastest after the copy completes full duplicate (+2.3 TB) requires 2 × snapshot_size free disk per run. We aborted this on a 4 TB box because 2.3 TB extracted + 2.3 TB copy + benchmarkoor work left no headroom.

Stick with overlayfs unless you have specific reasons not to.

Step 6: cold-cache wrapper (recommended for cold-baseline numbers)

drop_memory_caches: "steps" calls vm.drop_caches=3 between steps but doesn't reliably evict snapshot mmap pages held by overlayfs lower-dirs. We confirmed empirically: drop fired but warm-run pages persisted (cold first run reads 2.69 GB, warm subsequent runs read 1.94 GB).

For repeatable cold numbers, use this wrapper:

cat > $HOME/benchmarkoor/run-cold.sh <<'EOF'
#!/usr/bin/env bash
# Run benchmarkoor with a forced cold host page cache.
# Usage: sudo ./run-cold.sh [extra benchmarkoor args]
set -euo pipefail

if [ "$(id -u)" -ne 0 ]; then
  echo "ERROR: must run as root (drop_caches + cpu_freq + cgroup limits)" >&2
  exit 1
fi

CFG="${CFG:-$HOME/benchmarkoor/run.erigon-osaka-sstore.yaml}"
BIN="${BIN:-$HOME/benchmarkoor/benchmarkoor}"

echo "[cold] tearing down stale erigon-bal-full container if any"
docker rm -f erigon-bal-full 2>/dev/null || true
docker ps -a --format '{{.Names}}' \
  | grep -E '^benchmarkoor-.*-erigon-bal-full$' \
  | xargs -r docker rm -f

echo "[cold] unmounting any leftover overlayfs mounts"
mount | awk '/benchmarkoor-overlay/ {print $3}' | while read -r m; do
  umount "$m" 2>/dev/null || umount -l "$m" 2>/dev/null || true
done

echo "[cold] sync + drop_caches"
sync
echo 3 > /proc/sys/vm/drop_caches
echo "[cold] page cache after drop:"
grep -E '^Cached|^Buffers' /proc/meminfo

echo "[cold] launching benchmarkoor"
exec "$BIN" run --config "$CFG" --log-level=info "$@"
EOF
chmod +x $HOME/benchmarkoor/run-cold.sh

Step 7: run

For the sequential-commitment baseline (current best erigon perf):

sudo $HOME/benchmarkoor/run-cold.sh 2>&1 | tee /tmp/bench-baseline-cold.log

Subsequent warm runs (without dropping cache):

sudo $HOME/benchmarkoor/benchmarkoor run --config $HOME/benchmarkoor/run.erigon-osaka-sstore.yaml --log-level=info \
  2>&1 | tee /tmp/bench-baseline-warm.log

To repro the wrong-trie-root bug, add --experimental.concurrent-commitment to extra_args in the yaml and re-run. The run will fail at block 24358305 during the setup phase (i.e. before the actual test block fires), so you'll see no result.json for the test step — only the fail logged in the console output.

Step 8: read results

LATEST=$(ls $HOME/benchmarkoor/results/runs/ | grep -v index.json | sort | tail -1)
cat $HOME/benchmarkoor/results/runs/$LATEST/result.json | python3 -c "
import json, sys
d = json.load(sys.stdin)
for n, t in d.get('tests', {}).items():
    s = t.get('steps', {}).get('test', {}).get('aggregated', {})
    if not s or not s.get('time_total'):
        continue
    rt = s.get('resource_totals', {})
    print(f'{n}')
    print(f'  test_time_s={s[\"time_total\"]/1e9:.3f}')
    print(f'  gas_used={s[\"gas_used_total\"]}')
    print(f'  mgas_per_s={(s[\"gas_used_total\"]/(s[\"time_total\"]/1e9))/1e6:.2f}')
    print(f'  disk_read_GB={rt.get(\"disk_read_bytes\",0)/1e9:.2f}')
    print(f'  disk_read_iops={rt.get(\"disk_read_iops\",0)}')
    print(f'  cpu_s={rt.get(\"cpu_usec\",0)/1e6:.2f}')
"

Common failure modes (so you don't repeat ours)

  1. "no space left on device" mid-extract → either delete snapshot.tar.zst first then extract elsewhere, or get a 4 TB+ volume. If extract failed, do a clean re-extract (delete partial first, then tar -I zstd -x without --keep-old-files).
  2. benchmarkoor times out before erigon is ready → confirm you used the patched 900s DefaultReadyTimeout.
  3. docker: image not found → benchmarkoor uses pull_policy: never, so the image must be local. Build with the fast-swap step first.
  4. First cold run is much slower than subsequent runs (4.9s vs 3.7s) → expected. Page cache warms after the first iteration. Use the cold wrapper for repeatable cold numbers.
  5. Permission denied on /proc/sys/vm/drop_caches → benchmarkoor must run as root. The cold wrapper enforces this.
  6. curl download repeatedly stalls / restarts from zerocurl --retry truncates without -C -. Use aria2c.
  7. Wrong-trie-root on block 24358305 with --experimental.concurrent-commitment → not your fault. That's the blocker bug below.

Blocker: --experimental.concurrent-commitment produces wrong trie root deterministically

Reproducer

Follow the end-to-end reproducer above through Step 5, but uncomment --experimental.concurrent-commitment in extra_args:

      extra_args:
        - --networkid=12159
        - --fcu.background.commit=false
        - --experimental.concurrent-commitment

Then run Step 7. Branch under test: bal-devnet-3 (HEAD 671ece6747) — bug also reproduces on feat/storage-parallel-trie (= bal-devnet-3 + Phase 1 detect + Phase 2a buffering); reverting Phase 2a's buffer-and-replay back to inline followAndUpdate (the original Phase 1 shape) reproduces the exact same wrong root, so the storage-parallel-trie commits are NOT the cause.

Failing block: 24358305 (the LAST setup block, before the SSTORE-bloated test block 24358306).

[5/5 Execution] Wrong trie root of block 24358305:
  computed d5b10024a44c952b458ef9fe5957d35c4f8bd3aa673b2b369cd489ab75cc3437
  expected dbb289601651fbd44fbfe8fac02d4e1ab5c2f2a47aff7a0b519a8423b6bf338f
Block hash: 05a6d80ceb1828354ff3768ea2730e0412591bb5fd8627681e83d781152355af
[5/5 Execution] rw exit err="invalid block: wrong trie root, block=24358305"
  stack="[exec3_parallel.go:192 exec3_parallel.go:468 exec3_parallel.go:468
         exec3.go:259 stage_execute.go:391 default_stages.go:328
         sync.go:500 sync.go:331 stageloop.go:598 executor.go:313
         fork_validator.go:297 fork_validator.go:259 exec_module.go:483 ...]"

Same hashes, same block, same code path on every run.

Why block 305 specifically

The setup phase plays 6 small blocks (24358300–24358305). The first commitment batch is always sequential per // first run always sequential (db/state/execctx/domain_shared.go, commitment.go:158). After each batch, ConcurrentPatriciaHashed.CanDoConcurrentNext() decides whether the next batch can run concurrent.

  • Blocks 300–304 → sequential commitment → succeed.
  • After 304, CanDoConcurrentNext() returns true (root has no extension; zero-prefix branch is large enough).
  • Block 305 → first concurrent batch → wrong root.

So this is the first concurrent-commitment batch in the run. The defect is in ParallelHashSort (execution/commitment/hex_concurrent_patricia_hashed.go) or its supporting unfold/fold mechanics, not in cumulative state divergence many batches later.

What we ruled out

Hypothesis Test Outcome
Phase 2a's buffering broke ParallelHashSort Reverted Phase 2a's buffer-and-replay back to inline followAndUpdate (the original Phase 1 shape) Same wrong root — Phase 2a innocent
Some bal-devnet-3 BAL/parallel-exec interaction All earlier benchmark runs on bal-devnet-3 without the flag → sequential commitment → run clean bal-devnet-3 fine without the flag
--exec.no-prune interaction Both 671ece6747 (base + no-prune fix) and pre-no-prune commits show the same failure Unrelated

What we did NOT yet test (handoff items)

  1. Build origin/main with --experimental.concurrent-commitment and run the same benchmark. Tells us whether this is a pre-existing upstream bug or a bal-devnet-3 regression.
  2. Bisect bal-devnet-3 if main is fine. Likely candidates: BAL system-address filter (gas_table.go), parallel-exec asynctx pattern fixes, the warmuper changes (warmuper: blocking and more (#20819) #20877/[bal-devnet-3] warmuper: blocking and more (#20877) #20884), the BAL-balance seeding fix ([bal-devnet-3] execution/state: don't seed initial BAL balance from post-write reads #20864), and any commitment-side changes since the last known-good main concurrent-commitment baseline.
  3. ParallelHashSort invariants on this block. With dbg.SetTrace(true) on the concurrent trie and serial trie, capture the unfold/fold sequence for block 305 and diff. That should localise where the divergence happens.

Methodology notes

  • Canonical published numbers fetched from https://benchmarkoor-api.core.ethpandaops.io/api/v1/index/suites/2477940593a59252/stats?max_runs_per_client=25.

Storage-parallel-trie experiment (paused, branch preserved)

Sub-fanout idea: within the 1-of-16 dominant subtrie on this workload, split 16 ways on keccak256(slot)[0]. Two phases committed on feat/storage-parallel-trie:

  • 2bc8977800 — Phase 1: detect single-account-dominated subtries, log only.
  • 3a3bcf3c04 — Phase 2a: warmup-only fanout (16 inner goroutines that followAndUpdate clone subtries to populate the OS page cache for the canonical pass).

Phase 2a measurement (with --experimental.concurrent-commitment not enabled — i.e. dead code): ~0% delta, expected. Once we turned the flag on, the wrong-root bug above blocked everything. Branch preserved on GitHub, not merged. Both phases inert without --experimental.concurrent-commitment.

Phase 2b (real mount-at-depth-64 fanout with parallel CPU work) deferred until concurrent commitment is correct.


Acceptance

The performance gap to geth/besu/nethermind is the headline. The path to closing it routes through --experimental.concurrent-commitment, so step 1 of the handoff is a working concurrent-commitment baseline on this benchmark — either by fixing the divergence on block 24358305, or by documenting it as a pre-existing main-branch bug and filing the fix there.

Once that exists, the measurement that's actually interesting is: with-flag vs without-flag on the same SSTORE-bloated block, both warm and cold. If concurrent-commitment closes most of the gap, we're done. If not, Phase 2 of the storage-parallel-trie work is the follow-up.

Metadata

Metadata

Assignees

Labels

Type

No type
No fields configured for issues without a type.

Projects

No projects

Relationships

None yet

Development

No branches or pull requests

Issue actions