Reduce impact of background merge/compress to ChainTip by AskAlexSharov · Pull Request #18995 · erigontech/erigon

AskAlexSharov · 2026-02-06T01:36:40Z

No description provided.

oh wait, I think there might be some unintended changes to the execution-spec-tests and node/interfaces submodules?

taratorio · 2026-02-06T01:42:12Z

I think there might be some unintended changes to the execution-spec-tests and node/interfaces submodules?

## Reduce impact of synchronized aggregation across fleet nodes ### Problem When running multiple Erigon nodes syncing the same chain, all nodes cross snapshot step boundaries at nearly the same time (within seconds of each other). This triggers `BuildFilesInBackground` simultaneously on every node, and the resulting aggregation I/O stalls block execution on all nodes at once. In a load-balanced fleet this causes a total service outage — every backend falls behind the chain tip simultaneously, and the proxy has zero healthy backends to route traffic to. ### Real-world incident (April 7 2026) We operate a 3-node fleet. After ~2 months of stable operation, all nodes hit aggregation step 2193 within 20 seconds of each other: | Node | `BuildFilesInBackground step=2193` | Aggregation duration | |------|-------------------------------------|---------------------| | node-1 | 09:59:34 | 2m30s | | node-2 | 09:59:28 | 2m29s | | node-3 | 09:59:48 | still aggregating, was restarted | During the aggregation, block execution throughput dropped from ~20 Mgas/s to ~1-5 Mgas/s. All nodes fell behind the chain tip. At 10:07:33 the fleet had **0 out of 3 healthy backends** for 60 seconds. The aggregation step itself evicted ~16GB of page cache (RSS dropped from 48GB to 32GB on one node), starving block execution of I/O bandwidth. Each node recovered on its own within 10-15 minutes, but the synchronized nature of the stall meant there was no healthy node to absorb traffic during the event. ### Root cause `BuildFilesInBackground` is triggered when `txNum` crosses a step boundary. Since all nodes process the same chain in real time, they all cross the boundary on the same block. The trigger is deterministic — there is no jitter or per-node offset. ### Solution Add a configurable delay (`ERIGON_AGGREGATION_DELAY_MS`, default 0) at the start of `BuildFilesInBackground`, before the build loop begins. This follows the same pattern as the existing `COMPRESS_WORKERS` env var in `common/dbg/experiments.go`. Operators running multi-node fleets can set different values per node to desynchronize aggregation: ``` node-1: ERIGON_AGGREGATION_DELAY_MS=0 node-2: ERIGON_AGGREGATION_DELAY_MS=60000 node-3: ERIGON_AGGREGATION_DELAY_MS=120000 ``` This guarantees at least 60 seconds between each node starting its aggregation, which would have completely prevented the 0/3 healthy window in the incident above. Single-node operators are unaffected (default is 0). ### Notes - This is complementary to `COMPRESS_WORKERS` (PR #18995) which reduces I/O pressure *within* each aggregation step. This PR addresses the *timing* of when aggregation starts across nodes. - No impact on single-node deployments or initial sync (default delay is 0). --------- Signed-off-by: Peter Lemenkov <lemenkov@gmail.com> Co-authored-by: Alexey Sharov <AskAlexSharov@gmail.com>

…ure (#20486) ### Problem When Erigon is running at chain tip, `MergeLoop` executes merge steps back-to-back with no pause between iterations. Each merge step involves heavy disk I/O (reading, compressing, and writing state files). Running these steps consecutively saturates the disk, starving block execution of I/O bandwidth. The result is periodic block processing stalls: the node's reported block number freezes for minutes at a time while background merges consume all available I/O, then bursts forward when a merge step completes. During these stalls the node falls behind the chain tip and is marked unhealthy by load balancers. ### Observed behavior On a production fleet running Erigon v3.3.x on AWS Graviton instances (64GB RAM, EBS gp3 volumes), we observed the following pattern during MergeLoop activity on individual nodes: - Block execution throughput drops from ~20 Mgas/s to 1-5 Mgas/s - Node block number freezes for 8-16 minutes per merge step - Page cache eviction of 16GB+ as merge I/O displaces cached state data - Lag accumulates at ~5 blocks/minute during each stall - Worst observed: 164 blocks behind over a 188-minute period of continuous merge activity The node always recovers eventually, but the stalls cause the node to be removed from load balancer rotation, reducing fleet capacity. ### Solution Add a configurable delay between `MergeLoop` iterations via the `MERGE_THROTTLE_MS` environment variable (default 0, preserving current behavior). The delay is inserted after each successful `mergeLoopStep`, giving block execution a window to access the disk before the next merge step begins. ``` Before (current): mergeLoopStep() → heavy I/O mergeLoopStep() → immediately, more heavy I/O mergeLoopStep() → immediately, more heavy I/O After (with ERIGON_MERGE_THROTTLE_MS=2000): mergeLoopStep() → heavy I/O sleep(2s) → block execution catches up mergeLoopStep() → heavy I/O sleep(2s) → block execution catches up ``` ### Production results We have been running this patch on a 3-node production fleet since December 2025. Results: - Individual node availability during merge-heavy periods improved from ~90% to >99% - Block execution stalls reduced from 8-16 minutes to under 5 minutes - Nodes maintain chain tip proximity during merge activity - No negative impact on merge completion time (merges still finish, just spread over a slightly longer window) - Fleet-wide availability (via load-balanced proxy) is near 99.99%, with the remaining downtime caused by synchronized stalls that this patch and `AGGREGATION_DELAY_MS` (PR #20391) address together Recommended values based on our testing: | Use case | Value | Effect | |----------|-------|--------| | Default (no throttle) | 0 | Current behavior, no change | | Light throttle | 500 | Slight breathing room between merges | | Production RPC nodes | 2000 | Good balance of merge progress and block execution | | Heavy RPC workload | 5000 | Prioritize block execution over merge speed | ### Notes - This is complementary to `COMPRESS_WORKERS` (PR #18995) which reduces I/O pressure *within* each merge step by limiting worker parallelism. This PR addresses I/O pressure *between* merge steps. - This is also complementary to `AGGREGATION_DELAY_MS` (PR #20391, merged) which staggers the *start time* of aggregation across fleet nodes. - No impact on single-node deployments or initial sync (default delay is 0). Signed-off-by: Peter Lemenkov <lemenkov@gmail.com>

AskAlexSharov added 2 commits February 6, 2026 08:15

save

79e8992

save

ff6f939

AskAlexSharov requested review from Giulio2002, mh0lt and yperbasis as code owners February 6, 2026 01:36

AskAlexSharov requested a review from taratorio February 6, 2026 01:37

taratorio previously approved these changes Feb 6, 2026

View reviewed changes

AskAlexSharov mentioned this pull request Feb 6, 2026

cli, execution: add --compression.workers flag to limit snapshot compression workers #18971

Closed

save

79e54c0

AskAlexSharov enabled auto-merge (squash) February 6, 2026 02:05

taratorio approved these changes Feb 6, 2026

View reviewed changes

AskAlexSharov merged commit b2dc316 into release/3.3 Feb 6, 2026
11 checks passed

AskAlexSharov deleted the alex/comp_workers_reduce_33 branch February 6, 2026 02:10

yperbasis added the performance label Feb 9, 2026

lemenkov mentioned this pull request Apr 7, 2026

Reduce impact of synchronized aggregation across fleet nodes #20391

Merged

lemenkov mentioned this pull request Apr 10, 2026

db/state: add optional throttle to MergeLoop to reduce disk I/O pressure #20486

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Reduce impact of background merge/compress to ChainTip#18995

Reduce impact of background merge/compress to ChainTip#18995
AskAlexSharov merged 3 commits into
release/3.3from
alex/comp_workers_reduce_33

AskAlexSharov commented Feb 6, 2026

Uh oh!

taratorio commented Feb 6, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

AskAlexSharov commented Feb 6, 2026

Uh oh!

taratorio commented Feb 6, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants