db/state: add optional throttle to MergeLoop to reduce disk I/O pressure by lemenkov · Pull Request #20486 · erigontech/erigon

lemenkov · 2026-04-10T20:51:46Z

Problem

When Erigon is running at chain tip, MergeLoop executes merge steps back-to-back with no pause between iterations. Each merge step involves heavy disk I/O (reading, compressing, and writing state files). Running these steps consecutively saturates the disk, starving block execution of I/O bandwidth.

The result is periodic block processing stalls: the node's reported block number freezes for minutes at a time while background merges consume all available I/O, then bursts forward when a merge step completes. During these stalls the node falls behind the chain tip and is marked unhealthy by load balancers.

Observed behavior

On a production fleet running Erigon v3.3.x on AWS Graviton instances (64GB RAM, EBS gp3 volumes), we observed the following pattern during MergeLoop activity on individual nodes:

Block execution throughput drops from ~20 Mgas/s to 1-5 Mgas/s
Node block number freezes for 8-16 minutes per merge step
Page cache eviction of 16GB+ as merge I/O displaces cached state data
Lag accumulates at ~5 blocks/minute during each stall
Worst observed: 164 blocks behind over a 188-minute period of continuous merge activity

The node always recovers eventually, but the stalls cause the node to be removed from load balancer rotation, reducing fleet capacity.

Solution

Add a configurable delay between MergeLoop iterations via the MERGE_THROTTLE_MS environment variable (default 0, preserving current behavior). The delay is inserted after each successful mergeLoopStep, giving block execution a window to access the disk before the next merge step begins.

Before (current):
  mergeLoopStep()  → heavy I/O
  mergeLoopStep()  → immediately, more heavy I/O
  mergeLoopStep()  → immediately, more heavy I/O
 
After (with ERIGON_MERGE_THROTTLE_MS=2000):
  mergeLoopStep()  → heavy I/O
  sleep(2s)        → block execution catches up
  mergeLoopStep()  → heavy I/O
  sleep(2s)        → block execution catches up

Production results

We have been running this patch on a 3-node production fleet since December 2025. Results:

Individual node availability during merge-heavy periods improved from ~90% to >99%
Block execution stalls reduced from 8-16 minutes to under 5 minutes
Nodes maintain chain tip proximity during merge activity
No negative impact on merge completion time (merges still finish, just spread over a slightly longer window)
Fleet-wide availability (via load-balanced proxy) is near 99.99%, with the remaining downtime caused by synchronized stalls that this patch and AGGREGATION_DELAY_MS (PR Reduce impact of synchronized aggregation across fleet nodes #20391) address together

Recommended values based on our testing:

Use case	Value	Effect
Default (no throttle)	0	Current behavior, no change
Light throttle	500	Slight breathing room between merges
Production RPC nodes	2000	Good balance of merge progress and block execution
Heavy RPC workload	5000	Prioritize block execution over merge speed

Notes

This is complementary to COMPRESS_WORKERS (PR Reduce impact of background merge/compress to ChainTip #18995) which reduces I/O pressure within each merge step by limiting worker parallelism. This PR addresses I/O pressure between merge steps.
This is also complementary to AGGREGATION_DELAY_MS (PR Reduce impact of synchronized aggregation across fleet nodes #20391, merged) which staggers the start time of aggregation across fleet nodes.
No impact on single-node deployments or initial sync (default delay is 0).

AskAlexSharov · 2026-04-11T00:23:13Z

i'm not sure that

 
After (with ERIGON_MERGE_THROTTLE_MS=2000):
  mergeLoopStep()  → heavy I/O
  sleep(2s)        → block execution catches up
  mergeLoopStep()  → heavy I/O
  sleep(2s)        → block execution catches up

is better. because merge step can take 24hours (on very large .kv files)

lemenkov · 2026-04-11T01:18:53Z

i'm not sure that

 
After (with ERIGON_MERGE_THROTTLE_MS=2000):
  mergeLoopStep()  → heavy I/O
  sleep(2s)        → block execution catches up
  mergeLoopStep()  → heavy I/O
  sleep(2s)        → block execution catches up

is better. because merge step can take 24hours (on very large .kv files)

Good point! The throttle wouldn't help during a single merge of large historical files (initial sync?). That's a different problem entirely (and NO_DEEP_MERGE_HISTORY=true is probably the right workaround for that).

What we're addressing is the chain-tip case: multiple smaller merge steps (e.g. step ranges 2048-2176, 2176-2192, 2192-2194 - real scenario btw) running back-to-back, each taking a few minutes. Without a pause between them, block execution is starved of I/O continuously across the full sequence. The customizable gap lets block execution process pending blocks between each step.

On our fleet at chain tip, individual merge steps take 1-5 minutes, not hours. The stalls come from running 3-5 of them consecutively with no break.

09:59:34  BuildFilesInBackground step=2193
10:02:09  aggregated step=2193 took=2m30s
10:02:09  MergeLoop throttle enabled delay_ms=2000
10:02:41  serial executed blk=24826479 gas/s=4.97M   ← catching up during merge
10:04:24  Execution DONE in=2m20s block=24826482
10:06:02  Execution DONE in=1m16s block=24826488
10:07:16  Execution DONE in=1m9s block=24826494

Without the throttle, those merge steps run immediately one after another and block execution stays at 1-5 Mgas/s for the entire duration. With the throttle, each 2-second gap allows a burst of block processing at closer to normal throughput.

If you'd like to verify we could come uop with a few easy approaches:

Side-by-side comparison: if you have two nodes at chain tip, set ERIGON_MERGE_THROTTLE_MS=2000 on one and leave the other at default. Compare mgas/s and block age during the next merge cycle.
Single node, restart at chain tip: let a node sync normally, then restart it with ERIGON_MERGE_THROTTLE_MS=2000. No re-sync needed, btw. The next MergeLoop cycle will use the throttle.
Just check existing logs: look at any node at chain tip during MergeLoop, and if you see consecutive merge steps completing with sustained mgas/s drops between them, that's exactly where the 2-second gap helps.

We are willing to add more information! We are very open in this regards.

AskAlexSharov · 2026-04-11T02:39:21Z

individual merge steps take 1-5 minutes, - in your case it's acceptable to have 5min of 1-5 Mgas/s?

AskAlexSharov · 2026-04-11T02:43:52Z

NO_DEEP_MERGE_HISTORY=true - is good workaround. but it's only for history files (.ef/.v), not for domain files (.kv).

real source of high io during .kv merge is: #14809 but i'm not sure when we will able to release full-fix for this issue.

So, I would just merge your PR. Because it adding some flexibility. And if we have time - we can replace this env variables by some sort of randomized-time.

The MergeLoop background goroutine performs continuous heavy disk I/O when merging state files. This saturates disk (90%+ utilization), blocking block execution which competes for the same storage. Symptoms observed: - Block drift increases during merge operations - Synchronized stalls across nodes at similar block heights - RPC timeouts and degraded service Add ERIGON_MERGE_THROTTLE_MS environment variable to insert a pause between merge operations. This reduces disk contention and allows operators to desynchronize merge timing across a node fleet. Usage: ERIGON_MERGE_THROTTLE_MS=500 ./erigon ... Default behavior (no throttle) is preserved when unset. Signed-off-by: Peter Lemenkov <lemenkov@gmail.com> Assisted-by: Claude (Anthropic) <https://claude.ai>

lemenkov · 2026-04-11T03:04:03Z

individual merge steps take 1-5 minutes, - in your case it's acceptable to have 5min of 1-5 Mgas/s?

Yes. We rely on a proxy-server on top of it and it'll handle the dispatching.

lemenkov requested review from AskAlexSharov, Giulio2002 and sudeepdino008 as code owners April 10, 2026 20:51

AskAlexSharov approved these changes Apr 11, 2026

View reviewed changes

lemenkov force-pushed the merge_throttle branch from c93cd42 to b777f31 Compare April 11, 2026 03:01

AskAlexSharov enabled auto-merge April 11, 2026 03:30

AskAlexSharov added this pull request to the merge queue Apr 11, 2026

Merged via the queue into erigontech:main with commit 302e128 Apr 11, 2026
33 checks passed

yperbasis mentioned this pull request May 19, 2026

Performance regression on the chain tip #21008

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

db/state: add optional throttle to MergeLoop to reduce disk I/O pressure#20486

db/state: add optional throttle to MergeLoop to reduce disk I/O pressure#20486
AskAlexSharov merged 1 commit into
erigontech:mainfrom
lemenkov:merge_throttle

lemenkov commented Apr 10, 2026

Uh oh!

AskAlexSharov commented Apr 11, 2026

Uh oh!

lemenkov commented Apr 11, 2026

Uh oh!

AskAlexSharov commented Apr 11, 2026

Uh oh!

AskAlexSharov commented Apr 11, 2026

Uh oh!

lemenkov commented Apr 11, 2026 •

edited

Loading

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

lemenkov commented Apr 10, 2026

Problem

Observed behavior

Solution

Production results

Notes

Uh oh!

AskAlexSharov commented Apr 11, 2026

Uh oh!

lemenkov commented Apr 11, 2026

Uh oh!

AskAlexSharov commented Apr 11, 2026

Uh oh!

AskAlexSharov commented Apr 11, 2026

Uh oh!

lemenkov commented Apr 11, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

lemenkov commented Apr 11, 2026 •

edited

Loading