Skip to content

Erigon is OOM killed by OS during eth_getLogs() #19719

@lupin012

Description

@lupin012

System information

A User (gautam.jha):
I’m running Erigon on my Kubernetes cluster for an Ethereum Sepolia node. The setup had been stable for about a month while running erigontech/erigon:v3.0.3 with a memory limit of 45GB.

Starting today (24/02/2026), the pod began getting OOMKilled repeatedly. Due to the frequent restarts, I increased the memory limit to 75GB, but memory consumption continued to rise and is now reaching approximately 130GB RAM.

After the repeated restarts, I upgraded the image from erigontech/erigon:v3.0.3 to erigontech/erigon:v3.3.8 (latest), but the high memory usage persists.

I’m trying to understand:

Why would memory usage suddenly spike after being stable for a month?,
Could this be related to the version upgrade from 3.0.3 to 3.3.8?,
Is there any known memory regression or behavior change in recent releases?,
Could any of my runtime flags be contributing to this?,

Below are the arguments I’m using:

--datadir=/data/erigon/
--chain=sepolia
--port=30303
--http
--http.api=eth,debug,net,trace,web3,erigon,txpool
--http.addr=0.0.0.0
--http.vhosts=any
--http.corsdomain='*'
--torrent.download.rate=512mb
--http.port=8545
--ws
--ws.port=8546
--authrpc.addr=0.0.0.0
--authrpc.port=8551
--authrpc.jwtsecret=/data/erigon/jwt.hex
--externalcl
--healthcheck
--metrics
--metrics.addr=0.0.0.0
--metrics.port=6061
--nat=auto
Erigon version: ./erigon --version

OS & Version: Windows/Linux/OSX

Kubernetes cluster for an Ethereum Sepolia node
erigontech/erigon:v3.0.3

Commit hash:

Erigon Command (with flags/config):

Consensus Layer:

Consensus Layer Command (with flags/config):

Chain/Network:

Expected behaviour

Actual behaviour

Starting today, the pod began getting OOMKilled repeatedly. Due to the frequent restarts, I increased the memory limit to 75GB, but memory consumption continued to rise and is now reaching approximately 130GB RAM.

If anyone has seen similar behavior or has suggestions on what to check (cache configuration, snapshot stages, torrent rate, trace/debug API impact, etc.), I’d really appreciate your guidance.

Thanks in advance 🙏
Image

Steps to reproduce the behaviour

Backtrace

[backtrace]

Metadata

Metadata

Assignees

Labels

Type

No type
No fields configured for issues without a type.

Projects

No projects

Relationships

None yet

Development

No branches or pull requests

Issue actions