Skip to content

fix(beacon_node): add pruning of observed_column_sidecars#8531

Merged
mergify[bot] merged 1 commit intosigp:release-v8.0from
0xMushow:fix/memory-leak-oberserved-column-sidecars
Dec 4, 2025
Merged

fix(beacon_node): add pruning of observed_column_sidecars#8531
mergify[bot] merged 1 commit intosigp:release-v8.0from
0xMushow:fix/memory-leak-oberserved-column-sidecars

Conversation

@0xMushow
Copy link
Copy Markdown
Contributor

@0xMushow 0xMushow commented Dec 4, 2025

Issue Addressed

None

Proposed Changes

I noticed that observed_column_sidecars is missing its prune call in the finalization handler, which results in a memory leak on long-running nodes (very slow (7MB/day)) :

self.observed_block_producers.write().prune(
new_view
.finalized_checkpoint
.epoch
.start_slot(T::EthSpec::slots_per_epoch()),
);
self.observed_blob_sidecars.write().prune(
new_view
.finalized_checkpoint
.epoch
.start_slot(T::EthSpec::slots_per_epoch()),
);
self.observed_slashable.write().prune(
new_view
.finalized_checkpoint
.epoch
.start_slot(T::EthSpec::slots_per_epoch()),
);

Both caches use the same generic type ObservedDataSidecars<T>:

/// Maintains a record of blob sidecars seen over the gossip network.
pub observed_blob_sidecars: RwLock<ObservedDataSidecars<BlobSidecar<T::EthSpec>>>,
/// Maintains a record of column sidecars seen over the gossip network.
pub observed_column_sidecars: RwLock<ObservedDataSidecars<DataColumnSidecar<T::EthSpec>>>,

The type's documentation explicitly requires manual pruning:

"The cache supports pruning based upon the finalized epoch. It does not automatically prune, you must call Self::prune manually."

/// Maintains a cache of seen `ObservableDataSidecar`s that are received over gossip
/// and have been gossip verified.
///
/// The cache supports pruning based upon the finalized epoch. It does not automatically prune, you
/// must call `Self::prune` manually.
///
/// Note: To prevent DoS attacks, this cache must include only items that have received some DoS resistance
/// like checking the proposer signature.
pub struct ObservedDataSidecars<T: ObservableDataSidecar> {

Currently:

  • observed_blob_sidecars => pruned
  • observed_column_sidecars => NOT pruned

Without pruning, the underlying HashMap accumulates entries indefinitely, causing continuous memory growth until the node restarts.

Additional Info

None

@pawanjay176 pawanjay176 changed the base branch from unstable to release-v8.0 December 4, 2025 08:20
@pawanjay176
Copy link
Copy Markdown
Member

Changed the base to the release branch FYI

@pawanjay176 pawanjay176 added ready-for-review The code is ready for review v8.0.2 Post-Fulu patch release with small improvements labels Dec 4, 2025
@jimmygchen jimmygchen added ready-for-merge This PR is ready to merge. and removed ready-for-review The code is ready for review labels Dec 4, 2025
@mergify
Copy link
Copy Markdown

mergify bot commented Dec 4, 2025

Merge Queue Status

✅ The pull request has been merged

This pull request spent 1 hour 1 minute 9 seconds in the queue, including 58 minutes 50 seconds running CI.
The checks were run on draft #8537.

Required conditions to merge
  • check-success=local-testnet-success
  • check-success=test-suite-success

@mergify mergify bot added the queued label Dec 4, 2025
mergify bot added a commit that referenced this pull request Dec 4, 2025
@mergify mergify bot merged commit 9ddd2d8 into sigp:release-v8.0 Dec 4, 2025
36 checks passed
@mergify mergify bot removed the queued label Dec 4, 2025
michaelsproul pushed a commit that referenced this pull request Feb 18, 2026
None


  I noticed that `observed_column_sidecars` is missing its prune call in the finalization handler, which results in a memory leak on long-running nodes (very slow (**7MB/day**)) :

https://github.com/sigp/lighthouse/blob/13dfa9200f822c41ccd81b95a3f052df54c888e9/beacon_node/beacon_chain/src/canonical_head.rs#L940-L959

Both caches use the same generic type `ObservedDataSidecars<T>:`
https://github.com/sigp/lighthouse/blob/22ec4b327186c4a4a87d2c8c745caf3b36cb6dd6/beacon_node/beacon_chain/src/beacon_chain.rs#L413-L416

The type's documentation explicitly requires manual pruning:

>  "*The cache supports pruning based upon the finalized epoch. It does not automatically prune, you must call Self::prune manually.*"


https://github.com/sigp/lighthouse/blob/b4704eab4ac8edf0ea0282ed9a5758b784038dd2/beacon_node/beacon_chain/src/observed_data_sidecars.rs#L66-L74

Currently:
- `observed_blob_sidecars` => pruned
- `observed_column_sidecars` => **NOT** pruned

Without pruning, the underlying HashMap accumulates entries indefinitely, causing continuous memory growth until the node restarts.


Co-Authored-By: Antoine James <antoine@ethereum.org>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

ready-for-merge This PR is ready to merge. v8.0.2 Post-Fulu patch release with small improvements

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants