Skip to content

perf bug: cache accumulation #186

@steph-rs

Description

@steph-rs

Problem

The cache accumulates stale data from previous blocks because it reused across canonical block boundaries.

The flow

  // 1. canonical block arrives
  fn process_canonical_block(...) {
      flashblocks.retain(|fb| fb.block_number > block.number);
      self.build_pending_state(prev_pending_blocks, &flashblocks) // state.rs:319
  }
  // 2. build new state, starting with OLD cache
  fn build_pending_state(prev_pending_blocks, flashblocks) {
      let mut db = match &prev_pending_blocks { // state.rs:417
          Some(pb) => CacheDB {
              cache: pb.get_db_cache(),  //  clone old cache
              db: state,
          },
          None => CacheDB::new(state),
      };

      // 3. execute transactions - cache grows
      for tx in transactions {
          evm.transact(tx)?;
          evm.db_mut().commit(state);  //  cache grows
      }

      // 4. save grown cache
      pending_blocks_builder.with_db_cache(db.cache); 

      // 5. return new PendingBlocks with grown cache
      Ok(Some(Arc::new(pending_blocks_builder.build()?)))
  }

  // 6. swap - grown cache becomes next call's input
  self.pending_blocks.swap(new_pending_blocks);

When this happens

Flashblocks for the new block n+1, could appear before the canonical block n. So, this process could happen constantly.

Quick Fix

    fn process_canonical_block(...) {
        flashblocks.retain(|fb| fb.block_number > block.number);
-     self.build_pending_state(prev_pending_blocks, &flashblocks)
+     self.build_pending_state(None, &flashblocks) // state.rs:319
    }

This prevents cache reuse, solving the accumulation bug.

Better fix

The current architecture makes a clean solution difficult:

  • PendingBlocks is immutable (Arc wrapped)
  • Functions pass state in/out instead of mutating self
  • Multiple clones at every step (get_db_cache(), get_flashblocks(), etc.)

The self.build_pending_state(prev_pending_blocks, &flashblocks) really just:

  1. Filtering out old block data from PendingBlocks (all fields except db_cache)
  2. Checking cache for transactions (we didn't get any new flashblock on process_canonical_block call. we got all flashblocks from pending blocks: let mut flashblocks = pending_blocks.get_flashblocks(); - also a clone of flashblocks)

The Right Fix (requires refactoring):

impl StateProcessor {
    fn handle_canonical_block(&self, block: &RecoveredBlock) {
        // mutate self.pending_blocks directly, no passing around
        self.prune_pending_blocks(block.number);
        self.validate_flashblocks_match(block);
    }

    fn prune_pending_blocks(&self, canonical_block_number: u64) {
        let mut pending = self.pending_blocks.load();
        pending.retain_blocks_above(canonical_block_number);
        // no cache reuse, no clones, no rebuild
    }
}

Lmk what you think! I'm happy to implement either the quick fix or tackle the full refactoring, would appreciate guidance on the refactoring approach if we go that route.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions