feat: tracking size of all LruCache with SizeTrackingLruCache #5920

hanabi1224 · 2025-08-11T13:50:43Z

Summary of changes

Changes introduced in this pull request:

replaces all the rest LruCache usages with SizeTrackingLruCache

Reference issue to close (if applicable)

Closes

Change checklist

I have performed a self-review of my own code,
I have made corresponding changes to the documentation. All new code adheres to the team's documentation standards,
I have added tests that prove my fix is effective or that my feature works (if possible),
I have made sure the CHANGELOG is up-to-date. All user-facing changes should be reflected in this document.

Summary by CodeRabbit

New Features
- Added heap-size reporting across core types (headers, tipsets, tickets, proofs, messages, addresses, token amounts) for improved memory accounting.
Refactor
- Replaced lock-guarded LRU caches with size-tracking caches, requiring non-zero capacities and adjusting concurrency/access semantics.
- Standardized cache metric names and prefixes for clearer observability.
Chores
- Updated lint/config to discourage legacy cache constructors.
Tests
- Updated tests to compute sizes dynamically instead of using hard-coded values.

coderabbitai · 2025-08-11T13:50:49Z

Walkthrough

Adds GetSize derives/impls and a BigInt heap-size helper across many types; replaces several lock-guarded lru::LruCache usages with SizeTrackingLruCache (new API: get_cloned, push, contains, NonZeroUsize capacity); tweaks several metrics registry names and one test assertion.

Changes

Cohort / File(s)	Summary
Clippy config updates `./.clippy.toml`	Adds disallowed methods for `lru::LruCache::new`, `with_hasher`, `unbounded`, `unbounded_with_hasher` with guidance to prefer `SizeTrackingLruCache`.
Blocks: GetSize additions `src/blocks/election_proof.rs`, `src/blocks/header.rs`, `src/blocks/ticket.rs`, `src/blocks/tipset.rs`, `src/blocks/vrf_proof.rs`	Add/GetSize derives and impls: ElectionProof derives GetSize; RawBlockHeader implements GetSize and CachingBlockHeader derives GetSize (cid ignored); Ticket and Tipset derive GetSize (custom vec helper for headers); VRFProof GetSize delegates to inner Vec.
Messages: GetSize derives `src/message/chain_message.rs`, `src/message/signed_message.rs`	Derive GetSize for `ChainMessage` and `SignedMessage`; minor import reordering.
Shim types: GetSize support `src/shim/address.rs`, `src/shim/econ.rs`, `src/shim/message.rs`, `src/shim/sector.rs`	Implement/derive GetSize: Address returns 0; TokenAmount uses `big_int_heap_size_helper`; Message derives GetSize with custom size_fn for `params`; PoStProof delegates to inner proof bytes.
Utils: BigInt helper & cache additions `src/utils/get_size/mod.rs`, `src/utils/cache/lru.rs`	Add `big_int_heap_size_helper(&BigInt) -> usize`; add `SizeTrackingLruCache::contains<Q>(&self, k: &Q) -> bool`; update cache size metric name to prefix `cache_`.
Cache refactor: replace LruCache+locks `src/chain/store/chain_store.rs`, `src/chain/store/index.rs`	Replace RwLock/Mutex-wrapped `lru::LruCache` with `SizeTrackingLruCache`; change constructors to accept `NonZeroUsize`; switch access to `get_cloned`, `push`, `contains`; adjust Default and tests for NonZeroUsize.
Metrics/registry name tweaks `src/beacon/drand.rs`, `src/chain_sync/bad_block_cache.rs`, `src/db/car/mod.rs`, `src/message_pool/msgpool/msg_pool.rs`, `src/state_manager/cache.rs`, `src/state_migration/common/mod.rs`	Rename metrics/registry keys (examples: `verified_beacons_cache`→`verified_beacons`, `bad_block_cache`→`bad_block`, `zstd_frame_cache`→`zstd_frame`, `bls_sig_cache`→`bls_sig`, `sig_val_cache`→`sig_val`, `migration_cache`→`migration`, tipset state naming).
Test update `interop-tests/tests/dhat_get_size_beacon_entry.rs`	Replace hard-coded 60 with computed `inner_bytes` from GetSize sums in the dhat HeapStats assertion.

Sequence Diagram(s)

sequenceDiagram
  participant Caller
  participant SizeTrackingLruCache
  participant Builder

  Caller->>SizeTrackingLruCache: get_or_insert_with(key, builder)
  alt cache.contains(key)
    SizeTrackingLruCache-->>Caller: get_cloned(key) (hit)
  else miss
    SizeTrackingLruCache->>Builder: compute value
    Builder-->>SizeTrackingLruCache: value
    SizeTrackingLruCache->>SizeTrackingLruCache: push(key, value)
    SizeTrackingLruCache-->>Caller: get_cloned(key)
  end

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Possibly related PRs

feat: more lru cache metrics #5848: Overlapping migrations replacing lru::LruCache usages with SizeTrackingLruCache and related metric/name adjustments.
feat: add SizeTrackingLruCache for tracking cache size metrics #5841: Adds/introduces the SizeTrackingLruCache abstraction and related APIs that this change adopts.
feat: track TipsetStateCache size with prometheus metrics #5912: Adds GetSize support and cache migrations overlapping the same modules (get_size helpers and cache conversions).

Suggested reviewers

elmattic
LesnyRumcajs
akaladarshi

📜 Recent review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 13e6cab and d4dedaf.

📒 Files selected for processing (1)

src/chain/store/index.rs (2 hunks)

🚧 Files skipped from review as they are similar to previous changes (1)

src/chain/store/index.rs

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (9)

GitHub Check: Build Ubuntu
GitHub Check: Build MacOS
GitHub Check: cargo-publish-dry-run
GitHub Check: tests-release
GitHub Check: tests
GitHub Check: Build forest binaries on Linux AMD64
GitHub Check: All lint checks
GitHub Check: Analyze (rust)
GitHub Check: Analyze (go)

✨ Finishing Touches

📝 Generate Docstrings

🧪 Generate unit tests

Create PR with unit tests
Post copyable unit tests in a comment
Commit unit tests in branch hm/track-all-lru-caches

🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

Review comments: Directly reply to a review comment made by CodeRabbit. Example:
- I pushed a fix in commit <commit_id>, please review it.
- Open a follow-up GitHub issue for this discussion.
Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query.
PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
- @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
- @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.

Support

Need help? Create a ticket on our support page for assistance with any issues or questions.

CodeRabbit Commands (Invoked using PR/Issue comments)

Type @coderabbitai help to get the list of available commands.

Other keywords and placeholders

Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
Add @coderabbitai anywhere in the PR title to generate the title automatically.

CodeRabbit Configuration File (`.coderabbit.yaml`)

You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
Please see the configuration documentation for more information.
If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Status, Documentation and Community

Visit our Status Page to check the current availability of CodeRabbit.
Visit our Documentation for detailed information on how to use CodeRabbit.
Join our Discord Community to get help, request features, and share feedback.
Follow us on X/Twitter for updates and announcements.

coderabbitai

Actionable comments posted: 4

🔭 Outside diff range comments (2)

src/state_manager/cache.rs (1)

97-114: Pending entry not cleared on compute error can leave stuck “pending” entries

If compute().await fails, the key remains in inner.pending indefinitely. Subsequent calls will always go down the “Empty” path, lock the same mutex, and re-attempt compute (good), but the pending list will grow with permanently failing keys, causing memory bloat.

Clear the pending entry on error as well. Example change:

-                        let value = compute().await?;
-
-                        // Write back to cache, release lock and return value
-                        self.insert(key.clone(), value.clone());
-                        Ok(value)
+                        let res = compute().await;
+                        // Ensure we don't leave a stale pending entry on error.
+                        self.with_inner(|inner| {
+                            inner.pending.retain(|(k, _)| k != key);
+                        });
+                        match res {
+                            Ok(value) => {
+                                // Write back to cache, release lock and return value
+                                self.insert(key.clone(), value.clone());
+                                Ok(value)
+                            }
+                            Err(e) => Err(e),
+                        }

Also applies to: 123-146

src/chain/store/chain_store.rs (1)

577-584: Use a single lookup to avoid TOCTOU and remove the panic

The current contains→get path can race under concurrency and trigger the "cache entry disappeared!" panic. Replace it with one get (or get_cloned) call and fall back to insertion:
-        if self.cache.contains(key) {
-            Ok(self.get(key).expect("cache entry disappeared!"))
-        } else {
-            let v = f()?;
-            self.insert(key.clone(), v.clone());
-            Ok(v)
-        }
+        match self.get(key) {
+            Some(v) => Ok(v),
+            None => {
+                let v = f()?;
+                self.insert(key.clone(), v.clone());
+                Ok(v)
+            }
+        }
• This removes the benign TOCTOU and eliminates the panic.
• Since SizeTrackingLruCache doesn’t expose an atomic get-or-insert helper, duplicate computation under contention is still possible. We can add a get_or_try_insert_with API to the cache as a follow-up if you’d like.

🧹 Nitpick comments (14)

src/chain_sync/bad_block_cache.rs (1)
28-29: Metrics label rename to "bad_block" — LGTM. Consider exposing contains().

The rename aligns with the metrics standardization.

Optional: provide a contains() that uses the new SizeTrackingLruCache.contains() to avoid cloning/allocations elsewhere.
 impl BadBlockCache {
     pub fn new(cap: NonZeroUsize) -> Self {
         Self {
             cache: SizeTrackingLruCache::new_with_default_metrics_registry("bad_block".into(), cap),
         }
     }
 
+    /// Returns true if the block CID is in the bad block cache.
+    /// Does not update LRU head position.
+    pub fn contains(&self, c: &Cid) -> bool {
+        self.cache.contains(&(*c).into())
+    }
+
     pub fn push(&self, c: Cid) {
         self.cache.push(c.into(), ());
     }
 
     /// Returns `Some` if the block CID is in bad block cache.
     /// This function does not update the head position of the `Cid` key.
     pub fn peek(&self, c: &Cid) -> Option<()> {
         self.cache.peek_cloned(&(*c).into())
     }
 }
src/beacon/drand.rs (1)
271-274: Use contains() for membership check to avoid locking + value compare.

You can simplify is_verified by checking key presence only; the round uniquely identifies the entry content.
-    fn is_verified(&self, entry: &BeaconEntry) -> bool {
-        let cache = self.verified_beacons.cache().read();
-        cache.peek(&entry.round()) == Some(entry)
-    }
+    fn is_verified(&self, entry: &BeaconEntry) -> bool {
+        self.verified_beacons.contains(&entry.round())
+    }
If there’s any possibility that the cached value for a round could differ from entry, keep the equality check.
src/shim/sector.rs (1)

22-22: GetSize for PoStProof — LGTM.

Accurately accounts for heap via proof_bytes. If fvm_shared adds additional heap fields in PoStProofV4 later, consider delegating to the inner type (if it ever implements GetSize) to avoid drift.

Also applies to: 381-385

src/shim/address.rs (1)

17-17: GetSize for Address — LGTM.

Returning 0 matches the zero-heap expectation for this wrapper.

Consider a brief comment explaining why heap size is 0 (helps future maintainers avoid double-counting).

Also applies to: 263-267
src/blocks/tipset.rs (1)
149-156: Derive GetSize for Tipset and annotate headers — LGTM.

The size_fn for nunny::Vec ensures precise heap tracking for the headers collection.

If you want similar visibility for FullTipset memory, consider deriving GetSize there too:
-#[derive(Debug, Clone, Eq)]
+#[derive(Debug, Clone, Eq, GetSize)]
 pub struct FullTipset {
-    blocks: NonEmpty<Block>,
+    #[get_size(size_fn = nunny_vec_heap_size_helper)]
+    blocks: NonEmpty<Block>,
     // key is lazily initialized via `fn key()`.
     key: OnceLock<TipsetKey>,
 }
src/utils/get_size/mod.rs (1)
54-58: MSRV-safe byte math and minor doc tweak for BigInt helper.

The estimate is fine. If MSRV predates div_ceil on integers, prefer the add-then-divide pattern. Also clarify the comment to reflect that this measures magnitude bytes (sign/overhead ignored).
-// This is a rough estimation. Use `b.allocation_size()`
-// once https://github.com/rust-num/num-bigint/pull/333 is accepted and released.
+// Rough estimation of magnitude bytes. Ignores sign/struct overhead.
+// Switch to `b.allocation_size()` once
+// https://github.com/rust-num/num-bigint/pull/333 is released.

-pub fn big_int_heap_size_helper(b: &BigInt) -> usize {
-    b.bits().div_ceil(8) as usize
-}
+pub fn big_int_heap_size_helper(b: &BigInt) -> usize {
+    // MSRV-safe: (bits + 7) / 8
+    let bits = b.bits() as usize;
+    (bits.saturating_add(7)) / 8
+}
src/blocks/vrf_proof.rs (1)
30-34: Optional: derive GetSize for VRFProof and drop the manual impl.

Keeps consistency with other types deriving GetSize and reduces boilerplate.
-#[derive(Clone, Debug, PartialEq, Eq, Ord, PartialOrd, Default, Serialize, Deserialize, Hash)]
+#[derive(Clone, Debug, PartialEq, Eq, Ord, PartialOrd, Default, Serialize, Deserialize, Hash, GetSize)]
 pub struct VRFProof(#[serde(with = "serde_byte_array")] pub Vec<u8>);

-impl GetSize for VRFProof {
-    fn get_heap_size(&self) -> usize {
-        self.0.get_heap_size()
-    }
-}
src/shim/message.rs (1)
44-49: Remove dead commented code.

Stale skeleton impl adds noise; derive is already in use.
-// impl GetSize for Message{
-//     fn get_heap_size(&self) -> usize {
-
-//     }
-// }
src/utils/cache/lru.rs (2)

127-134: New contains() is correct and concurrency-safe

Read lock with LruCache::contains is appropriate and does not update recency. Consider a short doc comment noting that contains does not affect LRU order, so callers shouldn’t rely on it for touch semantics.

186-191: Metric rename to "cache_..._size": ensure consistency and update dashboards

The size metric now uses the "cache_" prefix. Please:

Verify dashboards/alerts scraped the new name.

Consider aligning len/cap metrics with the same "cache_" prefix for consistency (optional).
src/blocks/header.rs (2)
201-233: Manual GetSize impl: clarify exclusion rationale and potential undercount

You explicitly exclude CIDs (state_root, message_receipts, messages). If this is to avoid relying on a non-existent GetSize impl for cid::Cid, please replace the comment “derive macro does not compile for some reason” with a concrete note so future readers don’t regress this.

If cid::Cid (or its multihash) allocates on the heap in your version, excluding it here will undercount. If the multihash is stack/inlined for 32-byte digests in your build, the exclusion is fine. Please confirm the intended behavior.

Nit: big_int_heap_size_helper(b) accounts only for payload bits; it doesn’t include Vec/limb overhead. If you want a slightly less optimistic estimate, consider adding a small constant overhead per BigInt, but that’s optional.

Example comment tweak:
-// The derive macro does not compile for some reason
+// Note: We can’t derive GetSize here because some external types (e.g., cid::Cid)
+// don’t implement it. We manually sum known heap-allocating fields and exclude CIDs,
+// which are treated as fixed-size/inlined for our purposes.
Would you like me to search the repo to verify whether there’s any GetSize impl for cid::Cid (or wrappers), and whether we should account for it here?

237-241: Deriving GetSize and ignoring OnceLock is reasonable—confirm CID treatment is consistent

Deriving GetSize on CachingBlockHeader while ignoring the cached CID avoids double-counting and also sidesteps the lack of a GetSize impl for Cid. This matches the manual exclusion in RawBlockHeader, but it means CID heap (if any) is never counted. If that’s intentional, consider adding a brief comment to the field.

Optional:
 #[derive(Debug, GetSize)]
 pub struct CachingBlockHeader {
     uncached: RawBlockHeader,
-    #[get_size(ignore)]
+    // CID heap is excluded; tracked via RawBlockHeader fields as needed (currently excluded too).
+    #[get_size(ignore)]
     cid: OnceLock<Cid>,
     has_ever_been_verified_against_any_signature: AtomicBool,
 }
If you’d like, I can also scan for a GetSize impl on TipsetKey and Signature to ensure size tracking stays sound across caches.
src/chain/store/chain_store.rs (2)

560-567: Constructor and metrics wiring LGTM

Using new_with_default_metrics_registry is consistent with similar caches. Consider standardizing the cache name (e.g., “msgs_in_tipset”) if you want all cache metrics to follow a consistent pluralization.

591-598: NonZeroUsize env parsing is fine; optional: guard invalid values explicitly

Current parsing silently falls back to DEFAULT. Optionally log or trace when parsing fails or when the value is zero/invalid to aid ops.

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 59ca796 and 76fefd2.

📒 Files selected for processing (23)

.clippy.toml (1 hunks)
interop-tests/tests/dhat_get_size_beacon_entry.rs (1 hunks)
src/beacon/drand.rs (1 hunks)
src/blocks/election_proof.rs (2 hunks)
src/blocks/header.rs (2 hunks)
src/blocks/ticket.rs (1 hunks)
src/blocks/tipset.rs (2 hunks)
src/blocks/vrf_proof.rs (2 hunks)
src/chain/store/chain_store.rs (4 hunks)
src/chain/store/index.rs (2 hunks)
src/chain_sync/bad_block_cache.rs (1 hunks)
src/db/car/mod.rs (1 hunks)
src/message/chain_message.rs (1 hunks)
src/message/signed_message.rs (1 hunks)
src/message_pool/msgpool/msg_pool.rs (1 hunks)
src/shim/address.rs (2 hunks)
src/shim/econ.rs (2 hunks)
src/shim/message.rs (2 hunks)
src/shim/sector.rs (2 hunks)
src/state_manager/cache.rs (1 hunks)
src/state_migration/common/mod.rs (1 hunks)
src/utils/cache/lru.rs (2 hunks)
src/utils/get_size/mod.rs (2 hunks)

🧰 Additional context used

🧠 Learnings (1)

📚 Learning: 2025-07-17T15:21:40.753Z

Learnt from: hanabi1224
PR: ChainSafe/forest#5841
File: src/utils/get_size/mod.rs:10-10
Timestamp: 2025-07-17T15:21:40.753Z
Learning: The get_size2 crate's GetSize trait provides default implementations: get_stack_size() uses std::mem::size_of, get_heap_size() returns 0, and get_size() returns their sum. An empty impl like `impl GetSize for MyType {}` is valid and uses these defaults, making it suitable for simple wrapper types that don't allocate heap memory.

Applied to files:

src/blocks/tipset.rs
src/blocks/vrf_proof.rs
src/shim/address.rs
src/blocks/header.rs
src/message/signed_message.rs
src/shim/econ.rs
src/shim/sector.rs
src/shim/message.rs

🧬 Code Graph Analysis (13)

src/message_pool/msgpool/msg_pool.rs (2)

src/state_migration/common/mod.rs (1)

new (32-39)

src/utils/cache/lru.rs (1)

new_with_default_metrics_registry (95-100)

src/blocks/tipset.rs (8)

src/rpc/methods/chain.rs (1)

utils (497-497)

src/blocks/header.rs (2)

cid (69-71)

cid (304-306)

src/message/chain_message.rs (1)

cid (29-34)

src/message/signed_message.rs (1)

cid (78-85)

src/shim/message.rs (1)

cid (195-198)

src/blocks/block.rs (1)

cid (41-43)

src/libp2p_bitswap/store.rs (1)

cid (86-88)

src/utils/get_size/mod.rs (1)

nunny_vec_heap_size_helper (50-52)

src/blocks/vrf_proof.rs (4)

src/blocks/header.rs (1)

get_heap_size (203-232)

src/shim/address.rs (1)

get_heap_size (264-266)

src/shim/econ.rs (1)

get_heap_size (40-42)

src/shim/sector.rs (1)

get_heap_size (382-384)

src/shim/address.rs (5)

src/blocks/vrf_proof.rs (1)

get_heap_size (31-33)

src/blocks/header.rs (1)

get_heap_size (203-232)

src/shim/econ.rs (1)

get_heap_size (40-42)

src/shim/sector.rs (1)

get_heap_size (382-384)

src/shim/executor.rs (2)

get_heap_size (138-144)

get_heap_size (348-357)

src/blocks/header.rs (3)

src/utils/encoding/mod.rs (1)

blake2b_256 (87-97)

src/utils/get_size/mod.rs (1)

big_int_heap_size_helper (56-58)

src/shim/sector.rs (1)

get_heap_size (382-384)

src/chain/store/chain_store.rs (2)

src/utils/cache/lru.rs (2)

cache (119-121)

new_with_default_metrics_registry (95-100)

src/chain/store/index.rs (1)

new (40-46)

src/db/car/mod.rs (1)

src/utils/cache/lru.rs (1)

unbounded_with_default_metrics_registry (115-117)

src/chain_sync/bad_block_cache.rs (1)

src/utils/cache/lru.rs (2)

new_with_default_metrics_registry (95-100)

cap (159-161)

src/utils/cache/lru.rs (5)

src/blocks/tipset.rs (1)

contains (50-52)

src/chain/store/chain_store.rs (1)

contains (95-97)

src/cid_collections/small_cid_vec.rs (1)

contains (35-37)

src/db/car/many.rs (1)

contains (197-199)

src/state_manager/cache.rs (1)

cache_name (45-51)

src/shim/econ.rs (6)

src/utils/get_size/mod.rs (1)

big_int_heap_size_helper (56-58)

src/blocks/vrf_proof.rs (1)

get_heap_size (31-33)

src/blocks/header.rs (1)

get_heap_size (203-232)

src/shim/address.rs (1)

get_heap_size (264-266)

src/shim/sector.rs (1)

get_heap_size (382-384)

src/shim/executor.rs (2)

get_heap_size (138-144)

get_heap_size (348-357)

src/shim/sector.rs (5)

src/blocks/vrf_proof.rs (1)

get_heap_size (31-33)

src/blocks/header.rs (1)

get_heap_size (203-232)

src/shim/address.rs (1)

get_heap_size (264-266)

src/shim/econ.rs (1)

get_heap_size (40-42)

src/shim/executor.rs (2)

get_heap_size (138-144)

get_heap_size (348-357)

src/chain/store/index.rs (4)

src/utils/cache/lru.rs (2)

cache (119-121)

new_with_default_metrics_registry (95-100)

src/utils/misc/env.rs (1)

is_env_truthy (17-19)

src/message_pool/msgpool/provider.rs (2)

load_tipset (50-50)

load_tipset (114-120)

src/blocks/tipset.rs (4)

tsk (264-268)

load (263-271)

new (241-259)

new (493-510)

src/message/chain_message.rs (2)

src/message/signed_message.rs (1)

message (40-42)

src/cli/subcommands/chain_cmd.rs (1)

fvm_ipld_encoding (92-92)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (9)

GitHub Check: tests-release
GitHub Check: tests
GitHub Check: Build forest binaries on Linux AMD64
GitHub Check: cargo-publish-dry-run
GitHub Check: Build MacOS
GitHub Check: Build Ubuntu
GitHub Check: All lint checks
GitHub Check: Analyze (rust)
GitHub Check: Analyze (go)

🔇 Additional comments (24)

src/db/car/mod.rs (1)

75-76: Metric label rename sanity-check complete

I ran rg -n --hidden -S 'zstd_frame_cache' and found no remaining code-level registrations or string literals using the old metric label. The only matches are:

src/db/car/mod.rs
• Constant and env var names: ZSTD_FRAME_CACHE_DEFAULT_MAX_SIZE, FOREST_ZSTD_FRAME_CACHE_DEFAULT_MAX_SIZE
• Test name: test_zstd_frame_cache_size

CHANGELOG.md and docs/users/reference/env_variables.md: references to the env var

None of these affect the metric registry. Please verify that any external dashboards or alerting rules still expecting "zstd_frame_cache" have been updated.

src/beacon/drand.rs (1)

265-268: Verified complete label rename
The search for "verified_beacons_cache" across the repo returned no hits, confirming the old metric label has been fully removed and replaced. No further references found. 🎉

.clippy.toml (1)

53-54: Reason update for unbounded — LGTM.

Clearer guidance and aligns with the migration to SizeTrackingLruCache.

src/message/signed_message.rs (1)

15-20: Deriving GetSize for SignedMessage — LGTM.

This enables precise heap accounting for cache sizing; consistent with the broader PR.

src/blocks/tipset.rs (2)

14-15: Import nunny_vec_heap_size_helper — LGTM.

Needed for accurate heap-size accounting of nunny::Vec.

33-36: Derive GetSize for TipsetKey — LGTM.

Pairs well with shrink_to_fit in From<NonEmpty> to avoid over-counting spare capacity.

interop-tests/tests/dhat_get_size_beacon_entry.rs (1)

52-53: LGTM: assertion matches dynamic heap sizing.

Using inner_bytes aligns the expectation with actual per-element heap usage. Consistent with earlier assertions in the test.

src/shim/econ.rs (1)

39-43: LGTM: GetSize for TokenAmount correctly accounts for atto BigInt heap.

Accurate and zero-copy via atto(). This integrates with SizeTrackingLruCache as intended.

src/shim/message.rs (1)

18-18: LGTM: Deriving GetSize for Message.

Consistent with other message types and enables accurate size tracking.

src/message/chain_message.rs (1)

14-16: LGTM: GetSize derive for ChainMessage.

Enables correct sizing for both variants and integrates with size-tracking caches.

src/state_migration/common/mod.rs (1)

35-37: LGTM: metrics key normalized to "migration".

Aligns with repo-wide naming cleanup; no functional change.

src/message_pool/msgpool/msg_pool.rs (1)

479-485: LGTM: metrics keys shortened.

Consistent with naming across the PR; no runtime behavior change.

src/state_manager/cache.rs (1)

45-51: Metric name change: validate dashboards/alerts and ensure no collisions

cache_name() now returns "tipset_state_{type}" (previously included "cache" in the name per PR summary). Please confirm dashboards/alerts are updated and that no other registry entry reuses the same name.

src/blocks/election_proof.rs (1)

8-8: Derive GetSize on ElectionProof — good

Brings ElectionProof into heap-size accounting (including nested VRFProof). Looks correct.

Also applies to: 138-149

src/blocks/ticket.rs (1)

6-6: Derive GetSize on Ticket — good

Enables size tracking for Ticket via VRFProof. No logical changes; import and derive look correct.

Also applies to: 12-23

src/chain/store/index.rs (1)

52-58: Cache refactor to SizeTrackingLruCache — looks good

get_cloned() and push() usage is correct; metrics hit/miss logic preserved. The API remains thread-safe without an external Mutex.

Also applies to: 62-63

src/blocks/header.rs (2)

17-18: Importing big_int_heap_size_helper is appropriate

Brings in the helper used below; no issues.

23-23: GetSize import looks good

Required for the manual impl and derive below.

src/chain/store/chain_store.rs (6)

29-29: Import switch to SizeTrackingLruCache aligns with PR goals

Consistent with replacing LruCache usages.

555-557: MsgsInTipsetCache switched to SizeTrackingLruCache

Good step toward centralized size accounting. Please ensure TipsetKey and Vec implement GetSize so sizes are tracked correctly; otherwise entries will default to stack-only size.

Do you want me to scan the codebase to confirm GetSize impls for these types?

569-571: get via get_cloned is correct for copy-out semantics

No concerns.

587-588: Insert delegates to push—good

Matches the new cache API.

603-604: Default uses read_cache_size—LGTM

No issues.

801-801: Test update for NonZeroUsize capacity is correct

Constructs the cache with capacity 2; behavior validated below.

.clippy.toml

src/chain/store/index.rs

coderabbitai · 2025-08-11T14:14:03Z

src/shim/message.rs

+fn raw_bytes_heap_size(b: &RawBytes) -> usize {
+    // Note: this is a cheap but inaccurate estimation,
+    // the correct implementation should be `Vec<u8>.from(b.clone()).get_heap_size()`,
+    // or `b.bytes.get_heap_size()` if `bytes` is made public.
+    b.bytes().get_heap_size()
+}


💡 Verification agent

🧩 Analysis chain

Confirm RawBytes::bytes() return type; fix misleading comment if needed.

If RawBytes::bytes() returns &[u8], get_heap_size() will undercount (likely 0). If it returns &Vec, this is correct. The comment currently contradicts the implementation; adjust accordingly after verification.

Please verify via docs: what is the return type of fvm_ipld_encoding::RawBytes::bytes()?

If it returns &[u8], change to a safe, accurate path (not ideal due to clone, but correct):

-fn raw_bytes_heap_size(b: &RawBytes) -> usize { - // Note: this is a cheap but inaccurate estimation, - // the correct implementation should be `Vec<u8>.from(b.clone()).get_heap_size()`, - // or `b.bytes.get_heap_size()` if `bytes` is made public. - b.bytes().get_heap_size() -} +fn raw_bytes_heap_size(b: &RawBytes) -> usize { + // Accurate: obtain a Vec to measure capacity-based heap size. + let v: Vec<u8> = b.clone().into(); + v.get_heap_size() +}

🌐 Web query:

What is the return type of fvm_ipld_encoding::RawBytes::bytes()? Does it return &Vec<u8> or &[u8>?

💡 Result:

It returns a slice: pub fn bytes(&self) -> &[u8]. (docs.rs)

(So you get a &[u8], not an &Vec.)

Citations:

1: https://docs.rs/fvm_ipld_encoding/latest/fvm_ipld_encoding/struct.RawBytes.html

Adjust raw_bytes_heap_size to clone into a Vec for an accurate heap‐size measurement

According to the fvm_ipld_encoding docs, RawBytes::bytes() returns a slice (&[u8]), so calling get_heap_size() on it will undercount (it effectively reports zero for the referenced heap data). To measure the actual heap allocation, clone the RawBytes into a Vec<u8> and call get_heap_size() on that:

src/shim/message.rs

fn raw_bytes_heap_size(b: &RawBytes) -> usize { - // Note: this is a cheap but inaccurate estimation, - // the correct implementation should be `Vec<u8>.from(b.clone()).get_heap_size()`, - // or `b.bytes.get_heap_size()` if `bytes` is made public. - b.bytes().get_heap_size() + // Accurate: clone into a Vec so we can measure its heap allocation. + let v: Vec<u8> = b.clone().into(); + v.get_heap_size() }

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

fn raw_bytes_heap_size(b: &RawBytes) -> usize {

// Note: this is a cheap but inaccurate estimation,

// the correct implementation should be `Vec<u8>.from(b.clone()).get_heap_size()`,

// or `b.bytes.get_heap_size()` if `bytes` is made public.

b.bytes().get_heap_size()

}

fn raw_bytes_heap_size(b: &RawBytes) -> usize {

// Accurate: clone into a Vec so we can measure its heap allocation.

let v: Vec<u8> = b.clone().into();

v.get_heap_size()

}

🤖 Prompt for AI Agents

In src/shim/message.rs around lines 37 to 42, raw_bytes_heap_size currently calls get_heap_size on a slice returned by RawBytes::bytes(), which undercounts heap usage; modify the function to obtain an owned Vec<u8> from the RawBytes (by cloning/collecting into a Vec) and call get_heap_size on that Vec so the actual heap allocation is measured correctly (e.g., convert the RawBytes into a Vec<u8> and return its get_heap_size()).

I'd prefer performance over accuracy in this case, as commented.

@hanabi1224, understood! The performance trade-off makes sense here, especially since size tracking can be called frequently. The comment already documents that this is an intentional approximation for performance reasons. I appreciate the clarification on the design choice.

🧠 Learnings used

Learnt from: hanabi1224 PR: ChainSafe/forest#5841 File: src/utils/get_size/mod.rs:10-10 Timestamp: 2025-07-17T15:21:40.753Z Learning: The get_size2 crate's GetSize trait provides default implementations: get_stack_size() uses std::mem::size_of, get_heap_size() returns 0, and get_size() returns their sum. An empty impl like `impl GetSize for MyType {}` is valid and uses these defaults, making it suitable for simple wrapper types that don't allocate heap memory.

src/utils/get_size/mod.rs

src/shim/address.rs

Co-authored-by: Hubert <hubert@chainsafe.io>

coderabbitai

Actionable comments posted: 1

🔭 Outside diff range comments (1)

src/chain/store/index.rs (1)

60-66: Cache-disabled env still performs writes and increments miss metrics — gate both

When the cache is disabled, we shouldn’t populate it or emit hit/miss metrics. Gate the push and miss counter on cache_enabled to avoid memory usage and misleading metrics when caching is disabled.

Apply this change:

-        let ts_opt = Tipset::load(&self.db, tsk)?.map(Arc::new);
-        if let Some(ts) = &ts_opt {
-            self.ts_cache.push(tsk.clone(), ts.clone());
-            metrics::LRU_CACHE_MISS
-                .get_or_create(&metrics::values::TIPSET)
-                .inc();
-        }
+        let ts_opt = Tipset::load(&self.db, tsk)?.map(Arc::new);
+        if let Some(ts) = &ts_opt {
+            if cache_enabled {
+                self.ts_cache.push(tsk.clone(), ts.clone());
+                metrics::LRU_CACHE_MISS
+                    .get_or_create(&metrics::values::TIPSET)
+                    .inc();
+            }
+        }

♻️ Duplicate comments (1)

.clippy.toml (1)

47-54: Ban coverage for LruCache constructors — LGTM; addresses earlier review

Blocking LruCache::new and LruCache::with_hasher aligns with the migration and closes the loophole called out earlier. Nice.

🧹 Nitpick comments (1)

.clippy.toml (1)
56-62: Ban unbounded constructors — good; unify reason wording and capitalization

The bans for unbounded and unbounded_with_hasher are correct. Minor consistency nit: standardize the reason string (capitalize LRU and simplify wording).

Apply this small wording cleanup:
- reason = """Avoid unbounded lru cache for potential memory leak, use SizeTrackingLruCache instead."""
+ reason = """Avoid unbounded LRU caches; use SizeTrackingLruCache instead."""
Repeat the same change for both entries (Lines 57 and 61).

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 76fefd2 and 13e6cab.

📒 Files selected for processing (4)

.clippy.toml (1 hunks)
src/chain/store/index.rs (2 hunks)
src/shim/address.rs (2 hunks)
src/utils/get_size/mod.rs (2 hunks)

🚧 Files skipped from review as they are similar to previous changes (2)

src/shim/address.rs
src/utils/get_size/mod.rs

🧰 Additional context used

🧠 Learnings (1)

📓 Common learnings

Learnt from: hanabi1224
PR: ChainSafe/forest#5841
File: src/utils/get_size/mod.rs:10-10
Timestamp: 2025-07-17T15:21:40.753Z
Learning: The get_size2 crate's GetSize trait provides default implementations: get_stack_size() uses std::mem::size_of, get_heap_size() returns 0, and get_size() returns their sum. An empty impl like `impl GetSize for MyType {}` is valid and uses these defaults, making it suitable for simple wrapper types that don't allocate heap memory.

🧬 Code Graph Analysis (1)

src/chain/store/index.rs (3)

src/utils/misc/env.rs (1)

is_env_truthy (17-19)

src/blocks/tipset.rs (4)

tsk (264-268)

load (263-271)

new (241-259)

new (493-510)

src/beacon/drand.rs (1)

new (251-269)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (9)

GitHub Check: Build Ubuntu
GitHub Check: cargo-publish-dry-run
GitHub Check: Build MacOS
GitHub Check: Build forest binaries on Linux AMD64
GitHub Check: All lint checks
GitHub Check: tests-release
GitHub Check: tests
GitHub Check: Analyze (rust)
GitHub Check: Analyze (go)

🔇 Additional comments (4)

.clippy.toml (1)

47-62: Confirmed: No additional LruCache constructors to ban

A direct import of LruCache is present at src/utils/cache/lru.rs:16.

No occurrences of LruCache::try_new, ::try_with_hasher, ::new_default, or ::from_iter were found in the codebase.

The current disallowed-methods list covers all existing constructors. We can revisit this if the lru crate introduces new constructors in the future.

src/chain/store/index.rs (3)

11-11: Switch to SizeTrackingLruCache import — LGTM

Import is correct and aligns with the migration away from LruCache.

19-19: Type alias migrated to size-tracking cache — LGTM

Using SizeTrackingLruCache<TipsetKey, Arc<Tipset>> is appropriate for measured caching.

41-44: Initialize cache with metrics registry and NonZero capacity — LGTM

Name “tipset” and NonZero capacity are consistent with the new cache API and metrics conventions.

src/chain/store/index.rs

feat: tracking size of all LruCache with SizeTrackingLruCache

76fefd2

hanabi1224 marked this pull request as ready for review August 11, 2025 13:51

hanabi1224 requested a review from a team as a code owner August 11, 2025 13:51

hanabi1224 requested review from LesnyRumcajs and sudo-shashank and removed request for a team August 11, 2025 13:51

coderabbitai bot reviewed Aug 11, 2025

View reviewed changes

resolve AI comments

44968f5

LesnyRumcajs mentioned this pull request Aug 12, 2025

track cache sizes [skip ci] #5840

Closed

4 tasks

LesnyRumcajs reviewed Aug 12, 2025

View reviewed changes

src/shim/address.rs Outdated Show resolved Hide resolved

LesnyRumcajs previously approved these changes Aug 12, 2025

View reviewed changes

Update address.rs

dd0301f

Co-authored-by: Hubert <hubert@chainsafe.io>

hanabi1224 dismissed LesnyRumcajs’s stale review via dd0301f August 12, 2025 10:04

Merge branch 'main' into hm/track-all-lru-caches

13e6cab

coderabbitai bot reviewed Aug 12, 2025

View reviewed changes

src/chain/store/index.rs Outdated Show resolved Hide resolved

resolve AI comments

d4dedaf

sudo-shashank approved these changes Aug 12, 2025

View reviewed changes

hanabi1224 enabled auto-merge August 12, 2025 13:43

LesnyRumcajs approved these changes Aug 12, 2025

View reviewed changes

hanabi1224 added this pull request to the merge queue Aug 12, 2025

Merged via the queue into main with commit 6b53690 Aug 12, 2025
43 checks passed

hanabi1224 deleted the hm/track-all-lru-caches branch August 12, 2025 14:15

coderabbitai bot mentioned this pull request Dec 4, 2025

feat: add eth_block_cache to improve performance of a few Eth RPC method #6289

Merged

4 tasks

coderabbitai bot mentioned this pull request Dec 11, 2025

fix: Fix the GasEstimateMessageGas API #6109

Merged

4 tasks

feat: tracking size of all LruCache with SizeTrackingLruCache #5920

feat: tracking size of all LruCache with SizeTrackingLruCache #5920

Uh oh!

Conversation

hanabi1224 commented Aug 11, 2025 • edited by coderabbitai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary of changes

Reference issue to close (if applicable)

Other information and links

Change checklist

Summary by CodeRabbit

Uh oh!

coderabbitai bot commented Aug 11, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Sequence Diagram(s)

Estimated code review effort

Possibly related PRs

Suggested reviewers

Chat

Support

CodeRabbit Commands (Invoked using PR/Issue comments)

Other keywords and placeholders

CodeRabbit Configuration File (.coderabbit.yaml)

Status, Documentation and Community

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

coderabbitai bot Aug 11, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

hanabi1224 Aug 11, 2025

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Aug 11, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

hanabi1224 commented Aug 11, 2025 •

edited by coderabbitai bot

Loading

coderabbitai bot commented Aug 11, 2025 •

edited

Loading

CodeRabbit Configuration File (`.coderabbit.yaml`)

coderabbitai bot Aug 11, 2025 •

edited

Loading