Skip to content

Conversation

@hanabi1224
Copy link
Contributor

@hanabi1224 hanabi1224 commented Aug 11, 2025

Summary of changes

Changes introduced in this pull request:

  • replaces all the rest LruCache usages with SizeTrackingLruCache

Reference issue to close (if applicable)

Closes

Other information and links

Change checklist

  • I have performed a self-review of my own code,
  • I have made corresponding changes to the documentation. All new code adheres to the team's documentation standards,
  • I have added tests that prove my fix is effective or that my feature works (if possible),
  • I have made sure the CHANGELOG is up-to-date. All user-facing changes should be reflected in this document.

Summary by CodeRabbit

  • New Features

    • Added heap-size reporting across core types (headers, tipsets, tickets, proofs, messages, addresses, token amounts) for improved memory accounting.
  • Refactor

    • Replaced lock-guarded LRU caches with size-tracking caches, requiring non-zero capacities and adjusting concurrency/access semantics.
    • Standardized cache metric names and prefixes for clearer observability.
  • Chores

    • Updated lint/config to discourage legacy cache constructors.
  • Tests

    • Updated tests to compute sizes dynamically instead of using hard-coded values.

@coderabbitai
Copy link
Contributor

coderabbitai bot commented Aug 11, 2025

Walkthrough

Adds GetSize derives/impls and a BigInt heap-size helper across many types; replaces several lock-guarded lru::LruCache usages with SizeTrackingLruCache (new API: get_cloned, push, contains, NonZeroUsize capacity); tweaks several metrics registry names and one test assertion.

Changes

Cohort / File(s) Summary
Clippy config updates
./.clippy.toml
Adds disallowed methods for lru::LruCache::new, with_hasher, unbounded, unbounded_with_hasher with guidance to prefer SizeTrackingLruCache.
Blocks: GetSize additions
src/blocks/election_proof.rs, src/blocks/header.rs, src/blocks/ticket.rs, src/blocks/tipset.rs, src/blocks/vrf_proof.rs
Add/GetSize derives and impls: ElectionProof derives GetSize; RawBlockHeader implements GetSize and CachingBlockHeader derives GetSize (cid ignored); Ticket and Tipset derive GetSize (custom vec helper for headers); VRFProof GetSize delegates to inner Vec.
Messages: GetSize derives
src/message/chain_message.rs, src/message/signed_message.rs
Derive GetSize for ChainMessage and SignedMessage; minor import reordering.
Shim types: GetSize support
src/shim/address.rs, src/shim/econ.rs, src/shim/message.rs, src/shim/sector.rs
Implement/derive GetSize: Address returns 0; TokenAmount uses big_int_heap_size_helper; Message derives GetSize with custom size_fn for params; PoStProof delegates to inner proof bytes.
Utils: BigInt helper & cache additions
src/utils/get_size/mod.rs, src/utils/cache/lru.rs
Add big_int_heap_size_helper(&BigInt) -> usize; add SizeTrackingLruCache::contains<Q>(&self, k: &Q) -> bool; update cache size metric name to prefix cache_.
Cache refactor: replace LruCache+locks
src/chain/store/chain_store.rs, src/chain/store/index.rs
Replace RwLock/Mutex-wrapped lru::LruCache with SizeTrackingLruCache; change constructors to accept NonZeroUsize; switch access to get_cloned, push, contains; adjust Default and tests for NonZeroUsize.
Metrics/registry name tweaks
src/beacon/drand.rs, src/chain_sync/bad_block_cache.rs, src/db/car/mod.rs, src/message_pool/msgpool/msg_pool.rs, src/state_manager/cache.rs, src/state_migration/common/mod.rs
Rename metrics/registry keys (examples: verified_beacons_cacheverified_beacons, bad_block_cachebad_block, zstd_frame_cachezstd_frame, bls_sig_cachebls_sig, sig_val_cachesig_val, migration_cachemigration, tipset state naming).
Test update
interop-tests/tests/dhat_get_size_beacon_entry.rs
Replace hard-coded 60 with computed inner_bytes from GetSize sums in the dhat HeapStats assertion.

Sequence Diagram(s)

sequenceDiagram
  participant Caller
  participant SizeTrackingLruCache
  participant Builder

  Caller->>SizeTrackingLruCache: get_or_insert_with(key, builder)
  alt cache.contains(key)
    SizeTrackingLruCache-->>Caller: get_cloned(key) (hit)
  else miss
    SizeTrackingLruCache->>Builder: compute value
    Builder-->>SizeTrackingLruCache: value
    SizeTrackingLruCache->>SizeTrackingLruCache: push(key, value)
    SizeTrackingLruCache-->>Caller: get_cloned(key)
  end
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Possibly related PRs

Suggested reviewers

  • elmattic
  • LesnyRumcajs
  • akaladarshi

📜 Recent review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 13e6cab and d4dedaf.

📒 Files selected for processing (1)
  • src/chain/store/index.rs (2 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
  • src/chain/store/index.rs
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (9)
  • GitHub Check: Build Ubuntu
  • GitHub Check: Build MacOS
  • GitHub Check: cargo-publish-dry-run
  • GitHub Check: tests-release
  • GitHub Check: tests
  • GitHub Check: Build forest binaries on Linux AMD64
  • GitHub Check: All lint checks
  • GitHub Check: Analyze (rust)
  • GitHub Check: Analyze (go)
✨ Finishing Touches
  • 📝 Generate Docstrings
🧪 Generate unit tests
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch hm/track-all-lru-caches

🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

  • Review comments: Directly reply to a review comment made by CodeRabbit. Example:
    • I pushed a fix in commit <commit_id>, please review it.
    • Open a follow-up GitHub issue for this discussion.
  • Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query.
  • PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
    • @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
    • @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.

Support

Need help? Create a ticket on our support page for assistance with any issues or questions.

CodeRabbit Commands (Invoked using PR/Issue comments)

Type @coderabbitai help to get the list of available commands.

Other keywords and placeholders

  • Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
  • Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
  • Add @coderabbitai anywhere in the PR title to generate the title automatically.

CodeRabbit Configuration File (.coderabbit.yaml)

  • You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
  • Please see the configuration documentation for more information.
  • If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Status, Documentation and Community

  • Visit our Status Page to check the current availability of CodeRabbit.
  • Visit our Documentation for detailed information on how to use CodeRabbit.
  • Join our Discord Community to get help, request features, and share feedback.
  • Follow us on X/Twitter for updates and announcements.

@hanabi1224 hanabi1224 marked this pull request as ready for review August 11, 2025 13:51
@hanabi1224 hanabi1224 requested a review from a team as a code owner August 11, 2025 13:51
@hanabi1224 hanabi1224 requested review from LesnyRumcajs and sudo-shashank and removed request for a team August 11, 2025 13:51
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 4

🔭 Outside diff range comments (2)
src/state_manager/cache.rs (1)

97-114: Pending entry not cleared on compute error can leave stuck “pending” entries

If compute().await fails, the key remains in inner.pending indefinitely. Subsequent calls will always go down the “Empty” path, lock the same mutex, and re-attempt compute (good), but the pending list will grow with permanently failing keys, causing memory bloat.

Clear the pending entry on error as well. Example change:

-                        let value = compute().await?;
-
-                        // Write back to cache, release lock and return value
-                        self.insert(key.clone(), value.clone());
-                        Ok(value)
+                        let res = compute().await;
+                        // Ensure we don't leave a stale pending entry on error.
+                        self.with_inner(|inner| {
+                            inner.pending.retain(|(k, _)| k != key);
+                        });
+                        match res {
+                            Ok(value) => {
+                                // Write back to cache, release lock and return value
+                                self.insert(key.clone(), value.clone());
+                                Ok(value)
+                            }
+                            Err(e) => Err(e),
+                        }

Also applies to: 123-146

src/chain/store/chain_store.rs (1)

577-584: Use a single lookup to avoid TOCTOU and remove the panic

The current contains→get path can race under concurrency and trigger the "cache entry disappeared!" panic. Replace it with one get (or get_cloned) call and fall back to insertion:

-        if self.cache.contains(key) {
-            Ok(self.get(key).expect("cache entry disappeared!"))
-        } else {
-            let v = f()?;
-            self.insert(key.clone(), v.clone());
-            Ok(v)
-        }
+        match self.get(key) {
+            Some(v) => Ok(v),
+            None => {
+                let v = f()?;
+                self.insert(key.clone(), v.clone());
+                Ok(v)
+            }
+        }

• This removes the benign TOCTOU and eliminates the panic.
• Since SizeTrackingLruCache doesn’t expose an atomic get-or-insert helper, duplicate computation under contention is still possible. We can add a get_or_try_insert_with API to the cache as a follow-up if you’d like.

🧹 Nitpick comments (14)
src/chain_sync/bad_block_cache.rs (1)

28-29: Metrics label rename to "bad_block" — LGTM. Consider exposing contains().

The rename aligns with the metrics standardization.

Optional: provide a contains() that uses the new SizeTrackingLruCache.contains() to avoid cloning/allocations elsewhere.

 impl BadBlockCache {
     pub fn new(cap: NonZeroUsize) -> Self {
         Self {
             cache: SizeTrackingLruCache::new_with_default_metrics_registry("bad_block".into(), cap),
         }
     }
 
+    /// Returns true if the block CID is in the bad block cache.
+    /// Does not update LRU head position.
+    pub fn contains(&self, c: &Cid) -> bool {
+        self.cache.contains(&(*c).into())
+    }
+
     pub fn push(&self, c: Cid) {
         self.cache.push(c.into(), ());
     }
 
     /// Returns `Some` if the block CID is in bad block cache.
     /// This function does not update the head position of the `Cid` key.
     pub fn peek(&self, c: &Cid) -> Option<()> {
         self.cache.peek_cloned(&(*c).into())
     }
 }
src/beacon/drand.rs (1)

271-274: Use contains() for membership check to avoid locking + value compare.

You can simplify is_verified by checking key presence only; the round uniquely identifies the entry content.

-    fn is_verified(&self, entry: &BeaconEntry) -> bool {
-        let cache = self.verified_beacons.cache().read();
-        cache.peek(&entry.round()) == Some(entry)
-    }
+    fn is_verified(&self, entry: &BeaconEntry) -> bool {
+        self.verified_beacons.contains(&entry.round())
+    }

If there’s any possibility that the cached value for a round could differ from entry, keep the equality check.

src/shim/sector.rs (1)

22-22: GetSize for PoStProof — LGTM.

Accurately accounts for heap via proof_bytes. If fvm_shared adds additional heap fields in PoStProofV4 later, consider delegating to the inner type (if it ever implements GetSize) to avoid drift.

Also applies to: 381-385

src/shim/address.rs (1)

17-17: GetSize for Address — LGTM.

Returning 0 matches the zero-heap expectation for this wrapper.

Consider a brief comment explaining why heap size is 0 (helps future maintainers avoid double-counting).

Also applies to: 263-267

src/blocks/tipset.rs (1)

149-156: Derive GetSize for Tipset and annotate headers — LGTM.

The size_fn for nunny::Vec ensures precise heap tracking for the headers collection.

If you want similar visibility for FullTipset memory, consider deriving GetSize there too:

-#[derive(Debug, Clone, Eq)]
+#[derive(Debug, Clone, Eq, GetSize)]
 pub struct FullTipset {
-    blocks: NonEmpty<Block>,
+    #[get_size(size_fn = nunny_vec_heap_size_helper)]
+    blocks: NonEmpty<Block>,
     // key is lazily initialized via `fn key()`.
     key: OnceLock<TipsetKey>,
 }
src/utils/get_size/mod.rs (1)

54-58: MSRV-safe byte math and minor doc tweak for BigInt helper.

The estimate is fine. If MSRV predates div_ceil on integers, prefer the add-then-divide pattern. Also clarify the comment to reflect that this measures magnitude bytes (sign/overhead ignored).

-// This is a rough estimation. Use `b.allocation_size()`
-// once https://github.com/rust-num/num-bigint/pull/333 is accepted and released.
+// Rough estimation of magnitude bytes. Ignores sign/struct overhead.
+// Switch to `b.allocation_size()` once
+// https://github.com/rust-num/num-bigint/pull/333 is released.

-pub fn big_int_heap_size_helper(b: &BigInt) -> usize {
-    b.bits().div_ceil(8) as usize
-}
+pub fn big_int_heap_size_helper(b: &BigInt) -> usize {
+    // MSRV-safe: (bits + 7) / 8
+    let bits = b.bits() as usize;
+    (bits.saturating_add(7)) / 8
+}
src/blocks/vrf_proof.rs (1)

30-34: Optional: derive GetSize for VRFProof and drop the manual impl.

Keeps consistency with other types deriving GetSize and reduces boilerplate.

-#[derive(Clone, Debug, PartialEq, Eq, Ord, PartialOrd, Default, Serialize, Deserialize, Hash)]
+#[derive(Clone, Debug, PartialEq, Eq, Ord, PartialOrd, Default, Serialize, Deserialize, Hash, GetSize)]
 pub struct VRFProof(#[serde(with = "serde_byte_array")] pub Vec<u8>);

-impl GetSize for VRFProof {
-    fn get_heap_size(&self) -> usize {
-        self.0.get_heap_size()
-    }
-}
src/shim/message.rs (1)

44-49: Remove dead commented code.

Stale skeleton impl adds noise; derive is already in use.

-// impl GetSize for Message{
-//     fn get_heap_size(&self) -> usize {
-
-//     }
-// }
src/utils/cache/lru.rs (2)

127-134: New contains() is correct and concurrency-safe

Read lock with LruCache::contains is appropriate and does not update recency. Consider a short doc comment noting that contains does not affect LRU order, so callers shouldn’t rely on it for touch semantics.


186-191: Metric rename to "cache_..._size": ensure consistency and update dashboards

The size metric now uses the "cache_" prefix. Please:

  • Verify dashboards/alerts scraped the new name.
  • Consider aligning len/cap metrics with the same "cache_" prefix for consistency (optional).
src/blocks/header.rs (2)

201-233: Manual GetSize impl: clarify exclusion rationale and potential undercount

  • You explicitly exclude CIDs (state_root, message_receipts, messages). If this is to avoid relying on a non-existent GetSize impl for cid::Cid, please replace the comment “derive macro does not compile for some reason” with a concrete note so future readers don’t regress this.
  • If cid::Cid (or its multihash) allocates on the heap in your version, excluding it here will undercount. If the multihash is stack/inlined for 32-byte digests in your build, the exclusion is fine. Please confirm the intended behavior.

Nit: big_int_heap_size_helper(b) accounts only for payload bits; it doesn’t include Vec/limb overhead. If you want a slightly less optimistic estimate, consider adding a small constant overhead per BigInt, but that’s optional.

Example comment tweak:

-// The derive macro does not compile for some reason
+// Note: We can’t derive GetSize here because some external types (e.g., cid::Cid)
+// don’t implement it. We manually sum known heap-allocating fields and exclude CIDs,
+// which are treated as fixed-size/inlined for our purposes.

Would you like me to search the repo to verify whether there’s any GetSize impl for cid::Cid (or wrappers), and whether we should account for it here?


237-241: Deriving GetSize and ignoring OnceLock is reasonable—confirm CID treatment is consistent

Deriving GetSize on CachingBlockHeader while ignoring the cached CID avoids double-counting and also sidesteps the lack of a GetSize impl for Cid. This matches the manual exclusion in RawBlockHeader, but it means CID heap (if any) is never counted. If that’s intentional, consider adding a brief comment to the field.

Optional:

 #[derive(Debug, GetSize)]
 pub struct CachingBlockHeader {
     uncached: RawBlockHeader,
-    #[get_size(ignore)]
+    // CID heap is excluded; tracked via RawBlockHeader fields as needed (currently excluded too).
+    #[get_size(ignore)]
     cid: OnceLock<Cid>,
     has_ever_been_verified_against_any_signature: AtomicBool,
 }

If you’d like, I can also scan for a GetSize impl on TipsetKey and Signature to ensure size tracking stays sound across caches.

src/chain/store/chain_store.rs (2)

560-567: Constructor and metrics wiring LGTM

Using new_with_default_metrics_registry is consistent with similar caches. Consider standardizing the cache name (e.g., “msgs_in_tipset”) if you want all cache metrics to follow a consistent pluralization.


591-598: NonZeroUsize env parsing is fine; optional: guard invalid values explicitly

Current parsing silently falls back to DEFAULT. Optionally log or trace when parsing fails or when the value is zero/invalid to aid ops.

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 59ca796 and 76fefd2.

📒 Files selected for processing (23)
  • .clippy.toml (1 hunks)
  • interop-tests/tests/dhat_get_size_beacon_entry.rs (1 hunks)
  • src/beacon/drand.rs (1 hunks)
  • src/blocks/election_proof.rs (2 hunks)
  • src/blocks/header.rs (2 hunks)
  • src/blocks/ticket.rs (1 hunks)
  • src/blocks/tipset.rs (2 hunks)
  • src/blocks/vrf_proof.rs (2 hunks)
  • src/chain/store/chain_store.rs (4 hunks)
  • src/chain/store/index.rs (2 hunks)
  • src/chain_sync/bad_block_cache.rs (1 hunks)
  • src/db/car/mod.rs (1 hunks)
  • src/message/chain_message.rs (1 hunks)
  • src/message/signed_message.rs (1 hunks)
  • src/message_pool/msgpool/msg_pool.rs (1 hunks)
  • src/shim/address.rs (2 hunks)
  • src/shim/econ.rs (2 hunks)
  • src/shim/message.rs (2 hunks)
  • src/shim/sector.rs (2 hunks)
  • src/state_manager/cache.rs (1 hunks)
  • src/state_migration/common/mod.rs (1 hunks)
  • src/utils/cache/lru.rs (2 hunks)
  • src/utils/get_size/mod.rs (2 hunks)
🧰 Additional context used
🧠 Learnings (1)
📚 Learning: 2025-07-17T15:21:40.753Z
Learnt from: hanabi1224
PR: ChainSafe/forest#5841
File: src/utils/get_size/mod.rs:10-10
Timestamp: 2025-07-17T15:21:40.753Z
Learning: The get_size2 crate's GetSize trait provides default implementations: get_stack_size() uses std::mem::size_of, get_heap_size() returns 0, and get_size() returns their sum. An empty impl like `impl GetSize for MyType {}` is valid and uses these defaults, making it suitable for simple wrapper types that don't allocate heap memory.

Applied to files:

  • src/blocks/tipset.rs
  • src/blocks/vrf_proof.rs
  • src/shim/address.rs
  • src/blocks/header.rs
  • src/message/signed_message.rs
  • src/shim/econ.rs
  • src/shim/sector.rs
  • src/shim/message.rs
🧬 Code Graph Analysis (13)
src/message_pool/msgpool/msg_pool.rs (2)
src/state_migration/common/mod.rs (1)
  • new (32-39)
src/utils/cache/lru.rs (1)
  • new_with_default_metrics_registry (95-100)
src/blocks/tipset.rs (8)
src/rpc/methods/chain.rs (1)
  • utils (497-497)
src/blocks/header.rs (2)
  • cid (69-71)
  • cid (304-306)
src/message/chain_message.rs (1)
  • cid (29-34)
src/message/signed_message.rs (1)
  • cid (78-85)
src/shim/message.rs (1)
  • cid (195-198)
src/blocks/block.rs (1)
  • cid (41-43)
src/libp2p_bitswap/store.rs (1)
  • cid (86-88)
src/utils/get_size/mod.rs (1)
  • nunny_vec_heap_size_helper (50-52)
src/blocks/vrf_proof.rs (4)
src/blocks/header.rs (1)
  • get_heap_size (203-232)
src/shim/address.rs (1)
  • get_heap_size (264-266)
src/shim/econ.rs (1)
  • get_heap_size (40-42)
src/shim/sector.rs (1)
  • get_heap_size (382-384)
src/shim/address.rs (5)
src/blocks/vrf_proof.rs (1)
  • get_heap_size (31-33)
src/blocks/header.rs (1)
  • get_heap_size (203-232)
src/shim/econ.rs (1)
  • get_heap_size (40-42)
src/shim/sector.rs (1)
  • get_heap_size (382-384)
src/shim/executor.rs (2)
  • get_heap_size (138-144)
  • get_heap_size (348-357)
src/blocks/header.rs (3)
src/utils/encoding/mod.rs (1)
  • blake2b_256 (87-97)
src/utils/get_size/mod.rs (1)
  • big_int_heap_size_helper (56-58)
src/shim/sector.rs (1)
  • get_heap_size (382-384)
src/chain/store/chain_store.rs (2)
src/utils/cache/lru.rs (2)
  • cache (119-121)
  • new_with_default_metrics_registry (95-100)
src/chain/store/index.rs (1)
  • new (40-46)
src/db/car/mod.rs (1)
src/utils/cache/lru.rs (1)
  • unbounded_with_default_metrics_registry (115-117)
src/chain_sync/bad_block_cache.rs (1)
src/utils/cache/lru.rs (2)
  • new_with_default_metrics_registry (95-100)
  • cap (159-161)
src/utils/cache/lru.rs (5)
src/blocks/tipset.rs (1)
  • contains (50-52)
src/chain/store/chain_store.rs (1)
  • contains (95-97)
src/cid_collections/small_cid_vec.rs (1)
  • contains (35-37)
src/db/car/many.rs (1)
  • contains (197-199)
src/state_manager/cache.rs (1)
  • cache_name (45-51)
src/shim/econ.rs (6)
src/utils/get_size/mod.rs (1)
  • big_int_heap_size_helper (56-58)
src/blocks/vrf_proof.rs (1)
  • get_heap_size (31-33)
src/blocks/header.rs (1)
  • get_heap_size (203-232)
src/shim/address.rs (1)
  • get_heap_size (264-266)
src/shim/sector.rs (1)
  • get_heap_size (382-384)
src/shim/executor.rs (2)
  • get_heap_size (138-144)
  • get_heap_size (348-357)
src/shim/sector.rs (5)
src/blocks/vrf_proof.rs (1)
  • get_heap_size (31-33)
src/blocks/header.rs (1)
  • get_heap_size (203-232)
src/shim/address.rs (1)
  • get_heap_size (264-266)
src/shim/econ.rs (1)
  • get_heap_size (40-42)
src/shim/executor.rs (2)
  • get_heap_size (138-144)
  • get_heap_size (348-357)
src/chain/store/index.rs (4)
src/utils/cache/lru.rs (2)
  • cache (119-121)
  • new_with_default_metrics_registry (95-100)
src/utils/misc/env.rs (1)
  • is_env_truthy (17-19)
src/message_pool/msgpool/provider.rs (2)
  • load_tipset (50-50)
  • load_tipset (114-120)
src/blocks/tipset.rs (4)
  • tsk (264-268)
  • load (263-271)
  • new (241-259)
  • new (493-510)
src/message/chain_message.rs (2)
src/message/signed_message.rs (1)
  • message (40-42)
src/cli/subcommands/chain_cmd.rs (1)
  • fvm_ipld_encoding (92-92)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (9)
  • GitHub Check: tests-release
  • GitHub Check: tests
  • GitHub Check: Build forest binaries on Linux AMD64
  • GitHub Check: cargo-publish-dry-run
  • GitHub Check: Build MacOS
  • GitHub Check: Build Ubuntu
  • GitHub Check: All lint checks
  • GitHub Check: Analyze (rust)
  • GitHub Check: Analyze (go)
🔇 Additional comments (24)
src/db/car/mod.rs (1)

75-76: Metric label rename sanity-check complete

I ran rg -n --hidden -S 'zstd_frame_cache' and found no remaining code-level registrations or string literals using the old metric label. The only matches are:

  • src/db/car/mod.rs
    • Constant and env var names: ZSTD_FRAME_CACHE_DEFAULT_MAX_SIZE, FOREST_ZSTD_FRAME_CACHE_DEFAULT_MAX_SIZE
    • Test name: test_zstd_frame_cache_size
  • CHANGELOG.md and docs/users/reference/env_variables.md: references to the env var

None of these affect the metric registry. Please verify that any external dashboards or alerting rules still expecting "zstd_frame_cache" have been updated.

src/beacon/drand.rs (1)

265-268: Verified complete label rename
The search for "verified_beacons_cache" across the repo returned no hits, confirming the old metric label has been fully removed and replaced. No further references found. 🎉

.clippy.toml (1)

53-54: Reason update for unbounded — LGTM.

Clearer guidance and aligns with the migration to SizeTrackingLruCache.

src/message/signed_message.rs (1)

15-20: Deriving GetSize for SignedMessage — LGTM.

This enables precise heap accounting for cache sizing; consistent with the broader PR.

src/blocks/tipset.rs (2)

14-15: Import nunny_vec_heap_size_helper — LGTM.

Needed for accurate heap-size accounting of nunny::Vec.


33-36: Derive GetSize for TipsetKey — LGTM.

Pairs well with shrink_to_fit in From<NonEmpty> to avoid over-counting spare capacity.

interop-tests/tests/dhat_get_size_beacon_entry.rs (1)

52-53: LGTM: assertion matches dynamic heap sizing.

Using inner_bytes aligns the expectation with actual per-element heap usage. Consistent with earlier assertions in the test.

src/shim/econ.rs (1)

39-43: LGTM: GetSize for TokenAmount correctly accounts for atto BigInt heap.

Accurate and zero-copy via atto(). This integrates with SizeTrackingLruCache as intended.

src/shim/message.rs (1)

18-18: LGTM: Deriving GetSize for Message.

Consistent with other message types and enables accurate size tracking.

src/message/chain_message.rs (1)

14-16: LGTM: GetSize derive for ChainMessage.

Enables correct sizing for both variants and integrates with size-tracking caches.

src/state_migration/common/mod.rs (1)

35-37: LGTM: metrics key normalized to "migration".

Aligns with repo-wide naming cleanup; no functional change.

src/message_pool/msgpool/msg_pool.rs (1)

479-485: LGTM: metrics keys shortened.

Consistent with naming across the PR; no runtime behavior change.

src/state_manager/cache.rs (1)

45-51: Metric name change: validate dashboards/alerts and ensure no collisions

cache_name() now returns "tipset_state_{type}" (previously included "cache" in the name per PR summary). Please confirm dashboards/alerts are updated and that no other registry entry reuses the same name.

src/blocks/election_proof.rs (1)

8-8: Derive GetSize on ElectionProof — good

Brings ElectionProof into heap-size accounting (including nested VRFProof). Looks correct.

Also applies to: 138-149

src/blocks/ticket.rs (1)

6-6: Derive GetSize on Ticket — good

Enables size tracking for Ticket via VRFProof. No logical changes; import and derive look correct.

Also applies to: 12-23

src/chain/store/index.rs (1)

52-58: Cache refactor to SizeTrackingLruCache — looks good

get_cloned() and push() usage is correct; metrics hit/miss logic preserved. The API remains thread-safe without an external Mutex.

Also applies to: 62-63

src/blocks/header.rs (2)

17-18: Importing big_int_heap_size_helper is appropriate

Brings in the helper used below; no issues.


23-23: GetSize import looks good

Required for the manual impl and derive below.

src/chain/store/chain_store.rs (6)

29-29: Import switch to SizeTrackingLruCache aligns with PR goals

Consistent with replacing LruCache usages.


555-557: MsgsInTipsetCache switched to SizeTrackingLruCache

Good step toward centralized size accounting. Please ensure TipsetKey and Vec implement GetSize so sizes are tracked correctly; otherwise entries will default to stack-only size.

Do you want me to scan the codebase to confirm GetSize impls for these types?


569-571: get via get_cloned is correct for copy-out semantics

No concerns.


587-588: Insert delegates to push—good

Matches the new cache API.


603-604: Default uses read_cache_size—LGTM

No issues.


801-801: Test update for NonZeroUsize capacity is correct

Constructs the cache with capacity 2; behavior validated below.

Comment on lines +37 to +42
fn raw_bytes_heap_size(b: &RawBytes) -> usize {
// Note: this is a cheap but inaccurate estimation,
// the correct implementation should be `Vec<u8>.from(b.clone()).get_heap_size()`,
// or `b.bytes.get_heap_size()` if `bytes` is made public.
b.bytes().get_heap_size()
}
Copy link
Contributor

@coderabbitai coderabbitai bot Aug 11, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Verification agent

🧩 Analysis chain

Confirm RawBytes::bytes() return type; fix misleading comment if needed.

If RawBytes::bytes() returns &[u8], get_heap_size() will undercount (likely 0). If it returns &Vec, this is correct. The comment currently contradicts the implementation; adjust accordingly after verification.

Please verify via docs: what is the return type of fvm_ipld_encoding::RawBytes::bytes()?

If it returns &[u8], change to a safe, accurate path (not ideal due to clone, but correct):

-fn raw_bytes_heap_size(b: &RawBytes) -> usize {
-    // Note: this is a cheap but inaccurate estimation,
-    // the correct implementation should be `Vec<u8>.from(b.clone()).get_heap_size()`,
-    // or `b.bytes.get_heap_size()` if `bytes` is made public.
-    b.bytes().get_heap_size()
-}
+fn raw_bytes_heap_size(b: &RawBytes) -> usize {
+    // Accurate: obtain a Vec to measure capacity-based heap size.
+    let v: Vec<u8> = b.clone().into();
+    v.get_heap_size()
+}

🌐 Web query:

What is the return type of fvm_ipld_encoding::RawBytes::bytes()? Does it return &Vec<u8> or &[u8>?

💡 Result:

It returns a slice: pub fn bytes(&self) -> &[u8]. (docs.rs)

(So you get a &[u8], not an &Vec.)

Citations:


Adjust raw_bytes_heap_size to clone into a Vec for an accurate heap‐size measurement

According to the fvm_ipld_encoding docs, RawBytes::bytes() returns a slice (&[u8]), so calling get_heap_size() on it will undercount (it effectively reports zero for the referenced heap data). To measure the actual heap allocation, clone the RawBytes into a Vec<u8> and call get_heap_size() on that:

src/shim/message.rs

 fn raw_bytes_heap_size(b: &RawBytes) -> usize {
-    // Note: this is a cheap but inaccurate estimation,
-    // the correct implementation should be `Vec<u8>.from(b.clone()).get_heap_size()`,
-    // or `b.bytes.get_heap_size()` if `bytes` is made public.
-    b.bytes().get_heap_size()
+    // Accurate: clone into a Vec so we can measure its heap allocation.
+    let v: Vec<u8> = b.clone().into();
+    v.get_heap_size()
 }
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
fn raw_bytes_heap_size(b: &RawBytes) -> usize {
// Note: this is a cheap but inaccurate estimation,
// the correct implementation should be `Vec<u8>.from(b.clone()).get_heap_size()`,
// or `b.bytes.get_heap_size()` if `bytes` is made public.
b.bytes().get_heap_size()
}
fn raw_bytes_heap_size(b: &RawBytes) -> usize {
// Accurate: clone into a Vec so we can measure its heap allocation.
let v: Vec<u8> = b.clone().into();
v.get_heap_size()
}
🤖 Prompt for AI Agents
In src/shim/message.rs around lines 37 to 42, raw_bytes_heap_size currently
calls get_heap_size on a slice returned by RawBytes::bytes(), which undercounts
heap usage; modify the function to obtain an owned Vec<u8> from the RawBytes (by
cloning/collecting into a Vec) and call get_heap_size on that Vec so the actual
heap allocation is measured correctly (e.g., convert the RawBytes into a Vec<u8>
and return its get_heap_size()).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd prefer performance over accuracy in this case, as commented.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@hanabi1224, understood! The performance trade-off makes sense here, especially since size tracking can be called frequently. The comment already documents that this is an intentional approximation for performance reasons. I appreciate the clarification on the design choice.


🧠 Learnings used
Learnt from: hanabi1224
PR: ChainSafe/forest#5841
File: src/utils/get_size/mod.rs:10-10
Timestamp: 2025-07-17T15:21:40.753Z
Learning: The get_size2 crate's GetSize trait provides default implementations: get_stack_size() uses std::mem::size_of, get_heap_size() returns 0, and get_size() returns their sum. An empty impl like `impl GetSize for MyType {}` is valid and uses these defaults, making it suitable for simple wrapper types that don't allocate heap memory.

@LesnyRumcajs LesnyRumcajs mentioned this pull request Aug 12, 2025
4 tasks
LesnyRumcajs
LesnyRumcajs previously approved these changes Aug 12, 2025
Co-authored-by: Hubert <hubert@chainsafe.io>
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🔭 Outside diff range comments (1)
src/chain/store/index.rs (1)

60-66: Cache-disabled env still performs writes and increments miss metrics — gate both

When the cache is disabled, we shouldn’t populate it or emit hit/miss metrics. Gate the push and miss counter on cache_enabled to avoid memory usage and misleading metrics when caching is disabled.

Apply this change:

-        let ts_opt = Tipset::load(&self.db, tsk)?.map(Arc::new);
-        if let Some(ts) = &ts_opt {
-            self.ts_cache.push(tsk.clone(), ts.clone());
-            metrics::LRU_CACHE_MISS
-                .get_or_create(&metrics::values::TIPSET)
-                .inc();
-        }
+        let ts_opt = Tipset::load(&self.db, tsk)?.map(Arc::new);
+        if let Some(ts) = &ts_opt {
+            if cache_enabled {
+                self.ts_cache.push(tsk.clone(), ts.clone());
+                metrics::LRU_CACHE_MISS
+                    .get_or_create(&metrics::values::TIPSET)
+                    .inc();
+            }
+        }
♻️ Duplicate comments (1)
.clippy.toml (1)

47-54: Ban coverage for LruCache constructors — LGTM; addresses earlier review

Blocking LruCache::new and LruCache::with_hasher aligns with the migration and closes the loophole called out earlier. Nice.

🧹 Nitpick comments (1)
.clippy.toml (1)

56-62: Ban unbounded constructors — good; unify reason wording and capitalization

The bans for unbounded and unbounded_with_hasher are correct. Minor consistency nit: standardize the reason string (capitalize LRU and simplify wording).

Apply this small wording cleanup:

- reason = """Avoid unbounded lru cache for potential memory leak, use SizeTrackingLruCache instead."""
+ reason = """Avoid unbounded LRU caches; use SizeTrackingLruCache instead."""

Repeat the same change for both entries (Lines 57 and 61).

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 76fefd2 and 13e6cab.

📒 Files selected for processing (4)
  • .clippy.toml (1 hunks)
  • src/chain/store/index.rs (2 hunks)
  • src/shim/address.rs (2 hunks)
  • src/utils/get_size/mod.rs (2 hunks)
🚧 Files skipped from review as they are similar to previous changes (2)
  • src/shim/address.rs
  • src/utils/get_size/mod.rs
🧰 Additional context used
🧠 Learnings (1)
📓 Common learnings
Learnt from: hanabi1224
PR: ChainSafe/forest#5841
File: src/utils/get_size/mod.rs:10-10
Timestamp: 2025-07-17T15:21:40.753Z
Learning: The get_size2 crate's GetSize trait provides default implementations: get_stack_size() uses std::mem::size_of, get_heap_size() returns 0, and get_size() returns their sum. An empty impl like `impl GetSize for MyType {}` is valid and uses these defaults, making it suitable for simple wrapper types that don't allocate heap memory.
🧬 Code Graph Analysis (1)
src/chain/store/index.rs (3)
src/utils/misc/env.rs (1)
  • is_env_truthy (17-19)
src/blocks/tipset.rs (4)
  • tsk (264-268)
  • load (263-271)
  • new (241-259)
  • new (493-510)
src/beacon/drand.rs (1)
  • new (251-269)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (9)
  • GitHub Check: Build Ubuntu
  • GitHub Check: cargo-publish-dry-run
  • GitHub Check: Build MacOS
  • GitHub Check: Build forest binaries on Linux AMD64
  • GitHub Check: All lint checks
  • GitHub Check: tests-release
  • GitHub Check: tests
  • GitHub Check: Analyze (rust)
  • GitHub Check: Analyze (go)
🔇 Additional comments (4)
.clippy.toml (1)

47-62: Confirmed: No additional LruCache constructors to ban

  • A direct import of LruCache is present at src/utils/cache/lru.rs:16.
  • No occurrences of LruCache::try_new, ::try_with_hasher, ::new_default, or ::from_iter were found in the codebase.

The current disallowed-methods list covers all existing constructors. We can revisit this if the lru crate introduces new constructors in the future.

src/chain/store/index.rs (3)

11-11: Switch to SizeTrackingLruCache import — LGTM

Import is correct and aligns with the migration away from LruCache.


19-19: Type alias migrated to size-tracking cache — LGTM

Using SizeTrackingLruCache<TipsetKey, Arc<Tipset>> is appropriate for measured caching.


41-44: Initialize cache with metrics registry and NonZero capacity — LGTM

Name “tipset” and NonZero capacity are consistent with the new cache API and metrics conventions.

@hanabi1224 hanabi1224 enabled auto-merge August 12, 2025 13:43
@hanabi1224 hanabi1224 added this pull request to the merge queue Aug 12, 2025
Merged via the queue into main with commit 6b53690 Aug 12, 2025
43 checks passed
@hanabi1224 hanabi1224 deleted the hm/track-all-lru-caches branch August 12, 2025 14:15
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants