-
Notifications
You must be signed in to change notification settings - Fork 182
feat: tracking size of all LruCache with SizeTrackingLruCache #5920
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
WalkthroughAdds GetSize derives/impls and a BigInt heap-size helper across many types; replaces several lock-guarded lru::LruCache usages with SizeTrackingLruCache (new API: get_cloned, push, contains, NonZeroUsize capacity); tweaks several metrics registry names and one test assertion. Changes
Sequence Diagram(s)sequenceDiagram
participant Caller
participant SizeTrackingLruCache
participant Builder
Caller->>SizeTrackingLruCache: get_or_insert_with(key, builder)
alt cache.contains(key)
SizeTrackingLruCache-->>Caller: get_cloned(key) (hit)
else miss
SizeTrackingLruCache->>Builder: compute value
Builder-->>SizeTrackingLruCache: value
SizeTrackingLruCache->>SizeTrackingLruCache: push(key, value)
SizeTrackingLruCache-->>Caller: get_cloned(key)
end
Estimated code review effort🎯 4 (Complex) | ⏱️ ~45 minutes Possibly related PRs
Suggested reviewers
📜 Recent review detailsConfiguration used: CodeRabbit UI 📒 Files selected for processing (1)
🚧 Files skipped from review as they are similar to previous changes (1)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (9)
✨ Finishing Touches
🧪 Generate unit tests
🪧 TipsChatThere are 3 ways to chat with CodeRabbit:
SupportNeed help? Create a ticket on our support page for assistance with any issues or questions. CodeRabbit Commands (Invoked using PR/Issue comments)Type Other keywords and placeholders
CodeRabbit Configuration File (
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 4
🔭 Outside diff range comments (2)
src/state_manager/cache.rs (1)
97-114: Pending entry not cleared on compute error can leave stuck “pending” entriesIf compute().await fails, the key remains in inner.pending indefinitely. Subsequent calls will always go down the “Empty” path, lock the same mutex, and re-attempt compute (good), but the pending list will grow with permanently failing keys, causing memory bloat.
Clear the pending entry on error as well. Example change:
- let value = compute().await?; - - // Write back to cache, release lock and return value - self.insert(key.clone(), value.clone()); - Ok(value) + let res = compute().await; + // Ensure we don't leave a stale pending entry on error. + self.with_inner(|inner| { + inner.pending.retain(|(k, _)| k != key); + }); + match res { + Ok(value) => { + // Write back to cache, release lock and return value + self.insert(key.clone(), value.clone()); + Ok(value) + } + Err(e) => Err(e), + }Also applies to: 123-146
src/chain/store/chain_store.rs (1)
577-584: Use a single lookup to avoid TOCTOU and remove the panicThe current contains→get path can race under concurrency and trigger the
"cache entry disappeared!"panic. Replace it with oneget(orget_cloned) call and fall back to insertion:- if self.cache.contains(key) { - Ok(self.get(key).expect("cache entry disappeared!")) - } else { - let v = f()?; - self.insert(key.clone(), v.clone()); - Ok(v) - } + match self.get(key) { + Some(v) => Ok(v), + None => { + let v = f()?; + self.insert(key.clone(), v.clone()); + Ok(v) + } + }• This removes the benign TOCTOU and eliminates the panic.
• SinceSizeTrackingLruCachedoesn’t expose an atomic get-or-insert helper, duplicate computation under contention is still possible. We can add aget_or_try_insert_withAPI to the cache as a follow-up if you’d like.
🧹 Nitpick comments (14)
src/chain_sync/bad_block_cache.rs (1)
28-29: Metrics label rename to "bad_block" — LGTM. Consider exposing contains().The rename aligns with the metrics standardization.
Optional: provide a contains() that uses the new SizeTrackingLruCache.contains() to avoid cloning/allocations elsewhere.
impl BadBlockCache { pub fn new(cap: NonZeroUsize) -> Self { Self { cache: SizeTrackingLruCache::new_with_default_metrics_registry("bad_block".into(), cap), } } + /// Returns true if the block CID is in the bad block cache. + /// Does not update LRU head position. + pub fn contains(&self, c: &Cid) -> bool { + self.cache.contains(&(*c).into()) + } + pub fn push(&self, c: Cid) { self.cache.push(c.into(), ()); } /// Returns `Some` if the block CID is in bad block cache. /// This function does not update the head position of the `Cid` key. pub fn peek(&self, c: &Cid) -> Option<()> { self.cache.peek_cloned(&(*c).into()) } }src/beacon/drand.rs (1)
271-274: Use contains() for membership check to avoid locking + value compare.You can simplify is_verified by checking key presence only; the round uniquely identifies the entry content.
- fn is_verified(&self, entry: &BeaconEntry) -> bool { - let cache = self.verified_beacons.cache().read(); - cache.peek(&entry.round()) == Some(entry) - } + fn is_verified(&self, entry: &BeaconEntry) -> bool { + self.verified_beacons.contains(&entry.round()) + }If there’s any possibility that the cached value for a round could differ from
entry, keep the equality check.src/shim/sector.rs (1)
22-22: GetSize for PoStProof — LGTM.Accurately accounts for heap via proof_bytes. If fvm_shared adds additional heap fields in PoStProofV4 later, consider delegating to the inner type (if it ever implements GetSize) to avoid drift.
Also applies to: 381-385
src/shim/address.rs (1)
17-17: GetSize for Address — LGTM.Returning 0 matches the zero-heap expectation for this wrapper.
Consider a brief comment explaining why heap size is 0 (helps future maintainers avoid double-counting).
Also applies to: 263-267
src/blocks/tipset.rs (1)
149-156: Derive GetSize for Tipset and annotate headers — LGTM.The size_fn for nunny::Vec ensures precise heap tracking for the headers collection.
If you want similar visibility for FullTipset memory, consider deriving GetSize there too:
-#[derive(Debug, Clone, Eq)] +#[derive(Debug, Clone, Eq, GetSize)] pub struct FullTipset { - blocks: NonEmpty<Block>, + #[get_size(size_fn = nunny_vec_heap_size_helper)] + blocks: NonEmpty<Block>, // key is lazily initialized via `fn key()`. key: OnceLock<TipsetKey>, }src/utils/get_size/mod.rs (1)
54-58: MSRV-safe byte math and minor doc tweak for BigInt helper.The estimate is fine. If MSRV predates div_ceil on integers, prefer the add-then-divide pattern. Also clarify the comment to reflect that this measures magnitude bytes (sign/overhead ignored).
-// This is a rough estimation. Use `b.allocation_size()` -// once https://github.com/rust-num/num-bigint/pull/333 is accepted and released. +// Rough estimation of magnitude bytes. Ignores sign/struct overhead. +// Switch to `b.allocation_size()` once +// https://github.com/rust-num/num-bigint/pull/333 is released. -pub fn big_int_heap_size_helper(b: &BigInt) -> usize { - b.bits().div_ceil(8) as usize -} +pub fn big_int_heap_size_helper(b: &BigInt) -> usize { + // MSRV-safe: (bits + 7) / 8 + let bits = b.bits() as usize; + (bits.saturating_add(7)) / 8 +}src/blocks/vrf_proof.rs (1)
30-34: Optional: derive GetSize for VRFProof and drop the manual impl.Keeps consistency with other types deriving GetSize and reduces boilerplate.
-#[derive(Clone, Debug, PartialEq, Eq, Ord, PartialOrd, Default, Serialize, Deserialize, Hash)] +#[derive(Clone, Debug, PartialEq, Eq, Ord, PartialOrd, Default, Serialize, Deserialize, Hash, GetSize)] pub struct VRFProof(#[serde(with = "serde_byte_array")] pub Vec<u8>); -impl GetSize for VRFProof { - fn get_heap_size(&self) -> usize { - self.0.get_heap_size() - } -}src/shim/message.rs (1)
44-49: Remove dead commented code.Stale skeleton impl adds noise; derive is already in use.
-// impl GetSize for Message{ -// fn get_heap_size(&self) -> usize { - -// } -// }src/utils/cache/lru.rs (2)
127-134: New contains() is correct and concurrency-safeRead lock with LruCache::contains is appropriate and does not update recency. Consider a short doc comment noting that contains does not affect LRU order, so callers shouldn’t rely on it for touch semantics.
186-191: Metric rename to "cache_..._size": ensure consistency and update dashboardsThe size metric now uses the "cache_" prefix. Please:
- Verify dashboards/alerts scraped the new name.
- Consider aligning len/cap metrics with the same "cache_" prefix for consistency (optional).
src/blocks/header.rs (2)
201-233: Manual GetSize impl: clarify exclusion rationale and potential undercount
- You explicitly exclude CIDs (state_root, message_receipts, messages). If this is to avoid relying on a non-existent GetSize impl for cid::Cid, please replace the comment “derive macro does not compile for some reason” with a concrete note so future readers don’t regress this.
- If cid::Cid (or its multihash) allocates on the heap in your version, excluding it here will undercount. If the multihash is stack/inlined for 32-byte digests in your build, the exclusion is fine. Please confirm the intended behavior.
Nit: big_int_heap_size_helper(b) accounts only for payload bits; it doesn’t include Vec/limb overhead. If you want a slightly less optimistic estimate, consider adding a small constant overhead per BigInt, but that’s optional.
Example comment tweak:
-// The derive macro does not compile for some reason +// Note: We can’t derive GetSize here because some external types (e.g., cid::Cid) +// don’t implement it. We manually sum known heap-allocating fields and exclude CIDs, +// which are treated as fixed-size/inlined for our purposes.Would you like me to search the repo to verify whether there’s any GetSize impl for cid::Cid (or wrappers), and whether we should account for it here?
237-241: Deriving GetSize and ignoring OnceLock is reasonable—confirm CID treatment is consistentDeriving GetSize on CachingBlockHeader while ignoring the cached CID avoids double-counting and also sidesteps the lack of a GetSize impl for Cid. This matches the manual exclusion in RawBlockHeader, but it means CID heap (if any) is never counted. If that’s intentional, consider adding a brief comment to the field.
Optional:
#[derive(Debug, GetSize)] pub struct CachingBlockHeader { uncached: RawBlockHeader, - #[get_size(ignore)] + // CID heap is excluded; tracked via RawBlockHeader fields as needed (currently excluded too). + #[get_size(ignore)] cid: OnceLock<Cid>, has_ever_been_verified_against_any_signature: AtomicBool, }If you’d like, I can also scan for a GetSize impl on TipsetKey and Signature to ensure size tracking stays sound across caches.
src/chain/store/chain_store.rs (2)
560-567: Constructor and metrics wiring LGTMUsing new_with_default_metrics_registry is consistent with similar caches. Consider standardizing the cache name (e.g., “msgs_in_tipset”) if you want all cache metrics to follow a consistent pluralization.
591-598: NonZeroUsize env parsing is fine; optional: guard invalid values explicitlyCurrent parsing silently falls back to DEFAULT. Optionally log or trace when parsing fails or when the value is zero/invalid to aid ops.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (23)
.clippy.toml(1 hunks)interop-tests/tests/dhat_get_size_beacon_entry.rs(1 hunks)src/beacon/drand.rs(1 hunks)src/blocks/election_proof.rs(2 hunks)src/blocks/header.rs(2 hunks)src/blocks/ticket.rs(1 hunks)src/blocks/tipset.rs(2 hunks)src/blocks/vrf_proof.rs(2 hunks)src/chain/store/chain_store.rs(4 hunks)src/chain/store/index.rs(2 hunks)src/chain_sync/bad_block_cache.rs(1 hunks)src/db/car/mod.rs(1 hunks)src/message/chain_message.rs(1 hunks)src/message/signed_message.rs(1 hunks)src/message_pool/msgpool/msg_pool.rs(1 hunks)src/shim/address.rs(2 hunks)src/shim/econ.rs(2 hunks)src/shim/message.rs(2 hunks)src/shim/sector.rs(2 hunks)src/state_manager/cache.rs(1 hunks)src/state_migration/common/mod.rs(1 hunks)src/utils/cache/lru.rs(2 hunks)src/utils/get_size/mod.rs(2 hunks)
🧰 Additional context used
🧠 Learnings (1)
📚 Learning: 2025-07-17T15:21:40.753Z
Learnt from: hanabi1224
PR: ChainSafe/forest#5841
File: src/utils/get_size/mod.rs:10-10
Timestamp: 2025-07-17T15:21:40.753Z
Learning: The get_size2 crate's GetSize trait provides default implementations: get_stack_size() uses std::mem::size_of, get_heap_size() returns 0, and get_size() returns their sum. An empty impl like `impl GetSize for MyType {}` is valid and uses these defaults, making it suitable for simple wrapper types that don't allocate heap memory.
Applied to files:
src/blocks/tipset.rssrc/blocks/vrf_proof.rssrc/shim/address.rssrc/blocks/header.rssrc/message/signed_message.rssrc/shim/econ.rssrc/shim/sector.rssrc/shim/message.rs
🧬 Code Graph Analysis (13)
src/message_pool/msgpool/msg_pool.rs (2)
src/state_migration/common/mod.rs (1)
new(32-39)src/utils/cache/lru.rs (1)
new_with_default_metrics_registry(95-100)
src/blocks/tipset.rs (8)
src/rpc/methods/chain.rs (1)
utils(497-497)src/blocks/header.rs (2)
cid(69-71)cid(304-306)src/message/chain_message.rs (1)
cid(29-34)src/message/signed_message.rs (1)
cid(78-85)src/shim/message.rs (1)
cid(195-198)src/blocks/block.rs (1)
cid(41-43)src/libp2p_bitswap/store.rs (1)
cid(86-88)src/utils/get_size/mod.rs (1)
nunny_vec_heap_size_helper(50-52)
src/blocks/vrf_proof.rs (4)
src/blocks/header.rs (1)
get_heap_size(203-232)src/shim/address.rs (1)
get_heap_size(264-266)src/shim/econ.rs (1)
get_heap_size(40-42)src/shim/sector.rs (1)
get_heap_size(382-384)
src/shim/address.rs (5)
src/blocks/vrf_proof.rs (1)
get_heap_size(31-33)src/blocks/header.rs (1)
get_heap_size(203-232)src/shim/econ.rs (1)
get_heap_size(40-42)src/shim/sector.rs (1)
get_heap_size(382-384)src/shim/executor.rs (2)
get_heap_size(138-144)get_heap_size(348-357)
src/blocks/header.rs (3)
src/utils/encoding/mod.rs (1)
blake2b_256(87-97)src/utils/get_size/mod.rs (1)
big_int_heap_size_helper(56-58)src/shim/sector.rs (1)
get_heap_size(382-384)
src/chain/store/chain_store.rs (2)
src/utils/cache/lru.rs (2)
cache(119-121)new_with_default_metrics_registry(95-100)src/chain/store/index.rs (1)
new(40-46)
src/db/car/mod.rs (1)
src/utils/cache/lru.rs (1)
unbounded_with_default_metrics_registry(115-117)
src/chain_sync/bad_block_cache.rs (1)
src/utils/cache/lru.rs (2)
new_with_default_metrics_registry(95-100)cap(159-161)
src/utils/cache/lru.rs (5)
src/blocks/tipset.rs (1)
contains(50-52)src/chain/store/chain_store.rs (1)
contains(95-97)src/cid_collections/small_cid_vec.rs (1)
contains(35-37)src/db/car/many.rs (1)
contains(197-199)src/state_manager/cache.rs (1)
cache_name(45-51)
src/shim/econ.rs (6)
src/utils/get_size/mod.rs (1)
big_int_heap_size_helper(56-58)src/blocks/vrf_proof.rs (1)
get_heap_size(31-33)src/blocks/header.rs (1)
get_heap_size(203-232)src/shim/address.rs (1)
get_heap_size(264-266)src/shim/sector.rs (1)
get_heap_size(382-384)src/shim/executor.rs (2)
get_heap_size(138-144)get_heap_size(348-357)
src/shim/sector.rs (5)
src/blocks/vrf_proof.rs (1)
get_heap_size(31-33)src/blocks/header.rs (1)
get_heap_size(203-232)src/shim/address.rs (1)
get_heap_size(264-266)src/shim/econ.rs (1)
get_heap_size(40-42)src/shim/executor.rs (2)
get_heap_size(138-144)get_heap_size(348-357)
src/chain/store/index.rs (4)
src/utils/cache/lru.rs (2)
cache(119-121)new_with_default_metrics_registry(95-100)src/utils/misc/env.rs (1)
is_env_truthy(17-19)src/message_pool/msgpool/provider.rs (2)
load_tipset(50-50)load_tipset(114-120)src/blocks/tipset.rs (4)
tsk(264-268)load(263-271)new(241-259)new(493-510)
src/message/chain_message.rs (2)
src/message/signed_message.rs (1)
message(40-42)src/cli/subcommands/chain_cmd.rs (1)
fvm_ipld_encoding(92-92)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (9)
- GitHub Check: tests-release
- GitHub Check: tests
- GitHub Check: Build forest binaries on Linux AMD64
- GitHub Check: cargo-publish-dry-run
- GitHub Check: Build MacOS
- GitHub Check: Build Ubuntu
- GitHub Check: All lint checks
- GitHub Check: Analyze (rust)
- GitHub Check: Analyze (go)
🔇 Additional comments (24)
src/db/car/mod.rs (1)
75-76: Metric label rename sanity-check completeI ran
rg -n --hidden -S 'zstd_frame_cache'and found no remaining code-level registrations or string literals using the old metric label. The only matches are:
- src/db/car/mod.rs
• Constant and env var names:ZSTD_FRAME_CACHE_DEFAULT_MAX_SIZE,FOREST_ZSTD_FRAME_CACHE_DEFAULT_MAX_SIZE
• Test name:test_zstd_frame_cache_size- CHANGELOG.md and docs/users/reference/env_variables.md: references to the env var
None of these affect the metric registry. Please verify that any external dashboards or alerting rules still expecting
"zstd_frame_cache"have been updated.src/beacon/drand.rs (1)
265-268: Verified complete label rename
The search for"verified_beacons_cache"across the repo returned no hits, confirming the old metric label has been fully removed and replaced. No further references found. 🎉.clippy.toml (1)
53-54: Reason update for unbounded — LGTM.Clearer guidance and aligns with the migration to SizeTrackingLruCache.
src/message/signed_message.rs (1)
15-20: Deriving GetSize for SignedMessage — LGTM.This enables precise heap accounting for cache sizing; consistent with the broader PR.
src/blocks/tipset.rs (2)
14-15: Import nunny_vec_heap_size_helper — LGTM.Needed for accurate heap-size accounting of nunny::Vec.
33-36: Derive GetSize for TipsetKey — LGTM.Pairs well with shrink_to_fit in From<NonEmpty> to avoid over-counting spare capacity.
interop-tests/tests/dhat_get_size_beacon_entry.rs (1)
52-53: LGTM: assertion matches dynamic heap sizing.Using inner_bytes aligns the expectation with actual per-element heap usage. Consistent with earlier assertions in the test.
src/shim/econ.rs (1)
39-43: LGTM: GetSize for TokenAmount correctly accounts for atto BigInt heap.Accurate and zero-copy via atto(). This integrates with SizeTrackingLruCache as intended.
src/shim/message.rs (1)
18-18: LGTM: Deriving GetSize for Message.Consistent with other message types and enables accurate size tracking.
src/message/chain_message.rs (1)
14-16: LGTM: GetSize derive for ChainMessage.Enables correct sizing for both variants and integrates with size-tracking caches.
src/state_migration/common/mod.rs (1)
35-37: LGTM: metrics key normalized to "migration".Aligns with repo-wide naming cleanup; no functional change.
src/message_pool/msgpool/msg_pool.rs (1)
479-485: LGTM: metrics keys shortened.Consistent with naming across the PR; no runtime behavior change.
src/state_manager/cache.rs (1)
45-51: Metric name change: validate dashboards/alerts and ensure no collisionscache_name() now returns "tipset_state_{type}" (previously included "cache" in the name per PR summary). Please confirm dashboards/alerts are updated and that no other registry entry reuses the same name.
src/blocks/election_proof.rs (1)
8-8: Derive GetSize on ElectionProof — goodBrings ElectionProof into heap-size accounting (including nested VRFProof). Looks correct.
Also applies to: 138-149
src/blocks/ticket.rs (1)
6-6: Derive GetSize on Ticket — goodEnables size tracking for Ticket via VRFProof. No logical changes; import and derive look correct.
Also applies to: 12-23
src/chain/store/index.rs (1)
52-58: Cache refactor to SizeTrackingLruCache — looks goodget_cloned() and push() usage is correct; metrics hit/miss logic preserved. The API remains thread-safe without an external Mutex.
Also applies to: 62-63
src/blocks/header.rs (2)
17-18: Importing big_int_heap_size_helper is appropriateBrings in the helper used below; no issues.
23-23: GetSize import looks goodRequired for the manual impl and derive below.
src/chain/store/chain_store.rs (6)
29-29: Import switch to SizeTrackingLruCache aligns with PR goalsConsistent with replacing LruCache usages.
555-557: MsgsInTipsetCache switched to SizeTrackingLruCacheGood step toward centralized size accounting. Please ensure TipsetKey and Vec implement GetSize so sizes are tracked correctly; otherwise entries will default to stack-only size.
Do you want me to scan the codebase to confirm GetSize impls for these types?
569-571: get via get_cloned is correct for copy-out semanticsNo concerns.
587-588: Insert delegates to push—goodMatches the new cache API.
603-604: Default uses read_cache_size—LGTMNo issues.
801-801: Test update for NonZeroUsize capacity is correctConstructs the cache with capacity 2; behavior validated below.
| fn raw_bytes_heap_size(b: &RawBytes) -> usize { | ||
| // Note: this is a cheap but inaccurate estimation, | ||
| // the correct implementation should be `Vec<u8>.from(b.clone()).get_heap_size()`, | ||
| // or `b.bytes.get_heap_size()` if `bytes` is made public. | ||
| b.bytes().get_heap_size() | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
💡 Verification agent
🧩 Analysis chain
Confirm RawBytes::bytes() return type; fix misleading comment if needed.
If RawBytes::bytes() returns &[u8], get_heap_size() will undercount (likely 0). If it returns &Vec, this is correct. The comment currently contradicts the implementation; adjust accordingly after verification.
Please verify via docs: what is the return type of fvm_ipld_encoding::RawBytes::bytes()?
If it returns &[u8], change to a safe, accurate path (not ideal due to clone, but correct):
-fn raw_bytes_heap_size(b: &RawBytes) -> usize {
- // Note: this is a cheap but inaccurate estimation,
- // the correct implementation should be `Vec<u8>.from(b.clone()).get_heap_size()`,
- // or `b.bytes.get_heap_size()` if `bytes` is made public.
- b.bytes().get_heap_size()
-}
+fn raw_bytes_heap_size(b: &RawBytes) -> usize {
+ // Accurate: obtain a Vec to measure capacity-based heap size.
+ let v: Vec<u8> = b.clone().into();
+ v.get_heap_size()
+}🌐 Web query:
What is the return type of fvm_ipld_encoding::RawBytes::bytes()? Does it return &Vec<u8> or &[u8>?
💡 Result:
It returns a slice: pub fn bytes(&self) -> &[u8]. (docs.rs)
(So you get a &[u8], not an &Vec.)
Citations:
Adjust raw_bytes_heap_size to clone into a Vec for an accurate heap‐size measurement
According to the fvm_ipld_encoding docs, RawBytes::bytes() returns a slice (&[u8]), so calling get_heap_size() on it will undercount (it effectively reports zero for the referenced heap data). To measure the actual heap allocation, clone the RawBytes into a Vec<u8> and call get_heap_size() on that:
src/shim/message.rs
fn raw_bytes_heap_size(b: &RawBytes) -> usize {
- // Note: this is a cheap but inaccurate estimation,
- // the correct implementation should be `Vec<u8>.from(b.clone()).get_heap_size()`,
- // or `b.bytes.get_heap_size()` if `bytes` is made public.
- b.bytes().get_heap_size()
+ // Accurate: clone into a Vec so we can measure its heap allocation.
+ let v: Vec<u8> = b.clone().into();
+ v.get_heap_size()
}📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| fn raw_bytes_heap_size(b: &RawBytes) -> usize { | |
| // Note: this is a cheap but inaccurate estimation, | |
| // the correct implementation should be `Vec<u8>.from(b.clone()).get_heap_size()`, | |
| // or `b.bytes.get_heap_size()` if `bytes` is made public. | |
| b.bytes().get_heap_size() | |
| } | |
| fn raw_bytes_heap_size(b: &RawBytes) -> usize { | |
| // Accurate: clone into a Vec so we can measure its heap allocation. | |
| let v: Vec<u8> = b.clone().into(); | |
| v.get_heap_size() | |
| } |
🤖 Prompt for AI Agents
In src/shim/message.rs around lines 37 to 42, raw_bytes_heap_size currently
calls get_heap_size on a slice returned by RawBytes::bytes(), which undercounts
heap usage; modify the function to obtain an owned Vec<u8> from the RawBytes (by
cloning/collecting into a Vec) and call get_heap_size on that Vec so the actual
heap allocation is measured correctly (e.g., convert the RawBytes into a Vec<u8>
and return its get_heap_size()).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd prefer performance over accuracy in this case, as commented.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@hanabi1224, understood! The performance trade-off makes sense here, especially since size tracking can be called frequently. The comment already documents that this is an intentional approximation for performance reasons. I appreciate the clarification on the design choice.
🧠 Learnings used
Learnt from: hanabi1224
PR: ChainSafe/forest#5841
File: src/utils/get_size/mod.rs:10-10
Timestamp: 2025-07-17T15:21:40.753Z
Learning: The get_size2 crate's GetSize trait provides default implementations: get_stack_size() uses std::mem::size_of, get_heap_size() returns 0, and get_size() returns their sum. An empty impl like `impl GetSize for MyType {}` is valid and uses these defaults, making it suitable for simple wrapper types that don't allocate heap memory.
Co-authored-by: Hubert <hubert@chainsafe.io>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
🔭 Outside diff range comments (1)
src/chain/store/index.rs (1)
60-66: Cache-disabled env still performs writes and increments miss metrics — gate bothWhen the cache is disabled, we shouldn’t populate it or emit hit/miss metrics. Gate the push and miss counter on
cache_enabledto avoid memory usage and misleading metrics when caching is disabled.Apply this change:
- let ts_opt = Tipset::load(&self.db, tsk)?.map(Arc::new); - if let Some(ts) = &ts_opt { - self.ts_cache.push(tsk.clone(), ts.clone()); - metrics::LRU_CACHE_MISS - .get_or_create(&metrics::values::TIPSET) - .inc(); - } + let ts_opt = Tipset::load(&self.db, tsk)?.map(Arc::new); + if let Some(ts) = &ts_opt { + if cache_enabled { + self.ts_cache.push(tsk.clone(), ts.clone()); + metrics::LRU_CACHE_MISS + .get_or_create(&metrics::values::TIPSET) + .inc(); + } + }
♻️ Duplicate comments (1)
.clippy.toml (1)
47-54: Ban coverage for LruCache constructors — LGTM; addresses earlier reviewBlocking
LruCache::newandLruCache::with_hasheraligns with the migration and closes the loophole called out earlier. Nice.
🧹 Nitpick comments (1)
.clippy.toml (1)
56-62: Ban unbounded constructors — good; unify reason wording and capitalizationThe bans for
unboundedandunbounded_with_hasherare correct. Minor consistency nit: standardize the reason string (capitalize LRU and simplify wording).Apply this small wording cleanup:
- reason = """Avoid unbounded lru cache for potential memory leak, use SizeTrackingLruCache instead.""" + reason = """Avoid unbounded LRU caches; use SizeTrackingLruCache instead."""Repeat the same change for both entries (Lines 57 and 61).
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (4)
.clippy.toml(1 hunks)src/chain/store/index.rs(2 hunks)src/shim/address.rs(2 hunks)src/utils/get_size/mod.rs(2 hunks)
🚧 Files skipped from review as they are similar to previous changes (2)
- src/shim/address.rs
- src/utils/get_size/mod.rs
🧰 Additional context used
🧠 Learnings (1)
📓 Common learnings
Learnt from: hanabi1224
PR: ChainSafe/forest#5841
File: src/utils/get_size/mod.rs:10-10
Timestamp: 2025-07-17T15:21:40.753Z
Learning: The get_size2 crate's GetSize trait provides default implementations: get_stack_size() uses std::mem::size_of, get_heap_size() returns 0, and get_size() returns their sum. An empty impl like `impl GetSize for MyType {}` is valid and uses these defaults, making it suitable for simple wrapper types that don't allocate heap memory.
🧬 Code Graph Analysis (1)
src/chain/store/index.rs (3)
src/utils/misc/env.rs (1)
is_env_truthy(17-19)src/blocks/tipset.rs (4)
tsk(264-268)load(263-271)new(241-259)new(493-510)src/beacon/drand.rs (1)
new(251-269)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (9)
- GitHub Check: Build Ubuntu
- GitHub Check: cargo-publish-dry-run
- GitHub Check: Build MacOS
- GitHub Check: Build forest binaries on Linux AMD64
- GitHub Check: All lint checks
- GitHub Check: tests-release
- GitHub Check: tests
- GitHub Check: Analyze (rust)
- GitHub Check: Analyze (go)
🔇 Additional comments (4)
.clippy.toml (1)
47-62: Confirmed: No additionalLruCacheconstructors to ban
- A direct import of
LruCacheis present atsrc/utils/cache/lru.rs:16.- No occurrences of
LruCache::try_new,::try_with_hasher,::new_default, or::from_iterwere found in the codebase.The current disallowed-methods list covers all existing constructors. We can revisit this if the
lrucrate introduces new constructors in the future.src/chain/store/index.rs (3)
11-11: Switch to SizeTrackingLruCache import — LGTMImport is correct and aligns with the migration away from
LruCache.
19-19: Type alias migrated to size-tracking cache — LGTMUsing
SizeTrackingLruCache<TipsetKey, Arc<Tipset>>is appropriate for measured caching.
41-44: Initialize cache with metrics registry and NonZero capacity — LGTMName “tipset” and NonZero capacity are consistent with the new cache API and metrics conventions.
Summary of changes
Changes introduced in this pull request:
LruCacheusages withSizeTrackingLruCacheReference issue to close (if applicable)
Closes
Other information and links
Change checklist
Summary by CodeRabbit
New Features
Refactor
Chores
Tests