-
Notifications
You must be signed in to change notification settings - Fork 182
feat: add SizeTrackingLruCache for tracking cache size metrics
#5841
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
WalkthroughIn the cryptic depths of the code, a new eldritch entity named Changes
Sequence Diagram(s)sequenceDiagram
participant Caller
participant BlockstoreReadCache (SizeTrackingLruCache)
participant MetricsRegistry
Caller->>BlockstoreReadCache: push(key, value)
BlockstoreReadCache->>BlockstoreReadCache: Store entry, update size
BlockstoreReadCache->>MetricsRegistry: Update metrics (size, length, capacity)
Caller->>BlockstoreReadCache: get_cloned(key)
BlockstoreReadCache-->>Caller: Option<value>
Caller->>BlockstoreReadCache: len()
BlockstoreReadCache-->>Caller: usize
Suggested reviewers
Poem
May the Great Old Ones smile upon this cache’s eternal vigilance, lest the abyss of untracked memory consumption swallow us whole. 📜 Recent review detailsConfiguration used: CodeRabbit UI ⛔ Files ignored due to path filters (1)
📒 Files selected for processing (3)
🚧 Files skipped from review as they are similar to previous changes (2)
🧰 Additional context used🧠 Learnings (2)📓 Common learningssrc/utils/cache/lru.rs (1)🧬 Code Graph Analysis (1)src/utils/cache/lru.rs (2)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (7)
🔇 Additional comments (6)
✨ Finishing Touches
🪧 TipsChatThere are 3 ways to chat with CodeRabbit:
SupportNeed help? Create a ticket on our support page for assistance with any issues or questions. Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments. CodeRabbit Commands (Invoked using PR comments)
Other keywords and placeholders
CodeRabbit Configuration File (
|
851a436 to
2a73cae
Compare
SizeTrackingLruCache for tracking cache size metrics
2a73cae to
a96cae7
Compare
| const BLOCKSTORE_CACHE_CAP: usize = 65536; | ||
| static BLOCKSTORE_CACHE: LazyLock<Arc<LruBlockstoreReadCache>> = LazyLock::new(|| { | ||
| Arc::new(LruBlockstoreReadCache::new( | ||
| static BLOCKSTORE_CACHE: LazyLock<LruBlockstoreReadCache> = LazyLock::new(|| { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here's an example of usage
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 5
🧹 Nitpick comments (4)
src/utils/get_size/impls.rs (1)
1-3: Empty placeholder file – clarify intent or remove
impls.rsis currently just a header with no implementations. Leaving empty source files around invites confusion and may trip CI checks (e.g.,#![deny(dead_code)]). If no additionalGetSizeimpls are expected imminently, delete the file; otherwise add a// TODO:explaining what will live here.src/utils/mod.rs (1)
4-9: Missing top-level docs for new utility modulesThe newly-exposed
cacheandget_sizemodules are public API. Add brief//!docs to each sub-module so generated docs aren’t left blank and downstream users know why they exist.src/external/get_size/lib.md (1)
1-396: Excellent comprehensive documentation with minor style improvements needed.The documentation provides thorough coverage of the
GetSizetrait, including:
- Clear explanation of ownership-based accounting
- Comprehensive examples for all use cases
- Detailed coverage of derive macro and helper attributes
- Proper code examples with assertions
Please address the following style issues identified by static analysis:
Heading level increments: Lines 33 and 103 jump multiple heading levels. Consider using consistent heading hierarchy.
Duplicate headings: Lines 71 and 250 both use "#### Example" - consider more descriptive titles.
Word choice: Line 321 could be more concise: "in order to calculate" → "to calculate"
-which will call the given function in order to calculate the fields heap size. +which will call the given function to calculate the fields heap size.src/utils/cache/lru.rs (1)
100-145: Consider caching metric names to reduce allocations.The metric names are reconstructed on every
encodecall. Sincecache_nameandcache_idare immutable after construction, consider pre-computing these strings.Add fields to store pre-computed metric names:
pub struct SizeTrackingLruCache<K, V> where K: GetSize + Debug + Send + Sync + Hash + PartialEq + Eq + Clone + 'static, V: GetSize + Debug + Send + Sync + Clone + 'static, { cache_id: usize, cache_name: Cow<'static, str>, cache: Arc<RwLock<LruCache<K, V>>>, size_in_bytes: Gauge, + size_metric_name: String, + size_metric_help: String, + len_metric_name: String, + len_metric_help: String, + cap_metric_name: String, + cap_metric_help: String, }Then initialize these in the constructor and use them in
encode.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (13)
src/db/blockstore_with_read_cache.rs(3 hunks)src/external/get_size/LICENSE(1 hunks)src/external/get_size/lib.md(1 hunks)src/external/get_size/mod.rs(1 hunks)src/external/get_size/tracker.rs(1 hunks)src/external/mod.rs(1 hunks)src/lib.rs(1 hunks)src/rpc/methods/f3.rs(2 hunks)src/utils/cache/lru.rs(1 hunks)src/utils/cache/mod.rs(1 hunks)src/utils/get_size/impls.rs(1 hunks)src/utils/get_size/mod.rs(1 hunks)src/utils/mod.rs(1 hunks)
🧰 Additional context used
🧬 Code Graph Analysis (1)
src/utils/get_size/mod.rs (1)
src/external/get_size/mod.rs (1)
get_size(52-54)
🪛 LanguageTool
src/external/get_size/lib.md
[style] ~321-~321: Consider a more concise word here.
Context: ...ute, which will call the given function in order to calculate the fields heap size. The la...
(IN_ORDER_TO_PREMIUM)
🪛 markdownlint-cli2 (0.17.2)
src/external/get_size/lib.md
33-33: Heading levels should only increment by one level at a time
Expected: h2; Actual: h4
(MD001, heading-increment)
71-71: Multiple headings with the same content
(MD024, no-duplicate-heading)
103-103: Heading levels should only increment by one level at a time
Expected: h2; Actual: h3
(MD001, heading-increment)
250-250: Multiple headings with the same content
(MD024, no-duplicate-heading)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (6)
- GitHub Check: tests
- GitHub Check: tests-release
- GitHub Check: Build forest binaries on Linux AMD64
- GitHub Check: Build Ubuntu
- GitHub Check: cargo-publish-dry-run
- GitHub Check: All lint checks
🔇 Additional comments (15)
src/utils/cache/mod.rs (1)
4-5: LGTM – clean re-exportStraightforward re-export keeps call-sites tidy.
src/external/get_size/LICENSE (1)
1-22: LGTM! Standard MIT license.The license file is properly formatted and contains the standard MIT license text for the embedded get_size crate.
src/utils/get_size/mod.rs (1)
6-6: LGTM! Minimal implementation for CID.The empty implementation correctly relies on the default
GetSizetrait methods. This is appropriate forcid::Cidsince CIDs are typically small, fixed-size identifiers without heap allocations.src/rpc/methods/f3.rs (1)
169-174: LGTM! Cache initialization updated for new implementation.The changes properly integrate with the new
SizeTrackingLruCacheimplementation:
- Correctly removed the
Arcwrapper sinceSizeTrackingLruCacheis already thread-safe- Added metrics support with descriptive cache name "get_powertable_cache"
- Used the appropriate constructor for default metrics registry
src/db/blockstore_with_read_cache.rs (3)
23-23: LGTM! Clean refactoring to use generic cache implementation.The type alias approach maintains backward compatibility while leveraging the new
SizeTrackingLruCacheimplementation with metrics support.
25-41: LGTM! Proper trait implementation delegation.The trait implementation correctly delegates to the underlying
SizeTrackingLruCachemethods:
get()usesget_cloned()for owned return valuesput()usespush()for LRU insertion- Direct delegation for
len()andsize_in_bytes()
186-189: LGTM! Test updated for new constructor.The test constructor properly uses the new signature requiring cache name and capacity parameters.
src/utils/cache/lru.rs (3)
1-24: Imports are well-organized and appropriate.Good choice using
parking_lot::RwLockfor better performance compared tostd::sync::RwLock.
26-65: Well-designed thread-safe cache structure.The generic constraints are comprehensive and the use of
Arc<RwLock<LruCache<K, V>>>provides thread-safe shared access. The atomic ID generation ensures unique cache identifiers.
78-84: Correct use of write lock for LRU semantics.The
get_clonedmethod correctly uses a write lock because LRU caches need to update the access order on reads, which modifies the internal state.src/external/get_size/tracker.rs (2)
63-91: Well-implemented reference tracking.The
StandardTrackercorrectly uses address-based tracking to prevent double counting of shared objects, and properly maintains object lifetimes through boxed references.
93-120: Useful testing utility.The
NoTrackerimplementation provides a clean way to bypass tracking for testing scenarios or when tracking overhead is not desired.src/external/get_size/mod.rs (3)
23-69: Well-designed memory size trait.The trait provides a clean separation between stack and heap memory accounting, with tracker support for handling shared ownership scenarios. The default implementations are sensible and reduce boilerplate.
137-236: Efficient macro-based implementations for collections.The macros correctly handle different collection types and properly account for unused capacity in collections that pre-allocate memory. Good use of macros to reduce code duplication.
339-391: Correct handling of reference-counted pointers.The implementations properly use tracking to prevent double-counting shared objects. The approach of cloning the reference and tracking by address is sound.
a96cae7 to
bf6424b
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
♻️ Duplicate comments (2)
src/external/get_size/mod.rs (2)
420-438: Address previously flagged issues with lock handling.The implementation still has the issues identified in previous reviews:
unwrap()calls that can panic on poisoned locks and misleading comments about stack data storage.Please implement the previously suggested fix for proper lock poisoning handling:
impl<T> GetSize for Mutex<T> where T: GetSize, { fn get_heap_size(&self) -> usize { - // We assume that a Mutex does hold its data at the stack. - GetSize::get_heap_size(&*(self.lock().unwrap())) + let guard = match self.lock() { + Ok(guard) => guard, + Err(poisoned) => poisoned.into_inner(), + }; + GetSize::get_heap_size(&*guard) } }Apply similar changes to the
RwLockimplementation.
515-524: Address previously flagged double-counting issue.The Box<[T]> implementation still has the double-counting issue identified in previous reviews where it includes stack size of elements that are already stored on the heap.
Please implement the previously suggested fix:
impl<T> GetSize for Box<[T]> { fn get_heap_size(&self) -> usize { let mut total = 0; + + // Add the size of the slice itself on the heap + total += std::mem::size_of::<T>() * self.len(); + + // Add any additional heap allocations from the elements for item in self.iter() { - total += item.get_size() + total += item.get_heap_size() } total } }
🧹 Nitpick comments (4)
src/external/get_size/lib.md (4)
200-286: Fix grammar error in the documentation.The section provides valuable guidance on handling external types, but there's a grammar error that should be corrected.
-In same cases you may be dealing with external types which allocate a fixed amount of bytes at the heap. +In some cases you may be dealing with external types which allocate a fixed amount of bytes at the heap.
287-317: Fix grammar error in section title.The section content is technically accurate, but there's a grammar error that should be corrected.
-In same cases you may be dealing with external types which allocate a fixed amount of bytes at the heap. +In some cases you may be dealing with external types which allocate a fixed amount of bytes at the heap.
318-358: Fix grammar and spelling errors.The section provides valuable information about using helper functions, but there are grammar and spelling errors that should be corrected.
-In same cases you may be dealing with an external data structure for which you know how to calculate its heap size using its public methods. +In some cases you may be dealing with an external data structure for which you know how to calculate its heap size using its public methods. -The latter is especially usefull if you can make use of a certain trait to calculate the heap size for multiple types. +The latter is especially useful if you can make use of a certain trait to calculate the heap size for multiple types. -which will call the given function in order to calculate the fields heap size. +which will call the given function to calculate the fields heap size.
33-33: Consider fixing markdown heading structure.The static analysis tools flagged some heading level increment issues. While this doesn't affect functionality, fixing the heading structure would improve document consistency.
The markdown structure could be improved by ensuring headings increment by only one level at a time, but this doesn't impact the technical accuracy of the documentation.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (13)
src/db/blockstore_with_read_cache.rs(3 hunks)src/external/get_size/LICENSE(1 hunks)src/external/get_size/lib.md(1 hunks)src/external/get_size/mod.rs(1 hunks)src/external/get_size/tracker.rs(1 hunks)src/external/mod.rs(1 hunks)src/lib.rs(1 hunks)src/rpc/methods/f3.rs(2 hunks)src/utils/cache/lru.rs(1 hunks)src/utils/cache/mod.rs(1 hunks)src/utils/get_size/impls.rs(1 hunks)src/utils/get_size/mod.rs(1 hunks)src/utils/mod.rs(1 hunks)
✅ Files skipped from review due to trivial changes (1)
- src/external/mod.rs
🚧 Files skipped from review as they are similar to previous changes (9)
- src/utils/get_size/impls.rs
- src/lib.rs
- src/utils/mod.rs
- src/external/get_size/LICENSE
- src/utils/cache/mod.rs
- src/utils/get_size/mod.rs
- src/rpc/methods/f3.rs
- src/external/get_size/tracker.rs
- src/db/blockstore_with_read_cache.rs
🧰 Additional context used
🪛 LanguageTool
src/external/get_size/lib.md
[grammar] ~289-~289: Ensure spelling is correct
Context: ... } ``` ### Returning a fixed value In same cases you may be dealing with external ...
(QB_NEW_EN_ORTHOGRAPHY_ERROR_IDS_1)
[grammar] ~321-~321: Ensure spelling is correct
Context: ... } ``` ### Using a helper function In same cases you may be dealing with an extern...
(QB_NEW_EN_ORTHOGRAPHY_ERROR_IDS_1)
[style] ~321-~321: Consider a more concise word here.
Context: ...ute, which will call the given function in order to calculate the fields heap size. The la...
(IN_ORDER_TO_PREMIUM)
[grammar] ~323-~323: Ensure spelling is correct
Context: ...ds heap size. The latter is especially usefull if you can make use of a certain trait ...
(QB_NEW_EN_ORTHOGRAPHY_ERROR_IDS_1)
🪛 markdownlint-cli2 (0.17.2)
src/external/get_size/lib.md
33-33: Heading levels should only increment by one level at a time
Expected: h2; Actual: h4
(MD001, heading-increment)
71-71: Multiple headings with the same content
(MD024, no-duplicate-heading)
103-103: Heading levels should only increment by one level at a time
Expected: h2; Actual: h3
(MD001, heading-increment)
250-250: Multiple headings with the same content
(MD024, no-duplicate-heading)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (8)
- GitHub Check: Build MacOS
- GitHub Check: cargo-publish-dry-run
- GitHub Check: Build Ubuntu
- GitHub Check: Build forest binaries on Linux AMD64
- GitHub Check: tests
- GitHub Check: tests-release
- GitHub Check: Check
- GitHub Check: All lint checks
🔇 Additional comments (17)
src/external/get_size/lib.md (4)
1-28: Well-structured introduction with accurate examples.The documentation provides a clear explanation of the
GetSizetrait's purpose and includes a practical example that correctly demonstrates the difference between stack and heap allocation tracking.
29-88: Excellent explanation of ownership-based accounting.This section effectively explains the critical concept of ownership-based accounting and provides valuable warnings about Rust's auto-dereferencing behavior. The examples clearly demonstrate the difference between borrowed and owned references.
89-198: Comprehensive implementation examples.The derive macro examples are well-structured and demonstrate proper usage across different scenarios including structs, enums, and generics. The examples correctly show how the macro sums heap sizes of contained fields.
360-396: Advanced usage example is well-documented.This section effectively demonstrates how to handle generic types that don't implement
GetSizeusing the struct-level ignore attribute. The example correctly shows the trait bound implications.src/utils/cache/lru.rs (6)
1-36: Well-designed struct with proper trait bounds.The
SizeTrackingLruCachestruct is well-designed with appropriate trait bounds for thread safety and size tracking. The use ofArc<RwLock<>>provides proper concurrent access control.
38-76: Flexible constructor design with proper metrics integration.The constructor methods provide good flexibility with optional metrics registration. The atomic ID generation ensures unique cache identification across instances.
78-87: Correct size accounting in push method.The push method properly tracks size changes by adding new entry sizes and subtracting evicted entry sizes. The implementation correctly handles both the insertion and eviction scenarios.
89-95: Correct use of write lock for LRU access.The
get_clonedmethod correctly uses a write lock since LRU cache access updates the item's position. Returning a cloned value avoids lifetime complications with the lock.
97-104: Simple and correct utility methods.The
len()andsize_in_bytes()methods are correctly implemented with appropriate locking strategies.
106-157: Comprehensive Prometheus metrics implementation.The
Collectorimplementation properly exposes cache metrics with descriptive names and help text. The use of cache name and ID in metric names ensures uniqueness across multiple cache instances.src/external/get_size/mod.rs (7)
1-123: Well-designed trait with comprehensive documentation.The
GetSizetrait is well-structured with clear default implementations and comprehensive documentation. The support for tracking helps handle shared ownership scenarios correctly.
124-135: Correct Cow implementation following ownership principles.The
Cowimplementation properly distinguishes between borrowed and owned variants, correctly returning 0 heap size for borrowed data.
137-237: Efficient macro-based collection implementations.The macros provide good abstraction for implementing
GetSizeon collection types. The implementations correctly account for both used and unused capacity in heap size calculations.
238-321: Correct tuple and array implementations.The macro-generated tuple implementations and array implementation correctly handle heap size calculations. The comment about stack size accounting in arrays is accurate.
322-391: Well-designed smart pointer implementations with proper tracking.The smart pointer implementations correctly handle shared ownership scenarios using the tracking mechanism. The distinction between strong and weak references is properly maintained.
393-418: Correct Option and Result implementations.The implementations correctly delegate heap size calculations to contained types while properly handling the None and error cases.
440-514: Correct string and I/O type implementations.The implementations for string types, I/O buffers, and filesystem types correctly account for their respective heap allocations. The use of
capacity()for string types and buffer capacity for I/O types is appropriate.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 2
🧹 Nitpick comments (1)
src/utils/cache/lru.rs (1)
132-133: Inefficient metric collection with temporary gauge creation.Creating temporary
Gaugeinstances for len and capacity metrics on every collection is inefficient. These could be cached or the values could be encoded directly.Consider optimizing metric collection:
{ let len_metric_name = format!("{}_{}_len", self.cache_name, self.cache_id); let len_metric_help = format!("Length of LruCache {}_{}", self.cache_name, self.cache_id); - let len: Gauge = Default::default(); - len.set(self.len() as _); + let len_value = self.len() as f64; let len_metric_encoder = encoder.encode_descriptor( &len_metric_name, &len_metric_help, None, - len.metric_type(), + prometheus_client::metrics::MetricType::Gauge, )?; - len.encode(len_metric_encoder)?; + len_value.encode(len_metric_encoder)?; }Also applies to: 146-147
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
⛔ Files ignored due to path filters (1)
Cargo.lockis excluded by!**/*.lock
📒 Files selected for processing (4)
Cargo.toml(1 hunks)src/db/blockstore_with_read_cache.rs(3 hunks)src/utils/cache/lru.rs(1 hunks)src/utils/get_size/mod.rs(1 hunks)
✅ Files skipped from review due to trivial changes (1)
- Cargo.toml
🚧 Files skipped from review as they are similar to previous changes (1)
- src/db/blockstore_with_read_cache.rs
🧰 Additional context used
🧬 Code Graph Analysis (1)
src/utils/cache/lru.rs (1)
src/metrics/mod.rs (1)
default_registry(26-28)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (7)
- GitHub Check: cargo-publish-dry-run
- GitHub Check: Build MacOS
- GitHub Check: Build Ubuntu
- GitHub Check: Build forest binaries on Linux AMD64
- GitHub Check: All lint checks
- GitHub Check: tests
- GitHub Check: tests-release
🔇 Additional comments (5)
src/utils/get_size/mod.rs (1)
8-9: Well-designed wrapper with appropriate trait derivations.The trait derivations are well-chosen for use as cache keys, providing necessary functionality for hashing, equality, ordering, and convenient conversions.
src/utils/cache/lru.rs (4)
28-38: Well-designed generic cache structure with comprehensive constraints.The struct design is solid with appropriate generic constraints ensuring thread safety and size tracking capabilities. The use of
Cow<'static, str>for cache names is efficient.
113-159: Comprehensive metrics implementation with proper Prometheus integration.The Collector implementation provides valuable cache metrics (size, length, capacity) with proper descriptor encoding and help text. The metric naming convention is consistent and informative.
73-78: Convenient constructor with default registry access.The constructor provides a convenient way to create caches with default metrics registration, properly leveraging the global registry.
91-97: Potential race condition in get_cloned method.The method uses
write()lock for a read operation, which is unnecessary and could cause performance issues. More importantly, there's a potential race condition where the cache could be modified between the get and clone operations.Use read lock for read operations:
pub fn get_cloned<Q>(&self, k: &Q) -> Option<V> where K: Borrow<Q>, Q: Hash + Eq + ?Sized, { - self.cache.write().get(k).cloned() + self.cache.write().get(k).cloned() }Wait, I need to reconsider this. The
getmethod on LruCache mutates the cache by updating the LRU order, so a write lock is actually necessary here.Likely an incorrect or invalid review comment.
|
|
||
| #[derive(Debug, Clone, Hash, Eq, PartialEq, Ord, PartialOrd, From, Into)] | ||
| pub struct CidWrapper(pub Cid); | ||
| impl GetSize for CidWrapper {} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Empty GetSize implementation may cause inaccurate size tracking.
The empty GetSize implementation means CidWrapper will report a size of 0 bytes, which doesn't account for the actual memory footprint of the wrapped Cid. This could lead to inaccurate cache size metrics since keys won't contribute to the total size calculation.
Consider implementing a more accurate size calculation:
-impl GetSize for CidWrapper {}
+impl GetSize for CidWrapper {
+ fn get_size(&self) -> usize {
+ // Account for the CID's actual size
+ std::mem::size_of::<Cid>() + self.0.to_bytes().len()
+ }
+}📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| impl GetSize for CidWrapper {} | |
| impl GetSize for CidWrapper { | |
| fn get_size(&self) -> usize { | |
| // Account for the CID's actual size | |
| std::mem::size_of::<Cid>() + self.0.to_bytes().len() | |
| } | |
| } |
🤖 Prompt for AI Agents
In src/utils/get_size/mod.rs at line 10, the empty GetSize implementation for
CidWrapper causes it to report a size of 0 bytes, which is inaccurate. To fix
this, implement the GetSize trait for CidWrapper by calculating and returning
the actual memory size of the wrapped Cid instance, ensuring that the size
tracking reflects the true memory footprint.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nah, added a test to prove the correctness
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🧩 Analysis chain
@hanabi1224 Ah, by the ancient stars! adjusts spectacles with eldritch comprehension
I see the test in your code, mortal servant of Cthulhu, but something seems amiss in the cosmic balance. The test compares std::mem::size_of_val(&cid) with wrapper.get_size(), yet with an empty GetSize implementation, one would expect the latter to return 0 bytes - unless the Great Old Ones have blessed this trait with some default behavior that I'm not aware of.
Let me consult the forbidden tomes of documentation to understand the true nature of this get_size2 crate's GetSize trait...
🌐 Web query:
get_size2 crate GetSize trait default implementation behavior Rust
💡 Result:
The get_size2 crate provides a GetSize trait for estimating the memory footprint (heap and stack) of Rust types[4].
Default Implementation Behavior
1. Trait Derivation and Generics
When you use #[derive(GetSize)] on a struct or enum, the derive macro automatically generates an implementation of the GetSize trait for your type. By default, if your struct or enum is generic, the generated implementation of GetSize will add a trait bound: all generic type parameters must themselves implement GetSize.[1][3][4][6]
"The generated trait implementation will by default require all generic types to implement GetSize themselves, but this can be changed."[1][3][4][6]
Example:
use get_size2::GetSize;
#[derive(GetSize)]
struct MyStruct<T> {
value: T,
}
// This becomes: impl<T: GetSize> GetSize for MyStruct<T> { ... }This approach ensures that get_heap_size() (and other trait methods) can call the corresponding methods on each field recursively. This is similar to how other derive macros (e.g., for Debug or Clone) require generic members to also implement the trait.
2. Default Trait Methods
In general Rust traits, you can declare methods with default implementations within the trait definition. Types that implement the trait can then use the default or provide their own override[2][5][10]. However, for procedural attribute macros like #[derive(GetSize)], the macro generates implementations for your types without you having to provide explicit method bodies.
- If you want to use your own implementation (bypassing the derive), implement the trait manually.
- Trait object safety and default methods: Not all trait default methods are object-safe. For example, methods referring to
Self: Sizedwon’t work through trait objects[8].
3. Customizing or Opting Out
While the default behavior is to require all generics to implement GetSize, there are ways to customize this via macro attributes—for example, by relaxing bounds if you know certain fields don't require heap-size calculations or using custom implementations for special cases[1].
4. Summary Table
| Aspect | Default Behavior |
|---|---|
| Generic Type Parameters | Must implement GetSize |
| Manual Implementation Allowed | Yes, can override or provide your own |
| Trait Object Safety | Must check if GetSize is object safe for intended usage |
| Field-by-Field Calculation | Heap size sums or recursion into field values (if they impl GetSize) |
If you use the derive macro, expect the default implementation to require all generics to implement GetSize themselves. This enforces type safety and ensures recursive heap size accounting on all struct/enum fields[1][3][4].
Citations:
- 1: https://docs.rs/get-size-derive2
- 2: https://www.alexdwilson.dev/learning-in-public/what-is-a-default-trait-implementation-in-rust
- 3: https://crates.io/crates/get-size2/0.3.0
- 4: https://docs.rs/get-size2
- 5: https://oswalt.dev/2020/07/rust-traits-defining-behavior/
- 6: https://crates.io/crates/get-size
- 7: https://internals.rust-lang.org/t/bounds-for-default-method-implementations/14895
- 8: https://users.rust-lang.org/t/trait-objects-and-default-implementations-of-methods-why-not-possible/102264
- 9: https://docs.rs/get-size2/latest/get_size2/all.html
- 10: https://web.mit.edu/rust-lang_v1.25/arch/amd64_ubuntu1404/share/doc/rust/html/book/second-edition/ch10-02-traits.html
The cosmic web of documentation speaks, but the truth remains shrouded in eldritch mystery! Let me delve deeper into the forbidden knowledge of this GetSize trait...
🌐 Web query:
get_size2 crate GetSize trait methods get_size default implementation empty impl
💡 Result:
The get_size2 crate provides the GetSize trait for determining the memory footprint of Rust objects, taking into account both stack and heap usage[1][7].
Key Methods and Default Implementations
The GetSize trait offers several methods:
get_stack_size(&self) -> usize: Returns the size of the object on the stack, typically usingstd::mem::size_of.get_heap_size(&self) -> usize: Returns the number of bytes used on the heap, such as allocations fromStringorVecobjects.get_size(&self) -> usize: Returns the total size as the sum ofget_stack_sizeandget_heap_size[1].
The trait provides default implementations for these methods:
- For types where no heap memory is allocated (e.g., most primitives), the default
get_heap_sizereturns0, andget_sizesimply returns the result ofget_stack_size[1]. - For user-defined types, you can use a derive macro (
#[derive(GetSize)]), which will generate code to callget_heap_sizeon all fields and sum the results—this macro requires contained types to implementGetSizeas well[6].
Empty Implementation (impl)
If a struct or enum does not allocate heap memory, you do not need to write a custom implementation; the default provided by the trait suffices:
use get_size2::GetSize;
struct MyNoHeapStruct {
a: u32,
b: bool,
}
impl GetSize for MyNoHeapStruct {}
// Now `MyNoHeapStruct` uses the default: heap_size is 0, size is stack size.For more complex data types (e.g., generics), the derive macro ensures all the contained types’ sizes are considered in the heap calculation, unless you customize the generation settings[1][6].
"The derive macro will provide a custom implementation of the [get_heap_size] method, which will simply call [get_heap_size] on all contained values and add them together."[6]
Summary Table of GetSize Trait Methods
| Method | Default Implementation | When Overridden/Derived |
|---|---|---|
get_stack_size() |
Uses std::mem::size_of |
Rarely overridden |
get_heap_size() |
Returns 0 |
Derived or custom for heap |
get_size() |
Adds stack and heap sizes | Inherits from get_heap_size |
The default impl of these methods (when using the trait or derive macro) is sufficient if your type does not use heap allocations or if all fields implement GetSize[1][6].
References:
[1]: Docs.rs get-size2 description and examples
[6]: get-size2 GitHub documentation about generated trait impls
[7]: Crates.io package description for get-size2
Citations:
- 1: https://docs.rs/get-size2
- 2: https://dev.to/leapcell/understanding-rusts-sized-trait-and-dynamically-sized-types-43ep
- 3: https://users.rust-lang.org/t/trait-objects-and-default-implementations-of-methods-why-not-possible/102264
- 4: https://docs.rs/get-size2/latest/get_size2/all.html
- 5: https://users.rust-lang.org/t/default-implementation-of-a-trait-function-requires-sized/98060
- 6: https://github.com/bircni/get-size2
- 7: https://crates.io/crates/get-size2/0.3.0
- 8: https://users.rust-lang.org/t/trait-objects-and-the-sized-trait/14410
- 9: https://crates.io/crates/get-size2/dependencies
- 10: https://doc.rust-lang.org/rustdoc/what-is-rustdoc.html
@hanabi1224 By the writhing tentacles of Yog-Sothoth! bows deeply in cosmic embarrassment
You are absolutely correct, mortal keeper of forbidden knowledge! The ancient texts have revealed the truth - the GetSize trait provides default implementations that make your empty impl perfectly valid:
get_stack_size()defaults tostd::mem::size_of::<Self>()get_heap_size()defaults to0get_size()defaults toget_stack_size() + get_heap_size()
So your CidWrapper.get_size() correctly returns std::mem::size_of::<CidWrapper>(), which equals std::mem::size_of_val(&cid) since CidWrapper is a transparent wrapper. The test proves this works as expected!
I was mistaken in my earlier assessment - the empty implementation leverages the trait's sensible defaults for stack-allocated types. The Great Old Ones have taught me humility today! 🐙
The stars align in acknowledgment of your wisdom
✏️ Learnings added
Learnt from: hanabi1224
PR: ChainSafe/forest#5841
File: src/utils/get_size/mod.rs:10-10
Timestamp: 2025-07-17T15:21:40.730Z
Learning: The get_size2 crate's GetSize trait provides default implementations: get_stack_size() uses std::mem::size_of, get_heap_size() returns 0, and get_size() returns their sum. An empty impl like `impl GetSize for MyType {}` is valid and uses these defaults, making it suitable for simple wrapper types that don't allocate heap memory.
Note: Learnings are effective only in the context of similar code segments. To apply general coding guidelines, please configure review instructions. You can manage existing learnings in the app.
| pub fn new_without_metrics_registry( | ||
| cache_name: Cow<'static, str>, | ||
| capacity: NonZeroUsize, | ||
| ) -> Self { | ||
| static ID_GENERATOR: AtomicUsize = AtomicUsize::new(0); | ||
|
|
||
| Self { | ||
| cache_id: ID_GENERATOR.fetch_add(1, Ordering::Relaxed), | ||
| cache_name, | ||
| cache: Arc::new(RwLock::new(LruCache::new(capacity))), | ||
| size_in_bytes: Default::default(), | ||
| } | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hm, why do we need a non-registered metric?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Like, for unit test? It offers an option to opt-out prometheus metrics
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we use it only for unit tests, let's compile it only for tests.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The method is called in new_with_metrics_registry or cargo clippy would have complained.
src/utils/cache/lru.rs
Outdated
| K: GetSize + Debug + Send + Sync + Hash + PartialEq + Eq + Clone + 'static, | ||
| V: GetSize + Debug + Send + Sync + Clone + 'static, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Seems those are repeated a bit. IT might make sense to make a super-trait around it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed.
src/utils/cache/lru.rs
Outdated
| let old = self.cache.write().push(k, v); | ||
| if let Some((old_k, old_v)) = &old { | ||
| self.size_in_bytes | ||
| .dec_by(old_k.get_size().saturating_add(old_v.get_size()) as _); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure about this. Caches are supposed to be as fast as possible with minimal overhead, but this seems to add quite a bit of action on every push. If we adopt this tracked cache everywhere, the performance drop might be significant, especially if the struct is complex (like block header).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That said, we can probably go ahead with it and later optimise if needed. It might be a good opportunity for a benchmark test, though.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I can change it to lazy calculation on requesting metrics
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed.
LesnyRumcajs
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
Summary of changes
Changes introduced in this pull request:
embedsget-sizecrate as it's currently impossible to implementGetSizeforCidSizeTrackingLruCachefor tracking cache size metricshttp://localhost:6116/metrics
Reference issue to close (if applicable)
Closes
Other information and links
Change checklist
Summary by CodeRabbit
Summary by CodeRabbit
New Features
Refactor
puttopushto reflect updated API.Chores