Skip to content

Conversation

@hanabi1224
Copy link
Contributor

@hanabi1224 hanabi1224 commented Dec 4, 2025

Summary of changes

Fix/mitigate #5633 and #6149

Changes introduced in this pull request:

  • add eth_block_cache to improve performance of a few Eth RPC method
  • add cache for retrieving fn heaveist_tipset

Calibnet

# `eth_getBlockByNumber` @ main
    HTTP
    http_req_duration..............: avg=2.34ms min=22.36µs  med=2.33ms max=229.18ms p(90)=3.28ms p(95)=3.69ms
      { expected_response:true }...: avg=2.34ms min=22.36µs  med=2.33ms max=229.18ms p(90)=3.28ms p(95)=3.69ms
    http_req_failed................: 0.00%  0 out of 233885
    http_reqs......................: 233885 8271.839525/s

# `eth_getBlockByNumber` @ pr
    HTTP
    http_req_duration..............: avg=1.44ms min=339.28µs med=1.34ms max=13.35ms p(90)=2.34ms p(95)=2.77ms
      { expected_response:true }...: avg=1.44ms min=339.28µs med=1.34ms max=13.35ms p(90)=2.34ms p(95)=2.77ms
    http_req_failed................: 0.00%  0 out of 359372
    http_reqs......................: 359372 11978.279043/s

# `eth_getBlockByNumber` @ lotus
 HTTP
    http_req_duration..............: avg=1.41ms min=-1800252041ns med=1.21ms max=39.97ms p(90)=2.24ms p(95)=2.83ms
      { expected_response:true }...: avg=1.41ms min=-1800252041ns med=1.21ms max=39.97ms p(90)=2.24ms p(95)=2.83ms
    http_req_failed................: 0.00%  0 out of 358954
    http_reqs......................: 358954 12729.172151/s
# `eth_getBlockByHash` @ main
    HTTP
    http_req_duration..............: avg=5.94ms min=34.66µs med=5.63ms max=22.56ms p(90)=7.56ms p(95)=8.43ms
      { expected_response:true }...: avg=5.94ms min=34.66µs med=5.63ms max=22.56ms p(90)=7.56ms p(95)=8.43ms
    http_req_failed................: 0.00%  0 out of 94896
    http_reqs......................: 94896  3309.52354/s

# `eth_getBlockByHash` @ pr
    HTTP
    http_req_duration..............: avg=2.01ms min=-1232130651ns med=1.97ms max=26.88ms p(90)=3.25ms p(95)=3.79ms
      { expected_response:true }...: avg=2.01ms min=-1232130651ns med=1.97ms max=26.88ms p(90)=3.25ms p(95)=3.79ms
    http_req_failed................: 0.00%  0 out of 264833
    http_reqs......................: 264833 9205.819417/s

# `eth_getBlockByHash` @ lotus
    HTTP
    http_req_duration..............: avg=1.33ms min=-1325814894ns med=1.21ms max=29.8ms  p(90)=1.93ms p(95)=2.27ms
      { expected_response:true }...: avg=1.33ms min=-1325814894ns med=1.21ms max=29.8ms  p(90)=1.93ms p(95)=2.27ms
    http_req_failed................: 0.00%  0 out of 354996
    http_reqs......................: 354996 12380.031203/s

Reference issue to close (if applicable)

Closes

Other information and links

Change checklist

  • I have performed a self-review of my own code,
  • I have made corresponding changes to the documentation. All new code adheres to the team's documentation standards,
  • I have added tests that prove my fix is effective or that my feature works (if possible),
  • I have made sure the CHANGELOG is up-to-date. All user-facing changes should be reflected in this document.

Summary by CodeRabbit

  • New Features

    • Added FOREST_ETH_BLOCK_CACHE_SIZE environment variable (default: 500) to configure Ethereum block cache size.
    • Introduced block caching mechanisms to improve performance and reduce redundant computations.
  • Documentation

    • Updated dictionary with new entry.

✏️ Tip: You can customize this high-level summary in your review settings.

@coderabbitai
Copy link
Contributor

coderabbitai bot commented Dec 4, 2025

Walkthrough

This PR introduces caching optimizations and cleaner type handling across the codebase. Key changes include: a per-block ETH cache in the RPC layer to improve block fetch performance; a heaviest-tipset in-memory cache in ChainStore with RwLock for concurrent reads; GetSize trait implementations for Ethereum types to enable heap-size accounting; systematic refactoring to use the nonzero! macro instead of verbose NonZeroUsize construction; and a new environment variable for configuring the ETH block cache size.

Changes

Cohort / File(s) Summary
Documentation
docs/dictionary.txt, docs/docs/users/reference/env_variables.md
Added dictionary entry "Eth" and new environment variable FOREST_ETH_BLOCK_CACHE_SIZE (default 500).
NonZero Macro Refactoring
src/beacon/drand.rs, src/cli/subcommands/chain_cmd/list.rs, src/libp2p/service.rs, src/state_migration/common/mod.rs, src/state_migration/common/state_migration.rs, src/tool/subcommands/state_compute_cmd.rs
Replaced verbose NonZeroUsize::new(...).expect() / .unwrap() patterns with the nonzero! macro for consistency and readability.
ETH Block Caching & Size-Tracking
src/rpc/methods/eth.rs, src/rpc/methods/eth/types.rs
Introduced static ETH_BLOCK_CACHE (SizeTrackingLruCache) for per-block caching; implemented GetSize trait for EthBigInt, Nonce, Bloom, EthAddress, EthHash, and derived GetSize on public types (EthUint64, Block, ApiEthTx, Transactions, EthBytes).
ChainStore Caching & Improvements
src/chain/store/chain_store.rs
Added heaviest_tipset_cache: Arc<RwLock<Option<Tipset>>> for thread-safe in-memory caching; updated set_heaviest_tipset and heaviest_tipset methods to use cache; migrated MsgsInTipsetCache to use nonzero! macro.
Cleanup
src/shim/message.rs
Removed commented-out GetSize impl block for Message.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

Areas requiring extra attention:

  • Cache coherence in src/rpc/methods/eth.rs: Verify that the ETH_BLOCK_CACHE logic correctly handles cache hits/misses and that the cached Block structure (including transactions) matches expected semantics for full vs. hash transaction inclusion.
  • Thread-safety of heaviest_tipset_cache in src/chain/store/chain_store.rs: Ensure RwLock usage is sound, check for potential deadlocks or TOCTOU issues between cache read and fallback-to-provider logic.
  • GetSize implementations: Confirm that size-tracking for EthBigInt, Nonce, Bloom, and other types correctly accounts for heap allocations (e.g., verify that returning 0 for fixed-size types is intentional).

Possibly related PRs

Suggested labels

RPC, performance, eth

Suggested reviewers

  • elmattic
  • sudo-shashank
  • LesnyRumcajs

Pre-merge checks and finishing touches

❌ Failed checks (2 warnings)
Check name Status Explanation Resolution
Out of Scope Changes check ⚠️ Warning Multiple changes introduce the nonzero! macro across unrelated modules (chain_cmd/list.rs, drand.rs, libp2p/service.rs, state_migration files, state_compute_cmd.rs) which appear to be refactoring outside the stated caching objectives. Isolate caching-related changes into focused commits. Move nonzero! macro refactoring into a separate preparatory PR for consistency and maintainability.
Docstring Coverage ⚠️ Warning Docstring coverage is 26.32% which is insufficient. The required threshold is 80.00%. You can run @coderabbitai generate docstrings to improve docstring coverage.
✅ Passed checks (3 passed)
Check name Status Explanation
Title check ✅ Passed The PR title accurately describes the main objective: adding eth_block_cache to improve Eth RPC method performance, which aligns with the substantial caching implementation across multiple files.
Linked Issues check ✅ Passed The changes successfully implement eth_block_cache (src/rpc/methods/eth.rs, src/rpc/methods/eth/types.rs) and heaviest_tipset_cache (src/chain/store/chain_store.rs) to address performance issues in eth_getBlockByNumber and related methods outlined in issue #5633.
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
✨ Finishing touches
  • 📝 Generate docstrings
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch hm/eth_block_cache

Comment @coderabbitai help to get the list of available commands and usage tips.

@hanabi1224 hanabi1224 marked this pull request as ready for review December 4, 2025 12:49
@hanabi1224 hanabi1224 requested a review from a team as a code owner December 4, 2025 12:49
@hanabi1224 hanabi1224 requested review from LesnyRumcajs and sudo-shashank and removed request for a team December 4, 2025 12:49
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (1)
src/chain/store/chain_store.rs (1)

227-238: Consider populating the cache on miss.

The fallback path (lines 231-237) loads the tipset from the provider but doesn't update the cache. This means repeated calls before the first set_heaviest_tipset will keep hitting the provider.

For a frequently called method, consider populating the cache on miss:

     pub fn heaviest_tipset(&self) -> Tipset {
         if let Some(ts) = &*self.heaviest_tipset_cache.read() {
             return ts.clone();
         }
         let tsk = self
             .heaviest_tipset_key_provider
             .heaviest_tipset_key()
             .unwrap_or_else(|_| TipsetKey::from(nunny::vec![*self.genesis_block_header.cid()]));
-        self.chain_index
+        let ts = self.chain_index
             .load_required_tipset(&tsk)
-            .expect("failed to load heaviest tipset")
+            .expect("failed to load heaviest tipset");
+        *self.heaviest_tipset_cache.write() = Some(ts.clone());
+        ts
     }

This would eliminate redundant loads during startup or after restarts.

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 0c7718e and a758707.

📒 Files selected for processing (12)
  • docs/dictionary.txt (1 hunks)
  • docs/docs/users/reference/env_variables.md (1 hunks)
  • src/beacon/drand.rs (3 hunks)
  • src/chain/store/chain_store.rs (6 hunks)
  • src/cli/subcommands/chain_cmd/list.rs (2 hunks)
  • src/libp2p/service.rs (2 hunks)
  • src/rpc/methods/eth.rs (9 hunks)
  • src/rpc/methods/eth/types.rs (4 hunks)
  • src/shim/message.rs (0 hunks)
  • src/state_migration/common/mod.rs (1 hunks)
  • src/state_migration/common/state_migration.rs (2 hunks)
  • src/tool/subcommands/state_compute_cmd.rs (2 hunks)
💤 Files with no reviewable changes (1)
  • src/shim/message.rs
🧰 Additional context used
🧠 Learnings (6)
📓 Common learnings
Learnt from: hanabi1224
Repo: ChainSafe/forest PR: 5944
File: src/chain/store/index.rs:0-0
Timestamp: 2025-08-18T03:09:47.932Z
Learning: In Forest's tipset_by_height caching implementation, hanabi1224 prefers performance-conscious solutions that leverage finality guarantees rather than expensive chain walking for fork detection. The approach of constraining cache lookups to finalized epochs (using CHECKPOINT_INTERVAL >= CHAIN_FINALITY) provides fork safety without the performance cost of ancestry verification.
Learnt from: hanabi1224
Repo: ChainSafe/forest PR: 5930
File: build.rs:64-77
Timestamp: 2025-08-13T09:43:20.301Z
Learning: hanabi1224 prefers hard compile-time errors in build scripts rather than runtime safeguards or collision detection, believing it's better to fail fast and fix root causes of issues like malformed snapshot names.
Learnt from: hanabi1224
Repo: ChainSafe/forest PR: 6057
File: src/cli/subcommands/f3_cmd.rs:0-0
Timestamp: 2025-09-09T10:37:17.947Z
Learning: hanabi1224 prefers having default timeouts (like 10m for --no-progress-timeout) to prevent commands from hanging indefinitely, even when the timeout flag isn't explicitly provided by users. This fail-fast approach is preferred over requiring explicit flag usage.
📚 Learning: 2025-07-17T15:21:40.753Z
Learnt from: hanabi1224
Repo: ChainSafe/forest PR: 5841
File: src/utils/get_size/mod.rs:10-10
Timestamp: 2025-07-17T15:21:40.753Z
Learning: The get_size2 crate's GetSize trait provides default implementations: get_stack_size() uses std::mem::size_of, get_heap_size() returns 0, and get_size() returns their sum. An empty impl like `impl GetSize for MyType {}` is valid and uses these defaults, making it suitable for simple wrapper types that don't allocate heap memory.

Applied to files:

  • src/rpc/methods/eth/types.rs
  • src/rpc/methods/eth.rs
📚 Learning: 2025-09-11T16:03:14.328Z
Learnt from: akaladarshi
Repo: ChainSafe/forest PR: 6000
File: src/tool/subcommands/api_cmd/state_decode_params_tests/miner.rs:4-9
Timestamp: 2025-09-11T16:03:14.328Z
Learning: In the Forest codebase's state_decode_params_tests module, common imports like `to_vec`, `TokenAmount`, `Cid`, etc. are centralized in the mod.rs file and made available to submodules through `use super::*;`, eliminating the need for duplicate imports in individual test files.

Applied to files:

  • src/state_migration/common/state_migration.rs
  • src/state_migration/common/mod.rs
📚 Learning: 2025-08-18T03:09:47.932Z
Learnt from: hanabi1224
Repo: ChainSafe/forest PR: 5944
File: src/chain/store/index.rs:0-0
Timestamp: 2025-08-18T03:09:47.932Z
Learning: In Forest's tipset_by_height caching implementation, hanabi1224 prefers performance-conscious solutions that leverage finality guarantees rather than expensive chain walking for fork detection. The approach of constraining cache lookups to finalized epochs (using CHECKPOINT_INTERVAL >= CHAIN_FINALITY) provides fork safety without the performance cost of ancestry verification.

Applied to files:

  • src/chain/store/chain_store.rs
📚 Learning: 2025-10-17T14:24:47.046Z
Learnt from: hanabi1224
Repo: ChainSafe/forest PR: 6167
File: src/tool/subcommands/state_compute_cmd.rs:89-91
Timestamp: 2025-10-17T14:24:47.046Z
Learning: In `src/tool/subcommands/state_compute_cmd.rs`, when using `ReadOpsTrackingStore` to generate minimal snapshots, `HEAD_KEY` should be written to `db.tracker` (not `db` itself) before calling `export_forest_car()`, because the export reads from the tracker MemoryDB which accumulates only the accessed data during computation.

Applied to files:

  • src/chain/store/chain_store.rs
📚 Learning: 2025-08-08T12:10:45.218Z
Learnt from: hanabi1224
Repo: ChainSafe/forest PR: 5867
File: src/ipld/util.rs:553-558
Timestamp: 2025-08-08T12:10:45.218Z
Learning: Forest project targets Rust stable >=1.89; features stabilized in 1.88 like let-chains are acceptable in this codebase.

Applied to files:

  • src/chain/store/chain_store.rs
🧬 Code graph analysis (4)
src/rpc/methods/eth/types.rs (3)
src/rpc/methods/eth.rs (3)
  • get_heap_size (141-143)
  • get_heap_size (169-171)
  • get_heap_size (183-185)
src/shim/econ.rs (1)
  • get_heap_size (40-42)
src/shim/executor.rs (2)
  • get_heap_size (138-144)
  • get_heap_size (348-357)
src/state_migration/common/state_migration.rs (2)
src/beacon/drand.rs (1)
  • new (245-263)
src/state_migration/common/mod.rs (1)
  • new (32-39)
src/chain/store/chain_store.rs (1)
src/chain_sync/bad_block_cache.rs (1)
  • default (20-22)
src/rpc/methods/eth.rs (3)
src/utils/cache/lru.rs (2)
  • cache (99-101)
  • new_with_metrics (83-87)
src/utils/get_size/mod.rs (1)
  • big_int_heap_size_helper (56-58)
src/rpc/methods/eth/types.rs (5)
  • get_heap_size (123-125)
  • get_heap_size (407-409)
  • new (73-83)
  • new (423-428)
  • new (435-437)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (11)
  • GitHub Check: Build forest binaries on Linux AMD64
  • GitHub Check: tests
  • GitHub Check: tests-release
  • GitHub Check: Bootstrap checks - Lotus
  • GitHub Check: Devnet checks
  • GitHub Check: Diff snapshot export checks
  • GitHub Check: V2 snapshot export checks
  • GitHub Check: Wallet tests
  • GitHub Check: Calibnet eth mapping check
  • GitHub Check: Forest CLI checks
  • GitHub Check: Calibnet check
🔇 Additional comments (23)
docs/dictionary.txt (1)

31-31: LGTM!

The addition of "Eth" to the dictionary is correctly placed alphabetically and aligns with the Ethereum-related changes in this PR.

docs/docs/users/reference/env_variables.md (1)

54-54: LGTM!

The documentation for FOREST_ETH_BLOCK_CACHE_SIZE is clear and follows the established format. The default value of 500 is reasonable for a block cache.

src/libp2p/service.rs (1)

38-38: LGTM!

The refactoring to use nonzero!(20usize) is cleaner and more concise than the previous manual construction. The macro provides compile-time guarantees while maintaining identical runtime behavior.

Also applies to: 210-210

src/state_migration/common/mod.rs (1)

136-136: LGTM!

The test code refactoring to use the nonzero! macro improves readability while maintaining identical test behavior.

Also applies to: 140-140

src/tool/subcommands/state_compute_cmd.rs (1)

18-18: LGTM!

The refactoring of the default value to use nonzero!(1usize) is cleaner and more maintainable while preserving the same behavior.

Also applies to: 120-120

src/cli/subcommands/chain_cmd/list.rs (1)

8-8: LGTM!

The refactoring to use nonzero!(30usize) for the default count value is cleaner and aligns with the project-wide refactoring effort.

Also applies to: 29-29

src/state_migration/common/state_migration.rs (2)

7-11: LGTM!

The consolidated imports and addition of the nonzero macro import improve code organization and align with the refactoring effort.

Also applies to: 16-16


82-82: LGTM!

The refactoring to use nonzero!(10_000usize) is more concise and eliminates the need for the .expect("infallible") call while maintaining the same cache size.

src/beacon/drand.rs (1)

22-22: LGTM!

The refactoring to make CACHE_SIZE a NonZeroUsize constant using the nonzero! macro improves type safety and eliminates the need for runtime wrapping. This is a clean improvement to the code.

Also applies to: 247-247, 260-260

src/chain/store/chain_store.rs (4)

34-35: LGTM on imports.

The nonzero_ext::nonzero macro and RwLock imports are appropriate for the new caching functionality.


68-70: LGTM on cache field and initialization.

Using Arc<RwLock<Option<Tipset>>> is appropriate for a read-heavy cache with occasional writes. The cache will warm up naturally on first access to heaviest_tipset().

Also applies to: 138-138


150-158: LGTM on set_heaviest_tipset update.

The cache is correctly updated after persisting to the provider. The ordering ensures durability before caching.


610-617: LGTM on nonzero! macro usage.

Clean usage of the nonzero! macro for compile-time non-zero guarantees, consistent with the pattern used elsewhere in the codebase.

src/rpc/methods/eth/types.rs (3)

8-8: LGTM on GetSize for EthBytes.

Deriving GetSize on EthBytes correctly accounts for the Vec<u8> heap allocation. The derive macro will properly track the underlying vector's heap size.

Also applies to: 20-38


122-126: LGTM on GetSize for EthAddress.

Returning 0 for heap size is correct since ethereum_types::Address (H160) is a fixed-size [u8; 20] stored inline with no heap allocation. Based on learnings about the GetSize trait's default implementations.


406-410: LGTM on GetSize for EthHash.

Returning 0 for heap size is correct since ethereum_types::H256 is a fixed-size [u8; 32] stored inline with no heap allocation.

src/rpc/methods/eth.rs (7)

56-56: LGTM on new imports.

The imports for SizeTrackingLruCache, GetSize, NonZeroUsize, and nonzero are appropriately added to support the new caching functionality.

Also applies to: 59-59, 69-69, 72-72, 76-76


140-144: LGTM on GetSize for EthBigInt.

Using big_int_heap_size_helper correctly accounts for the BigInt's heap allocation, consistent with the pattern used in src/shim/econ.rs.


168-186: LGTM on GetSize for Nonce and Bloom.

Both types wrap fixed-size arrays (H64 is 8 bytes, ethereum_types::Bloom is 256 bytes) with no heap allocation, so returning 0 for heap size is correct.


206-206: LGTM on GetSize derives.

Deriving GetSize on EthUint64, Transactions, Block, and ApiEthTx enables accurate heap size tracking for the SizeTrackingLruCache. The derive macro will correctly propagate size calculations through nested types.

Also applies to: 474-474, 506-506, 551-551


1417-1425: LGTM on static cache initialization.

The LazyLock pattern ensures thread-safe single initialization. The 500-block default is reasonable, and the FOREST_ETH_BLOCK_CACHE_SIZE environment variable follows the project's pattern for configurable cache sizes.


1427-1487: Solid caching implementation with full transaction storage.

The approach of always caching blocks with Transactions::Full and converting to Transactions::Hash on-the-fly is a good design choice - it maximizes cache utility by storing the richer representation.

One minor note: the block is cloned from cache and then potentially mutated (line 1483). This is correct since get_cloned returns an owned copy, but worth noting for clarity.


1482-1485: LGTM on transaction format conversion.

The let-chain pattern is idiomatic and efficient. Converting from Transactions::Full to Transactions::Hash on-demand avoids duplicating cache entries.

@hanabi1224 hanabi1224 added the RPC requires calibnet RPC checks to run on CI label Dec 4, 2025
let (state_root, msgs_and_receipts) = execute_tipset(&data, &tipset).await?;
static ETH_BLOCK_CACHE: LazyLock<SizeTrackingLruCache<CidWrapper, Block>> =
LazyLock::new(|| {
const DEFAULT_CACHE_SIZE: NonZeroUsize = nonzero!(500usize);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's the expected memory footprint of a maxed-out eth block cache?

Copy link
Contributor Author

@hanabi1224 hanabi1224 Dec 5, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Based on the metrics, a rough estimation would be 265108 * 500 / 3 = 44184667 ~= 42MiB

# HELP cache_eth_block_14_size_bytes Size of LruCache eth_block_14 in bytes
# TYPE cache_eth_block_14_size_bytes gauge
# UNIT cache_eth_block_14_size_bytes bytes
cache_eth_block_14_size_bytes 265108
# HELP eth_block_14_len Length of LruCache eth_block_14
# TYPE eth_block_14_len gauge
eth_block_14_len 3
# HELP eth_block_14_cap Capacity of LruCache eth_block_14
# TYPE eth_block_14_cap gauge
eth_block_14_cap 500

@LesnyRumcajs LesnyRumcajs added this pull request to the merge queue Dec 5, 2025
Merged via the queue into main with commit 979ce6f Dec 5, 2025
76 of 82 checks passed
@LesnyRumcajs LesnyRumcajs deleted the hm/eth_block_cache branch December 5, 2025 16:01
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

RPC requires calibnet RPC checks to run on CI

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants