-
Notifications
You must be signed in to change notification settings - Fork 182
feat: forest-tool state compute
#6167
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
WalkthroughAdds Changes
Sequence Diagram(s)sequenceDiagram
actor User
participant CLI as forest-tool CLI
participant StateCmd as StateCommand::Compute
participant DB as Read-only DB (ManyCar/ParityDb)
participant ChainStore as ChainStore
participant SM as StateManager
participant CAR as CAR Export
User->>CLI: forest-tool state compute --epoch N --chain <network> [--db <path>] [--export_db_to <path>]
CLI->>StateCmd: run()
StateCmd->>DB: load read-only snapshot (load_db)
StateCmd->>ChainStore: build ChainStore (with genesis header)
StateCmd->>ChainStore: resolve tipset at epoch N
StateCmd->>SM: create StateManager
StateCmd->>SM: compute tipset state
SM-->>StateCmd: state_root, receipt_root
StateCmd->>CAR: optionally export CAR snapshot
StateCmd->>User: print epoch, state_root, receipt_root, snapshot size
sequenceDiagram
actor User
participant CLI as forest-tool CLI
participant ReplayCmd as StateCommand::ReplayCompute
participant AnyCar as AnyCar (load snapshot)
participant MemDB as In-memory DB (ManyCar<MemoryDB>)
participant ChainStore as ChainStore
participant SM as StateManager
User->>CLI: forest-tool state replay-compute --snapshot <path> --chain <network> -n <count>
CLI->>ReplayCmd: run()
ReplayCmd->>AnyCar: load CAR snapshot
AnyCar-->>ReplayCmd: heaviest tipset & epoch
ReplayCmd->>MemDB: build in-memory DB from snapshot
ReplayCmd->>ChainStore: build ChainStore (with genesis header)
loop n times
ReplayCmd->>SM: compute tipset state
SM-->>ReplayCmd: state_root, receipt_root, duration
ReplayCmd->>User: print epoch, state_root, receipt_root, duration
end
Estimated code review effort🎯 4 (Complex) | ⏱️ ~45 minutes Possibly related PRs
Suggested reviewers
Pre-merge checks and finishing touches❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
✨ Finishing touches
🧪 Generate unit tests (beta)
Comment |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 2
🧹 Nitpick comments (4)
src/db/car/any.rs (1)
141-146: LGTM with an optional idiom note.The implementation correctly delegates to the existing
TryFrom<&Path>viaas_path(). This provides convenient conversion for callers holding&PathBuf.Note: Rust API guidelines typically favor accepting
&Pathover&PathBufsincePathBufderefs toPath, making&Pathmore flexible. However, as a trait implementation (rather than a function parameter), this may simplify specific call sites. If this isn't used frequently, consider whether callers could use.as_ref()or.as_path()with the existingTryFrom<&Path>instead.src/tool/subcommands/state_compute_cmd.rs (3)
81-81: Avoid unwraps on fallible constructors; add contextReplace
unwrap()with?plusanyhow::Contextfor clearer errors.+use anyhow::Context as _; ... - ) - .unwrap(), + ) + .context("creating ChainStore")?, ... - let state_manager = Arc::new(StateManager::new(chain_store.clone()).unwrap()); + let state_manager = Arc::new( + StateManager::new(chain_store.clone()).context("creating StateManager")? + ); ... - ) - .unwrap(), + ) + .context("creating ChainStore (replay)")?, ... - let state_manager = Arc::new(StateManager::new(chain_store.clone()).unwrap()); + let state_manager = Arc::new( + StateManager::new(chain_store.clone()).context("creating StateManager (replay)")? + );Also applies to: 90-90, 145-145
100-107: Stream snapshot to file when exporting to avoid double bufferingCurrently you always build a full in-memory
Vec<u8>then optionally write it to disk. When--export-db-tois set, write directly to the file; fall back to in-memory only when not exporting.- let mut db_snapshot = vec![]; - db.export_forest_car(&mut db_snapshot).await?; - println!( - "epoch: {epoch}, state_root: {state_root}, receipt_root: {receipt_root}, db_snapshot_size: {}", - human_bytes::human_bytes(db_snapshot.len() as f64) - ); - if let Some(export_db_to) = export_db_to { - std::fs::write(export_db_to, db_snapshot)?; - } + if let Some(export_db_to) = export_db_to { + let mut file = std::fs::File::create(&export_db_to)?; + db.export_forest_car(&mut file).await?; + let size = file.metadata()?.len(); + println!( + "epoch: {epoch}, state_root: {state_root}, receipt_root: {receipt_root}, db_snapshot_size: {}", + human_bytes::human_bytes(size as u128) + ); + } else { + let mut db_snapshot = Vec::new(); + db.export_forest_car(&mut db_snapshot).await?; + println!( + "epoch: {epoch}, state_root: {state_root}, receipt_root: {receipt_root}, db_snapshot_size: {}", + human_bytes::human_bytes(db_snapshot.len() as u128) + ); + }
153-164: Minor: consistent byte/time formatting across commandsYou’re using
humantimeandhuman_bytes—good. Consider adding the same byte-size line to replay runs (e.g., snapshot size once before the loop) for parity withcompute. Optional.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (9)
CHANGELOG.md(1 hunks)docs/docs/users/reference/cli.sh(1 hunks)src/db/car/any.rs(2 hunks)src/tool/main.rs(1 hunks)src/tool/subcommands/api_cmd.rs(1 hunks)src/tool/subcommands/api_cmd/api_compare_tests.rs(1 hunks)src/tool/subcommands/api_cmd/generate_test_snapshot.rs(2 hunks)src/tool/subcommands/mod.rs(2 hunks)src/tool/subcommands/state_compute_cmd.rs(1 hunks)
🧰 Additional context used
🧬 Code graph analysis (3)
src/tool/subcommands/state_compute_cmd.rs (6)
src/cli_shared/mod.rs (2)
chain_path(20-22)read_config(24-41)src/db/mod.rs (1)
db_root(320-322)src/genesis/mod.rs (1)
read_genesis_header(21-44)src/tool/subcommands/api_cmd/generate_test_snapshot.rs (3)
load_db(73-79)new(190-195)setting_keys(250-252)src/db/car/any.rs (4)
new(34-53)try_from(129-131)try_from(136-138)try_from(143-145)src/networks/mod.rs (1)
from_chain(404-414)
src/db/car/any.rs (2)
src/db/car/forest.rs (2)
std(109-109)try_from(241-243)src/db/car/many.rs (2)
try_from(157-159)try_from(164-166)
src/tool/subcommands/api_cmd/generate_test_snapshot.rs (1)
src/tool/subcommands/shed_cmd/migration.rs (1)
load_db(101-107)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (9)
- GitHub Check: Build forest binaries on Linux AMD64
- GitHub Check: tests-release
- GitHub Check: tests
- GitHub Check: Check
- GitHub Check: cargo-publish-dry-run
- GitHub Check: Build MacOS
- GitHub Check: Build Ubuntu
- GitHub Check: Deploy to Cloudflare Pages
- GitHub Check: All lint checks
🔇 Additional comments (2)
src/db/car/any.rs (1)
21-21: LGTM!The import update correctly adds
PathBufto support the newTryFromimplementation below.src/tool/subcommands/state_compute_cmd.rs (1)
101-104: Let me verify the actual state by checking the codebase more carefully:Based on the verification, I found the human_bytes function in version 0.2.1 accepts both
u32directly andu64 as f64, suggesting it uses generic type handling. However, I cannot definitively verify the exact API signature for version 0.4 from web sources. The inconsistency in the codebase reveals that:
archive_cmd.rs:336passesindex_size_byteswithout an f64 caststate_compute_cmd.rs:103casts withas f64snapshot.rs:357also casts withas f64If the code currently compiles with these different patterns, the function likely accepts multiple numeric types. The review comment's rationale about "precision loss" for large snapshots is reasonable (f64 loses precision at very large byte counts), but the suggestion to use
u128specifically requires verification of the 0.4 API.Use integer type for human_bytes to avoid precision loss
The current code casts to
f64, which loses precision for large byte counts. However, before applying the suggested change tou128, verify that version 0.4 ofhuman_bytesacceptsu128. The codebase already uses this function inconsistently:archive_cmd.rs:336passes an integer directly without casting, whilestate_compute_cmd.rs:103andsnapshot.rs:357both usef64casts. Align all usages after confirming the supported input types forhuman_bytes 0.4.
| mod net_cmd; | ||
| mod shed_cmd; | ||
| mod snapshot_cmd; | ||
| mod state_compute_cmd; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fix “private type in public interface” for Subcommand::State
Subcommand is pub, but it exposes state_compute_cmd::StateCommand from a private module, which breaks visibility (E0446). Re-export the type and reference the re-export, or make the module public. Minimal fix:
mod state_compute_cmd;
+pub use self::state_compute_cmd::StateCommand;
...
- #[command(subcommand)]
- State(state_compute_cmd::StateCommand),
+ #[command(subcommand)]
+ State(StateCommand),Also applies to: 85-87
🤖 Prompt for AI Agents
In src/tool/subcommands/mod.rs around lines 15 and also affecting lines 85-87,
Subcommand::State exposes StateCommand from a private module causing a "private
type in public interface" error; fix by either making the module public (change
the module declaration to pub mod state_compute_cmd) or re-exporting the type
(add pub use state_compute_cmd::StateCommand) and update Subcommand::State to
reference the re-exported StateCommand (e.g., use
crate::tool::subcommands::StateCommand) so the public enum only exposes public
types.
| SettingsStoreExt::write_obj(&db.tracker, crate::db::setting_keys::HEAD_KEY, ts.key())?; | ||
| let state_manager = Arc::new(StateManager::new(chain_store.clone()).unwrap()); | ||
|
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wrong store used when writing HEAD; likely compile error and unintended DB mutation
SettingsStoreExt::write_obj(&db.tracker, ...) targets the read-ops tracker, not the settings store. This won’t compile and, if pointed to the real DB, could also mutate a live DB’s HEAD. Use the DB wrapper itself, or drop the HEAD write entirely (not needed when passing ts to compute_tipset_state).
- SettingsStoreExt::write_obj(&db.tracker, crate::db::setting_keys::HEAD_KEY, ts.key())?;
+ // Optional: this write is not strictly necessary for compute; remove if you want no side-effects.
+ SettingsStoreExt::write_obj(&*db, crate::db::setting_keys::HEAD_KEY, &ts.key())?;If you prefer method syntax:
- SettingsStoreExt::write_obj(&db.tracker, crate::db::setting_keys::HEAD_KEY, ts.key())?;
+ db.write_obj(crate::db::setting_keys::HEAD_KEY, &ts.key())?;📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| SettingsStoreExt::write_obj(&db.tracker, crate::db::setting_keys::HEAD_KEY, ts.key())?; | |
| let state_manager = Arc::new(StateManager::new(chain_store.clone()).unwrap()); | |
| // Optional: this write is not strictly necessary for compute; remove if you want no side-effects. | |
| SettingsStoreExt::write_obj(&*db, crate::db::setting_keys::HEAD_KEY, &ts.key())?; | |
| let state_manager = Arc::new(StateManager::new(chain_store.clone()).unwrap()); | |
🤖 Prompt for AI Agents
In src/tool/subcommands/state_compute_cmd.rs around lines 89 to 91, the call
uses SettingsStoreExt::write_obj(&db.tracker, crate::db::setting_keys::HEAD_KEY,
ts.key()) which targets the tracker store (causing a compile error and risking
mutating the wrong DB); change the call to use the DB wrapper (e.g.
SettingsStoreExt::write_obj(&db, crate::db::setting_keys::HEAD_KEY, ts.key()))
or simply remove the HEAD write entirely since ts is passed into
compute_tipset_state; ensure you use the correct settings store reference or
drop the write to avoid unintended DB mutation.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is intended. Setting the head for exporting the snapshot from the tracker
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🧩 Analysis chain
🏁 Script executed:
#!/bin/bash
# Find the ReadOpsTrackingStore definition and understand its structure
ast-grep --pattern 'struct ReadOpsTrackingStore'
Length of output: 0
🏁 Script executed:
#!/bin/bash
# Search for how the snapshot/export mechanism works
rg -nP -C5 'export.*snapshot|snapshot.*export' --type=rust
Length of output: 19887
🏁 Script executed:
#!/bin/bash
# Look for the tracker field usage in the codebase
rg -nP '\btracker\b' --type=rust -C3 -g '!target/**'
Length of output: 17067
🏁 Script executed:
#!/bin/bash
# Find where export_snapshot_to_forest_car is implemented/called
rg -nP 'export_snapshot_to_forest_car|export.*forest.*car' --type=rust -C5
Length of output: 7646
@hanabi1224, thank you for the clarification! You're absolutely right. I now understand the architecture:
The ReadOpsTrackingStore uses its tracker (a MemoryDB) to accumulate all data accessed during the state computation. When export_forest_car is called, it exports from this tracker, creating a minimal snapshot containing only the blocks that were actually read.
Writing HEAD_KEY to db.tracker is indeed necessary because MemoryDB::export_forest_car reads the HEAD_KEY from the store it's exporting. Without this write, the export would fail.
I apologize for the incorrect initial assessment. The code is working as designed.
🐰
/)_/)
(•.•)
/>📚 Learning noted!
✏️ Learnings added
Learnt from: hanabi1224
PR: ChainSafe/forest#6167
File: src/tool/subcommands/state_compute_cmd.rs:89-91
Timestamp: 2025-10-17T14:24:47.006Z
Learning: In `src/tool/subcommands/state_compute_cmd.rs`, when using `ReadOpsTrackingStore` to generate minimal snapshots, `HEAD_KEY` should be written to `db.tracker` (not `db` itself) before calling `export_forest_car()`, because the export reads from the tracker MemoryDB which accumulates only the accessed data during computation.
Note: Learnings are effective only in the context of similar code segments. To apply general coding guidelines, please configure review instructions. You can manage existing learnings in the app.
| chain_config, | ||
| genesis_header, | ||
| ) | ||
| .unwrap(), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why unwrap? Is it infallible?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed.
| ) | ||
| .unwrap(), | ||
| ); | ||
| let state_manager = Arc::new(StateManager::new(chain_store.clone()).unwrap()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why unwrap? Is it infallible?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed.
| )?; | ||
| let epoch = ts.epoch(); | ||
| SettingsStoreExt::write_obj(&db.tracker, crate::db::setting_keys::HEAD_KEY, ts.key())?; | ||
| let state_manager = Arc::new(StateManager::new(chain_store.clone()).unwrap()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why unwrap? Is it infallible?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed.
| chain_config, | ||
| genesis_header, | ||
| ) | ||
| .unwrap(), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why unwrap? Is it infallible?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed.
Co-authored-by: Hubert <hubert@chainsafe.io>
LesnyRumcajs
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I understand those are short-lived commands, but panics are really a bad user experience. Let's propagate errors with helpful messages, by the book.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
🧹 Nitpick comments (4)
src/tool/subcommands/state_compute_cmd.rs (4)
61-66: Add error context when resolving DB path/config.Wrap fallible ops with anyhow::Context for actionable CLI errors.
Apply within the selected lines:
- let (_, config) = read_config(None, Some(chain.clone()))?; - db_root(&chain_path(&config))? + let (_, config) = read_config(None, Some(chain.clone())) + .context("reading forest config")?; + db_root(&chain_path(&config)) + .context("resolving database root path")?Outside the selected lines, add the import near the other uses:
+use anyhow::Context;
72-82: Avoid unwraps; propagate with context.Prefer bubbling errors with context over panicking in a CLI tool.
- let chain_store = Arc::new( - ChainStore::new( + let chain_store = Arc::new( + ChainStore::new( db.clone(), db.clone(), db.clone(), db.clone(), chain_config, genesis_header, - ) - .unwrap(), + ) + .context("initializing ChainStore")?, ); - let state_manager = Arc::new(StateManager::new(chain_store.clone()).unwrap()); + let state_manager = Arc::new( + StateManager::new(chain_store.clone()).context("initializing StateManager")?, + );- let chain_store = Arc::new( - ChainStore::new( + let chain_store = Arc::new( + ChainStore::new( db.clone(), db.clone(), db.clone(), db.clone(), chain_config, genesis_header, - ) - .unwrap(), + ) + .context("initializing ChainStore")?, ); - let state_manager = Arc::new(StateManager::new(chain_store.clone()).unwrap()); + let state_manager = Arc::new( + StateManager::new(chain_store.clone()).context("initializing StateManager")?, + );Also applies to: 90-90, 146-146, 148-148
89-89: Keep writing HEAD to the tracker; add a brief comment to document intent.This is correct for exporting minimal snapshots from ReadOpsTrackingStore. Add a note to prevent future “wrong store” regressions.
- SettingsStoreExt::write_obj(&db.tracker, crate::db::setting_keys::HEAD_KEY, ts.key())?; + // Write HEAD to the tracker MemoryDB so export_forest_car() uses the accessed subset. + // This intentionally does NOT mutate the underlying on-disk DB. + SettingsStoreExt::write_obj(&db.tracker, crate::db::setting_keys::HEAD_KEY, ts.key())?;Based on learnings
149-166: Print iteration index to ease benchmarking output parsing.Small UX win when running -n > 1.
- for _ in 0..n.get() { + for i in 1..=n.get() { let start = Instant::now(); @@ - println!( - "epoch: {epoch}, state_root: {state_root}, receipt_root: {receipt_root}, took {}.", - humantime::format_duration(start.elapsed()) - ); + println!( + "run {}/{}: epoch: {epoch}, state_root: {state_root}, receipt_root: {receipt_root}, took {}.", + i, + n.get(), + humantime::format_duration(start.elapsed()) + );
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (1)
src/tool/subcommands/state_compute_cmd.rs(1 hunks)
🧰 Additional context used
🧠 Learnings (2)
📓 Common learnings
Learnt from: hanabi1224
PR: ChainSafe/forest#6167
File: src/tool/subcommands/state_compute_cmd.rs:89-91
Timestamp: 2025-10-17T14:24:47.046Z
Learning: In `src/tool/subcommands/state_compute_cmd.rs`, when using `ReadOpsTrackingStore` to generate minimal snapshots, `HEAD_KEY` should be written to `db.tracker` (not `db` itself) before calling `export_forest_car()`, because the export reads from the tracker MemoryDB which accumulates only the accessed data during computation.
📚 Learning: 2025-10-17T14:24:47.046Z
Learnt from: hanabi1224
PR: ChainSafe/forest#6167
File: src/tool/subcommands/state_compute_cmd.rs:89-91
Timestamp: 2025-10-17T14:24:47.046Z
Learning: In `src/tool/subcommands/state_compute_cmd.rs`, when using `ReadOpsTrackingStore` to generate minimal snapshots, `HEAD_KEY` should be written to `db.tracker` (not `db` itself) before calling `export_forest_car()`, because the export reads from the tracker MemoryDB which accumulates only the accessed data during computation.
Applied to files:
src/tool/subcommands/state_compute_cmd.rs
🧬 Code graph analysis (1)
src/tool/subcommands/state_compute_cmd.rs (5)
src/cli_shared/mod.rs (2)
chain_path(20-22)read_config(24-41)src/db/mod.rs (1)
db_root(320-322)src/genesis/mod.rs (1)
read_genesis_header(21-44)src/tool/subcommands/api_cmd/generate_test_snapshot.rs (2)
load_db(73-79)new(190-195)src/db/car/any.rs (4)
new(34-53)try_from(129-131)try_from(136-138)try_from(143-145)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (9)
- GitHub Check: cargo-publish-dry-run
- GitHub Check: Build MacOS
- GitHub Check: Build Ubuntu
- GitHub Check: All lint checks
- GitHub Check: tests
- GitHub Check: tests-release
- GitHub Check: Check
- GitHub Check: Build forest binaries on Linux AMD64
- GitHub Check: Deploy to Cloudflare Pages
🔇 Additional comments (3)
src/tool/subcommands/state_compute_cmd.rs (3)
83-88: LGTM: robust tipset resolution across null rounds.Using ResolveNullTipset::TakeOlder with heaviest tipset is appropriate here.
122-123: Infallible unwrap is fine here.NonZeroUsize::new(1).unwrap() is safe by construction; no change needed.
99-107: Review comment is incorrect and will not compile.The suggested diff uses
std::fs::Fileandstd::io::BufWriter, which implement the synchronousstd::io::Writetrait. However,export_forest_carrequiresW: tokio::io::AsyncWrite + Unpin—these are incompatible types. The code will fail to compile with the proposed changes.The current implementation already works correctly:
Vec<u8>properly implementstokio::io::AsyncWrite, so there is no trait compatibility issue. Additionally, the code is already correctly writingHEAD_KEYtodb.tracker(line 89), as intended forReadOpsTrackingStoresnapshots.If streaming is actually desired for performance reasons, use
tokio::fs::Fileandtokio::io::BufWriterinstead—but this refactoring is not necessary for correctness.Likely an incorrect or invalid review comment.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
♻️ Duplicate comments (1)
src/tool/subcommands/state_compute_cmd.rs (1)
119-120: Default for n is fineNonZeroUsize::new(1).unwrap() is infallible here; keeping it is acceptable.
🧹 Nitpick comments (4)
src/tool/subcommands/state_compute_cmd.rs (4)
86-86: Writing HEAD to tracker is correct for minimal CAR export; add a clarifying comment and avoid an extra allocThis HEAD write must target the tracker MemoryDB so export_forest_car roots the snapshot correctly. Add a short comment to prevent regressions, and pass a reference to avoid an extra allocation.
Based on learnings
- SettingsStoreExt::write_obj(&db.tracker, crate::db::setting_keys::HEAD_KEY, ts.key())?; + // Export reads from the tracker MemoryDB; seed HEAD so the CAR roots correctly. + SettingsStoreExt::write_obj(&db.tracker, crate::db::setting_keys::HEAD_KEY, &ts.key())?;
96-104: Avoid buffering entire snapshot in memory when writing to a fileExporting to Vec allocates the full CAR in memory even when --export-db-to is provided. Stream directly to the file and derive size from metadata; fall back to Vec only when no path is given.
- let mut db_snapshot = vec![]; - db.export_forest_car(&mut db_snapshot).await?; - println!( - "epoch: {epoch}, state_root: {state_root}, receipt_root: {receipt_root}, db_snapshot_size: {}", - human_bytes::human_bytes(db_snapshot.len() as f64) - ); - if let Some(export_db_to) = export_db_to { - std::fs::write(export_db_to, db_snapshot)?; - } + if let Some(path) = export_db_to { + use std::io::Write; + let file = std::fs::File::create(&path)?; + let mut writer = std::io::BufWriter::new(file); + db.export_forest_car(&mut writer).await?; + writer.flush()?; + let size = std::fs::metadata(&path)?.len(); + println!( + "epoch: {epoch}, state_root: {state_root}, receipt_root: {receipt_root}, db_snapshot_size: {}", + human_bytes::human_bytes(size as f64) + ); + } else { + let mut buf = Vec::new(); + db.export_forest_car(&mut buf).await?; + println!( + "epoch: {epoch}, state_root: {state_root}, receipt_root: {receipt_root}, db_snapshot_size: {}", + human_bytes::human_bytes(buf.len() as f64) + ); + }
45-51: CLI UX: add ValueHint for path argsHelps shell completions and tooling.
- #[arg(long)] + #[arg(long, value_hint = clap::ValueHint::DirPath)] db: Option<PathBuf>, - #[arg(long)] + #[arg(long, value_hint = clap::ValueHint::FilePath)] export_db_to: Option<PathBuf>,
112-121: Snapshot path: add ValueHintSmall UX win for positional arg.
pub struct ReplayComputeCommand { /// Path to the database snapshot `CAR` file generated by `forest-tool state compute` - snapshot: PathBuf, + #[arg(value_hint = clap::ValueHint::FilePath)] + snapshot: PathBuf,
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (1)
src/tool/subcommands/state_compute_cmd.rs(1 hunks)
🧰 Additional context used
🧠 Learnings (2)
📓 Common learnings
Learnt from: hanabi1224
PR: ChainSafe/forest#6167
File: src/tool/subcommands/state_compute_cmd.rs:89-91
Timestamp: 2025-10-17T14:24:47.046Z
Learning: In `src/tool/subcommands/state_compute_cmd.rs`, when using `ReadOpsTrackingStore` to generate minimal snapshots, `HEAD_KEY` should be written to `db.tracker` (not `db` itself) before calling `export_forest_car()`, because the export reads from the tracker MemoryDB which accumulates only the accessed data during computation.
📚 Learning: 2025-10-17T14:24:47.046Z
Learnt from: hanabi1224
PR: ChainSafe/forest#6167
File: src/tool/subcommands/state_compute_cmd.rs:89-91
Timestamp: 2025-10-17T14:24:47.046Z
Learning: In `src/tool/subcommands/state_compute_cmd.rs`, when using `ReadOpsTrackingStore` to generate minimal snapshots, `HEAD_KEY` should be written to `db.tracker` (not `db` itself) before calling `export_forest_car()`, because the export reads from the tracker MemoryDB which accumulates only the accessed data during computation.
Applied to files:
src/tool/subcommands/state_compute_cmd.rs
🧬 Code graph analysis (1)
src/tool/subcommands/state_compute_cmd.rs (6)
src/cli_shared/mod.rs (2)
chain_path(20-22)read_config(24-41)src/db/mod.rs (1)
db_root(320-322)src/genesis/mod.rs (1)
read_genesis_header(21-44)src/tool/subcommands/api_cmd/generate_test_snapshot.rs (2)
load_db(73-79)new(190-195)src/db/car/any.rs (4)
new(34-53)try_from(129-131)try_from(136-138)try_from(143-145)src/networks/mod.rs (1)
from_chain(404-414)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (8)
- GitHub Check: tests
- GitHub Check: tests-release
- GitHub Check: Build forest binaries on Linux AMD64
- GitHub Check: All lint checks
- GitHub Check: cargo-publish-dry-run
- GitHub Check: Build MacOS
- GitHub Check: Build Ubuntu
- GitHub Check: Deploy to Cloudflare Pages
🔇 Additional comments (3)
src/tool/subcommands/state_compute_cmd.rs (3)
61-66: DB root resolution looks correctFalls back to config-driven chain path when --db is omitted. Clean and predictable.
126-133: PathBuf TryFrom and genesis init LGTMUsing AnyCar::try_from(&snapshot) and loading genesis via chain config over the in‑memory store is sound.
102-104: The review comment is incorrect—export_forest_car()already produces zstd-compressed.forest.car.zstformatThe code at lines 102-104 is correct. The
export_forest_car()method (src/db/memory.rs lines 28-56) explicitly:
- Calls
forest::Encoder::compress_stream_default()to compress blocks into zstd frames- Calls
forest::Encoder::write()to write the compressed frames in.forest.car.zstformatThe data written to disk is already compressed; no additional compression is needed. The PR summary correctly describes the output as
.forest.car.zst.Likely an incorrect or invalid review comment.
Summary of changes
This PR introduces a tool to take db snapshot for validating a tipset. The snapshot can be used for benchmarks (may come in a subsequent PR) to justify critial dependency updates (e.g.
wasmtimeinfvm)Changes introduced in this pull request:
forest-tool state computeforest-tool state replay-computeExample usage
Reference issue to close (if applicable)
Closes
Other information and links
Change checklist
Summary by CodeRabbit
New Features
forest-tool state computeto generate DB snapshots for tipset validation, with optional export and diagnostics.forest-tool state replay-computeto replay tipset validation from a snapshot, supporting repeated runs and timing output.Documentation
Changelog