-
Notifications
You must be signed in to change notification settings - Fork 182
feat: add --unordered to forest-cli snapshot export
#5867
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
WalkthroughAdds an Changes
Sequence Diagram(s)sequenceDiagram
autonumber
participant User
participant CLI as forest-cli
participant RPC as Forest RPC
participant Chain as chain::export
participant IPLD as {stream_chain|unordered_stream_chain}
participant Writer as AsyncWriterWithChecksum
User->>CLI: snapshot export --unordered
CLI->>RPC: ForestChainExport(unordered=true, params)
RPC->>Chain: export(..., ExportOptions{unordered, skip_checksum, seen})
alt unordered == true
Chain->>IPLD: unordered_stream_chain(...).with_seen(seen)
else
Chain->>IPLD: stream_chain(...).with_seen(seen)
end
IPLD-->>Chain: CAR block stream
Chain->>Writer: write frames (checksum based on skip_checksum)
Writer-->>RPC: optional checksum
RPC-->>CLI: result
CLI-->>User: export completed
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~20 minutes Possibly related PRs
Suggested labels
Suggested reviewers
Tip 🔌 Remote MCP (Model Context Protocol) integration is now available!Pro plan users can now connect to remote MCP servers from the Integrations page. Connect with popular remote MCPs such as Notion and Linear to add more context to your reviews and chats. 📜 Recent review detailsConfiguration used: CodeRabbit UI 💡 Knowledge Base configuration:
You can enable these sources in your CodeRabbit configuration. 📒 Files selected for processing (1)
🚧 Files skipped from review as they are similar to previous changes (1)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (9)
✨ Finishing Touches
🧪 Generate unit tests
🪧 TipsChatThere are 3 ways to chat with CodeRabbit:
SupportNeed help? Create a ticket on our support page for assistance with any issues or questions. CodeRabbit Commands (Invoked using PR/Issue comments)Type Other keywords and placeholders
CodeRabbit Configuration File (
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
🧹 Nitpick comments (3)
CHANGELOG.md (1)
30-33: Align verb tense with existing changelog styleMost entries within the “Added” section use the imperative “Add …” rather than the past-tense “Added …”. Tweaking the verb keeps the style consistent.
- - [#5867](https://github.com/ChainSafe/forest/pull/5867) Added `--unordered` to `forest-cli snapshot export` for exporting `CAR` blocks in non-deterministic order for better performance with more parallelization. + - [#5867](https://github.com/ChainSafe/forest/pull/5867) Add `--unordered` to `forest-cli snapshot export` for exporting `CAR` blocks in a non-deterministic order to improve parallelisation performance.src/chain/mod.rs (1)
66-84: Optimize duplicateArc::clonein unordered branch.The
dbis cloned twice in the unordered branch but only once in the ordered branch. Consider storing the cloned Arc to avoid redundant cloning.if unordered { + let db_clone = Arc::clone(db); futures::future::Either::Left( unordered_stream_chain( - Arc::clone(db), - tipset.clone().chain_owned(Arc::clone(db)), + db_clone.clone(), + tipset.clone().chain_owned(db_clone), stateroot_lookup_limit, ) .with_seen(seen), ) } else {src/ipld/util.rs (1)
413-413: Consider making worker count configurable.The worker count is hard-coded to clamp between 1 and 8. Consider making this configurable through
ExportOptionsfor better flexibility in different environments.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (10)
CHANGELOG.md(1 hunks)src/blocks/header.rs(2 hunks)src/chain/mod.rs(4 hunks)src/cli/subcommands/snapshot_cmd.rs(6 hunks)src/db/gc/snapshot.rs(2 hunks)src/ipld/util.rs(15 hunks)src/rpc/methods/chain.rs(6 hunks)src/rpc/mod.rs(1 hunks)src/tool/subcommands/api_cmd/test_snapshots_ignored.txt(1 hunks)src/tool/subcommands/archive_cmd.rs(2 hunks)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (7)
- GitHub Check: Build Ubuntu
- GitHub Check: Build MacOS
- GitHub Check: cargo-publish-dry-run
- GitHub Check: tests-release
- GitHub Check: tests
- GitHub Check: Build forest binaries on Linux AMD64
- GitHub Check: All lint checks
🔇 Additional comments (23)
src/tool/subcommands/api_cmd/test_snapshots_ignored.txt (1)
81-81: Addition keeps the ignore list consistent – LGTMIncluding
Forest.ChainExportprevents the new RPC from polluting the snapshot baseline and the placement preserves the alphabetical order of theForest.*block. No further action required.src/db/gc/snapshot.rs (2)
46-46: LGTM: Clean import update for refactored export API.The import change from
CidHashSettoExportOptionscorrectly aligns with the export API refactoring.
228-231: LGTM: Proper adaptation to new ExportOptions struct.The change correctly uses the new
ExportOptionsstruct while maintaining the same behavior:
skip_checksum: truepreserves the previous behavior..Default::default()ensuresunordered: false(deterministic order for GC operations)- The API change is handled cleanly without affecting functionality
src/rpc/mod.rs (1)
81-81: LGTM: Proper registration of new ForestChainExport RPC method.The addition correctly integrates the new
ForestChainExportmethod into the RPC framework, ensuring it's properly registered and available for use.src/tool/subcommands/archive_cmd.rs (2)
30-30: LGTM: Correct import for refactored export API.The import of
ExportOptionsaligns with the export API refactoring across the codebase.
514-525: LGTM: Proper adaptation to ExportOptions struct.The export call correctly uses the new
ExportOptionsstruct:
- Maintains
skip_checksum: truebehavior- Preserves the
seenparameter for diff export functionality- Uses appropriate defaults for other options (e.g.,
unordered: falsefor deterministic archive export)- The function signature and behavior remain unchanged from the caller's perspective
src/blocks/header.rs (2)
17-18: LGTM: Updated imports for improved CID computation.The import changes support the new
car_blockmethod implementation with proper multihash handling.Also applies to: 22-22
100-110: LGTM: Efficient car_block method with CID caching.The new
car_blockmethod and updatedcidmethod provide an efficient way to get both CID and serialized data:
- Avoids duplicate serialization when both CID and data are needed (common in export scenarios)
- Uses proper CID v1 with DAG_CBOR codec and Blake2b-256 multihash
- The
cidmethod maintains backward compatibility by delegating tocar_block- Error handling is appropriate with
expect("Infallible")for the well-tested serialization pathsrc/cli/subcommands/snapshot_cmd.rs (6)
9-9: LGTM: Updated imports for new export functionality.The imports correctly support the new
ForestChainExportParams, timing functionality, and numeric operations for the enhanced export command.Also applies to: 14-16
38-40: LGTM: Well-documented unordered flag.The new
unorderedflag is properly documented, explaining its purpose for enabling non-deterministic traversal to improve performance through better parallelization.
53-53: LGTM: Proper parameter struct usage.The code correctly uses the new
ForestChainExportParamsstruct and includes theunorderedflag. The parameter mapping maintains all existing functionality while adding the new capability.Also applies to: 89-94
100-100: LGTM: Enhanced progress reporting with export speed.The progress reporting improvements are well-implemented:
- Records start time before spawning the progress task
- Calculates export speed as bytes per second
- Handles division by zero correctly
- Uses appropriate interval (0.5 seconds) for progress updates
- Provides useful feedback to users during long-running exports
Also applies to: 105-105, 117-127
136-136: LGTM: Updated to use new ForestChainExport RPC method.The RPC call correctly uses the new
ForestChainExportmethod with proper timeout handling for long-running snapshot exports.
142-147: LGTM: Proper dry run handling.The dry run logic correctly prevents file operations when
dry_runis true:
- Skips checksum file creation
- Skips persisting the temporary file
- Maintains the export process for testing without side effects
src/chain/mod.rs (2)
20-25: LGTM!The
ExportOptionsstruct is well-designed with appropriate derives and sensible defaults.
27-38: LGTM!The function correctly forwards the
ExportOptionsto theexportfunction.src/rpc/methods/chain.rs (3)
225-298: LGTM!The
ForestChainExportRPC method is well-implemented with proper concurrency control, input validation, and error handling.
300-337: LGTM!The wrapper correctly maintains backward compatibility by delegating to
ForestChainExportwithunordered: false.
886-898: LGTM!The
ForestChainExportParamsstruct is properly defined with appropriate serde and schema annotations.src/ipld/util.rs (4)
103-108: LGTM!The enhancement to carry optional block data in the
Emitvariant is a good optimization to avoid redundant database fetches.
122-130: LGTM!The builder-style methods follow Rust conventions and provide a clean API.
220-226: Good optimization for vector reservation.The check for non-empty vector before reserving capacity is a good performance optimization.
290-307: LGTM!The
Fusewrapper andPinnedDropimplementation ensure proper cleanup and stream termination.
akaladarshi
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some comments otherwise LGTM. Very nice work 👌🏽
|
|
||
| for v in new_values.into_iter().rev() { | ||
| cid_vec.push_front(v) | ||
| if !new_values.is_empty() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit:
if new_values.is_empty() { continue }
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That changes the logic, it always returns sth here
| fail_on_dead_links: bool, | ||
| ) -> UnorderedChainStream<'a, DB, ITER> { | ||
| let (sender, receiver) = flume::bounded(BLOCK_CHANNEL_LIMIT); | ||
| let (extract_sender, extract_receiver) = flume::unbounded(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we also use the flume::bounded for extract sender and receiver ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's tackle this in another PR
| let mut handles = JoinSet::new(); | ||
|
|
||
| for _ in 0..num_cpus::get() { | ||
| for _ in 0..num_cpus::get().clamp(1, 8) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you think it makes sense to accept this as params as well?
Basically allow user to pass the worker count otherwise use current values as default?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it's our responsibility to tune this value. It might turn out this unordered_stream_chain is not fast and we end up removing it.
src/ipld/util.rs
Outdated
| let mut new_values = extract_cids(&data)?; | ||
| cid_vec.append(&mut new_values); | ||
| if !new_values.is_empty() { | ||
| cid_vec.append(&mut new_values); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Refactor it to:
let new_values = extract_cids(&data)?;
if !new_values.is_empty() {
cid_vec.reserve(new_values.len());
cid_vec.extend(new_values);
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This has been done internally in Vec::append
https://doc.rust-lang.org/stable/src/alloc/vec/mod.rs.html#2579
pub fn append(&mut self, other: &mut Self) {
unsafe {
self.append_elements(other.as_slice() as _);
other.set_len(0);
}
}
/// Appends elements to `self` from other buffer.
#[cfg(not(no_global_oom_handling))]
#[inline]
#[track_caller]
unsafe fn append_elements(&mut self, other: *const [T]) {
let count = other.len();
self.reserve(count);
let len = self.len();
unsafe { ptr::copy_nonoverlapping(other as *const T, self.as_mut_ptr().add(len), count) };
self.len += count;
}
src/ipld/util.rs
Outdated
| let mut new_values = extract_cids(&data)?; | ||
| cid_vec.append(&mut new_values); | ||
| if !new_values.is_empty() { | ||
| cid_vec.append(&mut new_values); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you think cid_vec could grow very large since we are also sending the state_root ?
Should we have batch processing here?
Something like this:
const MAX_CID_QUEUE_SIZE: usize = 1000;
if cid_vec.len() > MAX_CID_QUEUE_SIZE {
for cid in cid_vec.drain(MAX_CID_QUEUE_SIZE..) {
let _ = extract_sender.send(cid);
}
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
extract_sender is bounded, and this will likely get stuck. Actually the original code gets stuck on CI and I have to change it to send_async
src/ipld/util.rs
Outdated
| ) | ||
| } | ||
| } else if this.extract_sender.is_empty() { | ||
| this.worker_handle.abort(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Right now we are aborting without checking if the workers are still processing the existing load.
Do you think we should make sure that all the workers have finished before aborting it here, we can send signal from the worker side (once all tipset tasks are done) and then abort.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great point. The old logic did not handle the shutdown gracefully indeed. Fixed by closing the extract_sender instead of forcing worker_handle.abort
|
Also do you thinks it's possible to divide the Just a suggestion, we can use introduce couple of different structs to handle different tasks: |
|
Hey @akaladarshi Thanks for your review! Regarding |
|
@hanabi1224 I didn't realise this just for reviving the dead code. We can hold on to the changes for now, we can make them once we are sure this what we are going with. I will approve the PR, we can revisit the comments later. |
066fb5f
9a8ab04 to
f3897d3
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
🔭 Outside diff range comments (1)
scripts/tests/calibnet_eth_mapping_check.sh (1)
25-26: Off-by-one: loop iterates 201 tipsets instead of the requested 200
for ((i=0; i<=NUM_TIPSETS; i++)); doincludes both 0 and 200.
If you really want exactly$NUM_TIPSETStipsets, use a strict<upper bound.-for ((i=0; i<=NUM_TIPSETS; i++)); do +for ((i=0; i<NUM_TIPSETS; i++)); do
🧹 Nitpick comments (4)
scripts/tests/calibnet_eth_mapping_check.sh (1)
35-41: Redundant jq check adds needless subprocess work
jq -ealready exits with status 0 when the value is neithernullnorfalse.
The string comparison is superfluous—just test the exit code.- if [[ $(echo "$JSON" | jq -e '.result.transactions') != "null" ]]; then + if echo "$JSON" | jq -e '.result.transactions' >/dev/null; then.github/workflows/forest.yml (1)
309-330: Consider deduplicating the export-check jobs with a matrix
All three snapshot jobs differ only by one script path. A small matrix would
shrink ~70 duplicated lines and ease future maintenance.scripts/tests/calibnet_export_unordered_check.sh (2)
15-16: Typo in log message
unordred→unordered.-echo "Exporting zstd compressed snapshot with unordred graph traversal" +echo "Exporting zstd compressed snapshot with unordered graph traversal"
30-32: Hard-coded filename couples steps unnecessarily
Import path is fixed tounordered.forest.car.zst; using$(ls *.car.zst)
would make the step resilient to future name changes.-$FOREST_PATH ... --import-snapshot unordered.forest.car.zst +$FOREST_PATH ... --import-snapshot "$(ls *.car.zst)"
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (11)
.github/workflows/forest.yml(1 hunks)CHANGELOG.md(1 hunks)scripts/tests/calibnet_eth_mapping_check.sh(1 hunks)scripts/tests/calibnet_export_unordered_check.sh(1 hunks)scripts/tests/harness.sh(1 hunks)src/chain/mod.rs(7 hunks)src/chain/tests.rs(1 hunks)src/cli/subcommands/snapshot_cmd.rs(4 hunks)src/ipld/util.rs(9 hunks)src/rpc/methods/chain.rs(7 hunks)src/tool/subcommands/archive_cmd.rs(2 hunks)
✅ Files skipped from review due to trivial changes (1)
- scripts/tests/harness.sh
🚧 Files skipped from review as they are similar to previous changes (6)
- src/cli/subcommands/snapshot_cmd.rs
- src/tool/subcommands/archive_cmd.rs
- src/rpc/methods/chain.rs
- CHANGELOG.md
- src/ipld/util.rs
- src/chain/mod.rs
🧰 Additional context used
🧠 Learnings (3)
📓 Common learnings
Learnt from: hanabi1224
PR: ChainSafe/forest#5835
File: src/chain/tests.rs:58-76
Timestamp: 2025-08-04T13:36:22.993Z
Learning: In the Forest codebase, `Vec<u8>` can be used as an `AsyncWrite` implementation in test contexts. The user confirmed that tests using `&mut Vec<u8>` with `export` and `export_v2` functions compile and pass both locally and on CI.
📚 Learning: in forest project scripts, `forest_wait_api` (called within `forest_init`) ensures basic rpc service...
Learnt from: hanabi1224
PR: ChainSafe/forest#5886
File: scripts/tests/calibnet_eth_mapping_check.sh:49-49
Timestamp: 2025-08-07T09:37:03.079Z
Learning: In Forest project scripts, `forest_wait_api` (called within `forest_init`) ensures basic RPC service readiness, while `forest_wait_for_healthcheck_ready` performs additional comprehensive checks including DB backfill completion. When using `--backfill-db` flag, basic RPC operations can proceed after `forest_init`, but operations requiring complete DB backfill should wait for `forest_wait_for_healthcheck_ready`.
Applied to files:
scripts/tests/calibnet_eth_mapping_check.shscripts/tests/calibnet_export_unordered_check.sh
📚 Learning: in the forest codebase, `vec` can be used as an `asyncwrite` implementation in test contexts. th...
Learnt from: hanabi1224
PR: ChainSafe/forest#5835
File: src/chain/tests.rs:58-76
Timestamp: 2025-08-04T13:36:22.993Z
Learning: In the Forest codebase, `Vec<u8>` can be used as an `AsyncWrite` implementation in test contexts. The user confirmed that tests using `&mut Vec<u8>` with `export` and `export_v2` functions compile and pass both locally and on CI.
Applied to files:
src/chain/tests.rs
🧬 Code Graph Analysis (1)
scripts/tests/calibnet_eth_mapping_check.sh (1)
scripts/tests/harness.sh (1)
forest_wait_for_healthcheck_ready(100-103)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (3)
- GitHub Check: tests
- GitHub Check: tests-release
- GitHub Check: Build forest binaries on Linux AMD64
🔇 Additional comments (3)
src/chain/tests.rs (1)
62-62: LGTM! Function signature updates align with the API refactor.The changes correctly update the test function calls to use the new unified
ExportOptionsparameter. PassingNonemaintains the default behavior (no checksum skipping, ordered export, empty seen set), ensuring the tests continue to validate the same functionality while adapting to the refactored interface.Also applies to: 65-65
scripts/tests/calibnet_eth_mapping_check.sh (1)
49-49: Good call: wait for full back-fill before mapping checks
Replacing the ad-hoc health-check command with the helper ensures DB
back-fill is complete, preventing flaky failures in the mapping tests.scripts/tests/calibnet_export_unordered_check.sh (1)
8-14: Safety: wildcard delete may remove unintended files
rm --force --verbose ./*.{car,car.zst,sha256sum}expands to the literal
patterns when no matches exist, causing an error withset -eu.
Guard withshopt -s nullglobor quote the glob.+shopt -s nullglob rm --force --verbose ./*.{car,car.zst,sha256sum} +shopt -u nullglob
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 2
🧹 Nitpick comments (1)
src/ipld/util.rs (1)
293-297: Graceful worker shutdown recommended.
PinnedDropsimply callsabort()on the worker task; this can leak in-flight
work and leave the DB in inconsistent state.
Consider:
- Dropping
extract_senderfirst so workers exit naturally.worker_handle.abort()followed byawaiting the join handle (with a short
timeout) to ensure cleanup completes.This will reduce resource leaks and make panics inside workers visible to the
caller.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (3)
src/db/gc/snapshot.rs(2 hunks)src/ipld/util.rs(9 hunks)src/rpc/methods/chain.rs(7 hunks)
🚧 Files skipped from review as they are similar to previous changes (2)
- src/db/gc/snapshot.rs
- src/rpc/methods/chain.rs
🧰 Additional context used
🧠 Learnings (3)
📓 Common learnings
Learnt from: hanabi1224
PR: ChainSafe/forest#5835
File: src/chain/tests.rs:58-76
Timestamp: 2025-08-04T13:36:22.993Z
Learning: In the Forest codebase, `Vec<u8>` can be used as an `AsyncWrite` implementation in test contexts. The user confirmed that tests using `&mut Vec<u8>` with `export` and `export_v2` functions compile and pass both locally and on CI.
📚 Learning: 2025-08-06T15:42:43.520Z
Learnt from: LesnyRumcajs
PR: ChainSafe/forest#5907
File: Cargo.toml:25-36
Timestamp: 2025-08-06T15:42:43.520Z
Learning: The fvm_ipld_amt crate version 0.7.5 is required for the Forest project, but it's not yet published to crates.io, which is why the Git patch pointing to the ref-fvm repository is still necessary.
Applied to files:
src/ipld/util.rs
📚 Learning: 2025-08-06T15:42:43.520Z
Learnt from: LesnyRumcajs
PR: ChainSafe/forest#5907
File: Cargo.toml:25-36
Timestamp: 2025-08-06T15:42:43.520Z
Learning: Forest project requires fvm_ipld_amt version 0.7.5 functionality which is available in the Git revision a633547ae414a333b2d076beef87d4d30cdb7fb4 of the ref-fvm repository, but version 0.7.5 is not yet published to crates.io (only 0.7.4 is available). The Git patch must remain until 0.7.5 is published.
Applied to files:
src/ipld/util.rs
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (6)
- GitHub Check: Build MacOS
- GitHub Check: tests-release
- GitHub Check: rubocop
- GitHub Check: Check
- GitHub Check: Analyze (rust)
- GitHub Check: Analyze (go)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
🧹 Nitpick comments (2)
src/cli/subcommands/snapshot_cmd.rs (2)
39-41: Clarify nondeterminism and checksum expectations for--unordered.Unordered export produces a non-reproducible CAR (block order varies), so file checksums will differ across runs. Make this explicit in the flag help to avoid surprises.
Consider:
- /// Traverse chain in non-deterministic order for better performance with more parallelization. + /// Traverse chain in non-deterministic order for better performance with more parallelization. + /// Note: output CAR order (and thus file checksum) is non-deterministic and will vary across runs.
136-141: Dry-run: print explicit outcome and defensively gate checksum save.Minor UX tweak: explicitly state that dry-run writes nothing. Also guard saving checksum by the local
skip_checksumflag to avoid accidental writes if the server returns a hash unexpectedly.- if !dry_run { - if let Some(hash) = hash_result { - save_checksum(&output_path, hash).await?; - } - temp_path.persist(output_path)?; - } + if !dry_run { + if !skip_checksum { + if let Some(hash) = hash_result { + save_checksum(&output_path, hash).await?; + } + } + temp_path.persist(output_path)?; + } else { + println!("Dry run: no files were written and no checksum saved."); + }
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (1)
src/cli/subcommands/snapshot_cmd.rs(4 hunks)
🧰 Additional context used
🧬 Code Graph Analysis (1)
src/cli/subcommands/snapshot_cmd.rs (3)
src/blocks/tipset.rs (1)
hash(470-472)src/blocks/block.rs (1)
hash(25-27)src/shim/sector.rs (1)
hash(344-351)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (10)
- GitHub Check: Build forest binaries on Linux AMD64
- GitHub Check: tests
- GitHub Check: Check
- GitHub Check: tests-release
- GitHub Check: All lint checks
- GitHub Check: cargo-publish-dry-run
- GitHub Check: Build MacOS
- GitHub Check: Build Ubuntu
- GitHub Check: Analyze (rust)
- GitHub Check: Analyze (go)
🔇 Additional comments (2)
src/cli/subcommands/snapshot_cmd.rs (2)
57-57: LGTM: flag plumbed through CLI.The new
unorderedflag is correctly threaded through the export args.
100-103: End-to-EndunorderedSupport VerifiedAll paths correctly propagate and honor the
unorderedflag:
- CLI (
snapshot_cmd.rs) exposes--unorderedand sends it inForestChainExportParams.- RPC (
ForestChainExportParams) includespub unordered: booland passes it intoExportOptionsunchanged.- Both V1 (
export) and V2 (export_v2) calls receiveExportOptions { unordered, … }and select betweenunordered_stream_chainand the default ordered stream.- Utility layer (
ipld::unordered_stream_chain/_graph) and existing tests/scripts (e.g.calibnet_export_unordered_check.sh,benchmark_unordered_graph_traversal) exercise unordered exports.No further changes required.
Summary of changes
(Will be ready for review after #5835 is merged)
Changes introduced in this pull request:
--unorderedtoforest-cli snapshot exportto make use of dead codefn unordered_chain_exportwhich exportsCARblocks in non-deterministic order for better performance with more parallelizationforest-cli snapshot export(splitted into feat: print average speed in forest-cli snapshot export #5869)chain_exportandunordered_chain_export(made a separate PR fix: reduce db read ops in chain export #5868 for this)Vec::reserveops inchain_exportReference issue to close (if applicable)
Closes
Other information and links
Change checklist
Summary by CodeRabbit
New Features
Bug Fixes
Documentation
Tests
Chores