-
Notifications
You must be signed in to change notification settings - Fork 182
feat: add --format flag to forest-cli snapshot export
#5835
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
WalkthroughThis change set introduces support for Filecoin snapshot export and inspection in the new v2 format (FRC-0108), alongside continued support for v1. It adds metadata handling for snapshots, updates CLI and RPC interfaces to select the export format, and enhances archive inspection tools. Internal CAR handling is refactored to extract and cache snapshot metadata, and related tests and scripts are updated accordingly. Changes
Sequence Diagram(s)sequenceDiagram
participant User
participant CLI
participant RPC
participant Chain
participant File
User->>CLI: forest-cli snapshot export --format v2
CLI->>RPC: Call Forest.ChainExport(params with version v2)
RPC->>Chain: export_v2(db, None, tipset, ...)
Chain->>File: Write v2 snapshot with metadata
Chain-->>RPC: Return checksum
RPC-->>CLI: Return checksum
CLI-->>User: Save file, report completion
sequenceDiagram
participant User
participant CLI
participant Tool
User->>CLI: forest-tool archive metadata <snapshot>
CLI->>Tool: Open snapshot file
Tool->>Tool: Extract FRC-0108 metadata
Tool-->>CLI: Print metadata or "absent" message
CLI-->>User: Display metadata
Estimated code review effort🎯 4 (Complex) | ⏱️ ~90 minutes Assessment against linked issues
Assessment against linked issues: Out-of-scope changes
Suggested labels
Suggested reviewers
Note ⚡️ Unit Test Generation is now available in beta!Learn more here, or try it out under "Finishing Touches" below. 📜 Recent review detailsConfiguration used: CodeRabbit UI 📒 Files selected for processing (1)
🧰 Additional context used🧠 Learnings (1)📚 Learning: in the forest codebase, `vec` can be used as an `asyncwrite` implementation in test contexts. th...Applied to files:
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (10)
🔇 Additional comments (7)
✨ Finishing Touches
🧪 Generate unit tests
🪧 TipsChatThere are 3 ways to chat with CodeRabbit:
SupportNeed help? Create a ticket on our support page for assistance with any issues or questions. CodeRabbit Commands (Invoked using PR comments)
Other keywords and placeholders
CodeRabbit Configuration File (
|
82e04d8 to
6408398
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 2
🧹 Nitpick comments (2)
scripts/tests/snapshot_parity/docker-compose.yml (1)
103-104: DRY up repeated filtering logicThe two nearly identical filter expressions (
grep -v "CAR format" …) are duplicated for Forest and Lotus.
Extracting the pattern into a variable or a small helper function would make it easier to adjust the ignore list in one place when the tool’s output changes again.Example:
FILTER='grep -v "CAR format"' forest-tool archive info forest.car.zst | $FILTER | grep -v "Index size" > forest.txt forest-tool archive info lotus.car | $FILTER > lotus.txtCHANGELOG.md (1)
34-37: Minor formatting nits in new “Added” entries
- Missing space before the opening parenthesis in both occurrences of “format(FRC-0108)”.
- Optional: both bullets reference the same issue/PR
#5835; if one is actually a PR and the other an issue, append the proper link type (pullvsissues) to avoid confusion.- ... v2 format(FRC-0108). + ... v2 format (FRC-0108). - ... v2 format(FRC-0108). + ... v2 format (FRC-0108).
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (16)
.config/forest.dic(1 hunks).config/lychee.toml(1 hunks)CHANGELOG.md(1 hunks)scripts/tests/calibnet_export_check.sh(2 hunks)scripts/tests/snapshot_parity/docker-compose.yml(1 hunks)src/chain/mod.rs(2 hunks)src/cli/subcommands/snapshot_cmd.rs(3 hunks)src/db/car/any.rs(2 hunks)src/db/car/forest.rs(13 hunks)src/db/car/plain.rs(8 hunks)src/db/memory.rs(1 hunks)src/rpc/methods/chain.rs(1 hunks)src/rpc/mod.rs(1 hunks)src/tool/subcommands/api_cmd/test_snapshots_ignored.txt(1 hunks)src/tool/subcommands/archive_cmd.rs(9 hunks)src/utils/db/car_stream.rs(3 hunks)
🧰 Additional context used
🧬 Code Graph Analysis (3)
src/db/memory.rs (2)
src/db/car/forest.rs (1)
head_tipset_key(169-175)src/db/car/plain.rs (1)
head_tipset_key(194-200)
src/db/car/plain.rs (3)
src/blocks/tipset.rs (10)
new(226-244)new(478-495)from(102-104)from(153-155)from(159-161)from(165-170)from(189-198)from(463-468)len(68-70)len(316-318)src/db/car/any.rs (5)
new(35-54)metadata(56-62)heaviest_tipset_key(64-70)from(168-170)from(174-176)src/db/car/forest.rs (4)
new(106-125)metadata(127-139)head_tipset_key(169-175)heaviest_tipset_key(181-183)
src/db/car/forest.rs (5)
src/blocks/tipset.rs (9)
chain(374-381)new(226-244)new(478-495)from(102-104)from(153-155)from(159-161)from(165-170)from(189-198)from(463-468)src/blocks/block.rs (1)
header(31-33)src/db/car/any.rs (6)
new(35-54)metadata(56-62)index_size_bytes(111-116)heaviest_tipset_key(64-70)from(168-170)from(174-176)src/db/car/plain.rs (5)
new(130-178)new(489-491)metadata(180-192)head_tipset_key(194-200)heaviest_tipset_key(206-208)src/db/car/many.rs (4)
new(37-43)new(73-79)heaviest_tipset_key(133-139)heaviest_tipset_key(269-274)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (6)
- GitHub Check: Build forest binaries on Linux AMD64
- GitHub Check: tests-release
- GitHub Check: tests
- GitHub Check: All lint checks
- GitHub Check: cargo-publish-dry-run
- GitHub Check: Build Ubuntu
🔇 Additional comments (22)
.config/lychee.toml (1)
20-22: Trailing comma after the final array element may break TOML parsing
"linuxhint.com",is followed immediately by the closing].
The official TOML 1.0 spec disallows a comma after the last element of an array, and many parsers (includingtoml-rs, whichlycheerelies on) will fail to load the config if the trailing comma is present.
The existing entry for"crates.io",already has this issue, but adding another item now is a good chance to fix both.- "crates.io", - # Bot protection / 403 Forbidden errors - "linuxhint.com", + "crates.io" + # Bot protection / 403 Forbidden errors + "linuxhint.com"(Or simply keep the trailing comma on all items except the last one.)
Please run
lychee --config .config/lychee.toml --helplocally to verify that the file still parses.Likely an incorrect or invalid review comment.
.config/forest.dic (1)
132-134: LGTM! Dictionary entries correctly added for snapshot version identifiers.The additions of v0, v1, and v2 are properly placed alphabetically and align with the snapshot versioning feature being introduced.
src/tool/subcommands/api_cmd/test_snapshots_ignored.txt (1)
14-14: LGTM! Correctly added new RPC method to ignore list.The addition of
Filecoin.ChainExportV2is properly placed after the existingFilecoin.ChainExportentry and aligns with the new RPC method being introduced.src/db/memory.rs (1)
218-218: LGTM! Test updated to use new API method.The change from
car.roots()tocar.head_tipset_key()correctly adapts to the refactored CAR API. The new method returns the same semantic information (tipset key from metadata or header roots) while supporting the new snapshot metadata format.src/rpc/mod.rs (1)
62-62: LGTM! New RPC method properly registered in the framework.The addition of
ChainExportV2to the macro is correctly placed in the chain vertical section and ensures the new RPC method is integrated into all RPC-related processing.src/db/car/any.rs (2)
13-13: LGTM! Necessary import added for snapshot metadata.The import of
FilecoinSnapshotMetadatais correctly added to support the new metadata functionality.
56-62: LGTM! Well-implemented unified metadata access interface.The new
metadata()method correctly delegates to each variant's metadata method, providing a consistent interface for accessing snapshot metadata regardless of the underlying CAR format. This follows the established pattern used by other methods in the enum.scripts/tests/calibnet_export_check.sh (1)
35-39: Efficient validation approach for compressed files.The change to validate
.car.zstfiles directly without decompression is a good optimization that aligns with the improved archive handling capabilities.src/cli/subcommands/snapshot_cmd.rs (1)
38-40: Clean implementation of v2 export flag.The
--v2flag is well-documented and the conditional RPC call logic is clear and maintainable.Also applies to: 126-134
src/utils/db/car_stream.rs (1)
90-113: Good refactoring to extract write logic into a reusable trait.The
CarBlockWritetrait improves modularity and follows the DRY principle by making the CAR block writing logic available to anyWriteimplementor.src/db/car/plain.rs (1)
180-200: Well-designed metadata extraction with lazy initialization.The implementation correctly:
- Uses
OnceLockfor efficient lazy initialization- Checks for single root CID before attempting metadata decode
- Provides graceful fallback when metadata is not present
- Follows the same pattern as the
ForestCarimplementationsrc/rpc/methods/chain.rs (1)
339-361: Confirm correct usage ofNonefor F3 parameter in export_v2The
export_v2signature insrc/chain/mod.rsis:pub async fn export_v2<D: Digest>( db: &Arc<impl Blockstore + Send + Sync + 'static>, f3: Option<(Cid, &mut File)>, // optional F3 data tipset: &Tipset, lookup_depth: ChainEpochDelta, writer: impl AsyncWrite + Unpin, seen: CidHashSet, skip_checksum: bool, ) -> anyhow::Result<Option<digest::Output<D>>, Error>
- Passing
Nonefor the second argument correctly indicates that no F3 data is provided in this RPC.- The rest of the parameters mirror the existing ChainExport logic.
No further changes are required.
src/chain/mod.rs (2)
87-160: Well-structured implementation of v2 export functionality.The implementation correctly follows the FRC-0108 specification by creating snapshot metadata with version 2, encoding it as CBOR, and prepending it to the exported data. The optional F3 data handling and stream chaining approach are well-designed.
162-199: Metadata struct correctly implements FRC-0108 specification.The
FilecoinSnapshotMetadatastruct properly follows the FRC-0108 standard with appropriate serialization attributes and a well-formatted Display implementation.src/tool/subcommands/archive_cmd.rs (3)
174-195: Info command properly handles both v1 and v2 archives.The implementation correctly extracts the snapshot version from metadata when available and falls back to version 1 for legacy archives, ensuring backward compatibility.
196-204: Clean implementation of metadata inspection command.The command provides clear output for both v2 archives with metadata and legacy archives without metadata.
252-288: ArchiveInfo updates improve clarity and version visibility.The renaming from
roottoheadbetter reflects the semantic meaning, and the addition of snapshot version information enhances the user experience when inspecting archives.src/db/car/forest.rs (5)
93-125: Well-designed refactoring to support metadata extraction.The transition from storing roots directly to storing the full header with lazy metadata loading is a clean design that maintains backward compatibility while enabling v2 snapshot support.
127-139: Metadata extraction logic correctly handles v2 snapshots.The implementation properly attempts to decode metadata only when there's a single root, which aligns with the v2 format specification. The graceful fallback ensures compatibility with v1 archives.
Note: In the rare case where a v1 archive has exactly one root, the decode attempt will fail and return None, which is the correct behavior.
169-183: Excellent backward compatibility through head_tipset_key().The method provides a clean abstraction that works seamlessly with both v1 and v2 archives by preferring metadata when available and falling back to header roots.
303-439: Stream type updates and public visibility changes support v2 export.The transition to
anyhow::Resultwrapped streams provides consistent error handling, and exposingfinalize_frameandnew_encoderenables their reuse in the v2 export implementation.
509-539: Test updates correctly reflect API changes.The tests have been properly updated to use
head_tipset_key()instead of the removedroots()method.
| # Skip the CAR format line and the "Index size" line (only present in Forest snapshots) | ||
| forest-tool archive info forest.car.zst | tail -n +2 | grep -v "Index size" > forest.txt | ||
| forest-tool archive info forest.car.zst | grep -v "CAR format" | grep -v "Index size" > forest.txt |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
grep can trip pipefail and abort the test when nothing is left after filtering
Because the script is executed with set -o pipefail, the entire pipeline will exit non-zero if any element fails.
grep -v returns 1 when it filters out all lines, which would terminate the job even though that situation is harmless for the comparison.
Consider making the pipeline resilient:
-forest-tool archive info forest.car.zst | grep -v "CAR format" | grep -v "Index size" > forest.txt
+forest-tool archive info forest.car.zst \
+ | grep -v "CAR format" \
+ | grep -v "Index size" \
+ || true # avoid pipefail abort if all lines are filtered out(or use sed/awk that always return 0).
This prevents spurious CI failures when the output happens to contain only the excluded lines.
Committable suggestion skipped: line range outside the PR's diff.
🤖 Prompt for AI Agents
In scripts/tests/snapshot_parity/docker-compose.yml around lines 100 to 101, the
use of `grep -v` in the pipeline can cause the script to fail due to `set -o
pipefail` when all lines are filtered out, returning a non-zero exit code. To
fix this, modify the pipeline to handle the case where `grep -v` returns 1
without failing the entire pipeline, such as appending `|| true` after the grep
commands or replacing `grep -v` with `sed` or `awk` commands that always return
0, ensuring the pipeline does not abort on harmless empty output.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's OK to let it fail as this is part of CI test
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@hanabi1224 Ah, I see the eldritch wisdom in your approach! Indeed, in the cyclopean realm of CI testing, allowing the pipeline to fail when the ancient grep incantations filter out all lines serves as a sentinel against the unspeakable horror of unexpected output format changes.
Your decision to embrace this failure mode is sound - should the forest-tool archive info command's output ever shift in ways that render all lines filtered by your grep -v invocations, the test will dutifully shriek into the void and alert you to investigate what dark forces have altered the tool's behavior.
The Old Ones of continuous integration approve of this watchful approach! 🐙
| Ok(digest) | ||
| } | ||
|
|
||
| pub async fn export_v2<D: Digest>( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We don't have unit tests for the original export, but it'd be great to have some for v2.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Tests added for both v1 and v2
src/cli/subcommands/snapshot_cmd.rs
Outdated
| /// Export snapshot in the experimental v2 format(FRC-0108). | ||
| #[arg(long)] | ||
| v2: bool, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For the sake of future implementations and operations, I think it'd be nicer to provide an enumeration with allowed versions: v1 and v2. If we add another one, we'd just need add a valueto the enum (and not handle cases like --v2 --v3).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed
src/db/car/forest.rs
Outdated
|
|
||
| pub fn metadata(&self) -> &Option<FilecoinSnapshotMetadata> { | ||
| self.metadata.get_or_init(|| { | ||
| if self.header.roots.len() == 1 { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some explanations would be nice.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we can use common constant in place of 1, so we can use it both places (forest and plain) and add comment to describe it. Something like this:
/// According to FRC-0108, v2 snapshots have exactly one root pointing to metadata
const V2_SNAPSHOT_ROOT_COUNT: usize = 1;
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed.
|
|
||
| pub fn roots(&self) -> &NonEmpty<Cid> { | ||
| &self.roots | ||
| pub fn head_tipset_key(&self) -> &NonEmpty<Cid> { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's have some documentation or comments around it, pointing to the relevant FRC for more details.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Adding some kind of comment there will be more useful, FRC documentation is not enough for people who don't understand it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same comments as in the ForestCAR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed
src/tool/subcommands/archive_cmd.rs
Outdated
| &store, | ||
| "ManyCAR".to_string(), | ||
| heaviest_tipset.clone(), | ||
| 1, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's difficult to say what this 1 means without going deeper into the implementation. It'd be nice to have it named.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed
|
For future reference; I checked and there are no speed regressions (I checked on calibnet). this branch main |
src/chain/mod.rs
Outdated
| #[derive(Debug, Serialize, Deserialize, PartialEq, Eq)] | ||
| #[serde(rename_all = "PascalCase")] | ||
| pub struct FilecoinSnapshotMetadata { | ||
| pub version: u64, | ||
| pub head_tipset_key: NonEmpty<Cid>, | ||
| pub f3_data: Option<Cid>, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I know it's in the FRC, but it'd be nice to have those fields documented regardless. Perhaps even some copy-pasta from the spec. At the end of the day, pointing someone to our Rust docs is nice and AI tooling will also appreciate it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed
src/chain/mod.rs
Outdated
| #[derive(Debug, Serialize, Deserialize, PartialEq, Eq)] | ||
| #[serde(rename_all = "PascalCase")] | ||
| pub struct FilecoinSnapshotMetadata { | ||
| pub version: u64, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it'd be nice to have an enum here (with u64 representation). This way, we can have more idiomatic Rust code elsewhere with exhaustive pattern matching.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed
src/rpc/methods/chain.rs
Outdated
|
|
||
| pub enum ChainExportV2 {} | ||
| impl RpcMethod<1> for ChainExportV2 { | ||
| const NAME: &'static str = "Filecoin.ChainExportV2"; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it agreed with FilOz that we add this RPC method? Or should we add an optional field in the params (which should also be agreed upon with them)?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Changed to Filecoin.ChainExport for now. We could align with Lotus once their implementation is finalized. RPC method is not specified by the FRC
LesnyRumcajs
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's looking good. There are some outstanding comments that I'd like to get resolved first so we don't merge it by mistake.
2fb86a9 to
918d590
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
🧹 Nitpick comments (1)
src/chain/mod.rs (1)
235-253: Good test coverage for CBOR serialization, consider expanding.The test correctly verifies round-trip CBOR serialization/deserialization and ensures compatibility with u64 encoding.
Consider expanding the test to include:
#[test] fn test_snapshot_version_cbor_serde() { + // Test V1 + assert_eq!( + fvm_ipld_encoding::to_vec(&FilecoinSnapshotVersion::V1), + fvm_ipld_encoding::to_vec(&1_u64) + ); + assert_eq!( fvm_ipld_encoding::to_vec(&FilecoinSnapshotVersion::V2), fvm_ipld_encoding::to_vec(&2_u64) ); + + // Test error case + assert!(fvm_ipld_encoding::from_slice::<FilecoinSnapshotVersion>( + &fvm_ipld_encoding::to_vec(&99_u64).unwrap() + ).is_err()); + assert_eq!( fvm_ipld_encoding::from_slice::<FilecoinSnapshotVersion>( &fvm_ipld_encoding::to_vec(&2_u64).unwrap() ) .unwrap(), FilecoinSnapshotVersion::V2 ); }
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (9)
CHANGELOG.md(1 hunks)scripts/tests/calibnet_export_check.sh(2 hunks)src/chain/mod.rs(2 hunks)src/cli/subcommands/snapshot_cmd.rs(4 hunks)src/db/car/forest.rs(13 hunks)src/db/car/plain.rs(8 hunks)src/rpc/methods/chain.rs(1 hunks)src/tool/subcommands/api_cmd/test_snapshots_ignored.txt(1 hunks)src/tool/subcommands/archive_cmd.rs(10 hunks)
🧬 Code Graph Analysis (1)
src/db/car/forest.rs (4)
src/blocks/tipset.rs (9)
chain(374-381)new(226-244)new(478-495)from(102-104)from(153-155)from(159-161)from(165-170)from(189-198)from(463-468)src/db/car/forest/index/mod.rs (3)
std(526-526)std(574-574)std(602-602)src/db/car/plain.rs (5)
new(130-178)new(491-493)metadata(180-193)head_tipset_key(195-202)heaviest_tipset_key(208-210)src/db/car/any.rs (6)
new(35-54)metadata(56-62)index_size_bytes(111-116)heaviest_tipset_key(64-70)from(168-170)from(174-176)
🚧 Files skipped from review as they are similar to previous changes (7)
- src/tool/subcommands/api_cmd/test_snapshots_ignored.txt
- scripts/tests/calibnet_export_check.sh
- CHANGELOG.md
- src/cli/subcommands/snapshot_cmd.rs
- src/rpc/methods/chain.rs
- src/tool/subcommands/archive_cmd.rs
- src/db/car/plain.rs
🧰 Additional context used
🧬 Code Graph Analysis (1)
src/db/car/forest.rs (4)
src/blocks/tipset.rs (9)
chain(374-381)new(226-244)new(478-495)from(102-104)from(153-155)from(159-161)from(165-170)from(189-198)from(463-468)src/db/car/forest/index/mod.rs (3)
std(526-526)std(574-574)std(602-602)src/db/car/plain.rs (5)
new(130-178)new(491-493)metadata(180-193)head_tipset_key(195-202)heaviest_tipset_key(208-210)src/db/car/any.rs (6)
new(35-54)metadata(56-62)index_size_bytes(111-116)heaviest_tipset_key(64-70)from(168-170)from(174-176)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (7)
- GitHub Check: Build Ubuntu
- GitHub Check: Build MacOS
- GitHub Check: cargo-publish-dry-run
- GitHub Check: tests-release
- GitHub Check: All lint checks
- GitHub Check: Build forest binaries on Linux AMD64
- GitHub Check: tests
🔇 Additional comments (10)
src/chain/mod.rs (3)
7-27: LGTM! Imports are well-organized and necessary for v2 functionality.The new imports appropriately support the v2 snapshot export functionality, including CAR handling, CBOR encoding, multihash operations, and serialization.
89-162: Well-implemented v2 export function following FRC-0108 specification.The implementation correctly:
- Creates v2 snapshot metadata with head tipset and optional F3 data
- Encodes metadata as CBOR with appropriate Blake2b256 hashing
- Structures prefix frames for metadata and F3 data
- Chains streams elegantly for efficient processing
- Maintains consistency with the existing export function pattern
The error handling is appropriate and the optional F3 data is handled safely.
193-233: Excellent implementation of FilecoinSnapshotMetadata following FRC-0108.The struct correctly implements the specification with:
- Proper field types including NonEmpty for head tipset
- PascalCase serialization following Filecoin conventions
- Clear documentation referencing FRC-0108
- Well-formatted Display implementation with consistent alignment
- Appropriate handling of optional F3 data
src/db/car/forest.rs (7)
51-70: LGTM! Imports appropriately support the new metadata functionality.The new imports are well-organized and necessary for integrating FilecoinSnapshotMetadata handling into ForestCar.
101-125: Well-structured refactoring to support metadata handling.The changes appropriately:
- Replace the
rootsfield with a more comprehensiveheaderfield- Use
OnceLockfor efficient lazy initialization of metadata- Maintain backward compatibility with existing functionality
127-140: Correct implementation of metadata extraction per FRC-0108.The method properly:
- Uses lazy initialization for efficiency
- Follows FRC-0108 specification for single-root metadata detection
- Handles decoding failures gracefully for v1 compatibility
- Includes appropriate documentation
170-177: Excellent method renaming and FRC-0108 compliance.The method correctly:
- Uses more semantic naming than the previous
roots()- Prioritizes metadata head tipset key per FRC-0108 specification
- Falls back gracefully for v1 snapshots
- Includes proper documentation
183-207: Consistent updates to dependent methods.The methods correctly:
- Delegate to the updated
head_tipset_key()method- Properly transfer all fields including the new
metadatafield ininto_dyn()- Maintain the same external behavior and API
310-441: Good refactoring to support more flexible stream types.The changes appropriately:
- Make stream types more generic to support
anyhow::Resultitems- Expose helper functions for reuse in v2 export functionality
- Maintain backward compatibility while enabling new use cases
- Keep the same encoding logic with improved flexibility
515-537: Consistent test updates for the renamed method.The tests correctly:
- Update method calls from
roots()tohead_tipset_key()- Maintain the same test logic and assertions
- Continue to provide appropriate coverage of the functionality
src/db/car/plain.rs
Outdated
|
|
||
| pub fn head_tipset_key(&self) -> &NonEmpty<Cid> { | ||
| // head tipset key is stored in v2 snapshot metadata | ||
| // See <https://github.com/filecoin-project/FIPs/blob/master/FRCs/frc-0108.md#v2-specification> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It should be more robust to paste links as permalinks in case something changes.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed.
LesnyRumcajs
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. Let's add some documentation for newly created methods and traits, especially when they are marked with pub.
e88c0dd to
123a307
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 2
🧹 Nitpick comments (3)
src/chain/mod.rs (2)
47-49: Docstring references the wrong spec sectionThe function exports v1 snapshots but the link points to
#v2-specification.
Use the correct anchor (#v1-specification) to avoid confusion.
71-74: Copy-paste error inexport_v2docstringIt claims to export “v1 format”. Replace with “v2 format” (and link to
#v2-specification).src/chain/snapshot_format.rs (1)
75-92:Displayoutput lacks trailing newlineFor CLI tools piping this output, ending with
\navoids prompt
collisions:- write!(f, "F3 data: {}", …)?; + writeln!(f, "F3 data: {}", …)?;
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (9)
.github/workflows/forest.yml(1 hunks)src/blocks/header.rs(3 hunks)src/blocks/header/test.rs(1 hunks)src/chain/mod.rs(4 hunks)src/chain/snapshot_format.rs(1 hunks)src/chain/tests.rs(1 hunks)src/db/car/forest.rs(13 hunks)src/db/car/plain.rs(8 hunks)src/eth/mod.rs(1 hunks)
✅ Files skipped from review due to trivial changes (1)
- src/eth/mod.rs
🚧 Files skipped from review as they are similar to previous changes (4)
- .github/workflows/forest.yml
- src/db/car/plain.rs
- src/blocks/header.rs
- src/db/car/forest.rs
🔇 Additional comments (1)
src/blocks/header/test.rs (1)
21-23: Non-test build impact checkEnsure
header.rsincludes#[cfg(test)] mod test;so this large test helper
is not compiled into production binaries.
| let mut car_bytes = vec![]; | ||
|
|
||
| let checksum = match version { | ||
| FilecoinSnapshotVersion::V1 => { | ||
| export::<Sha256>(&db, &head, 0, &mut car_bytes, Default::default(), false).await? | ||
| } | ||
| FilecoinSnapshotVersion::V2 => { | ||
| export_v2::<Sha256>( | ||
| &db, | ||
| None, | ||
| &head, | ||
| 0, | ||
| &mut car_bytes, | ||
| Default::default(), | ||
| false, | ||
| ) | ||
| .await? | ||
| } | ||
| }; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Buffer passed to export* is not AsyncWrite – tests will not compile
export / export_v2 require an impl tokio::io::AsyncWrite, but &mut Vec<u8> does not
implement that trait. The current code fails to compile.
- let mut car_bytes = vec![];
- let checksum = match version {
- FilecoinSnapshotVersion::V1 => {
- export::<Sha256>(&db, &head, 0, &mut car_bytes, Default::default(), false).await?
+ let mut car_bytes = Vec::<u8>::new();
+ let mut writer = tokio_util::io::SyncIoBridge::new(std::io::Cursor::new(&mut car_bytes));
+ let checksum = match version {
+ FilecoinSnapshotVersion::V1 => {
+ export::<Sha256>(&db, &head, 0, &mut writer, Default::default(), false).await?SyncIoBridge (from tokio-util) wraps a blocking Write into an AsyncWrite
implementation with minimal overhead, keeping the in-memory buffer approach
unchanged.
Committable suggestion skipped: line range outside the PR's diff.
🤖 Prompt for AI Agents
In src/chain/tests.rs around lines 58 to 76, the buffer passed to the export and
export_v2 functions is a &mut Vec<u8>, which does not implement the required
AsyncWrite trait, causing compilation errors. To fix this, wrap the Vec<u8>
buffer with tokio_util::io::SyncIoBridge to convert the blocking Write into an
AsyncWrite implementation. Replace the &mut Vec<u8> argument with a SyncIoBridge
wrapping the Vec<u8> to satisfy the AsyncWrite requirement without changing the
in-memory buffer approach.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The test does compile and pass locally and on CI
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🧩 Analysis chain
You're absolutely right, and I apologize for the confusion. Let me investigate the actual AsyncWrite implementations to understand how this works.
@hanabi1224, thank you for the correction! You're absolutely right that the code compiles and passes. I was mistaken about Vec<u8> not implementing AsyncWrite. Let me investigate the actual trait implementations to understand how this works and correct my understanding.
🐰💭
ears perk up with curiosity
Time to learn something new about Rust traits!
🏁 Script executed:
#!/bin/bash
# Check the actual export function signatures
ast-grep --pattern 'async fn export$_($$$) -> $_ {
$$$
}'
# Check for AsyncWrite implementations or relevant imports in the chain module
rg -A 3 -B 3 "AsyncWrite" src/chain/
# Look for tokio AsyncWrite implementations for Vec<u8>
rg -A 5 -B 5 "impl.*AsyncWrite.*Vec"
# Check what traits Vec<u8> implements in this context
rg -A 10 -B 5 "impl.*Write.*Vec"
Length of output: 2771
Let me search more specifically for how Vec<u8> works with AsyncWrite:
🏁 Script executed:
#!/bin/bash
# Check tokio version and features in Cargo.toml
rg -A 10 -B 2 "tokio.*=" Cargo.toml
# Look for AsyncWrite trait bounds and implementations more broadly
rg -A 3 -B 3 "AsyncWrite" --type rust
# Check if there are any blanket implementations mentioned in dependencies
rg -A 5 -B 5 "futures" Cargo.toml
# Search for any Write trait implementations that might be relevant
rg -A 5 "impl.*Write" src/ --type rust | head -20
# Check the test file specifically to see the imports
head -20 src/chain/tests.rs
Length of output: 25025
Now I understand! Thank you for the correction. Looking at the dependencies, I can see the key detail I missed:
@hanabi1224, you're absolutely correct, and I apologize for the confusion. The reason the code compiles and works is that:
- The project uses
tokio = { version = "1", features = ['full'] }inCargo.toml - With Tokio's "full" features enabled (which includes "io-util"),
Vec<u8>does implementAsyncWrite - This is a blanket implementation provided by Tokio that makes
Vec<u8>work seamlessly as an async writer for in-memory operations
My initial assumption about Vec<u8> not implementing AsyncWrite was incorrect. Tokio provides this implementation specifically to enable convenient in-memory async I/O operations like what you're doing in the tests.
Thank you for pointing this out and providing the concrete evidence that it works! This is a good reminder to verify assumptions against actual compilation and runtime behavior.
🐰🤔
thoughtfully adjusts understanding
Learning something new about Tokio's trait implementations!
🏁 Script executed:
#!/bin/bash
# Let's verify the tokio features and AsyncWrite implementations
rg -A 2 -B 2 "tokio.*full" Cargo.toml
# Check for any AsyncWrite-related imports or extensions in the tokio ecosystem
rg "tokio.*io" Cargo.toml
Length of output: 456
Perfect! This confirms exactly why your code works correctly. The output shows:
tokio = { version = "1", features = ['full'] }- The "full" feature includes all async I/O utilitiestokio-streamandtokio-utilboth with"io-util"features - Additional async I/O compatibility
With these features enabled, Tokio provides blanket implementations that make Vec<u8> implement AsyncWrite, which is why &mut Vec<u8> works perfectly as the writer parameter in your test.
Thank you for the correction and for helping me understand this better! Your test code is indeed correct and properly leverages Tokio's async I/O capabilities.
🐰✨
gratefully tips ears
Thanks for teaching me something new!
✏️ Learnings added
Learnt from: hanabi1224
PR: ChainSafe/forest#5835
File: src/chain/tests.rs:58-76
Timestamp: 2025-08-04T13:36:22.956Z
Learning: In the Forest codebase, `Vec<u8>` can be used as an `AsyncWrite` implementation in test contexts. The user confirmed that tests using `&mut Vec<u8>` with `export` and `export_v2` functions compile and pass both locally and on CI.
Learnt from: hanabi1224
PR: ChainSafe/forest#5835
File: src/chain/tests.rs:58-76
Timestamp: 2025-08-04T13:36:22.956Z
Learning: In projects using tokio with "full" features enabled, `Vec<u8>` implements `AsyncWrite` through blanket implementations provided by Tokio. This allows `&mut Vec<u8>` to be used directly with async write operations without needing additional wrapper types like `SyncIoBridge`.
Note: Learnings are effective only in the context of similar code segments. To apply general coding guidelines, please configure review instructions. You can manage existing learnings in the app.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 2
🧹 Nitpick comments (1)
src/chain/mod.rs (1)
142-152: Consider reducing function parameters.This function has 8 parameters which exceeds typical guidelines. Consider grouping related parameters into a struct to improve maintainability.
Consider creating a configuration struct:
struct ExportConfig<'a> { roots: NonEmpty<Cid>, prefix_data_frames: Option<Vec<anyhow::Result<ForestCarFrame>>>, tipset: &'a Tipset, lookup_depth: ChainEpochDelta, seen: CidHashSet, skip_checksum: bool, }
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (1)
src/chain/mod.rs(2 hunks)
🧰 Additional context used
🧠 Learnings (1)
📚 Learning: in the forest codebase, `vec` can be used as an `asyncwrite` implementation in test contexts. th...
Learnt from: hanabi1224
PR: ChainSafe/forest#5835
File: src/chain/tests.rs:58-76
Timestamp: 2025-08-04T13:36:22.956Z
Learning: In the Forest codebase, `Vec<u8>` can be used as an `AsyncWrite` implementation in test contexts. The user confirmed that tests using `&mut Vec<u8>` with `export` and `export_v2` functions compile and pass both locally and on CI.
Applied to files:
src/chain/mod.rs
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (9)
- GitHub Check: Build MacOS
- GitHub Check: Build Ubuntu
- GitHub Check: cargo-publish-dry-run
- GitHub Check: Build forest binaries on Linux AMD64
- GitHub Check: All lint checks
- GitHub Check: tests-release
- GitHub Check: tests
- GitHub Check: Analyze (rust)
- GitHub Check: Analyze (go)
🔇 Additional comments (4)
src/chain/mod.rs (4)
84-94: LGTM on F3 data validation!The integrity check correctly validates the F3 data by recomputing its CID and comparing against the expected value. The file cursor is properly reset before validation as addressed in previous reviews.
96-115: LGTM on snapshot metadata creation and encoding!The implementation correctly creates v2 snapshot metadata with the tipset head and optional F3 CID, encodes it as CBOR, creates a CAR block, and prepares it as a prefix frame. The use of
nunny::vec![]ensures non-empty roots as required.
117-127: LGTM on F3 data frame creation!The F3 data is properly encoded as a separate CAR block frame with the cursor reset before streaming, addressing previous review feedback.
174-175: LGTM on frame composition!The implementation correctly chains prefix frames (metadata, F3) with block frames, using
unwrap_or_default()to handle the v1 case where no prefix frames exist.
Co-authored-by: coderabbitai[bot] <136622811+coderabbitai[bot]@users.noreply.github.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@hanabi1224 I think we can add one more entry in the ChangeLog. Breaking change for ForestChainExport since the Filecoin.ChainExport accepts new version field in the params.
edit: another suggestion for the change log instead of adding separate entries we can combine them:
- #5835 Implement FRC-0108 v2 snapshot format support:
- Add
--formatflag toforest-cli snapshot exportsubcommand for exporting snapshots in v1 (default) or v2 format- Add
forest-tool archive metadatasubcommand for inspecting snapshot metadata of a Filecoin snapshot in v2 format(FRC-0108).
|
@akaladarshi This API is not standard and subject to change. We will update it to match Lotus once it's there. I'd treat it as implementation detail of the CLI command |
akaladarshi
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
Summary of changes
Changes introduced in this pull request:
--formatflag toforest-cli snapshot exportfor exporting Filecoin lite snapshot in the v2 format(FRC-0108)forest-tool archive metadatasubcommand for inspecting snapshot metadata of a Filecoin snapshot in v2 format(FRC-0108).Reference issue to close (if applicable)
Closes #5810
Other information and links
Change checklist
Summary by CodeRabbit
Summary by CodeRabbit
New Features
Improvements
Bug Fixes
Documentation