Skip to content

Conversation

@hanabi1224
Copy link
Contributor

@hanabi1224 hanabi1224 commented Jul 15, 2025

Summary of changes

Changes introduced in this pull request:

  • add --format flag to forest-cli snapshot export for exporting Filecoin lite snapshot in the v2 format(FRC-0108)
  • add forest-tool archive metadata subcommand for inspecting snapshot metadata of a Filecoin snapshot in v2 format(FRC-0108).
➜  car_db forest-tool archive metadata forest_snapshot_calibnet_2025-07-16_height_2844521.forest.car.zst
Snapshot version: 2
Head Tipset:      bafy2bzaced6tnxdsi2t4zvwxt6kugdqz6yrztxcjc7nyjyl375fl2rf5njtsm
F3 data:          not found

Reference issue to close (if applicable)

Closes #5810

Other information and links

Change checklist

  • I have performed a self-review of my own code,
  • I have made corresponding changes to the documentation. All new code adheres to the team's documentation standards,
  • I have added tests that prove my fix is effective or that my feature works (if possible),
  • I have made sure the CHANGELOG is up-to-date. All user-facing changes should be reflected in this document.

Summary by CodeRabbit

Summary by CodeRabbit

  • New Features

    • Added support for exporting and inspecting Filecoin snapshots in the new v2 format, including enhanced metadata handling.
    • Introduced a new command to display snapshot metadata if present.
    • Added a new RPC method for v2 snapshot export.
    • Added a flag to select snapshot export format.
  • Improvements

    • Snapshot info and metadata are now more clearly displayed, including version information.
    • Unified and streamlined handling of snapshot metadata across various commands and tools.
    • Enhanced snapshot export with optional metadata and improved compression streams.
    • Refined snapshot validation and inspection workflows for compressed snapshot files.
    • Improved snapshot metadata extraction and access in CAR file handling.
    • Updated snapshot info output filtering for better clarity.
    • Simplified and standardized CAR block writing logic.
    • Added concurrency control to prevent simultaneous snapshot exports.
    • Updated test suites to cover new snapshot format and metadata handling.
  • Bug Fixes

    • Improved validation and comparison logic for compressed snapshot files in test scripts.
    • Made shutdown command in test harness more robust to avoid script failures.
  • Documentation

    • Updated changelog and command documentation to reflect new features and changes.
    • Corrected documentation URLs for external references.

@coderabbitai
Copy link
Contributor

coderabbitai bot commented Jul 15, 2025

Walkthrough

This change set introduces support for Filecoin snapshot export and inspection in the new v2 format (FRC-0108), alongside continued support for v1. It adds metadata handling for snapshots, updates CLI and RPC interfaces to select the export format, and enhances archive inspection tools. Internal CAR handling is refactored to extract and cache snapshot metadata, and related tests and scripts are updated accordingly.

Changes

File(s) Change Summary
.config/forest.dic Added "v0", "v1", and "v2" entries to the dictionary.
CHANGELOG.md Updated changelog: added entries for v2 snapshot export, metadata inspection, and detailed test report generation.
scripts/tests/calibnet_export_check.sh Script now exports both v1 and v2 compressed snapshots, runs archive inspection on each, and validates compressed files directly.
scripts/tests/snapshot_parity/docker-compose.yml Modified filtering logic for archive info output: now removes lines with "CAR format" and "Index size" for Forest, and "CAR format" for Lotus, instead of skipping lines by position.
src/chain/mod.rs Added export_v2 async function for v2 snapshot export with metadata; introduced FilecoinSnapshotMetadata struct and FilecoinSnapshotVersion enum; refactored export logic to share common code.
src/cli/subcommands/snapshot_cmd.rs Added format field of type FilecoinSnapshotVersion to SnapshotCommands::Export; updated export logic to call v1 or v2 export RPC methods based on format.
src/db/car/any.rs Added metadata method to AnyCar for unified snapshot metadata access.
src/db/car/forest.rs Replaced roots with header and metadata fields in ForestCar; added lazy metadata extraction; renamed and updated accessors; unified stream error handling; made encoding helpers public.
src/db/car/plain.rs Added metadata field to PlainCar; introduced lazy metadata extraction and new accessors; updated related methods and tests.
src/db/memory.rs Updated test to check head_tipset_key() instead of roots() for ForestCar.
src/rpc/methods/chain.rs Added new ForestChainExport RPC method supporting snapshot versioning and dispatching to v1 or v2 export functions; retained original ChainExport as a wrapper calling ForestChainExport with v1 version.
src/rpc/mod.rs Registered ForestChainExport RPC method in the macro for inclusion in method lists and OpenRPC specs.
src/tool/subcommands/api_cmd/test_snapshots_ignored.txt Added Forest.ChainExport to the list of ignored API commands/snapshots.
src/tool/subcommands/archive_cmd.rs Added Metadata command for snapshot metadata inspection; updated Info command and ArchiveInfo struct to handle and display snapshot version and metadata; renamed fields for clarity.
src/utils/db/car_stream.rs Refactored CarBlock::write to delegate to new CarBlockWrite trait for writing CAR blocks; implemented trait for all writers.
.github/workflows/forest.yml Added new CI job calibnet-export-check-v2 running export checks for v2 snapshots alongside existing v1 job; added timeout settings.
scripts/tests/harness.sh Modified shutdown command in cleanup function to ignore errors, preventing script termination on failure.
src/blocks/chain4u.rs Updated with_blockstore method to require Blockstore trait and insert genesis block on initialization.
src/blocks/header.rs Made FILECOIN_GENESIS_CID public; added public FILECOIN_GENESIS_BLOCK static; added test verifying genesis CID matches genesis block hash.
src/blocks/tipset.rs Added From<NonEmpty<CachingBlockHeader>> implementation for Tipset to enable direct construction from non-empty header collections.
src/db/car/mod.rs Added constant V2_SNAPSHOT_ROOT_COUNT set to 1, referencing FRC-0108 snapshot root count.

Sequence Diagram(s)

sequenceDiagram
    participant User
    participant CLI
    participant RPC
    participant Chain
    participant File

    User->>CLI: forest-cli snapshot export --format v2
    CLI->>RPC: Call Forest.ChainExport(params with version v2)
    RPC->>Chain: export_v2(db, None, tipset, ...)
    Chain->>File: Write v2 snapshot with metadata
    Chain-->>RPC: Return checksum
    RPC-->>CLI: Return checksum
    CLI-->>User: Save file, report completion
Loading
sequenceDiagram
    participant User
    participant CLI
    participant Tool

    User->>CLI: forest-tool archive metadata <snapshot>
    CLI->>Tool: Open snapshot file
    Tool->>Tool: Extract FRC-0108 metadata
    Tool-->>CLI: Print metadata or "absent" message
    CLI-->>User: Display metadata
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~90 minutes

Assessment against linked issues

Objective Addressed Explanation
Support exporting snapshots in new fomat with F3 data (#5810)

Assessment against linked issues: Out-of-scope changes

Code Change Explanation
Addition of genesis block constants and test in src/blocks/header.rs Not related to snapshot export or F3 data support objectives; pertains to genesis block setup.
From<NonEmpty<CachingBlockHeader>> for Tipset implementation in tipset.rs Utility conversion unrelated to snapshot export or F3 data support.
Refactoring of CarBlock::write to use trait in src/utils/db/car_stream.rs General code refactor, unrelated to snapshot export or F3 data support.
Shutdown command modification in scripts/tests/harness.sh Script robustness improvement, unrelated to snapshot export or F3 data support.

Suggested labels

RPC

Suggested reviewers

  • LesnyRumcajs
  • sudo-shashank

Note

⚡️ Unit Test Generation is now available in beta!

Learn more here, or try it out under "Finishing Touches" below.


📜 Recent review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between d7849fa and 717243f.

📒 Files selected for processing (1)
  • src/chain/mod.rs (2 hunks)
🧰 Additional context used
🧠 Learnings (1)
📚 Learning: in the forest codebase, `vec` can be used as an `asyncwrite` implementation in test contexts. th...
Learnt from: hanabi1224
PR: ChainSafe/forest#5835
File: src/chain/tests.rs:58-76
Timestamp: 2025-08-04T13:36:22.956Z
Learning: In the Forest codebase, `Vec<u8>` can be used as an `AsyncWrite` implementation in test contexts. The user confirmed that tests using `&mut Vec<u8>` with `export` and `export_v2` functions compile and pass both locally and on CI.

Applied to files:

  • src/chain/mod.rs
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (10)
  • GitHub Check: Build Ubuntu
  • GitHub Check: tests-release
  • GitHub Check: Build forest binaries on Linux AMD64
  • GitHub Check: tests
  • GitHub Check: cargo-publish-dry-run
  • GitHub Check: Build MacOS
  • GitHub Check: All lint checks
  • GitHub Check: Check
  • GitHub Check: Analyze (rust)
  • GitHub Check: Analyze (go)
🔇 Additional comments (7)
src/chain/mod.rs (7)

4-7: LGTM! Good modular organization.

The file structure has been improved by separating concerns into dedicated modules (snapshot_format, tests) while keeping the core export functionality in mod.rs. This addresses the previous feedback about splitting the file for better organization.


59-70: Good refactoring to reduce code duplication.

The export function has been properly refactored to delegate to the new export_to_forest_car helper function, eliminating code duplication while maintaining backward compatibility for v1 exports.


84-94: Excellent F3 data integrity validation.

The F3 snapshot integrity check properly validates the provided CID matches the actual file content by recomputing the hash. The seek operation ensures the entire file is read from the beginning, addressing previous feedback about cursor positioning.


96-115: Clean metadata serialization implementation.

The snapshot metadata creation and serialization follows proper patterns:

  • Uses the new FilecoinSnapshotMetadata::new_v2 constructor
  • Properly encodes to CBOR format
  • Creates a deterministic CID using Blake2b256
  • Constructs the CAR block correctly

117-127: Proper F3 data frame encoding.

The F3 data encoding correctly:

  • Resets the file cursor to start position (addressing past feedback)
  • Uses the forest encoder with proper compression level
  • Includes the F3 CID in the frame
  • Handles file metadata appropriately

142-152: Well-designed helper function signature.

The export_to_forest_car function has a comprehensive signature that handles both v1 and v2 export scenarios through the optional prefix_data_frames parameter. The #[allow(clippy::too_many_arguments)] is justified given the function's role as an internal helper that consolidates common export logic.


175-175: Elegant stream composition.

The combination of prefix frames with block frames using chain() provides a clean separation between metadata/F3 data and actual blockchain data, while maintaining streaming efficiency.

✨ Finishing Touches
  • 📝 Generate Docstrings
🧪 Generate unit tests
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch hm/export-snapshot-v2

🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

  • Review comments: Directly reply to a review comment made by CodeRabbit. Example:
    • I pushed a fix in commit <commit_id>, please review it.
    • Explain this complex logic.
    • Open a follow-up GitHub issue for this discussion.
  • Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
    • @coderabbitai explain this code block.
  • PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
    • @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
    • @coderabbitai read src/utils.ts and explain its main purpose.
    • @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.

Support

Need help? Create a ticket on our support page for assistance with any issues or questions.

CodeRabbit Commands (Invoked using PR comments)

  • @coderabbitai pause to pause the reviews on a PR.
  • @coderabbitai resume to resume the paused reviews.
  • @coderabbitai review to trigger an incremental review. This is useful when automatic reviews are disabled for the repository.
  • @coderabbitai full review to do a full review from scratch and review all the files again.
  • @coderabbitai summary to regenerate the summary of the PR.
  • @coderabbitai generate docstrings to generate docstrings for this PR.
  • @coderabbitai generate sequence diagram to generate a sequence diagram of the changes in this PR.
  • @coderabbitai generate unit tests to generate unit tests for this PR.
  • @coderabbitai resolve resolve all the CodeRabbit review comments.
  • @coderabbitai configuration to show the current CodeRabbit configuration for the repository.
  • @coderabbitai help to get help.

Other keywords and placeholders

  • Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
  • Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
  • Add @coderabbitai anywhere in the PR title to generate the title automatically.

CodeRabbit Configuration File (.coderabbit.yaml)

  • You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
  • Please see the configuration documentation for more information.
  • If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Documentation and Community

  • Visit our Documentation for detailed information on how to use CodeRabbit.
  • Join our Discord Community to get help, request features, and share feedback.
  • Follow us on X/Twitter for updates and announcements.

@hanabi1224 hanabi1224 force-pushed the hm/export-snapshot-v2 branch from 82e04d8 to 6408398 Compare July 16, 2025 10:59
@hanabi1224 hanabi1224 marked this pull request as ready for review July 16, 2025 11:05
@hanabi1224 hanabi1224 requested a review from a team as a code owner July 16, 2025 11:05
@hanabi1224 hanabi1224 requested review from LesnyRumcajs and akaladarshi and removed request for a team July 16, 2025 11:05
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🧹 Nitpick comments (2)
scripts/tests/snapshot_parity/docker-compose.yml (1)

103-104: DRY up repeated filtering logic

The two nearly identical filter expressions (grep -v "CAR format" …) are duplicated for Forest and Lotus.
Extracting the pattern into a variable or a small helper function would make it easier to adjust the ignore list in one place when the tool’s output changes again.

Example:

FILTER='grep -v "CAR format"'
forest-tool archive info forest.car.zst | $FILTER | grep -v "Index size" > forest.txt
forest-tool archive info lotus.car        | $FILTER                        > lotus.txt
CHANGELOG.md (1)

34-37: Minor formatting nits in new “Added” entries

  1. Missing space before the opening parenthesis in both occurrences of “format(FRC-0108)”.
  2. Optional: both bullets reference the same issue/PR #5835; if one is actually a PR and the other an issue, append the proper link type (pull vs issues) to avoid confusion.
- ... v2 format(FRC-0108).
+ ... v2 format (FRC-0108).

- ... v2 format(FRC-0108).
+ ... v2 format (FRC-0108).
📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 3cf843a and d07fa89.

📒 Files selected for processing (16)
  • .config/forest.dic (1 hunks)
  • .config/lychee.toml (1 hunks)
  • CHANGELOG.md (1 hunks)
  • scripts/tests/calibnet_export_check.sh (2 hunks)
  • scripts/tests/snapshot_parity/docker-compose.yml (1 hunks)
  • src/chain/mod.rs (2 hunks)
  • src/cli/subcommands/snapshot_cmd.rs (3 hunks)
  • src/db/car/any.rs (2 hunks)
  • src/db/car/forest.rs (13 hunks)
  • src/db/car/plain.rs (8 hunks)
  • src/db/memory.rs (1 hunks)
  • src/rpc/methods/chain.rs (1 hunks)
  • src/rpc/mod.rs (1 hunks)
  • src/tool/subcommands/api_cmd/test_snapshots_ignored.txt (1 hunks)
  • src/tool/subcommands/archive_cmd.rs (9 hunks)
  • src/utils/db/car_stream.rs (3 hunks)
🧰 Additional context used
🧬 Code Graph Analysis (3)
src/db/memory.rs (2)
src/db/car/forest.rs (1)
  • head_tipset_key (169-175)
src/db/car/plain.rs (1)
  • head_tipset_key (194-200)
src/db/car/plain.rs (3)
src/blocks/tipset.rs (10)
  • new (226-244)
  • new (478-495)
  • from (102-104)
  • from (153-155)
  • from (159-161)
  • from (165-170)
  • from (189-198)
  • from (463-468)
  • len (68-70)
  • len (316-318)
src/db/car/any.rs (5)
  • new (35-54)
  • metadata (56-62)
  • heaviest_tipset_key (64-70)
  • from (168-170)
  • from (174-176)
src/db/car/forest.rs (4)
  • new (106-125)
  • metadata (127-139)
  • head_tipset_key (169-175)
  • heaviest_tipset_key (181-183)
src/db/car/forest.rs (5)
src/blocks/tipset.rs (9)
  • chain (374-381)
  • new (226-244)
  • new (478-495)
  • from (102-104)
  • from (153-155)
  • from (159-161)
  • from (165-170)
  • from (189-198)
  • from (463-468)
src/blocks/block.rs (1)
  • header (31-33)
src/db/car/any.rs (6)
  • new (35-54)
  • metadata (56-62)
  • index_size_bytes (111-116)
  • heaviest_tipset_key (64-70)
  • from (168-170)
  • from (174-176)
src/db/car/plain.rs (5)
  • new (130-178)
  • new (489-491)
  • metadata (180-192)
  • head_tipset_key (194-200)
  • heaviest_tipset_key (206-208)
src/db/car/many.rs (4)
  • new (37-43)
  • new (73-79)
  • heaviest_tipset_key (133-139)
  • heaviest_tipset_key (269-274)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (6)
  • GitHub Check: Build forest binaries on Linux AMD64
  • GitHub Check: tests-release
  • GitHub Check: tests
  • GitHub Check: All lint checks
  • GitHub Check: cargo-publish-dry-run
  • GitHub Check: Build Ubuntu
🔇 Additional comments (22)
.config/lychee.toml (1)

20-22: Trailing comma after the final array element may break TOML parsing

"linuxhint.com", is followed immediately by the closing ].
The official TOML 1.0 spec disallows a comma after the last element of an array, and many parsers (including toml-rs, which lychee relies on) will fail to load the config if the trailing comma is present.
The existing entry for "crates.io", already has this issue, but adding another item now is a good chance to fix both.

-  "crates.io",
-  # Bot protection / 403 Forbidden errors
-  "linuxhint.com",
+  "crates.io"
+  # Bot protection / 403 Forbidden errors
+  "linuxhint.com"

(Or simply keep the trailing comma on all items except the last one.)

Please run lychee --config .config/lychee.toml --help locally to verify that the file still parses.

Likely an incorrect or invalid review comment.

.config/forest.dic (1)

132-134: LGTM! Dictionary entries correctly added for snapshot version identifiers.

The additions of v0, v1, and v2 are properly placed alphabetically and align with the snapshot versioning feature being introduced.

src/tool/subcommands/api_cmd/test_snapshots_ignored.txt (1)

14-14: LGTM! Correctly added new RPC method to ignore list.

The addition of Filecoin.ChainExportV2 is properly placed after the existing Filecoin.ChainExport entry and aligns with the new RPC method being introduced.

src/db/memory.rs (1)

218-218: LGTM! Test updated to use new API method.

The change from car.roots() to car.head_tipset_key() correctly adapts to the refactored CAR API. The new method returns the same semantic information (tipset key from metadata or header roots) while supporting the new snapshot metadata format.

src/rpc/mod.rs (1)

62-62: LGTM! New RPC method properly registered in the framework.

The addition of ChainExportV2 to the macro is correctly placed in the chain vertical section and ensures the new RPC method is integrated into all RPC-related processing.

src/db/car/any.rs (2)

13-13: LGTM! Necessary import added for snapshot metadata.

The import of FilecoinSnapshotMetadata is correctly added to support the new metadata functionality.


56-62: LGTM! Well-implemented unified metadata access interface.

The new metadata() method correctly delegates to each variant's metadata method, providing a consistent interface for accessing snapshot metadata regardless of the underlying CAR format. This follows the established pattern used by other methods in the enum.

scripts/tests/calibnet_export_check.sh (1)

35-39: Efficient validation approach for compressed files.

The change to validate .car.zst files directly without decompression is a good optimization that aligns with the improved archive handling capabilities.

src/cli/subcommands/snapshot_cmd.rs (1)

38-40: Clean implementation of v2 export flag.

The --v2 flag is well-documented and the conditional RPC call logic is clear and maintainable.

Also applies to: 126-134

src/utils/db/car_stream.rs (1)

90-113: Good refactoring to extract write logic into a reusable trait.

The CarBlockWrite trait improves modularity and follows the DRY principle by making the CAR block writing logic available to any Write implementor.

src/db/car/plain.rs (1)

180-200: Well-designed metadata extraction with lazy initialization.

The implementation correctly:

  • Uses OnceLock for efficient lazy initialization
  • Checks for single root CID before attempting metadata decode
  • Provides graceful fallback when metadata is not present
  • Follows the same pattern as the ForestCar implementation
src/rpc/methods/chain.rs (1)

339-361: Confirm correct usage of None for F3 parameter in export_v2

The export_v2 signature in src/chain/mod.rs is:

pub async fn export_v2<D: Digest>(
    db: &Arc<impl Blockstore + Send + Sync + 'static>,
    f3: Option<(Cid, &mut File)>,       // optional F3 data
    tipset: &Tipset,
    lookup_depth: ChainEpochDelta,
    writer: impl AsyncWrite + Unpin,
    seen: CidHashSet,
    skip_checksum: bool,
) -> anyhow::Result<Option<digest::Output<D>>, Error>
  • Passing None for the second argument correctly indicates that no F3 data is provided in this RPC.
  • The rest of the parameters mirror the existing ChainExport logic.

No further changes are required.

src/chain/mod.rs (2)

87-160: Well-structured implementation of v2 export functionality.

The implementation correctly follows the FRC-0108 specification by creating snapshot metadata with version 2, encoding it as CBOR, and prepending it to the exported data. The optional F3 data handling and stream chaining approach are well-designed.


162-199: Metadata struct correctly implements FRC-0108 specification.

The FilecoinSnapshotMetadata struct properly follows the FRC-0108 standard with appropriate serialization attributes and a well-formatted Display implementation.

src/tool/subcommands/archive_cmd.rs (3)

174-195: Info command properly handles both v1 and v2 archives.

The implementation correctly extracts the snapshot version from metadata when available and falls back to version 1 for legacy archives, ensuring backward compatibility.


196-204: Clean implementation of metadata inspection command.

The command provides clear output for both v2 archives with metadata and legacy archives without metadata.


252-288: ArchiveInfo updates improve clarity and version visibility.

The renaming from root to head better reflects the semantic meaning, and the addition of snapshot version information enhances the user experience when inspecting archives.

src/db/car/forest.rs (5)

93-125: Well-designed refactoring to support metadata extraction.

The transition from storing roots directly to storing the full header with lazy metadata loading is a clean design that maintains backward compatibility while enabling v2 snapshot support.


127-139: Metadata extraction logic correctly handles v2 snapshots.

The implementation properly attempts to decode metadata only when there's a single root, which aligns with the v2 format specification. The graceful fallback ensures compatibility with v1 archives.

Note: In the rare case where a v1 archive has exactly one root, the decode attempt will fail and return None, which is the correct behavior.


169-183: Excellent backward compatibility through head_tipset_key().

The method provides a clean abstraction that works seamlessly with both v1 and v2 archives by preferring metadata when available and falling back to header roots.


303-439: Stream type updates and public visibility changes support v2 export.

The transition to anyhow::Result wrapped streams provides consistent error handling, and exposing finalize_frame and new_encoder enables their reuse in the v2 export implementation.


509-539: Test updates correctly reflect API changes.

The tests have been properly updated to use head_tipset_key() instead of the removed roots() method.

Comment on lines 100 to +101
# Skip the CAR format line and the "Index size" line (only present in Forest snapshots)
forest-tool archive info forest.car.zst | tail -n +2 | grep -v "Index size" > forest.txt
forest-tool archive info forest.car.zst | grep -v "CAR format" | grep -v "Index size" > forest.txt
Copy link
Contributor

@coderabbitai coderabbitai bot Jul 16, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue

grep can trip pipefail and abort the test when nothing is left after filtering

Because the script is executed with set -o pipefail, the entire pipeline will exit non-zero if any element fails.
grep -v returns 1 when it filters out all lines, which would terminate the job even though that situation is harmless for the comparison.

Consider making the pipeline resilient:

-forest-tool archive info forest.car.zst | grep -v "CAR format" | grep -v "Index size" > forest.txt
+forest-tool archive info forest.car.zst \
+  | grep -v "CAR format" \
+  | grep -v "Index size" \
+  || true   # avoid pipefail abort if all lines are filtered out

(or use sed/awk that always return 0).
This prevents spurious CI failures when the output happens to contain only the excluded lines.

Committable suggestion skipped: line range outside the PR's diff.

🤖 Prompt for AI Agents
In scripts/tests/snapshot_parity/docker-compose.yml around lines 100 to 101, the
use of `grep -v` in the pipeline can cause the script to fail due to `set -o
pipefail` when all lines are filtered out, returning a non-zero exit code. To
fix this, modify the pipeline to handle the case where `grep -v` returns 1
without failing the entire pipeline, such as appending `|| true` after the grep
commands or replacing `grep -v` with `sed` or `awk` commands that always return
0, ensuring the pipeline does not abort on harmless empty output.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's OK to let it fail as this is part of CI test

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@hanabi1224 Ah, I see the eldritch wisdom in your approach! Indeed, in the cyclopean realm of CI testing, allowing the pipeline to fail when the ancient grep incantations filter out all lines serves as a sentinel against the unspeakable horror of unexpected output format changes.

Your decision to embrace this failure mode is sound - should the forest-tool archive info command's output ever shift in ways that render all lines filtered by your grep -v invocations, the test will dutifully shriek into the void and alert you to investigate what dark forces have altered the tool's behavior.

The Old Ones of continuous integration approve of this watchful approach! 🐙

Ok(digest)
}

pub async fn export_v2<D: Digest>(
Copy link
Member

@LesnyRumcajs LesnyRumcajs Jul 18, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We don't have unit tests for the original export, but it'd be great to have some for v2.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Tests added for both v1 and v2

Comment on lines 38 to 40
/// Export snapshot in the experimental v2 format(FRC-0108).
#[arg(long)]
v2: bool,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For the sake of future implementations and operations, I think it'd be nicer to provide an enumeration with allowed versions: v1 and v2. If we add another one, we'd just need add a valueto the enum (and not handle cases like --v2 --v3).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed


pub fn metadata(&self) -> &Option<FilecoinSnapshotMetadata> {
self.metadata.get_or_init(|| {
if self.header.roots.len() == 1 {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some explanations would be nice.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we can use common constant in place of 1, so we can use it both places (forest and plain) and add comment to describe it. Something like this:

/// According to FRC-0108, v2 snapshots have exactly one root pointing to metadata
const V2_SNAPSHOT_ROOT_COUNT: usize = 1;

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed.


pub fn roots(&self) -> &NonEmpty<Cid> {
&self.roots
pub fn head_tipset_key(&self) -> &NonEmpty<Cid> {
Copy link
Member

@LesnyRumcajs LesnyRumcajs Jul 18, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's have some documentation or comments around it, pointing to the relevant FRC for more details.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Adding some kind of comment there will be more useful, FRC documentation is not enough for people who don't understand it.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same comments as in the ForestCAR.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed

&store,
"ManyCAR".to_string(),
heaviest_tipset.clone(),
1,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's difficult to say what this 1 means without going deeper into the implementation. It'd be nice to have it named.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed

@LesnyRumcajs
Copy link
Member

For future reference; I checked and there are no speed regressions (I checked on calibnet).

this branch

❯ time forest-cli snapshot export
./forest_snapshot_calibnet_2025-07-18_height_2850827.forest.car.zst: 6.29GB
Export completed.

________________________________________________________
Executed in   70.68 secs      fish           external
   usr time   45.42 millis  753.00 micros   44.67 millis
   sys time   29.64 millis  823.00 micros   28.82 millis

❯ time forest-cli snapshot export --v2
./forest_snapshot_calibnet_2025-07-18_height_2850829.forest.car.zst: 6.29GB
Export completed.

________________________________________________________
Executed in   71.04 secs      fish           external
   usr time   50.34 millis    1.30 millis   49.04 millis
   sys time   25.84 millis    0.16 millis   25.67 millis

main

❯ time forest-cli snapshot export
./forest_snapshot_calibnet_2025-07-18_height_2850836.forest.car.zst: 6.29GB
Export completed.

________________________________________________________
Executed in   70.17 secs      fish           external
   usr time   59.28 millis    0.27 millis   59.00 millis
   sys time   36.68 millis    1.04 millis   35.64 millis

src/chain/mod.rs Outdated
Comment on lines 163 to 168
#[derive(Debug, Serialize, Deserialize, PartialEq, Eq)]
#[serde(rename_all = "PascalCase")]
pub struct FilecoinSnapshotMetadata {
pub version: u64,
pub head_tipset_key: NonEmpty<Cid>,
pub f3_data: Option<Cid>,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I know it's in the FRC, but it'd be nice to have those fields documented regardless. Perhaps even some copy-pasta from the spec. At the end of the day, pointing someone to our Rust docs is nice and AI tooling will also appreciate it.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed

src/chain/mod.rs Outdated
#[derive(Debug, Serialize, Deserialize, PartialEq, Eq)]
#[serde(rename_all = "PascalCase")]
pub struct FilecoinSnapshotMetadata {
pub version: u64,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it'd be nice to have an enum here (with u64 representation). This way, we can have more idiomatic Rust code elsewhere with exhaustive pattern matching.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed


pub enum ChainExportV2 {}
impl RpcMethod<1> for ChainExportV2 {
const NAME: &'static str = "Filecoin.ChainExportV2";
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it agreed with FilOz that we add this RPC method? Or should we add an optional field in the params (which should also be agreed upon with them)?

Copy link
Contributor Author

@hanabi1224 hanabi1224 Jul 22, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changed to Filecoin.ChainExport for now. We could align with Lotus once their implementation is finalized. RPC method is not specified by the FRC

Copy link
Member

@LesnyRumcajs LesnyRumcajs left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's looking good. There are some outstanding comments that I'd like to get resolved first so we don't merge it by mistake.

@hanabi1224 hanabi1224 force-pushed the hm/export-snapshot-v2 branch from 2fb86a9 to 918d590 Compare July 22, 2025 08:40
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (1)
src/chain/mod.rs (1)

235-253: Good test coverage for CBOR serialization, consider expanding.

The test correctly verifies round-trip CBOR serialization/deserialization and ensures compatibility with u64 encoding.

Consider expanding the test to include:

 #[test]
 fn test_snapshot_version_cbor_serde() {
+    // Test V1
+    assert_eq!(
+        fvm_ipld_encoding::to_vec(&FilecoinSnapshotVersion::V1),
+        fvm_ipld_encoding::to_vec(&1_u64)
+    );
+    
     assert_eq!(
         fvm_ipld_encoding::to_vec(&FilecoinSnapshotVersion::V2),
         fvm_ipld_encoding::to_vec(&2_u64)
     );
+    
+    // Test error case
+    assert!(fvm_ipld_encoding::from_slice::<FilecoinSnapshotVersion>(
+        &fvm_ipld_encoding::to_vec(&99_u64).unwrap()
+    ).is_err());
+    
     assert_eq!(
         fvm_ipld_encoding::from_slice::<FilecoinSnapshotVersion>(
             &fvm_ipld_encoding::to_vec(&2_u64).unwrap()
         )
         .unwrap(),
         FilecoinSnapshotVersion::V2
     );
 }
📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 2fb86a9 and 918d590.

📒 Files selected for processing (9)
  • CHANGELOG.md (1 hunks)
  • scripts/tests/calibnet_export_check.sh (2 hunks)
  • src/chain/mod.rs (2 hunks)
  • src/cli/subcommands/snapshot_cmd.rs (4 hunks)
  • src/db/car/forest.rs (13 hunks)
  • src/db/car/plain.rs (8 hunks)
  • src/rpc/methods/chain.rs (1 hunks)
  • src/tool/subcommands/api_cmd/test_snapshots_ignored.txt (1 hunks)
  • src/tool/subcommands/archive_cmd.rs (10 hunks)
🧬 Code Graph Analysis (1)
src/db/car/forest.rs (4)
src/blocks/tipset.rs (9)
  • chain (374-381)
  • new (226-244)
  • new (478-495)
  • from (102-104)
  • from (153-155)
  • from (159-161)
  • from (165-170)
  • from (189-198)
  • from (463-468)
src/db/car/forest/index/mod.rs (3)
  • std (526-526)
  • std (574-574)
  • std (602-602)
src/db/car/plain.rs (5)
  • new (130-178)
  • new (491-493)
  • metadata (180-193)
  • head_tipset_key (195-202)
  • heaviest_tipset_key (208-210)
src/db/car/any.rs (6)
  • new (35-54)
  • metadata (56-62)
  • index_size_bytes (111-116)
  • heaviest_tipset_key (64-70)
  • from (168-170)
  • from (174-176)
🚧 Files skipped from review as they are similar to previous changes (7)
  • src/tool/subcommands/api_cmd/test_snapshots_ignored.txt
  • scripts/tests/calibnet_export_check.sh
  • CHANGELOG.md
  • src/cli/subcommands/snapshot_cmd.rs
  • src/rpc/methods/chain.rs
  • src/tool/subcommands/archive_cmd.rs
  • src/db/car/plain.rs
🧰 Additional context used
🧬 Code Graph Analysis (1)
src/db/car/forest.rs (4)
src/blocks/tipset.rs (9)
  • chain (374-381)
  • new (226-244)
  • new (478-495)
  • from (102-104)
  • from (153-155)
  • from (159-161)
  • from (165-170)
  • from (189-198)
  • from (463-468)
src/db/car/forest/index/mod.rs (3)
  • std (526-526)
  • std (574-574)
  • std (602-602)
src/db/car/plain.rs (5)
  • new (130-178)
  • new (491-493)
  • metadata (180-193)
  • head_tipset_key (195-202)
  • heaviest_tipset_key (208-210)
src/db/car/any.rs (6)
  • new (35-54)
  • metadata (56-62)
  • index_size_bytes (111-116)
  • heaviest_tipset_key (64-70)
  • from (168-170)
  • from (174-176)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (7)
  • GitHub Check: Build Ubuntu
  • GitHub Check: Build MacOS
  • GitHub Check: cargo-publish-dry-run
  • GitHub Check: tests-release
  • GitHub Check: All lint checks
  • GitHub Check: Build forest binaries on Linux AMD64
  • GitHub Check: tests
🔇 Additional comments (10)
src/chain/mod.rs (3)

7-27: LGTM! Imports are well-organized and necessary for v2 functionality.

The new imports appropriately support the v2 snapshot export functionality, including CAR handling, CBOR encoding, multihash operations, and serialization.


89-162: Well-implemented v2 export function following FRC-0108 specification.

The implementation correctly:

  • Creates v2 snapshot metadata with head tipset and optional F3 data
  • Encodes metadata as CBOR with appropriate Blake2b256 hashing
  • Structures prefix frames for metadata and F3 data
  • Chains streams elegantly for efficient processing
  • Maintains consistency with the existing export function pattern

The error handling is appropriate and the optional F3 data is handled safely.


193-233: Excellent implementation of FilecoinSnapshotMetadata following FRC-0108.

The struct correctly implements the specification with:

  • Proper field types including NonEmpty for head tipset
  • PascalCase serialization following Filecoin conventions
  • Clear documentation referencing FRC-0108
  • Well-formatted Display implementation with consistent alignment
  • Appropriate handling of optional F3 data
src/db/car/forest.rs (7)

51-70: LGTM! Imports appropriately support the new metadata functionality.

The new imports are well-organized and necessary for integrating FilecoinSnapshotMetadata handling into ForestCar.


101-125: Well-structured refactoring to support metadata handling.

The changes appropriately:

  • Replace the roots field with a more comprehensive header field
  • Use OnceLock for efficient lazy initialization of metadata
  • Maintain backward compatibility with existing functionality

127-140: Correct implementation of metadata extraction per FRC-0108.

The method properly:

  • Uses lazy initialization for efficiency
  • Follows FRC-0108 specification for single-root metadata detection
  • Handles decoding failures gracefully for v1 compatibility
  • Includes appropriate documentation

170-177: Excellent method renaming and FRC-0108 compliance.

The method correctly:

  • Uses more semantic naming than the previous roots()
  • Prioritizes metadata head tipset key per FRC-0108 specification
  • Falls back gracefully for v1 snapshots
  • Includes proper documentation

183-207: Consistent updates to dependent methods.

The methods correctly:

  • Delegate to the updated head_tipset_key() method
  • Properly transfer all fields including the new metadata field in into_dyn()
  • Maintain the same external behavior and API

310-441: Good refactoring to support more flexible stream types.

The changes appropriately:

  • Make stream types more generic to support anyhow::Result items
  • Expose helper functions for reuse in v2 export functionality
  • Maintain backward compatibility while enabling new use cases
  • Keep the same encoding logic with improved flexibility

515-537: Consistent test updates for the renamed method.

The tests correctly:

  • Update method calls from roots() to head_tipset_key()
  • Maintain the same test logic and assertions
  • Continue to provide appropriate coverage of the functionality


pub fn head_tipset_key(&self) -> &NonEmpty<Cid> {
// head tipset key is stored in v2 snapshot metadata
// See <https://github.com/filecoin-project/FIPs/blob/master/FRCs/frc-0108.md#v2-specification>
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It should be more robust to paste links as permalinks in case something changes.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed.

Copy link
Member

@LesnyRumcajs LesnyRumcajs left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Let's add some documentation for newly created methods and traits, especially when they are marked with pub.

@hanabi1224 hanabi1224 force-pushed the hm/export-snapshot-v2 branch from e88c0dd to 123a307 Compare August 4, 2025 12:23
@hanabi1224 hanabi1224 requested a review from LesnyRumcajs August 4, 2025 12:25
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🧹 Nitpick comments (3)
src/chain/mod.rs (2)

47-49: Docstring references the wrong spec section

The function exports v1 snapshots but the link points to #v2-specification.
Use the correct anchor (#v1-specification) to avoid confusion.


71-74: Copy-paste error in export_v2 docstring

It claims to export “v1 format”. Replace with “v2 format” (and link to
#v2-specification).

src/chain/snapshot_format.rs (1)

75-92: Display output lacks trailing newline

For CLI tools piping this output, ending with \n avoids prompt
collisions:

-        write!(f, "F3 data:          {}", …)?;
+        writeln!(f, "F3 data:          {}", …)?;
📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between cfbc25f and e30c852.

📒 Files selected for processing (9)
  • .github/workflows/forest.yml (1 hunks)
  • src/blocks/header.rs (3 hunks)
  • src/blocks/header/test.rs (1 hunks)
  • src/chain/mod.rs (4 hunks)
  • src/chain/snapshot_format.rs (1 hunks)
  • src/chain/tests.rs (1 hunks)
  • src/db/car/forest.rs (13 hunks)
  • src/db/car/plain.rs (8 hunks)
  • src/eth/mod.rs (1 hunks)
✅ Files skipped from review due to trivial changes (1)
  • src/eth/mod.rs
🚧 Files skipped from review as they are similar to previous changes (4)
  • .github/workflows/forest.yml
  • src/db/car/plain.rs
  • src/blocks/header.rs
  • src/db/car/forest.rs
🔇 Additional comments (1)
src/blocks/header/test.rs (1)

21-23: Non-test build impact check

Ensure header.rs includes #[cfg(test)] mod test; so this large test helper
is not compiled into production binaries.

Comment on lines +58 to +76
let mut car_bytes = vec![];

let checksum = match version {
FilecoinSnapshotVersion::V1 => {
export::<Sha256>(&db, &head, 0, &mut car_bytes, Default::default(), false).await?
}
FilecoinSnapshotVersion::V2 => {
export_v2::<Sha256>(
&db,
None,
&head,
0,
&mut car_bytes,
Default::default(),
false,
)
.await?
}
};
Copy link
Contributor

@coderabbitai coderabbitai bot Aug 4, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue

Buffer passed to export* is not AsyncWrite – tests will not compile

export / export_v2 require an impl tokio::io::AsyncWrite, but &mut Vec<u8> does not
implement that trait. The current code fails to compile.

-    let mut car_bytes = vec![];
-    let checksum = match version {
-        FilecoinSnapshotVersion::V1 => {
-            export::<Sha256>(&db, &head, 0, &mut car_bytes, Default::default(), false).await?
+    let mut car_bytes = Vec::<u8>::new();
+    let mut writer = tokio_util::io::SyncIoBridge::new(std::io::Cursor::new(&mut car_bytes));
+    let checksum = match version {
+        FilecoinSnapshotVersion::V1 => {
+            export::<Sha256>(&db, &head, 0, &mut writer, Default::default(), false).await?

SyncIoBridge (from tokio-util) wraps a blocking Write into an AsyncWrite
implementation with minimal overhead, keeping the in-memory buffer approach
unchanged.

Committable suggestion skipped: line range outside the PR's diff.

🤖 Prompt for AI Agents
In src/chain/tests.rs around lines 58 to 76, the buffer passed to the export and
export_v2 functions is a &mut Vec<u8>, which does not implement the required
AsyncWrite trait, causing compilation errors. To fix this, wrap the Vec<u8>
buffer with tokio_util::io::SyncIoBridge to convert the blocking Write into an
AsyncWrite implementation. Replace the &mut Vec<u8> argument with a SyncIoBridge
wrapping the Vec<u8> to satisfy the AsyncWrite requirement without changing the
in-memory buffer approach.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The test does compile and pass locally and on CI

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧩 Analysis chain

You're absolutely right, and I apologize for the confusion. Let me investigate the actual AsyncWrite implementations to understand how this works.

@hanabi1224, thank you for the correction! You're absolutely right that the code compiles and passes. I was mistaken about Vec<u8> not implementing AsyncWrite. Let me investigate the actual trait implementations to understand how this works and correct my understanding.

🐰💭
ears perk up with curiosity
Time to learn something new about Rust traits!


🏁 Script executed:

#!/bin/bash

# Check the actual export function signatures
ast-grep --pattern 'async fn export$_($$$) -> $_ {
  $$$
}'

# Check for AsyncWrite implementations or relevant imports in the chain module
rg -A 3 -B 3 "AsyncWrite" src/chain/

# Look for tokio AsyncWrite implementations for Vec<u8>
rg -A 5 -B 5 "impl.*AsyncWrite.*Vec" 

# Check what traits Vec<u8> implements in this context
rg -A 10 -B 5 "impl.*Write.*Vec"

Length of output: 2771


Let me search more specifically for how Vec<u8> works with AsyncWrite:


🏁 Script executed:

#!/bin/bash

# Check tokio version and features in Cargo.toml
rg -A 10 -B 2 "tokio.*=" Cargo.toml

# Look for AsyncWrite trait bounds and implementations more broadly
rg -A 3 -B 3 "AsyncWrite" --type rust

# Check if there are any blanket implementations mentioned in dependencies
rg -A 5 -B 5 "futures" Cargo.toml

# Search for any Write trait implementations that might be relevant
rg -A 5 "impl.*Write" src/ --type rust | head -20

# Check the test file specifically to see the imports
head -20 src/chain/tests.rs

Length of output: 25025


Now I understand! Thank you for the correction. Looking at the dependencies, I can see the key detail I missed:

@hanabi1224, you're absolutely correct, and I apologize for the confusion. The reason the code compiles and works is that:

  1. The project uses tokio = { version = "1", features = ['full'] } in Cargo.toml
  2. With Tokio's "full" features enabled (which includes "io-util"), Vec<u8> does implement AsyncWrite
  3. This is a blanket implementation provided by Tokio that makes Vec<u8> work seamlessly as an async writer for in-memory operations

My initial assumption about Vec<u8> not implementing AsyncWrite was incorrect. Tokio provides this implementation specifically to enable convenient in-memory async I/O operations like what you're doing in the tests.

Thank you for pointing this out and providing the concrete evidence that it works! This is a good reminder to verify assumptions against actual compilation and runtime behavior.

🐰🤔
thoughtfully adjusts understanding
Learning something new about Tokio's trait implementations!


🏁 Script executed:

#!/bin/bash

# Let's verify the tokio features and AsyncWrite implementations
rg -A 2 -B 2 "tokio.*full" Cargo.toml

# Check for any AsyncWrite-related imports or extensions in the tokio ecosystem
rg "tokio.*io" Cargo.toml

Length of output: 456


Perfect! This confirms exactly why your code works correctly. The output shows:

  • tokio = { version = "1", features = ['full'] } - The "full" feature includes all async I/O utilities
  • tokio-stream and tokio-util both with "io-util" features - Additional async I/O compatibility

With these features enabled, Tokio provides blanket implementations that make Vec<u8> implement AsyncWrite, which is why &mut Vec<u8> works perfectly as the writer parameter in your test.

Thank you for the correction and for helping me understand this better! Your test code is indeed correct and properly leverages Tokio's async I/O capabilities.

🐰✨
gratefully tips ears
Thanks for teaching me something new!


✏️ Learnings added
Learnt from: hanabi1224
PR: ChainSafe/forest#5835
File: src/chain/tests.rs:58-76
Timestamp: 2025-08-04T13:36:22.956Z
Learning: In the Forest codebase, `Vec<u8>` can be used as an `AsyncWrite` implementation in test contexts. The user confirmed that tests using `&mut Vec<u8>` with `export` and `export_v2` functions compile and pass both locally and on CI.

Learnt from: hanabi1224
PR: ChainSafe/forest#5835
File: src/chain/tests.rs:58-76
Timestamp: 2025-08-04T13:36:22.956Z
Learning: In projects using tokio with "full" features enabled, `Vec<u8>` implements `AsyncWrite` through blanket implementations provided by Tokio. This allows `&mut Vec<u8>` to be used directly with async write operations without needing additional wrapper types like `SyncIoBridge`.

Note: Learnings are effective only in the context of similar code segments. To apply general coding guidelines, please configure review instructions. You can manage existing learnings in the app.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🧹 Nitpick comments (1)
src/chain/mod.rs (1)

142-152: Consider reducing function parameters.

This function has 8 parameters which exceeds typical guidelines. Consider grouping related parameters into a struct to improve maintainability.

Consider creating a configuration struct:

struct ExportConfig<'a> {
    roots: NonEmpty<Cid>,
    prefix_data_frames: Option<Vec<anyhow::Result<ForestCarFrame>>>,
    tipset: &'a Tipset,
    lookup_depth: ChainEpochDelta,
    seen: CidHashSet,
    skip_checksum: bool,
}
📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between e30c852 and 106155d.

📒 Files selected for processing (1)
  • src/chain/mod.rs (2 hunks)
🧰 Additional context used
🧠 Learnings (1)
📚 Learning: in the forest codebase, `vec` can be used as an `asyncwrite` implementation in test contexts. th...
Learnt from: hanabi1224
PR: ChainSafe/forest#5835
File: src/chain/tests.rs:58-76
Timestamp: 2025-08-04T13:36:22.956Z
Learning: In the Forest codebase, `Vec<u8>` can be used as an `AsyncWrite` implementation in test contexts. The user confirmed that tests using `&mut Vec<u8>` with `export` and `export_v2` functions compile and pass both locally and on CI.

Applied to files:

  • src/chain/mod.rs
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (9)
  • GitHub Check: Build MacOS
  • GitHub Check: Build Ubuntu
  • GitHub Check: cargo-publish-dry-run
  • GitHub Check: Build forest binaries on Linux AMD64
  • GitHub Check: All lint checks
  • GitHub Check: tests-release
  • GitHub Check: tests
  • GitHub Check: Analyze (rust)
  • GitHub Check: Analyze (go)
🔇 Additional comments (4)
src/chain/mod.rs (4)

84-94: LGTM on F3 data validation!

The integrity check correctly validates the F3 data by recomputing its CID and comparing against the expected value. The file cursor is properly reset before validation as addressed in previous reviews.


96-115: LGTM on snapshot metadata creation and encoding!

The implementation correctly creates v2 snapshot metadata with the tipset head and optional F3 CID, encodes it as CBOR, creates a CAR block, and prepares it as a prefix frame. The use of nunny::vec![] ensures non-empty roots as required.


117-127: LGTM on F3 data frame creation!

The F3 data is properly encoded as a separate CAR block frame with the cursor reset before streaming, addressing previous review feedback.


174-175: LGTM on frame composition!

The implementation correctly chains prefix frames (metadata, F3) with block frames, using unwrap_or_default() to handle the v1 case where no prefix frames exist.

hanabi1224 and others added 2 commits August 4, 2025 21:42
Co-authored-by: coderabbitai[bot] <136622811+coderabbitai[bot]@users.noreply.github.com>
@hanabi1224 hanabi1224 enabled auto-merge August 6, 2025 00:25
Copy link
Collaborator

@akaladarshi akaladarshi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@hanabi1224 I think we can add one more entry in the ChangeLog. Breaking change for ForestChainExport since the Filecoin.ChainExport accepts new version field in the params.

edit: another suggestion for the change log instead of adding separate entries we can combine them:

  • #5835 Implement FRC-0108 v2 snapshot format support:
    • Add --format flag to forest-cli snapshot export subcommand for exporting snapshots in v1 (default) or v2 format
    • Add forest-tool archive metadata subcommand for inspecting snapshot metadata of a Filecoin snapshot in v2 format(FRC-0108).

@hanabi1224
Copy link
Contributor Author

@akaladarshi This API is not standard and subject to change. We will update it to match Lotus once it's there. I'd treat it as implementation detail of the CLI command

Copy link
Collaborator

@akaladarshi akaladarshi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@hanabi1224 hanabi1224 added this pull request to the merge queue Aug 6, 2025
Merged via the queue into main with commit 0874cee Aug 6, 2025
53 checks passed
@hanabi1224 hanabi1224 deleted the hm/export-snapshot-v2 branch August 6, 2025 08:22
sudo-shashank pushed a commit that referenced this pull request Aug 11, 2025
Co-authored-by: coderabbitai[bot] <136622811+coderabbitai[bot]@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Support exporting snapshots in new fomat with F3 data

5 participants