Skip to content

Conversation

@hanabi1224
Copy link
Contributor

@hanabi1224 hanabi1224 commented Sep 12, 2025

Summary of changes

This PR adds forest-cli snapshot export-diff subcommand for exporting a diff snapshot from a live node, to achive what the offline tool forest-tool archive export does.

Note that, --diff-depth 900 in implied in the new tool

➜  snapshots forest-tool archive export --epoch 3008770 --depth 900 --diff 3007870 --diff-depth 900 -o diff1.forest.car.zst lite_2000_3008770.forest.car.zst
2025-09-12T11:50:16.652895Z  INFO forest::tool::subcommands::archive_cmd: looking up a tipset by epoch: 3008770
2025-09-12T11:52:29.696196Z  INFO forest::tool::subcommands::archive_cmd: exporting snapshot at location: diff1.forest.car.zst

➜  snapshots forest-cli snapshot export-diff --from 3008770 --to 3007870 -d 900 -o diff2.forest.car.zst
  Exporting diff2.forest.car.zst ... 36.05 MiB written in 5m (131.77 KiB/s)                                             Export completed.
  
➜  snapshots diff diff1.forest.car.zst diff2.forest.car.zst

Changes introduced in this pull request:

Reference issue to close (if applicable)

Closes

Other information and links

Change checklist

  • I have performed a self-review of my own code,
  • I have made corresponding changes to the documentation. All new code adheres to the team's documentation standards,
  • I have added tests that prove my fix is effective or that my feature works (if possible),
  • I have made sure the CHANGELOG is up-to-date. All user-facing changes should be reflected in this document.

Summary by CodeRabbit

  • New Features

    • Added snapshot "export-diff" CLI command with progress display, automatic output naming, epoch/depth handling.
    • Added a remote API to request diff exports; export operations are now serialized to prevent concurrent exports.
  • Documentation

    • Changelog updated to announce the new diff export subcommand.
  • Tests

    • CI job added to validate diff snapshot exports on calibnet.
    • New test script verifies consistency between two diff export methods.
  • Refactor

    • Export streaming and export API surfaces updated to support diff export behavior.

@coderabbitai
Copy link
Contributor

coderabbitai bot commented Sep 12, 2025

Walkthrough

Adds an end-to-end diff snapshot export flow: CLI subcommand, new RPC Forest.ChainExportDiff (with shared global export lock), archive/export plumbing changes (Arc, use of stream_chain, finality depth check), IPLD stateroot bound rename, CI job + test script, changelog update, and module visibility adjustment for archive tooling.

Changes

Cohort / File(s) Summary
CI workflow: calibnet diff export check
.github/workflows/forest.yml
Adds calibnet-export-diff-check job on ubuntu-24.04 that restores params cache, downloads forest artifact, and runs scripts/tests/calibnet_export_diff_check.sh.
Changelog
CHANGELOG.md
Replaces/updates Unreleased entry to document forest-cli snapshot export-diff (issue #6074).
Test script: calibnet diff check
scripts/tests/calibnet_export_diff_check.sh
New script that finds latest CAR snapshot epoch, runs forest-cli snapshot export-diff and forest-tool archive export to produce two diff snapshots, then diffs them.
CLI: snapshot export-diff subcommand
src/cli/subcommands/snapshot_cmd.rs
Adds ExportDiff variant to SnapshotCommands; constructs ForestChainExportDiff RPC params, handles temporary output path, progress polling task, and finalizes output file.
RPC: new diff export + shared lock
src/rpc/methods/chain.rs, src/rpc/mod.rs
Adds Forest.ChainExportDiff and ForestChainExportDiffParams; introduces global CHAIN_EXPORT_LOCK to serialize exports (shared with Forest.ChainExport); validates depth/finality and invokes archive_cmd::do_export. Registers the new method in RPC list.
Tooling: archive export API & call-sites
src/tool/subcommands/archive_cmd.rs, src/tool/subcommands/mod.rs
Makes do_export pub and changes root param to Arc<Tipset>; switches diff streaming to stream_chain; enforces finality-based diff depth check; updates call sites/tests to pass Arc<Tipset>; makes archive_cmd pub(crate).
IPLD streaming bound change
src/ipld/util.rs
Renames stateroot_limitstateroot_limit_exclusive in ChainStream, stream_chain, and stream_graph; updates docs and boundary checks to treat the stateroot bound as exclusive.
API snapshot tests ignore
src/tool/subcommands/api_cmd/test_snapshots_ignored.txt
Adds Forest.ChainExportDiff to ignored API snapshot outputs.

Sequence Diagram(s)

sequenceDiagram
  autonumber
  actor User
  participant CLI as forest-cli
  participant RPC as Forest RPC
  participant Lock as CHAIN_EXPORT_LOCK
  participant Archive as archive_cmd::do_export
  participant Stream as stream_chain
  participant FS as Filesystem

  User->>CLI: snapshot export-diff --from F --to T --depth D -o path
  CLI->>RPC: Forest.ChainExportDiff{from:F,to:T,depth:D,output_path}
  RPC->>Lock: try_lock()
  alt lock acquired
    RPC->>Archive: do_export(store, Arc<TS@F>, out, epoch=F, depth=D, diff=T, diff_depth=D, force=true)
    Archive->>Stream: stream_chain(db, tipsets, stateroot_limit_exclusive=T)
    Stream-->>Archive: blocks/messages bounded by exclusive T
    Archive->>FS: write CAR.zst (temp)
    Archive-->>RPC: Ok
    RPC-->>CLI: Ok
    CLI->>FS: persist temp -> final path
    CLI-->>User: Export completed
  else busy
    RPC-->>CLI: Error: export in progress
    CLI-->>User: Failure
  end
Loading
sequenceDiagram
  autonumber
  participant Job1 as Export/ExportDiff #1
  participant Job2 as Export/ExportDiff #2
  participant Lock as CHAIN_EXPORT_LOCK

  Job1->>Lock: try_lock()
  Note right of Lock: First job acquires lock
  Lock-->>Job1: acquired
  Job2->>Lock: try_lock()
  Lock-->>Job2: error (already locked)
  Job1-->>Lock: release on completion
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

Possibly related PRs

Suggested labels

RPC

Suggested reviewers

  • akaladarshi
  • LesnyRumcajs
  • sudo-shashank

Pre-merge checks and finishing touches

✅ Passed checks (3 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title Check ✅ Passed The title "feat: forest-cli snapshot export-diff" is a concise, single-line summary that clearly identifies the primary change — adding a forest-cli subcommand to export diff snapshots; it matches the PR contents (CLI, RPC, archive/export logic, script, and changelog) and follows conventional commit-style prefixing for intent. This makes the main change discoverable for teammates scanning history without listing implementation details.
Docstring Coverage ✅ Passed No functions found in the changes. Docstring coverage check skipped.
✨ Finishing touches
  • 📝 Generate Docstrings
🧪 Generate unit tests
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch hm/snapshot-export-diff

📜 Recent review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 049b324 and 55d935f.

📒 Files selected for processing (3)
  • CHANGELOG.md (1 hunks)
  • src/rpc/mod.rs (1 hunks)
  • src/tool/subcommands/api_cmd/test_snapshots_ignored.txt (1 hunks)
🚧 Files skipped from review as they are similar to previous changes (3)
  • src/tool/subcommands/api_cmd/test_snapshots_ignored.txt
  • src/rpc/mod.rs
  • CHANGELOG.md
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (7)
  • GitHub Check: Build forest binaries on Linux AMD64
  • GitHub Check: All lint checks
  • GitHub Check: cargo-publish-dry-run
  • GitHub Check: tests
  • GitHub Check: Build MacOS
  • GitHub Check: tests-release
  • GitHub Check: Build Ubuntu

Tip

👮 Agentic pre-merge checks are now available in preview!

Pro plan users can now enable pre-merge checks in their settings to enforce checklists before merging PRs.

  • Built-in checks – Quickly apply ready-made checks to enforce title conventions, require pull request descriptions that follow templates, validate linked issues for compliance, and more.
  • Custom agentic checks – Define your own rules using CodeRabbit’s advanced agentic capabilities to enforce organization-specific policies and workflows. For example, you can instruct CodeRabbit’s agent to verify that API documentation is updated whenever API schema files are modified in a PR. Note: Upto 5 custom checks are currently allowed during the preview period. Pricing for this feature will be announced in a few weeks.

Please see the documentation for more information.

Example:

reviews:
  pre_merge_checks:
    custom_checks:
      - name: "Undocumented Breaking Changes"
        mode: "warning"
        instructions: |
          Pass/fail criteria: All breaking changes to public APIs, CLI flags, environment variables, configuration keys, database schemas, or HTTP/GraphQL endpoints must be documented in the "Breaking Change" section of the PR description and in CHANGELOG.md. Exclude purely internal or private changes (e.g., code not exported from package entry points or explicitly marked as internal).

Please share your feedback with us on this Discord post.


Comment @coderabbitai help to get the list of available commands and usage tips.

@hanabi1224 hanabi1224 marked this pull request as ready for review September 17, 2025 11:10
@hanabi1224 hanabi1224 requested a review from a team as a code owner September 17, 2025 11:10
@hanabi1224 hanabi1224 requested review from LesnyRumcajs and akaladarshi and removed request for a team September 17, 2025 11:10
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (3)
.github/workflows/forest.yml (1)

621-647: Add the new job to the umbrella “integration-tests-status”.

The aggregator omits calibnet-export-diff-check, so the status job won’t reflect failures from this new check.

Apply:

       - calibnet-export-check-v2
+      - calibnet-export-diff-check
       - calibnet-no-discovery-checks
src/tool/subcommands/archive_cmd.rs (2)

262-267: Avoid borrowing a temporary Arc across .await; materialize the Arc first.

&store.into() takes a reference to a temporary Arc, which will not live across the async call and will fail to compile. Create an Arc binding and pass &arc instead.

Apply this diff:

-                let store = ManyCar::try_from(snapshot_files)?;
-                let heaviest_tipset = store.heaviest_tipset()?;
-                do_export(
-                    &store.into(),
+                let store: Arc<_> = ManyCar::try_from(snapshot_files)?.into();
+                let heaviest_tipset = store.heaviest_tipset()?;
+                do_export(
+                    &store,
                     heaviest_tipset.into(),
                     output_path,
                     epoch,
                     depth,
                     diff,
                     diff_depth,
                     force,
                 )

1168-1179: Test uses &store.into() temporary; fix lifetime as in production code.

Same borrow-of-temporary issue in test; make Arc explicit and pass &store.

Apply this diff:

-        let store = AnyCar::try_from(calibnet::DEFAULT_GENESIS).unwrap();
-        let heaviest_tipset = store.heaviest_tipset().unwrap();
-        do_export(
-            &store.into(),
+        let store: Arc<_> = AnyCar::try_from(calibnet::DEFAULT_GENESIS).unwrap().into();
+        let heaviest_tipset = store.heaviest_tipset().unwrap();
+        do_export(
+            &store,
             heaviest_tipset.into(),
             output_path.path().into(),
             Some(0),
             1,
             None,
             None,
             false,
         )
🧹 Nitpick comments (4)
CHANGELOG.md (1)

32-33: Fix link: reference the PR, not an Issue.

6074 is a PR. Use the PR URL to match the contribution guidelines in the header.

Apply:

- - [#6074](https://github.com/ChainSafe/forest/issues/6074) Added `forest-cli snapshot export-diff` subcommand for exporting a diff snapshot.
+ - [#6074](https://github.com/ChainSafe/forest/pull/6074) Added `forest-cli snapshot export-diff` subcommand for exporting a diff snapshot.
src/ipld/util.rs (1)

113-115: Doc/param naming drift: “exclusive” bound; align comments and names.

Behavior changes look correct, but docs still refer to stateroot_limit. Update to stateroot_limit_exclusive to prevent off‑by‑one confusion.

Apply:

-/// Stream all blocks that are reachable before the `stateroot_limit` epoch in a depth-first
+/// Stream all blocks that are reachable before the `stateroot_limit_exclusive` epoch (exclusive) in a depth-first
@@
-/// * `stateroot_limit` - An epoch that signifies how far back (exclusive) we need to inspect tipsets,
+/// * `stateroot_limit_exclusive` - Epoch strictly before which we stop deep inspection,

Also good call on:

  • Using > for the exclusive bound (Lines 256, 269).
  • Keeping the genesis special case (Line 269).

Also applies to: 146-150, 156-159, 165-169, 180-181, 256-265, 269-280

src/cli/subcommands/snapshot_cmd.rs (2)

44-58: Parameter UX: validate from > to and positive depth.

Defaulting depth to from - to is sensible, but add guards to fail fast on invalid inputs (e.g., from <= to, negative/zero depth).

Apply:

     ExportDiff {
@@
         depth: Option<crate::chain::ChainEpochDelta>,
     },

And in the handler (see below) add checks before issuing the RPC.


157-226: Create output directory and validate epochs before export.

  • Ensure the parent directory exists to avoid ENOENT from tempfile_in.
  • Validate from > to and that effective depth > 0 before RPC.

Apply:

@@
-                let depth = depth.unwrap_or_else(|| from - to);
+                let depth = depth.unwrap_or_else(|| from - to);
+                if from <= to {
+                    anyhow::bail!("`--from` ({from}) must be greater than `--to` ({to})");
+                }
+                if depth <= 0 {
+                    anyhow::bail!("`--depth` must be positive (got {depth})");
+                }
@@
-                let output_dir = output_path.parent().context("invalid output path")?;
+                let output_dir = output_path.parent().context("invalid output path")?;
+                std::fs::create_dir_all(output_dir)
+                    .with_context(|| format!("creating output dir {}", output_dir.display()))?;

Optional: preflight fetch Forest.Version/StateNetworkName to include chain finality in the CLI error, mirroring the server‑side check.

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 2b65a82 and a019230.

📒 Files selected for processing (10)
  • .github/workflows/forest.yml (1 hunks)
  • CHANGELOG.md (1 hunks)
  • scripts/tests/calibnet_export_diff_check.sh (1 hunks)
  • src/cli/subcommands/snapshot_cmd.rs (3 hunks)
  • src/ipld/util.rs (6 hunks)
  • src/rpc/methods/chain.rs (4 hunks)
  • src/rpc/mod.rs (1 hunks)
  • src/tool/subcommands/api_cmd/test_snapshots_ignored.txt (1 hunks)
  • src/tool/subcommands/archive_cmd.rs (7 hunks)
  • src/tool/subcommands/mod.rs (1 hunks)
🧰 Additional context used
🧠 Learnings (1)
📓 Common learnings
Learnt from: hanabi1224
PR: ChainSafe/forest#5930
File: build.rs:64-77
Timestamp: 2025-08-13T09:43:20.301Z
Learning: hanabi1224 prefers hard compile-time errors in build scripts rather than runtime safeguards or collision detection, believing it's better to fail fast and fix root causes of issues like malformed snapshot names.
🧬 Code graph analysis (4)
src/cli/subcommands/snapshot_cmd.rs (3)
src/rpc/methods/chain.rs (18)
  • chain (359-359)
  • chain (390-390)
  • handle (141-153)
  • handle (167-182)
  • handle (197-211)
  • handle (226-258)
  • handle (271-279)
  • handle (292-303)
  • handle (316-404)
  • handle (417-460)
  • handle (473-497)
  • handle (513-522)
  • handle (537-542)
  • handle (557-595)
  • handle (610-626)
  • handle (641-646)
  • handle (719-730)
  • handle (748-759)
src/rpc/client.rs (3)
  • call (92-151)
  • client (114-114)
  • request (272-285)
src/db/car/forest.rs (2)
  • new_forest_car_temp_path_in (471-478)
  • std (110-110)
scripts/tests/calibnet_export_diff_check.sh (1)
scripts/tests/harness.sh (2)
  • forest_init (124-148)
  • forest_query_epoch (77-79)
src/rpc/methods/chain.rs (1)
src/tool/subcommands/archive_cmd.rs (1)
  • do_export (513-616)
src/tool/subcommands/archive_cmd.rs (4)
src/ipld/util.rs (2)
  • stream_chain (146-159)
  • stream_graph (163-169)
src/chain/store/chain_store.rs (1)
  • heaviest_tipset (221-229)
src/db/car/any.rs (1)
  • heaviest_tipset (73-79)
src/db/car/many.rs (1)
  • heaviest_tipset (141-147)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (3)
  • GitHub Check: tests-release
  • GitHub Check: tests
  • GitHub Check: Build forest binaries on Linux AMD64
🔇 Additional comments (10)
src/tool/subcommands/api_cmd/test_snapshots_ignored.txt (1)

81-82: LGTM; keep ignore list aligned with RPC set.

Placement after Forest.ChainExport is sensible. Please ensure the API snapshot generator and docs include the new method.

.github/workflows/forest.yml (1)

309-330: Good addition. Ensure artifacts (forest, forest-cli, forest-tool) are present in PATH.

Steps mirror other calibnet checks; looks fine.

Please confirm make install-slim-quick uploads forest-cli and forest-tool along with forest (the script needs both). If not, extend the upload glob.

src/tool/subcommands/mod.rs (1)

5-5: Visibility widening looks appropriate.

pub(crate) enables RPC to reuse archive internals without exposing them publicly.

src/rpc/mod.rs (1)

84-85: RPC registration LGTM.

ForestChainExportDiff is correctly added to the method set and will be included in OpenRPC.

src/tool/subcommands/archive_cmd.rs (4)

552-556: Off‑by‑one sanity for exclusive stateroot bound (request verification).

With an exclusive bound, diff_limit = diff_ts.epoch() - diff_depth should include exactly diff_depth state‑roots (e.g., 3007870..=3006971 for 900). Looks right; please confirm with a quick unit/integration check.


40-40: LGTM: Correctly switching diff pre-scan to stream_chain.

stream_chain with dead-link failures on by default is appropriate for computing the seen set; retaining stream_graph for the merge path is fine.


997-998: LGTM: Arc wrap of Tipset at call site.

root.into() to Arc<Tipset> matches the new do_export API.


1030-1031: LGTM: Arc wrap for diff export call.

Consistent with the new signature and usage.

src/rpc/methods/chain.rs (2)

407-461: Forest.ChainExportDiff wiring looks solid; reuse of do_export is good.

Parameter flow (from, to, depth) and defaulting diff_depth to finality match the CLI semantics. After the lock fix, this path should work as intended.

Consider adding a small RPC test to assert the lock error when a concurrent export is in progress.


1059-1067: LGTM: Params struct mirrors CLI flags and is JSON-friendly.

Field names and types align with expected usage; PathBuf is fine for server-side paths.

Comment on lines 1 to 25
#!/usr/bin/env bash
# This script is checking the correctness of
# the diff snapshot export feature.
# It requires both the `forest` and `forest-cli` binaries to be in the PATH.

set -eu

source "$(dirname "$0")/harness.sh"

forest_init "$@"

db_path=$($FOREST_TOOL_PATH db stats --chain calibnet | grep "Database path:" | cut -d':' -f2- | xargs)
snapshot=$(find "$db_path/car_db"/*.car.zst | tail -n 1)
snapshot_epoch=$(forest_query_epoch "$snapshot")

echo "Exporting diff snapshot @ $snapshot_epoch with forest-cli snapshot export-diff"
$FOREST_CLI_PATH snapshot export-diff --from "$snapshot_epoch" --to "$((snapshot_epoch - 900))" -d 900 -o diff1

$FOREST_CLI_PATH shutdown --force

echo "Exporting diff snapshot @ $snapshot_epoch with forest-tool archive export"
$FOREST_TOOL_PATH archive export --epoch "$snapshot_epoch" --depth 900 --diff "$((snapshot_epoch - 900))" --diff-depth 900 -o diff2 "$snapshot"

echo "Comparing diff snapshots"
diff diff1 diff2
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue

Harden the script (pipefail, robust snapshot selection, and deterministic compare).

Current use of globbed find and plain diff is brittle. Add pipefail, fail fast if no snapshot, and pick the newest snapshot deterministically. Also prefer comparing normalized metadata to avoid false negatives from compression framing.

Apply:

@@
-set -eu
+set -euo pipefail
@@
-db_path=$($FOREST_TOOL_PATH db stats --chain calibnet | grep "Database path:" | cut -d':' -f2- | xargs)
-snapshot=$(find "$db_path/car_db"/*.car.zst | tail -n 1)
-snapshot_epoch=$(forest_query_epoch "$snapshot")
+db_path="$(
+  $FOREST_TOOL_PATH db stats --chain calibnet \
+    | grep -E "^Database path:" \
+    | cut -d':' -f2- \
+    | xargs
+)"
+# Pick latest snapshot by mtime; fail if none.
+snapshot="$(ls -1t "$db_path"/car_db/*.car.zst 2>/dev/null | head -n1 || true)"
+if [[ -z "${snapshot:-}" ]]; then
+  echo "Error: no *.car.zst snapshots found under: $db_path/car_db" >&2
+  exit 1
+fi
+snapshot_epoch="$(forest_query_epoch "$snapshot")"
@@
-echo "Exporting diff snapshot @ $snapshot_epoch with forest-cli snapshot export-diff"
-$FOREST_CLI_PATH snapshot export-diff --from "$snapshot_epoch" --to "$((snapshot_epoch - 900))" -d 900 -o diff1
+DEPTH="${DEPTH:-900}"
+TO_EPOCH="$((snapshot_epoch - DEPTH))"
+echo "Exporting diff snapshot @ $snapshot_epoch with forest-cli snapshot export-diff (to=$TO_EPOCH depth=$DEPTH)"
+$FOREST_CLI_PATH snapshot export-diff --from "$snapshot_epoch" --to "$TO_EPOCH" -d "$DEPTH" -o diff1
@@
-echo "Exporting diff snapshot @ $snapshot_epoch with forest-tool archive export"
-$FOREST_TOOL_PATH archive export --epoch "$snapshot_epoch" --depth 900 --diff "$((snapshot_epoch - 900))" --diff-depth 900 -o diff2 "$snapshot"
+echo "Exporting diff snapshot @ $snapshot_epoch with forest-tool archive export"
+$FOREST_TOOL_PATH archive export --epoch "$snapshot_epoch" --depth "$DEPTH" --diff "$TO_EPOCH" --diff-depth "$DEPTH" -o diff2 "$snapshot"
@@
-echo "Comparing diff snapshots"
-diff diff1 diff2
+echo "Comparing diff snapshots (metadata)"
+$FOREST_TOOL_PATH archive info diff1 > info1.txt
+$FOREST_TOOL_PATH archive info diff2 > info2.txt
+diff -u info1.txt info2.txt

Optional: if strict byte-for-byte comparison is desired and deterministic, keep a second diff diff1 diff2 step.

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
#!/usr/bin/env bash
# This script is checking the correctness of
# the diff snapshot export feature.
# It requires both the `forest` and `forest-cli` binaries to be in the PATH.
set -eu
source "$(dirname "$0")/harness.sh"
forest_init "$@"
db_path=$($FOREST_TOOL_PATH db stats --chain calibnet | grep "Database path:" | cut -d':' -f2- | xargs)
snapshot=$(find "$db_path/car_db"/*.car.zst | tail -n 1)
snapshot_epoch=$(forest_query_epoch "$snapshot")
echo "Exporting diff snapshot @ $snapshot_epoch with forest-cli snapshot export-diff"
$FOREST_CLI_PATH snapshot export-diff --from "$snapshot_epoch" --to "$((snapshot_epoch - 900))" -d 900 -o diff1
$FOREST_CLI_PATH shutdown --force
echo "Exporting diff snapshot @ $snapshot_epoch with forest-tool archive export"
$FOREST_TOOL_PATH archive export --epoch "$snapshot_epoch" --depth 900 --diff "$((snapshot_epoch - 900))" --diff-depth 900 -o diff2 "$snapshot"
echo "Comparing diff snapshots"
diff diff1 diff2
#!/usr/bin/env bash
# This script is checking the correctness of
# the diff snapshot export feature.
# It requires both the `forest` and `forest-cli` binaries to be in the PATH.
set -euo pipefail
source "$(dirname "$0")/harness.sh"
forest_init "$@"
db_path="$(
$FOREST_TOOL_PATH db stats --chain calibnet \
| grep -E "^Database path:" \
| cut -d':' -f2- \
| xargs
)"
# Pick latest snapshot by mtime; fail if none.
snapshot="$(ls -1t "$db_path"/car_db/*.car.zst 2>/dev/null | head -n1 || true)"
if [[ -z "${snapshot:-}" ]]; then
echo "Error: no *.car.zst snapshots found under: $db_path/car_db" >&2
exit 1
fi
snapshot_epoch="$(forest_query_epoch "$snapshot")"
DEPTH="${DEPTH:-900}"
TO_EPOCH="$((snapshot_epoch - DEPTH))"
echo "Exporting diff snapshot @ $snapshot_epoch with forest-cli snapshot export-diff (to=$TO_EPOCH depth=$DEPTH)"
$FOREST_CLI_PATH snapshot export-diff --from "$snapshot_epoch" --to "$TO_EPOCH" -d "$DEPTH" -o diff1
$FOREST_CLI_PATH shutdown --force
echo "Exporting diff snapshot @ $snapshot_epoch with forest-tool archive export"
$FOREST_TOOL_PATH archive export --epoch "$snapshot_epoch" --depth "$DEPTH" --diff "$TO_EPOCH" --diff-depth "$DEPTH" -o diff2 "$snapshot"
echo "Comparing diff snapshots (metadata)"
$FOREST_TOOL_PATH archive info diff1 > info1.txt
$FOREST_TOOL_PATH archive info diff2 > info2.txt
diff -u info1.txt info2.txt
🤖 Prompt for AI Agents
In scripts/tests/calibnet_export_diff_check.sh around lines 1 to 25, the script
is brittle: it lacks pipefail, doesn't fail if no snapshot is found, selects
snapshots non-deterministically, and uses a raw diff that can report false
differences due to compression framing. Add set -o pipefail alongside set -eu;
after finding snapshots, check and exit non-zero if none found; select the
newest snapshot deterministically (e.g. sort by mtime or use a reliable newest
selection) instead of relying on glob order; when comparing outputs, normalize
or decompress archives/metadata before diff (so compression framing won't
produce false mismatches) and keep an optional final strict byte-for-byte
comparison step if desired.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

♻️ Duplicate comments (1)
scripts/tests/calibnet_export_diff_check.sh (1)

6-6: Good: pipefail enabled.

Thanks for adopting set -euo pipefail; this eliminates a class of silent failures in the test harness.

🧹 Nitpick comments (10)
scripts/tests/calibnet_export_diff_check.sh (4)

12-14: Harden snapshot discovery (deterministic latest, fail fast, robust parsing).

Globbing through find "$db_path/car_db"/*.car.zst | tail -n 1 is brittle and may pick an arbitrary file. Prefer mtime-based selection with an explicit “none found” guard and more robust db_path parsing.

- db_path=$($FOREST_TOOL_PATH db stats --chain calibnet | grep "Database path:" | cut -d':' -f2- | xargs)
- snapshot=$(find "$db_path/car_db"/*.car.zst | tail -n 1)
- snapshot_epoch=$(forest_query_epoch "$snapshot")
+ db_path="$(
+   $FOREST_TOOL_PATH db stats --chain calibnet \
+     | grep -E '^Database path:' \
+     | cut -d':' -f2- \
+     | xargs
+ )"
+ # Pick newest snapshot by mtime; fail clearly if none.
+ snapshot="$(ls -1t "$db_path"/car_db/*.car.zst 2>/dev/null | head -n1 || true)"
+ if [[ -z "${snapshot:-}" ]]; then
+   echo "Error: no *.car.zst snapshots found under: $db_path/car_db" >&2
+   exit 1
+ fi
+ snapshot_epoch="$(forest_query_epoch "$snapshot")"

16-23: Parametrize depth and compute target epoch once; add guards.

Make depth configurable, derive --to from it, and add a simple lower-bound guard to avoid negative epochs in edge cases.

- echo "Exporting diff snapshot @ $snapshot_epoch with forest-cli snapshot export-diff"
- $FOREST_CLI_PATH snapshot export-diff --from "$snapshot_epoch" --to "$((snapshot_epoch - 900))" -d 900 -o diff1
+ DEPTH="${DEPTH:-900}"
+ TO_EPOCH="$((snapshot_epoch - DEPTH))"
+ if (( TO_EPOCH < 0 )); then
+   echo "Error: computed --to epoch ($TO_EPOCH) is negative (from=$snapshot_epoch depth=$DEPTH)" >&2
+   exit 1
+ fi
+ echo "Exporting diff snapshot @ $snapshot_epoch with forest-cli snapshot export-diff (to=$TO_EPOCH depth=$DEPTH)"
+ $FOREST_CLI_PATH snapshot export-diff --from "$snapshot_epoch" --to "$TO_EPOCH" -d "$DEPTH" -o diff1
@@
- echo "Exporting diff snapshot @ $snapshot_epoch with forest-tool archive export"
- $FOREST_TOOL_PATH archive export --epoch "$snapshot_epoch" --depth 900 --diff "$((snapshot_epoch - 900))" --diff-depth 900 -o diff2 "$snapshot"
+ echo "Exporting diff snapshot @ $snapshot_epoch with forest-tool archive export"
+ $FOREST_TOOL_PATH archive export \
+   --epoch "$snapshot_epoch" \
+   --depth "$DEPTH" \
+   --diff "$TO_EPOCH" \
+   --diff-depth "$DEPTH" \
+   -o diff2 "$snapshot"

24-25: Compare normalized metadata, not raw bytes (avoid compression framing noise).

Byte-for-byte diffs of compressed outputs can spuriously fail. Compare canonicalized metadata from archive info. Optionally keep strict compare behind a flag.

- echo "Comparing diff snapshots"
- diff diff1 diff2
+ echo "Comparing diff snapshots (metadata)"
+ $FOREST_TOOL_PATH archive info diff1 > info1.txt
+ $FOREST_TOOL_PATH archive info diff2 > info2.txt
+ diff -u info1.txt info2.txt
+ if [[ "${STRICT_BYTE_COMPARE:-0}" == "1" ]]; then
+   echo "Strict byte-by-byte compare enabled"
+   diff diff1 diff2
+ fi

6-10: Use tmpdir + trap to avoid file collisions and ensure cleanup.

Guard against parallel CI runs and leave a clean workspace.

 set -euo pipefail
 
 source "$(dirname "$0")/harness.sh"
 
 forest_init "$@"
+
+# Work in a temporary directory and clean up on exit.
+TMPDIR="$(mktemp -d -t calibnet-export-diff-XXXXXX)"
+trap 'rm -rf "$TMPDIR"' EXIT
+cd "$TMPDIR"
src/tool/subcommands/archive_cmd.rs (6)

262-275: Construct Arc once for the store; avoid temporary into() + reference.

This improves readability and avoids borrowing a temporary. It’s consistent with other call sites that already use Arc.

-                let store = ManyCar::try_from(snapshot_files)?;
-                let heaviest_tipset = store.heaviest_tipset()?;
-                do_export(
-                    &store.into(),
-                    heaviest_tipset.into(),
+                let store = Arc::new(ManyCar::try_from(snapshot_files)?);
+                let heaviest_tipset = store.heaviest_tipset()?;
+                do_export(
+                    &store,
+                    heaviest_tipset.into(),
                     output_path,
                     epoch,
                     depth,
                     diff,
                     diff_depth,
                     force,
                 )
                 .await

540-540: Prefer Arc::clone for clarity when cloning Arcs.

Makes intent explicit and avoids confusion with cloning the underlying store.

-    let index = ChainIndex::new(store.clone());
+    let index = ChainIndex::new(Arc::clone(store));

997-1005: Minor: avoid implicit into() for root in helpers.

Being explicit about Arc materialization reads clearer in these paths.

-    do_export(
-        &store,
-        root.into(),
+    let root = Arc::new(root);
+    do_export(
+        &store,
+        root,
         output_path.clone(),
         Some(epoch),
         depth,
         diff,
         diff_depth,
         force,
     )

1031-1038: Same root.into() nit as above in diff export helper.

-    do_export(
-        &store,
-        root.into(),
+    let root = Arc::new(root);
+    do_export(
+        &store,
+        root,
         output_path.clone(),
         Some(epoch),
         depth,
         diff,
         diff_depth,
         force,
     )

1171-1179: Test: construct Arc once to match production path.

Avoid borrowing a temporary; also mirrors the main code path suggested above.

-        do_export(
-            &store.into(),
-            heaviest_tipset.into(),
+        let store = Arc::new(store);
+        let heaviest_tipset = Arc::new(heaviest_tipset);
+        do_export(
+            &store,
+            heaviest_tipset,
             output_path.path().into(),
             Some(0),
             1,
             None,
             None,
             false,
         )

512-516: Change API to take Arc<impl Blockstore + Send + Sync + 'static> by value — update callers

Mirrors root: Arc<Tipset> and removes the odd &Arc borrow; update the callers below to pass the Arc (remove & or clone as needed):

  • src/rpc/methods/chain.rs:447 — do_export(&ctx.store_owned(), start_ts, output_path)
  • src/tool/subcommands/archive_cmd.rs:264 — do_export(&store.into(), heaviest_tipset.into(), output_path)
  • src/tool/subcommands/archive_cmd.rs:995 — do_export(&store, root.into(), output_path.clone())
  • src/tool/subcommands/archive_cmd.rs:1028 — do_export(&store, root.into(), output_path.clone())
  • src/tool/subcommands/archive_cmd.rs:1170 — do_export(&store.into(), heaviest_tipset.into(), output_path.path().into())

Also search for out-of-crate uses before merging.

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between a019230 and dcea4ab.

📒 Files selected for processing (2)
  • scripts/tests/calibnet_export_diff_check.sh (1 hunks)
  • src/tool/subcommands/archive_cmd.rs (8 hunks)
🧰 Additional context used
🧠 Learnings (2)
📓 Common learnings
Learnt from: hanabi1224
PR: ChainSafe/forest#5930
File: build.rs:64-77
Timestamp: 2025-08-13T09:43:20.301Z
Learning: hanabi1224 prefers hard compile-time errors in build scripts rather than runtime safeguards or collision detection, believing it's better to fail fast and fix root causes of issues like malformed snapshot names.
📚 Learning: 2025-09-16T12:55:26.955Z
Learnt from: hanabi1224
PR: ChainSafe/forest#6079
File: src/chain_sync/metrics.rs:6-7
Timestamp: 2025-09-16T12:55:26.955Z
Learning: HEAD_EPOCH references in shell scripts (like scripts/tests/calibnet_eth_mapping_check.sh) that extract data from `forest-cli info show` are unrelated to Rust metrics constants with the same name and should not be flagged when metrics cleanup is performed.

Applied to files:

  • scripts/tests/calibnet_export_diff_check.sh
🧬 Code graph analysis (2)
scripts/tests/calibnet_export_diff_check.sh (1)
scripts/tests/harness.sh (2)
  • forest_init (124-148)
  • forest_query_epoch (77-79)
src/tool/subcommands/archive_cmd.rs (3)
src/ipld/util.rs (3)
  • stream_chain (146-159)
  • stream_graph (163-169)
  • new (63-67)
src/chain/store/chain_store.rs (3)
  • heaviest_tipset (221-229)
  • new (116-142)
  • new (562-566)
src/blocks/tipset.rs (3)
  • genesis (415-452)
  • new (242-260)
  • new (494-511)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (8)
  • GitHub Check: Build MacOS
  • GitHub Check: cargo-publish-dry-run
  • GitHub Check: Build Ubuntu
  • GitHub Check: Build forest binaries on Linux AMD64
  • GitHub Check: All lint checks
  • GitHub Check: tests
  • GitHub Check: tests-release
  • GitHub Check: Check
🔇 Additional comments (1)
src/tool/subcommands/archive_cmd.rs (1)

552-556: Good: switched to stream_chain with dead-link failures.

Using stream_chain (fail_on_dead_links = true) is safer for constructing the diff “seen” set than stream_graph.

LesnyRumcajs
LesnyRumcajs previously approved these changes Sep 17, 2025
Copy link
Member

@LesnyRumcajs LesnyRumcajs left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Co-authored-by: Hubert <hubert@chainsafe.io>
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

♻️ Duplicate comments (1)
src/cli/subcommands/snapshot_cmd.rs (1)

173-175: Nice: adopted prior depth defaulting suggestion.
Matches the earlier review note; thanks for folding it in.

🧹 Nitpick comments (5)
src/cli/subcommands/snapshot_cmd.rs (5)

44-58: Clarify flag semantics in help text.
Add that --from must be >= --to, and that depth defaults to (from - to). This reduces user confusion and mismatches with filenames.


44-58: Optional: parity with export flags.
Consider adding --dry-run/--skip-checksum (even if checksum is a no-op here) for UI parity between export and export-diff.


176-185: Make output path handling resilient (no parent, missing dirs).
Currently parent() None (e.g., "diff.car.zst") errors, and missing directories aren’t created. Default to . and ensure the directory exists.

Apply this diff:

-                let output_dir = output_path.parent().context("invalid output path")?;
+                let output_dir = output_path.parent().unwrap_or_else(|| Path::new("."));
+                std::fs::create_dir_all(output_dir)
+                    .with_context(|| format!("failed to create output dir {}", output_dir.display()))?;

Note: Consider applying the same improvement in the Export path (Lines 103–105) for consistency.


200-213: Ensure progress task cannot outlive on early errors.
The spawned reporter keeps running if the RPC call fails before reaching the abort. The next comment proposes a minimal restructure that guarantees abort/await on all paths.


214-223: Always abort/await the progress task even on RPC errors.
Restructure to handle abortion before propagating errors.

Apply this diff:

-                // Manually construct RpcRequest because snapshot export could
-                // take a few hours on mainnet
-                client
-                    .call(ForestChainExportDiff::request((params,))?.with_timeout(Duration::MAX))
-                    .await?;
-
-                handle.abort();
-                pb.finish();
-                _ = handle.await;
+                // Manually construct RpcRequest because snapshot export could
+                // take a few hours on mainnet
+                let res = client
+                    .call(ForestChainExportDiff::request((params,))?.with_timeout(Duration::MAX))
+                    .await;
+                handle.abort();
+                pb.finish();
+                _ = handle.await;
+                res?;
📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 791d55b and 049b324.

📒 Files selected for processing (1)
  • src/cli/subcommands/snapshot_cmd.rs (3 hunks)
🧰 Additional context used
🧠 Learnings (1)
📓 Common learnings
Learnt from: hanabi1224
PR: ChainSafe/forest#5930
File: build.rs:64-77
Timestamp: 2025-08-13T09:43:20.301Z
Learning: hanabi1224 prefers hard compile-time errors in build scripts rather than runtime safeguards or collision detection, believing it's better to fail fast and fix root causes of issues like malformed snapshot names.
🧬 Code graph analysis (1)
src/cli/subcommands/snapshot_cmd.rs (3)
src/rpc/methods/chain.rs (18)
  • chain (359-359)
  • chain (390-390)
  • handle (141-153)
  • handle (167-182)
  • handle (197-211)
  • handle (226-258)
  • handle (271-279)
  • handle (292-303)
  • handle (316-404)
  • handle (417-460)
  • handle (473-497)
  • handle (513-522)
  • handle (537-542)
  • handle (557-595)
  • handle (610-626)
  • handle (641-646)
  • handle (719-730)
  • handle (748-759)
src/rpc/client.rs (3)
  • call (92-151)
  • client (114-114)
  • request (272-285)
src/db/car/forest.rs (2)
  • new_forest_car_temp_path_in (471-478)
  • std (110-110)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (8)
  • GitHub Check: cargo-publish-dry-run
  • GitHub Check: Build Ubuntu
  • GitHub Check: Build MacOS
  • GitHub Check: Build forest binaries on Linux AMD64
  • GitHub Check: Check
  • GitHub Check: tests-release
  • GitHub Check: tests
  • GitHub Check: All lint checks
🔇 Additional comments (4)
src/cli/subcommands/snapshot_cmd.rs (4)

9-9: LGTM: correct import for diff params.


163-171: LGTM: calibnet naming compat mirrors export path.


224-226: LGTM: atomic persist and completion message.


173-191: Require --from >= --to and derive default depth from the range; don't force equality unless the export API requires it.
ChainEpochDelta == ChainEpoch == i64, so casting (from - to) is safe. Implement: ensure from >= to, let expected_depth = (from - to) as crate::chain::ChainEpochDelta, depth = depth.unwrap_or(expected_depth), ensure depth > 0. Enforce depth == expected_depth only after confirming the export API requires exact equality.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

♻️ Duplicate comments (1)
src/cli/subcommands/snapshot_cmd.rs (1)

173-175: Also enforce from > to (independent of depth).

With an explicit positive depth, from <= to currently passes CLI validation and will rely on server‑side behavior. Make intent explicit and fail early.

-                let depth = depth.unwrap_or_else(|| from - to);
-                anyhow::ensure!(depth > 0, "depth must be positive");
+                anyhow::ensure!(from > to, "`from` must be greater than `to`");
+                let depth = depth.unwrap_or_else(|| from - to);
+                anyhow::ensure!(depth > 0, "depth must be positive");
🧹 Nitpick comments (4)
src/cli/subcommands/snapshot_cmd.rs (4)

44-58: Validate epochs at parse-time to fail fast.

Enforce non‑negative epochs via clap so users get immediate feedback; keep runtime checks too.

     /// Epoch to export from
-    #[arg(long)]
+    #[arg(long, value_parser = clap::value_parser!(i64).range(0..))]
     from: i64,
     /// Epoch to diff against
-    #[arg(long)]
+    #[arg(long, value_parser = clap::value_parser!(i64).range(0..))]
     to: i64,

163-171: DRY up chain name mapping.

The calibnet mapping is duplicated in Export and ExportDiff. Consider extracting a small helper and reusing it in both branches.

Example (outside this hunk):

fn snapshot_chain_name(raw: &str) -> &str {
    if raw == calibnet::NETWORK_GENESIS_NAME {
        calibnet::NETWORK_COMMON_NAME
    } else {
        raw
    }
}

183-185: Clarify handling when output_path points to a non‑existent directory.

is_dir() returns false for non‑existent paths, so a user‑intended directory might be treated as a file path. Consider documenting this or adding a --output-dir flag to disambiguate.


224-226: Optional: emit a checksum for parity with export.

Diff exports don’t persist a checksum; consider computing SHA‑256 client‑side post‑persist (or have RPC return it) for UX consistency.

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 791d55b and 049b324.

📒 Files selected for processing (1)
  • src/cli/subcommands/snapshot_cmd.rs (3 hunks)
🧰 Additional context used
🧠 Learnings (1)
📓 Common learnings
Learnt from: hanabi1224
PR: ChainSafe/forest#5930
File: build.rs:64-77
Timestamp: 2025-08-13T09:43:20.301Z
Learning: hanabi1224 prefers hard compile-time errors in build scripts rather than runtime safeguards or collision detection, believing it's better to fail fast and fix root causes of issues like malformed snapshot names.
🧬 Code graph analysis (1)
src/cli/subcommands/snapshot_cmd.rs (3)
src/rpc/methods/chain.rs (18)
  • chain (359-359)
  • chain (390-390)
  • handle (141-153)
  • handle (167-182)
  • handle (197-211)
  • handle (226-258)
  • handle (271-279)
  • handle (292-303)
  • handle (316-404)
  • handle (417-460)
  • handle (473-497)
  • handle (513-522)
  • handle (537-542)
  • handle (557-595)
  • handle (610-626)
  • handle (641-646)
  • handle (719-730)
  • handle (748-759)
src/rpc/client.rs (3)
  • call (92-151)
  • client (114-114)
  • request (272-285)
src/db/car/forest.rs (2)
  • new_forest_car_temp_path_in (471-478)
  • std (110-110)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (21)
  • GitHub Check: Bootstrap checks - Forest
  • GitHub Check: Bootstrap checks - Lotus
  • GitHub Check: Diff snapshot export checks
  • GitHub Check: Calibnet eth mapping check
  • GitHub Check: State migrations
  • GitHub Check: Calibnet stateless mode check
  • GitHub Check: Snapshot export checks
  • GitHub Check: Calibnet kademlia checks
  • GitHub Check: Snapshot export checks v2 with F3 data
  • GitHub Check: db-migration-checks
  • GitHub Check: Calibnet api test-stateful check
  • GitHub Check: Calibnet stateless RPC check
  • GitHub Check: Calibnet no discovery checks
  • GitHub Check: Wallet tests
  • GitHub Check: Devnet checks
  • GitHub Check: Forest CLI checks
  • GitHub Check: Calibnet check
  • GitHub Check: Build forest binaries on Linux AMD64
  • GitHub Check: tests-release
  • GitHub Check: tests
  • GitHub Check: All lint checks
🔇 Additional comments (4)
src/cli/subcommands/snapshot_cmd.rs (4)

9-9: Import looks correct.

Param type aligns with the new RPC handler.


176-181: Filename generation LGTM.

Pattern matches the PR description and is stable/predictable.


193-212: Progress reporting LGTM.

Spinner + size poller mirrors Export path and is correctly aborted.


214-219: Unbounded per‑request timeout matches Export path.

Consistent approach; OK if the RPC can take hours.

LesnyRumcajs
LesnyRumcajs previously approved these changes Sep 18, 2025

#[allow(clippy::too_many_arguments)]
async fn do_export(
pub async fn do_export(
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@hanabi1224 Do you think we should move the core logic of this in separate crate/file ?

I don't think we should use cli methods directly in the RPC crate, we should have separation of concerns here.

We can do this as a follow up work, as it's not really an issue.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@coderabbitai please create a follow up issue for this

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧩 Analysis chain

@hanabi1224 I've created a follow-up issue to address the architectural concern raised by @akaladarshi about separating the core export logic from CLI-specific code. The issue describes the problem, proposes a solution to move the shared export functionality to a dedicated module, and includes references to this PR and comment thread.

The issue suggests refactoring the do_export function and related logic into a separate module that can be shared between CLI and RPC layers without violating separation of concerns principles.

(\   /)

( . .)
o_(")(") thoughtful rabbit organizing architectural improvements


🏁 Script executed:

#!/bin/bash

# Create a GitHub issue for the architectural improvement
gh issue create \
  --title "Refactor export logic to separate module for better separation of concerns" \
  --body "## Problem

Currently, the RPC layer directly uses CLI methods from the \`archive_cmd\` module, which violates separation of concerns principles. Specifically, the \`do_export\` function in \`src/tool/subcommands/archive_cmd.rs\` is being used by both:

- CLI commands (\`forest-tool archive export\`)
- RPC methods (\`Forest.ChainExportDiff\`)

## Proposed Solution

Move the core export logic to a separate crate or dedicated module that can be shared between CLI and RPC layers without creating architectural dependencies.

This would involve:
1. Extracting the core export functionality from \`archive_cmd.rs\`
2. Creating a new module/crate for shared export logic
3. Having both CLI and RPC layers depend on this shared module
4. Maintaining clear separation between presentation layer (CLI/RPC) and business logic (export operations)

## Context

This issue was identified during review of PR #6074 which adds \`forest-cli snapshot export-diff\` functionality.

**References:**
- PR: https://github.com/ChainSafe/forest/pull/6074
- Comment: https://github.com/ChainSafe/forest/pull/6074#discussion_r2358997339
- Requested by: @hanabi1224

## Benefits

- Better architectural separation
- Easier testing of core export logic
- Reduced coupling between CLI and RPC layers
- More maintainable codebase" \
  --assignee hanabi1224 \
  --label enhancement

Length of output: 44


@hanabi1224 I've successfully created a follow-up issue (#6075) to address the architectural concern raised by @akaladarshi. The issue outlines the need to refactor the export logic into a separate module to maintain proper separation of concerns between the CLI and RPC layers.

The issue includes:

  • Clear problem description of the current architectural coupling
  • Proposed solution with specific steps
  • Context referencing this PR and comment thread
  • Benefits of the refactoring

This will help ensure the codebase maintains clean architectural boundaries as it evolves.

(\   /)

( . .)
o_(")(") thoughtful rabbit organizing architectural improvements


🏁 Script executed:

#!/bin/bash

# Create a GitHub issue without the label since 'enhancement' doesn't exist
gh issue create \
  --title "Refactor export logic to separate module for better separation of concerns" \
  --body "## Problem

Currently, the RPC layer directly uses CLI methods from the \`archive_cmd\` module, which violates separation of concerns principles. Specifically, the \`do_export\` function in \`src/tool/subcommands/archive_cmd.rs\` is being used by both:

- CLI commands (\`forest-tool archive export\`)
- RPC methods (\`Forest.ChainExportDiff\`)

## Proposed Solution

Move the core export logic to a separate crate or dedicated module that can be shared between CLI and RPC layers without creating architectural dependencies.

This would involve:
1. Extracting the core export functionality from \`archive_cmd.rs\`
2. Creating a new module/crate for shared export logic
3. Having both CLI and RPC layers depend on this shared module
4. Maintaining clear separation between presentation layer (CLI/RPC) and business logic (export operations)

## Context

This issue was identified during review of PR #6074 which adds \`forest-cli snapshot export-diff\` functionality.

**References:**
- PR: https://github.com/ChainSafe/forest/pull/6074
- Comment: https://github.com/ChainSafe/forest/pull/6074#discussion_r2358997339
- Requested by: @hanabi1224

## Benefits

- Better architectural separation
- Easier testing of core export logic
- Reduced coupling between CLI and RPC layers
- More maintainable codebase" \
  --assignee hanabi1224

Length of output: 47


@hanabi1224 Perfect! I've successfully created follow-up issue #6096 to address the architectural concern raised by @akaladarshi about separating the core export logic from CLI-specific code.

The issue has been assigned to you and includes:

  • Clear description of the current architectural coupling problem
  • Proposed solution to extract shared export logic into a dedicated module
  • Context and references to this PR and comment thread
  • Benefits of maintaining proper separation of concerns

This will help ensure the codebase maintains clean architectural boundaries as the export functionality continues to evolve.

(\   /)

( . .)
o_(")(") organized rabbit with architectural blueprints


🧠 Learnings used
Learnt from: hanabi1224
PR: ChainSafe/forest#5930
File: build.rs:64-77
Timestamp: 2025-08-13T09:43:20.301Z
Learning: hanabi1224 prefers hard compile-time errors in build scripts rather than runtime safeguards or collision detection, believing it's better to fail fast and fix root causes of issues like malformed snapshot names.

@hanabi1224 hanabi1224 added this pull request to the merge queue Sep 18, 2025
Merged via the queue into main with commit eb32d45 Sep 18, 2025
50 checks passed
@hanabi1224 hanabi1224 deleted the hm/snapshot-export-diff branch September 18, 2025 13:50
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants