chore: remove unordered chain traversal #6014

hanabi1224 · 2025-08-28T14:26:15Z

Summary of changes

This PR removes unordered chain traversal for its performance gain not meeting expectation.

Changes introduced in this pull request:

Reference issue to close (if applicable)

Closes #6013

Change checklist

I have performed a self-review of my own code,
I have made corresponding changes to the documentation. All new code adheres to the team's documentation standards,
I have added tests that prove my fix is effective or that my feature works (if possible),
I have made sure the CHANGELOG is up-to-date. All user-facing changes should be reflected in this document.

Summary by CodeRabbit

Refactor
- Removed unordered snapshot export option; export and benchmarking now use a deterministic DFS/ordered stream.
- Deleted unordered graph traversal benchmark variant.
Tests
- Removed the calibnet unordered snapshot end-to-end test.
Chores
- Removed CI job and its integration-status dependency for unordered export checks.
Documentation
- Removed related CLI benchmark command from docs and changelog entries updated.

coderabbitai · 2025-08-28T14:26:23Z

Walkthrough

Removes unordered export/traversal across the codebase: deletes CI job and calibnet unordered export test script; drops unordered flags and parameters from CLI and RPC; removes unordered stream types/functions; refactors IPLD streaming to a deterministic DFS-based ChainStream; updates tool commands and benchmarks to use stream_graph and remove unordered variants.

Changes

Cohort / File(s)	Summary
CI & test script `.github/workflows/forest.yml`, `scripts/tests/calibnet_export_unordered_check.sh`	Removed CI job and shell script that performed calibnet unordered snapshot export checks.
Chain export API `src/chain/mod.rs`	Dropped `unordered` option and import; export always uses `stream_chain(...).with_seen(...)`; `ExportOptions` no longer contains `unordered`.
CLI & RPC `src/cli/subcommands/snapshot_cmd.rs`, `src/rpc/methods/chain.rs`	Removed `--unordered`/`unordered` from Snapshot export CLI and `ForestChainExportParams`; RPC no longer accepts or forwards unordered.
IPLD traversal refactor `src/ipld/util.rs`	Replaced unordered multithreaded stream with deterministic DFS `ChainStream`; removed `UnorderedChainStream`, `unordered_stream_chain`, `unordered_stream_graph`; added DFS task queue, simplified concurrency and APIs.
Tooling & benchmarks `src/tool/subcommands/archive_cmd.rs`, `src/tool/subcommands/benchmark_cmd.rs`, `docs/docs/users/reference/cli.sh`, `CHANGELOG.md`	Switched to `stream_graph` for export diffs; removed unordered benchmark variant and related docs/changelog entries.

Sequence Diagram(s)

sequenceDiagram
  autonumber
  participant User
  participant CLI as forest / forest-tool
  participant RPC as RPC Server
  participant Chain as Chain Export
  participant IPLD as ipld::stream_chain
  participant Writer as CAR Writer

  User->>CLI: export snapshot (no unordered)
  CLI->>RPC: ForestChainExport(params)
  RPC->>Chain: build ExportOptions(skip_checksum, seen)
  Chain->>IPLD: stream_chain(tipsets).with_seen(seen)
  IPLD-->>Chain: DFS stream of blocks
  loop For each block
    Chain->>Writer: write block
  end
  Writer-->>CLI: export complete
  CLI-->>User: done

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

Assessment against linked issues

Objective	Addressed	Explanation
Remove fn unordered_stream_chain and fn unordered_stream_graph (#6013)	✅
Deprecate unordered traversal across interfaces (CLI/RPC/benchmarks) (#6013)	✅

Assessment against linked issues: Out-of-scope changes

Code Change	Explanation
Remove calibnet unordered export CI job (`.github/workflows/forest.yml`)	CI job removal is operational/CI maintenance and not explicitly required by #6013.
Delete calibnet unordered export test script (`scripts/tests/calibnet_export_unordered_check.sh`)	Removing the specific test script is not requested by #6013, which targets code-level deprecation.

Possibly related PRs

fix: reduce db read ops in chain export #5868 — touches IPLD traversal/streaming code and helpers, closely related to the ChainStream refactor.
feat: add --unordered to forest-cli snapshot export #5867 — originally introduced unordered export features that this PR removes; direct inverse on same codepaths.
chore: refactor and simplify stream_graph and unordered_stream_graph #5893 — modifies IPLD stream construction logic overlapping with stream_graph/unordered changes.

Suggested labels

RPC

Suggested reviewers

LesnyRumcajs
elmattic
akaladarshi

📜 Recent review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

💡 Knowledge Base configuration:

MCP integration is disabled by default for public repositories
Jira integration is disabled by default for public repositories
Linear integration is disabled by default for public repositories

You can enable these sources in your CodeRabbit configuration.

📥 Commits

Reviewing files that changed from the base of the PR and between abccbf2 and e3ebbf8.

📒 Files selected for processing (2)

CHANGELOG.md (1 hunks)
docs/docs/users/reference/cli.sh (0 hunks)

💤 Files with no reviewable changes (1)

docs/docs/users/reference/cli.sh

✅ Files skipped from review due to trivial changes (1)

CHANGELOG.md

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (9)

GitHub Check: Build Ubuntu
GitHub Check: cargo-publish-dry-run
GitHub Check: Build MacOS
GitHub Check: Build forest binaries on Linux AMD64
GitHub Check: tests-release
GitHub Check: tests
GitHub Check: All lint checks
GitHub Check: Analyze (rust)
GitHub Check: Analyze (go)

✨ Finishing Touches

📝 Generate Docstrings

🧪 Generate unit tests

Create PR with unit tests
Post copyable unit tests in a comment
Commit unit tests in branch hm/remove-unordered-chain-traversal

🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

Review comments: Directly reply to a review comment made by CodeRabbit. Example:
- I pushed a fix in commit <commit_id>, please review it.
- Open a follow-up GitHub issue for this discussion.
Files and specific lines of code (under the "Files changed" tab): Tag @coderabbit in a new review comment at the desired location with your query.
PR comments: Tag @coderabbit in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
- @coderabbit gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
- @coderabbit read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.

Support

Need help? Create a ticket on our support page for assistance with any issues or questions.

CodeRabbit Commands (Invoked using PR/Issue comments)

Type @coderabbit help to get the list of available commands.

Other keywords and placeholders

Add @coderabbit ignore or @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
Add @coderabbit summary or @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
Add @coderabbit or @coderabbitai anywhere in the PR title to generate the title automatically.

CodeRabbit Configuration File (`.coderabbit.yaml`)

You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
Please see the configuration documentation for more information.
If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Status, Documentation and Community

Visit our Status Page to check the current availability of CodeRabbit.
Visit our Documentation for detailed information on how to use CodeRabbit.
Join our Discord Community to get help, request features, and share feedback.
Follow us on X/Twitter for updates and announcements.

hanabi1224 · 2025-08-28T14:27:03Z

@coderabbitai review

coderabbitai · 2025-08-28T14:27:10Z

✅ Actions performed

Review triggered.

Note: CodeRabbit is an incremental review system and does not re-review already reviewed commits. This command is applicable only when automatic reviews are paused.

coderabbitai

Actionable comments posted: 0

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)

src/chain/mod.rs (1)

34-38: Remove outdated --unordered references in docs and CHANGELOG
Delete or update all --unordered and unordered-graph-traversal sections in docs/docs/users/reference/cli.md and docs/docs/users/reference/cli.sh, and add a CHANGELOG entry noting the removal of the --unordered flag from forest-cli snapshot export.

🧹 Nitpick comments (1)

src/ipld/util.rs (1)

146-169: Constructor semantics clear; suggest clarifying docs on dead-link policy

stream_chain (errors on dead links) vs stream_graph (tolerates dead links) is a key semantic split. Consider a short rustdoc note on each to prevent misuse.

 pub fn stream_chain<DB: Blockstore, T: Borrow<Tipset>, ITER: Iterator<Item = T> + Unpin>(
@@
 ) -> ChainStream<DB, ITER> {
-    ChainStream {
+    // Errors on missing links (strict). Use `stream_graph` to ignore them.
+    ChainStream {
@@
 pub fn stream_graph<DB: Blockstore, T: Borrow<Tipset>, ITER: Iterator<Item = T> + Unpin>(
@@
 ) -> ChainStream<DB, ITER> {
-    stream_chain(db, tipset_iter, stateroot_limit).fail_on_dead_links(false)
+    // Tolerant traversal for inspection/diff pre-scan; missing links are ignored.
+    stream_chain(db, tipset_iter, stateroot_limit).fail_on_dead_links(false)
 }

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

💡 Knowledge Base configuration:

MCP integration is disabled by default for public repositories
Jira integration is disabled by default for public repositories
Linear integration is disabled by default for public repositories

You can enable these sources in your CodeRabbit configuration.

📥 Commits

Reviewing files that changed from the base of the PR and between 15cdbb3 and abccbf2.

📒 Files selected for processing (8)

.github/workflows/forest.yml (0 hunks)
scripts/tests/calibnet_export_unordered_check.sh (0 hunks)
src/chain/mod.rs (2 hunks)
src/cli/subcommands/snapshot_cmd.rs (0 hunks)
src/ipld/util.rs (1 hunks)
src/rpc/methods/chain.rs (0 hunks)
src/tool/subcommands/archive_cmd.rs (2 hunks)
src/tool/subcommands/benchmark_cmd.rs (1 hunks)

💤 Files with no reviewable changes (4)

scripts/tests/calibnet_export_unordered_check.sh
.github/workflows/forest.yml
src/cli/subcommands/snapshot_cmd.rs
src/rpc/methods/chain.rs

🧰 Additional context used

🧠 Learnings (2)

📓 Common learnings

Learnt from: hanabi1224
PR: ChainSafe/forest#5930
File: build.rs:64-77
Timestamp: 2025-08-13T09:43:20.301Z
Learning: hanabi1224 prefers hard compile-time errors in build scripts rather than runtime safeguards or collision detection, believing it's better to fail fast and fix root causes of issues like malformed snapshot names.

📚 Learning: 2025-08-08T12:11:55.266Z

Learnt from: hanabi1224
PR: ChainSafe/forest#5867
File: src/ipld/util.rs:461-487
Timestamp: 2025-08-08T12:11:55.266Z
Learning: Forest (src/ipld/util.rs, Rust): In UnorderedChainStream::poll_next, dropping `extract_sender` (when no more tipsets and the extract queue is empty) is the intended shutdown signal for workers. Any subsequent attempt to enqueue work after this drop is a logic error and should be treated as an error; do not change `send()` to ignore a missing sender.

Applied to files:

src/chain/mod.rs
src/ipld/util.rs

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (6)

GitHub Check: All lint checks
GitHub Check: Build Ubuntu
GitHub Check: Build MacOS
GitHub Check: cargo-publish-dry-run
GitHub Check: Analyze (go)
GitHub Check: Analyze (rust)

🔇 Additional comments (9)

src/tool/subcommands/archive_cmd.rs (2)

39-39: Replace unordered traversal import with deterministic DFS — LGTM

Import switch to stream_graph aligns with the PR objective and new API.

551-558: Diff pre-scan now ignores dead links; verify downstream export won’t fail on missing keys

stream_graph(..) defaults to fail_on_dead_links(false). That’s desirable for building the seen set, but if any CIDs are missing in the diff base they won’t enter seen and could later cause export(..) (which uses stream_chain with failures enabled) to error. Please sanity-check with a diff where the base lacks some DAG nodes to ensure export still succeeds or fails intentionally.

Would you like me to add a focused test that simulates a missing message/state-root in the diff base and asserts export behavior?

src/tool/subcommands/benchmark_cmd.rs (1)

10-10: Import update to new streaming APIs — LGTM

Both stream_chain and stream_graph are used below; import set is correct.

src/chain/mod.rs (2)

16-16: Switch to stream_chain import — LGTM

Matches the unified deterministic traversal approach.

159-165: Always-deterministic chain streaming with seen-filter — LGTM

stream_chain(...).with_seen(seen) simplifies control flow and preserves strict missing-link errors for exports.

src/ipld/util.rs (4)

58-72: DFS iterator is simple and correct

Reversing pushes preserves intended DFS order while allowing extension.

171-189: Synchronous Blockstore::get inside poll_next may block the reactor

Blockstore::get is sync; on some stores (e.g., CAR-on-disk) this can perform IO in the poll path. If this stream is polled on a core Tokio runtime thread, it may reduce fairness. If practical, consider using a buffered prefetcher or documenting the expectation that callers wrap with par_buffer/compression pipeline (as you do) to keep throughput acceptable.

Do you want a micro-benchmark script comparing old unordered vs new DFS on large .forest.car.zst inputs to quantify any regression across store types?

206-217: CID expansion preserves DFS and avoids revisits — LGTM

extract_cids on DAG-CBOR plus seen.insert ensures deterministic, duplicate-free emission.

248-279: Genesis parent emission and depth gating mirror Lotus behavior

Special-casing epoch 0 and limiting deep walks by stateroot_limit is correct and matches expectations.

chore: remove unordered chain traversal

abccbf2

coderabbitai bot reviewed Aug 28, 2025

View reviewed changes

changelog

e3ebbf8

hanabi1224 marked this pull request as ready for review August 28, 2025 15:22

hanabi1224 requested a review from a team as a code owner August 28, 2025 15:22

hanabi1224 requested review from LesnyRumcajs and akaladarshi and removed request for a team August 28, 2025 15:22

github-actions bot deployed to preview August 28, 2025 15:24 View deployment

LesnyRumcajs approved these changes Aug 28, 2025

View reviewed changes

akaladarshi approved these changes Sep 2, 2025

View reviewed changes

hanabi1224 added this pull request to the merge queue Sep 2, 2025

Merged via the queue into main with commit f1c5bd7 Sep 2, 2025
60 checks passed

hanabi1224 deleted the hm/remove-unordered-chain-traversal branch September 2, 2025 08:07

coderabbitai bot mentioned this pull request Sep 17, 2025

feat: forest-cli snapshot export-diff #6074

Merged

4 tasks

coderabbitai bot mentioned this pull request Oct 8, 2025

Add new subcommands for forest-cli snasphot #6128

Merged

4 tasks

coderabbitai bot mentioned this pull request Oct 16, 2025

feat: forest-tool db import #6161

Merged

4 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

chore: remove unordered chain traversal #6014

chore: remove unordered chain traversal #6014

Uh oh!

hanabi1224 commented Aug 28, 2025 •

edited by coderabbitai bot

Loading

Uh oh!

coderabbitai bot commented Aug 28, 2025 •

edited

Loading

Chat

Support

CodeRabbit Commands (Invoked using PR/Issue comments)

Other keywords and placeholders

CodeRabbit Configuration File (`.coderabbit.yaml`)

Status, Documentation and Community

Uh oh!

hanabi1224 commented Aug 28, 2025

Uh oh!

coderabbitai bot commented Aug 28, 2025

Uh oh!

coderabbitai bot left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

chore: remove unordered chain traversal #6014

chore: remove unordered chain traversal #6014

Uh oh!

Conversation

hanabi1224 commented Aug 28, 2025 • edited by coderabbitai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary of changes

Reference issue to close (if applicable)

Other information and links

Change checklist

Summary by CodeRabbit

Uh oh!

coderabbitai bot commented Aug 28, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Sequence Diagram(s)

Estimated code review effort

Assessment against linked issues

Assessment against linked issues: Out-of-scope changes

Possibly related PRs

Suggested labels

Suggested reviewers

Chat

Support

CodeRabbit Commands (Invoked using PR/Issue comments)

Other keywords and placeholders

CodeRabbit Configuration File (.coderabbit.yaml)

Status, Documentation and Community

Uh oh!

hanabi1224 commented Aug 28, 2025

Uh oh!

coderabbitai bot commented Aug 28, 2025

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

hanabi1224 commented Aug 28, 2025 •

edited by coderabbitai bot

Loading

coderabbitai bot commented Aug 28, 2025 •

edited

Loading

CodeRabbit Configuration File (`.coderabbit.yaml`)