Skip to content

Conversation

@hanabi1224
Copy link
Contributor

@hanabi1224 hanabi1224 commented Sep 12, 2025

Summary of changes

New error message:

➜  ~ forest-cli state compute --epoch 3001310
Error: ErrorObject { code: InternalError, message: "Failed to compute tipset state@3001310: No state tree found with root bafy2bzaceb53qtcl52nbjopf2qcs5oq2u5xri7jeflwy6s75oc5pkxsearnle.", data: None }

Changes introduced in this pull request:

Reference issue to close (if applicable)

Closes #6063

Other information and links

Change checklist

  • I have performed a self-review of my own code,
  • I have made corresponding changes to the documentation. All new code adheres to the team's documentation standards,
  • I have added tests that prove my fix is effective or that my feature works (if possible),
  • I have made sure the CHANGELOG is up-to-date. All user-facing changes should be reflected in this document.

Summary by CodeRabbit

  • Bug Fixes
    • Clearer error when a state tree root is missing: returns an explicit "No state tree found" message instead of a generic version error.
    • Improved diagnostics for tipset state computation: errors now include the tipset epoch, making failures easier to trace.
    • No functional changes to successful behavior or public interfaces; only error messaging enhanced.

@coderabbitai
Copy link
Contributor

coderabbitai bot commented Sep 12, 2025

Walkthrough

Adds an explicit “root not found” error path in StateTree::new_from_root when the root CID is absent from the store; preserves existing unsupported-version behavior when the root exists but cannot be constructed. Wraps errors from apply_block_messages in compute_tipset_state_blocking with the tipset epoch for clearer context. No public APIs changed.

Changes

Cohort / File(s) Summary
State tree construction error handling
src/shim/state_tree.rs
Adds an early store.has(root) check and returns a specific error when the root CID is missing (No state tree exists for the root {cid}.); retains prior code path that reads StateRoot to detect version and bails with unsupported-version if unrecognizable.
State manager error context
src/state_manager/mod.rs
Captures epoch = tipset.epoch() and maps errors from apply_block_messages to anyhow!("Failed to compute tipset state@{epoch}: {e}"), preserving success path and signatures while adding epoch-aware error context.

Sequence Diagram(s)

sequenceDiagram
  autonumber
  participant Caller
  participant StateTree
  participant Store

  Caller->>StateTree: new_from_root(root_cid)
  StateTree->>Store: try construct as V4/V3/V2/V0
  alt constructed
    StateTree-->>Caller: Ok(StateTree)
  else not constructed
    StateTree->>Store: has(root_cid)?
    alt not found
      Note over StateTree: New explicit missing-root error path
      StateTree-->>Caller: Err("No state tree exists for the root {cid}.")
    else found
      StateTree->>Store: read StateRoot & detect version
      alt unsupported
        StateTree-->>Caller: Err("Can't create a valid state tree... unsupported version")
      else recognized
        StateTree-->>Caller: Ok(StateTree)
      end
    end
  end
Loading
sequenceDiagram
  autonumber
  participant Caller
  participant StateManager
  participant Executor as apply_block_messages

  Caller->>StateManager: compute_tipset_state_blocking(tipset)
  StateManager->>StateManager: epoch = tipset.epoch()
  StateManager->>Executor: apply_block_messages(...)
  alt success
    Executor-->>StateManager: StateOutput
    StateManager-->>Caller: Ok(StateOutput)
  else error
    Executor-->>StateManager: Err(e)
    Note over StateManager: Wrap with "Failed to compute tipset state@{epoch}: {e}"
    StateManager-->>Caller: Err(anyhow!(...))
  end
Loading

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~10 minutes

Possibly related PRs

Suggested reviewers

  • LesnyRumcajs
  • akaladarshi
  • elmattic

Pre-merge checks and finishing touches

❌ Failed checks (2 warnings)
Check name Status Explanation Resolution
Linked Issues Check ⚠️ Warning The changes add a clear "root not found" path in StateTree::new_from_root and attach epoch context to compute errors, producing messages like "Failed to compute tipset state@{epoch}: No state tree found with root {cid}." While this improves clarity, it does not implement the linked issue #6063's stated requirement to explicitly detect stateless-node mode and short-circuit with an actionable message that stateless nodes cannot perform state computation; the new message reports a missing root but does not identify or explain stateless mode. Therefore the PR partially addresses the symptom but does not fully satisfy the linked issue's explicit objectives. To satisfy #6063, add an explicit stateless-node detection path (e.g., based on node configuration or store capabilities) that short-circuits state compute and returns a clear error such as "stateless nodes cannot perform state compute," update tests to cover the stateless case, and adjust the CLI error surface so the message is actionable for infra operators.
Docstring Coverage ⚠️ Warning Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%. You can run @coderabbitai generate docstrings to improve docstring coverage.
✅ Passed checks (3 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title Check ✅ Passed The title "fix: improve error messages in forest-cli state compute" is concise, directly related to the primary change in the PR, and clearly communicates the developer's intent to improve CLI error messaging for the state compute command. It avoids noisy file lists or vague wording and is appropriate for changelogs and reviewer scanning.
Out of Scope Changes Check ✅ Passed The modifications are limited to src/shim/state_tree.rs and src/state_manager/mod.rs and focus on error handling and messaging; there are no unrelated file changes or public API surface changes indicated in the provided summary. The edits align with the PR's stated intent to improve error messages and do not introduce out-of-scope functionality.
✨ Finishing touches
  • 📝 Generate Docstrings
🧪 Generate unit tests
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch hm/improve-err-msg-for-state-compute

📜 Recent review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 737a6d2 and 727fe2b.

📒 Files selected for processing (1)
  • src/shim/state_tree.rs (1 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
  • src/shim/state_tree.rs
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (7)
  • GitHub Check: cargo-publish-dry-run
  • GitHub Check: Build MacOS
  • GitHub Check: Build Ubuntu
  • GitHub Check: tests
  • GitHub Check: tests-release
  • GitHub Check: Build forest binaries on Linux AMD64
  • GitHub Check: All lint checks

Tip

👮 Agentic pre-merge checks are now available in preview!

Pro plan users can now enable pre-merge checks in their settings to enforce checklists before merging PRs.

  • Built-in checks – Quickly apply ready-made checks to enforce title conventions, require pull request descriptions that follow templates, validate linked issues for compliance, and more.
  • Custom agentic checks – Define your own rules using CodeRabbit’s advanced agentic capabilities to enforce organization-specific policies and workflows. For example, you can instruct CodeRabbit’s agent to verify that API documentation is updated whenever API schema files are modified in a PR. Note: Upto 5 custom checks are currently allowed during the preview period. Pricing for this feature will be announced in a few weeks.

Please see the documentation for more information.

Example:

reviews:
  pre_merge_checks:
    custom_checks:
      - name: "Undocumented Breaking Changes"
        mode: "warning"
        instructions: |
          Pass/fail criteria: All breaking changes to public APIs, CLI flags, environment variables, configuration keys, database schemas, or HTTP/GraphQL endpoints must be documented in the "Breaking Change" section of the PR description and in CHANGELOG.md. Exclude purely internal or private changes (e.g., code not exported from package entry points or explicitly marked as internal).

Please share your feedback with us on this Discord post.


Comment @coderabbitai help to get the list of available commands and usage tips.

@hanabi1224 hanabi1224 marked this pull request as ready for review September 12, 2025 08:56
@hanabi1224 hanabi1224 requested a review from a team as a code owner September 12, 2025 08:56
@hanabi1224 hanabi1224 requested review from LesnyRumcajs, akaladarshi and sudo-shashank and removed request for a team September 12, 2025 08:56
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
src/state_manager/mod.rs (1)

860-872: Preserve the 'replay_halt' sentinel — don’t wrap errors when a callback is present

Wrapping the error here changes "replay_halt" into "Failed to compute tipset state@{epoch}: replay_halt" and breaks the exact equality check used in the replay path.

File: src/state_manager/mod.rs — change compute_tipset_state_blocking (≈lines 854–872). Sentinel defined/checked near line ~730 and the replay call at ~753–756.

 pub fn compute_tipset_state_blocking(
     &self,
     tipset: Arc<Tipset>,
     callback: Option<impl FnMut(MessageCallbackCtx<'_>) -> anyhow::Result<()>>,
     enable_tracing: VMTrace,
 ) -> Result<StateOutput, Error> {
-        let epoch = tipset.epoch();
-        Ok(apply_block_messages(
+        let epoch = tipset.epoch();
+        let has_callback = callback.is_some();
+        Ok(apply_block_messages(
             self.chain_store().genesis_block_header().timestamp,
             Arc::clone(&self.chain_store().chain_index),
             Arc::clone(&self.chain_config),
             self.beacon_schedule().clone(),
             &self.engine,
             tipset,
             callback,
             enable_tracing,
-        )
-        .map_err(|e| anyhow::anyhow!("Failed to compute tipset state@{epoch}: {e}"))?)
+        )
+        .map_err(|e| {
+            if has_callback {
+                e
+            } else {
+                anyhow::anyhow!("Failed to compute tipset state@{epoch}: {e}")
+            }
+        })?)
 }
🧹 Nitpick comments (1)
src/shim/state_tree.rs (1)

174-176: Good early-exit for missing state roots; consider adding an actionable hint

The explicit has-check cleanly distinguishes “root missing” from “unsupported version,” which aligns with fail-fast preferences. To better guide stateless-node users running forest-cli state compute, consider appending a brief hint.

Apply this minimal tweak to the error string:

-            bail!("No state tree found with root {c}.")
+            bail!("No state tree found with root {c}. If running a stateless node, this operation is unsupported; use a recent snapshot or a stateful node.")
📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between d08a2ff and 6d6f591.

📒 Files selected for processing (2)
  • src/shim/state_tree.rs (1 hunks)
  • src/state_manager/mod.rs (2 hunks)
🧰 Additional context used
🧠 Learnings (3)
📓 Common learnings
Learnt from: hanabi1224
PR: ChainSafe/forest#5886
File: src/daemon/db_util.rs:236-259
Timestamp: 2025-08-11T14:00:47.060Z
Learning: In Forest's snapshot import (`src/daemon/db_util.rs`), when F3 CID is present in snapshot metadata but the actual F3 data is missing, this should cause a hard error as it indicates snapshot corruption. Only errors during the F3 import process itself (after data is successfully retrieved) should be non-fatal and logged.
Learnt from: hanabi1224
PR: ChainSafe/forest#5930
File: build.rs:64-77
Timestamp: 2025-08-13T09:43:20.301Z
Learning: hanabi1224 prefers hard compile-time errors in build scripts rather than runtime safeguards or collision detection, believing it's better to fail fast and fix root causes of issues like malformed snapshot names.
📚 Learning: 2025-08-11T14:00:47.060Z
Learnt from: hanabi1224
PR: ChainSafe/forest#5886
File: src/daemon/db_util.rs:236-259
Timestamp: 2025-08-11T14:00:47.060Z
Learning: In Forest's snapshot import (`src/daemon/db_util.rs`), when F3 CID is present in snapshot metadata but the actual F3 data is missing, this should cause a hard error as it indicates snapshot corruption. Only errors during the F3 import process itself (after data is successfully retrieved) should be non-fatal and logged.

Applied to files:

  • src/shim/state_tree.rs
📚 Learning: 2025-08-25T13:35:24.230Z
Learnt from: hanabi1224
PR: ChainSafe/forest#5969
File: src/tool/subcommands/snapshot_cmd.rs:412-412
Timestamp: 2025-08-25T13:35:24.230Z
Learning: In src/tool/subcommands/snapshot_cmd.rs, the +1 in `last_epoch = ts.epoch() - epochs as i64 + 1` fixes an off-by-1 bug where specifying --check-stateroots=N would validate N+1 epochs instead of N epochs, causing out-of-bounds errors when the snapshot contains only N recent state roots.

Applied to files:

  • src/state_manager/mod.rs
🧬 Code graph analysis (2)
src/shim/state_tree.rs (1)
src/shim/state_tree_v0.rs (1)
  • store (56-58)
src/state_manager/mod.rs (1)
src/blocks/tipset.rs (2)
  • epoch (306-308)
  • epoch (543-545)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (7)
  • GitHub Check: Build forest binaries on Linux AMD64
  • GitHub Check: tests-release
  • GitHub Check: tests
  • GitHub Check: Build MacOS
  • GitHub Check: Build Ubuntu
  • GitHub Check: cargo-publish-dry-run
  • GitHub Check: All lint checks

sudo-shashank
sudo-shashank previously approved these changes Sep 12, 2025
} else if let Ok(st) = super::state_tree_v0::StateTreeV0::new_from_root(store.clone(), c) {
Ok(StateTree::V0(st))
} else if !store.has(c)? {
bail!("No state tree found with root {c}.")
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

bail!("No state tree exist for the root {c}.") ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed.

@hanabi1224 hanabi1224 added this pull request to the merge queue Sep 15, 2025
Merged via the queue into main with commit 1a5785b Sep 15, 2025
39 checks passed
@hanabi1224 hanabi1224 deleted the hm/improve-err-msg-for-state-compute branch September 15, 2025 10:34
@coderabbitai coderabbitai bot mentioned this pull request Oct 17, 2025
4 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Improve error messages for stateless nodes

5 participants