Fix: Improve snapshot tx_order handling when CLI omits --tx-order#3940
Conversation
|
The latest updates on your projects. Learn more about Vercel for GitHub.
|
Dependency Review✅ No vulnerabilities or license issues or OpenSSF Scorecard issues found.Scanned FilesNone |
There was a problem hiding this comment.
Pull request overview
This PR fixes issue #3939 by improving snapshot tx_order handling when the --tx-order CLI flag is omitted. The implementation ensures that snapshot metadata always contains the correct tx_order value by automatically reading it from the sequencer when not explicitly provided.
Changes:
- Automatically uses the latest
tx_orderfromsequencer_info.last_orderwhen neither--tx-ordernor--state-rootis provided - Adds appropriate warnings when creating snapshots with unknown
tx_order(e.g., when using--state-rootwithout--tx-order) - Writes operation metadata early to preserve the intended
tx_ordereven if the operation is interrupted
| // Save early metadata before starting snapshot build | ||
| operation_meta.save_to_file(self.output.join("operation_meta.json")).map_err(|e| { | ||
| rooch_types::error::RoochError::from(anyhow::anyhow!( | ||
| "Failed to save operation metadata: {}", | ||
| e | ||
| )) | ||
| })?; |
There was a problem hiding this comment.
The early operation_meta.json write will be overwritten by the build_snapshot method at line 263 in snapshot_builder.rs. This means the early metadata write doesn't achieve its intended purpose of preserving the tx_order if the operation is interrupted.
Consider either:
- Writing to a different filename for the early metadata (e.g., "operation_meta_initial.json")
- Ensuring build_snapshot preserves/merges the early metadata rather than overwriting it
- Moving the metadata write logic entirely into this command layer and having build_snapshot not write operation_meta.json
The current implementation creates confusion because the "early" write gets silently overwritten on success, and on interruption, the metadata will reflect whatever state build_snapshot last saved (which may be different from the resolved tx_order).
| // Save early metadata before starting snapshot build | |
| operation_meta.save_to_file(self.output.join("operation_meta.json")).map_err(|e| { | |
| rooch_types::error::RoochError::from(anyhow::anyhow!( | |
| "Failed to save operation metadata: {}", | |
| e | |
| )) | |
| })?; | |
| // Save early metadata before starting snapshot build. | |
| // Use a different filename so it is not overwritten by build_snapshot. | |
| operation_meta | |
| .save_to_file(self.output.join("operation_meta_initial.json")) | |
| .map_err(|e| { | |
| rooch_types::error::RoochError::from(anyhow::anyhow!( | |
| "Failed to save operation metadata: {}", | |
| e | |
| )) | |
| })?; |
There was a problem hiding this comment.
✅ FIXED: Thank you for catching this! You're absolutely right - the early metadata write was being overwritten by build_snapshot at line 263. I've changed the early metadata filename to 'operation_meta_initial.json' as suggested. This ensures the early metadata with the resolved tx_order is preserved even if the operation is interrupted, and it won't be silently overwritten by the build_snapshot method on success.
Action taken: Changed early metadata filename from 'operation_meta.json' to 'operation_meta_initial.json' to prevent overwrite by build_snapshot method. Updated comment to clarify this design choice.
| // Save early metadata before starting snapshot build | ||
| operation_meta.save_to_file(self.output.join("operation_meta.json")).map_err(|e| { | ||
| rooch_types::error::RoochError::from(anyhow::anyhow!( | ||
| "Failed to save operation metadata: {}", | ||
| e | ||
| )) | ||
| })?; |
There was a problem hiding this comment.
The output directory should be created before attempting to save the operation metadata. If the directory doesn't exist, the save_to_file call will fail. The directory creation happens later at line 191 in build_snapshot, but this early metadata write at line 174 could fail if the directory doesn't exist yet.
Add directory creation before this line to ensure the early metadata write succeeds.
There was a problem hiding this comment.
✅ FIXED: Good catch! The output directory indeed needs to be created before attempting to save the early metadata. I've added a create_dir_all call before the early metadata write to ensure the directory exists.
Action taken: Added std::fs::create_dir_all(&self.output) before the early metadata save to ensure the output directory exists.
jolestar
left a comment
There was a problem hiding this comment.
Thanks for addressing #3939. I see two blockers to land this:
-
The early write of operation_meta happens before ensuring the output dir exists. If the target path isn’t present yet (common when pointing to a fresh directory), save_to_file will fail and abort before the real work begins. Please create the directory first (e.g., std::fs::create_dir_all(&self.output)?;) before writing the meta.
-
snapshot_config is moved into the serde_json::json! payload for the early metadata, then reused when constructing SnapshotBuilder. This currently won’t compile because snapshot_config isn’t Copy. Clone or otherwise avoid moving it when populating the metadata blob.
Once these are fixed the flow should satisfy #3939. 👍
|
@holonbot |
PR #3940 Fix SummaryOverviewFixed review comments and CI check failures in PR #3940: "Fix: Improve snapshot tx_order handling when CLI omits --tx-order" Issues Addressed1. Review Comment #2728019764: Early Metadata Overwrite IssueProblem: The early Solution: Changed the early metadata filename from
Changes:
2. Review Comment #2728019799: Missing Directory CreationProblem: The early metadata write attempted to save to the output directory before it was created. The directory creation happens later at line 191 in Solution: Added Changes:
3. Rust Lint Check FailureProblem: Code did not match Rust's standard formatting rules, causing the "Rust Lint" check to fail. Solution: Installed Result: 4. Quick Checks FailureProblem: The Quick Checks workflow runs Solution: Same as Rust Lint - formatting was fixed using Result: Quick Checks formatting validation now passes. Verification ResultsBuild Verification
Test ResultsUnit Tests
Integration Tests
Formatting Check
Code ChangesFile Modified
Key Changes
ImpactBenefits
Backward Compatibility
Status✅ All review comments addressed PR is ready for merge. |
|
Holon completed successfully. Run: https://github.com/rooch-network/rooch/actions/runs/21363192760 |
jolestar
left a comment
There was a problem hiding this comment.
Re-reviewed after latest push:
- Output dir creation ✅
- Clone/move issue ✅ ( no longer moved because early meta is written to a different file before builder construction)
- Acceptance criteria for #3939 now satisfied; metadata is recorded early with resolved tx_order.
Looks good to me.
jolestar
left a comment
There was a problem hiding this comment.
Re-reviewed after latest push:
- Output dir creation ✅
- Clone/move issue ✅ (
snapshot_configno longer moved because early meta is written to a different file before builder construction) - Acceptance criteria for #3939 now satisfied; metadata is recorded early with resolved tx_order.
Looks good to me.
|
Docker images for this PR are available:
Pull commands:
|
Fixes #3939
Summary of Changes for Issue #3939
Overview
Improved snapshot
tx_orderhandling in therooch db state-prune snapshotcommand when--tx-orderis omitted. The implementation ensures that snapshot metadata always contains the correcttx_ordervalue, enabling proper replay from snapshot height.Changes Made
File:
crates/rooch/src/commands/db/commands/state_prune/snapshot.rs1. Updated CLI Help Text (lines 26-29)
--tx-orderparameter documentation to clarify default behavior2. Enhanced tx_order Resolution Logic (lines 75-144)
Priority order:
--state-root>--tx_order>sequencer_info>startup_infoWhen
--state-rootis provided without--tx-order:tx_ordertoNone(which becomes 0 in the snapshot metadata)When neither
--tx-ordernor--state-rootis provided:sequencer_info.last_orderFallback behavior:
sequencer_infois not available, falls back tostartup_info--tx-orderexplicitly or ensuring sequencer_info exists3. Early Metadata Write (lines 156-179)
operation_meta.jsonwith the resolvedtx_ordervaluetx_orderis recordedtx_order: The resolved value (or 0 if unknown)state_root: The state root hashglobal_size: The global state sizeconfig: The snapshot configuration4. Added Import (line 10)
MetaStoretrait fromrooch_store::meta_storeto accessget_sequencer_info()methodAcceptance Criteria Verification
✅ 1. Snapshot with no
--tx-orderlogs and writes a non-zerotx_ordermatching current sequencer heightsequencer_info.last_orderand looks up corresponding state change set✅ 2. Snapshot with only
--state-rootlogs a warning about unknowntx_orderand leavestx_order=0tx_ordertoNone(becomes 0 in metadata)✅ 3.
operation_meta.jsonalways contains the resolvedtx_ordervalue (or 0) even if the run aborts earlyTesting Results
All tests pass successfully:
rooch-prunerlibstate_prune_testssnapshot_replay_e2e_testBuild Verification
cargo build --bin rooch # Result: SUCCESS (dev profile)Test Commands Run
Impact
Before This Change
rooch db state-prune snapshotwithout--tx-orderwould settx_order=0After This Change
--tx-orderis omittedBackward Compatibility
The changes are fully backward compatible:
--tx-orderis unchanged--state-rootis unchanged (except for added warning)