fix: OOM in `forest-tool archive diff` #6051

hanabi1224 · 2025-09-08T13:00:47Z

Summary of changes

This PR fixes OOM issue in running forest-tool archive diff --epoch 2819795 forest_snapshot_mainnet_2023-05-01_height_2820000.forest.car.zst

Output:

forest-tool archive diff --epoch 2819795 forest_snapshot_mainnet_2023-05-01_height_2820000.forest.car.zst

- Expected state hash: bafy2bzacedflaenehbgpwlzgeimnamtuuu45vy5zk3u6vp345h7zs4ybweerq
+ Computed state hash: bafy2bzacecyw2yay6jkkmyob53u4hobgsi57xiqdfla6uu437w2vriarb5nds
Address f04 changed: 
 ActorState(ActorState { code: Cid(bafk2bzaceaxgloxuzg35vu7l7tohdgaq2frsfp4ejmuo7tkoxjp5zqrze6sf4)
- state: Cid(bafy2bzacedabdkat4ikokbpmzkbmwjql63zqmj2mlqdbbuzscmo6jz7vhwpxi)
+ state: Cid(bafy2bzaceanjejlvqc5gwilc62xpmdvzkgfzcqgp4igcymywto2vm25wexnni)
  sequence: 0
  balance: TokenAmount(0.011)
  delegated_address: None })
V11(State { total_raw_byte_power: 14030677426019762176
  total_bytes_committed: 14030973057208680448
  total_quality_adj_power: 22832346608471441408
  total_qa_bytes_committed: 22833184005643042816
- total_pledge_collateral: TokenAmount(143269926.2652187083170961)
+ total_pledge_collateral: TokenAmount(143270003.413371965258963041)
  this_epoch_raw_byte_power: 14030677426019762176
  this_epoch_quality_adj_power: 22832346608471441408
- this_epoch_pledge_collateral: TokenAmount(143269926.2652187083170961)
+ this_epoch_pledge_collateral: TokenAmount(143270003.413371965258963041)
  this_epoch_qa_power_smoothed: FilterEstimate { position: 7788953586247045787797498918428003402915103487709779837094
  velocity: 30553465660375297517809812792028847344200527615729490 }
  miner_count: 597663
  miner_above_min_power_count: 3468
  cron_event_queue: Cid(bafy2bzaceblqrq7j5jz75xqtwkgm7h67chxyp443zy7agjzkf6dvcpthxbhmw)
  first_cron_epoch: 2819796
  claims: Cid(bafy2bzacecl7s52etztcf2aaj5ccilgrzovhr4zeitxcl4n35ti7eucxbfwdg)
  proof_validation_batch: None })
Address f099 changed: 
 ActorState(ActorState { code: Cid(bafk2bzacealnlr7st6lkwoh6wxpf2hnrlex5sknaopgmkr2tuhg7vmbfy45so)
  state: Cid(bafy2bzacedpuk5ggwoq3s2wixsyjjnexpsjstdlyntio76vs2lt2jvy3o6mau)
  sequence: 0
- balance: TokenAmount(36187969.358883641800996567)
+ balance: TokenAmount(36187896.068627543923907091)
  delegated_address: None })
V11(State { address: Address { payload: ID(99) } })
Address f094625 changed: 
 ActorState(ActorState { code: Cid(bafk2bzacec24okjqrp7c7rj3hbrs5ez5apvwah2ruka6haesgfngf37mhk6us)
- state: Cid(bafy2bzacebr6if6imx2olrhrb3gquzgfwd4zkn36cxinnikiz4lyngys3eeyk)
+ state: Cid(bafy2bzacedjrgrvkj4wopdyecsqviyzride2ae4itin3uzabxg5kqh4jrr5l6)
  sequence: 0
- balance: TokenAmount(49655.00241996624371334)
- delegated_address: None })
V11(State { info: Cid(bafy2bzacealxryywh3wxrpq2ydnv5p53mp64x5eniswhj5tdnd67lhee6tjua)
+ balance: TokenAmount(49732.150573223185580281)
+ delegated_address: None })
V11(State { info: Cid(bafy2bzaced5vr2epcy7ao63r6sbda6jberludtn77u2dg6xufeydtyazo3b4m)
  pre_commit_deposits: TokenAmount(0.0)
- locked_funds: TokenAmount(5036.123441842349323139)
- vesting_funds: Cid(bafy2bzacedo75tr7blox5i374yioa2fk3xpzyqes7tb4tmiuu7beuwounk4oa)
+ locked_funds: TokenAmount(5113.27159509929119008)
+ vesting_funds: Cid(bafy2bzaceayfqbja334jl3cjggp5jhdrnwb6k3l2ey7mz5qro56mptkwvprae)
  fee_debt: TokenAmount(0.0)
  initial_pledge: TokenAmount(35640.487189858427627691)
  pre_committed_sectors: Cid(bafy2bzaceamp42wmmgr2g2ymg46euououzfyck7szknvfacqscohrvaikwfay)
  pre_committed_sectors_cleanup: Cid(bafy2bzaceaa2jny7gkgdwnid4kuldau6bnvgyss5bszo4uy6uikrncvdu5mc2)
  allocated_sectors: Cid(bafy2bzaceagnnbswwvovpa4knbmqco7iweh62ffsm4jelgp5d2mmmopb4ix2e)
  sectors: Cid(bafy2bzaced4paoaja7y7inff55ua2jqanjsvjpyci2zpg4khy5zhhrdvyitzk)
  proving_period_start: 2817774
  current_deadline: 33
  deadlines: Cid(bafy2bzacebgwqhqu6momrtokexrgqgoyp7baswlne5jaqvlcxm6owygcbfdm2)
  early_terminations: BitField { ranges: []
  set: {}
  unset: {} }
  deadline_cron_active: true })
Address f01984106 changed: 
 ActorState(ActorState { code: Cid(bafk2bzacealnlr7st6lkwoh6wxpf2hnrlex5sknaopgmkr2tuhg7vmbfy45so)
  state: Cid(bafy2bzacebfeoet4rnxbdcsmnrmttjtylyf64zmt27tjoud52ettkvgscr3xo)
  sequence: 111
- balance: TokenAmount(51.096479311654039993)
+ balance: TokenAmount(47.238582152589262528)
  delegated_address: None })
V11(State { address: Address { payload: Secp256k1([78
  66
  50
  65
  69
  41
  115
  113
  158
  124
  128
  139
  119
  241
  247
  133
  197
  177
  154
  216]) } })

Changes introduced in this pull request:

Reference issue to close (if applicable)

Closes

Change checklist

I have performed a self-review of my own code,
I have made corresponding changes to the documentation. All new code adheres to the team's documentation standards,
I have added tests that prove my fix is effective or that my feature works (if possible),
I have made sure the CHANGELOG is up-to-date. All user-facing changes should be reflected in this document.

Summary by CodeRabbit

New Features
- No user-facing feature changes in this release.
Performance
- Reduced upfront processing when displaying actor state differences; computes details only for changed or added items, improving responsiveness with large diffs.
Refactor
- Streamlined preparation of state diff data and clarified internal constants. No changes to output format or public APIs. Behavior and results remain the same; only efficiency and maintainability improved.

coderabbitai · 2025-09-08T13:00:55Z

Walkthrough

Moved pretty-print computation (calc_pp) inside conditional branches for changed/added actors, introduced a COMMA constant for splitting, and retained existing diff/print behavior. No public API changes.

Changes

Cohort / File(s)	Summary
State diff logic refactor `src/statediff/mod.rs`	Deferred calc_pp computation to only changed/added actors; introduced local COMMA constant for splitting expected/calculated pretty-printed strings; preserved existing diff construction and printing; no exported signature changes.

Sequence Diagram(s)

sequenceDiagram
  autonumber
  participant Caller
  participant StateDiff as try_print_actor_states
  participant Actors

  Caller->>StateDiff: invoke with actor states
  loop for each actor
    StateDiff->>Actors: determine status (unchanged/changed/added)
    alt changed or added
      note over StateDiff: Compute calc_pp lazily
      StateDiff->>StateDiff: Split expected_pp and calc_pp by COMMA
      StateDiff->>StateDiff: Build diffs
      StateDiff->>Caller: Print changes
    else unchanged
      StateDiff->>Caller: Skip calc_pp and continue
    end
  end

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~10 minutes

Suggested reviewers

LesnyRumcajs
elmattic
sudo-shashank

✨ Finishing Touches

📝 Generate Docstrings

🧪 Generate unit tests

Create PR with unit tests
Post copyable unit tests in a comment
Commit unit tests in branch hm/fix-state-diff-oom

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 0

🧹 Nitpick comments (3)

src/statediff/mod.rs (3)

94-105: Prefer a char separator and a single, reusable constant.

Minor cleanups to reduce tiny overhead and avoid block-local duplication:

Use a char separator (faster split path).
Hoist the separator to module scope so it’s defined once.

Apply:

-                const COMMA: &str = ",";
-                let calc_pp = pp_actor_state(bs, actor, depth)?;
+                let calc_pp = pp_actor_state(bs, actor, depth)?;
                 let expected_pp = pp_actor_state(bs, &other, depth)?;
-                let expected = expected_pp
-                    .split(COMMA)
+                let expected = expected_pp
+                    .split(SEP)
                     .map(|s| s.trim_start_matches('\n'))
                     .collect::<Vec<&str>>();
-                let calculated = calc_pp
-                    .split(COMMA)
+                let calculated = calc_pp
+                    .split(SEP)
                     .map(|s| s.trim_start_matches('\n'))
                     .collect::<Vec<&str>>();

Add at module scope (e.g., after use statements):

const SEP: char = ',';

105-110: Optional: fall back to line-based diff for huge tokens to further curb memory spikes.

When comma-splitting yields very large token vectors, a coarse line-diff avoids building massive [&str] slices and can trim peak RAM.

Apply:

-                let diffs = TextDiff::from_slices(&expected, &calculated);
+                let diffs = if expected.len() + calculated.len() > 20_000 {
+                    // Coarser but safer on memory
+                    TextDiff::from_lines(&expected_pp, &calc_pp)
+                } else {
+                    TextDiff::from_slices(&expected, &calculated)
+                };

112-115: Unify output through the same locked handle (consistency and slight perf).

You lock stdout for the “changed” branch but use println! for “added”. Writing both via the same handle avoids interleaving and extra locks.

Apply (within this branch):

-            // Added actor, print out the json format actor state.
-            println!("{}", format!("+ Address {addr}:\n{calc_pp}").green());
+            // Added actor, print out the json format actor state.
+            let stdout = stdout();
+            let mut handle = stdout.lock();
+            writeln!(handle, "{}", format!("+ Address {addr}:\n{calc_pp}").green())?;

Optionally, create one stdout lock before for_each and pass a mutable handle into the closure to avoid repeated locks.

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between f635581 and a1f9adf.

📒 Files selected for processing (1)

src/statediff/mod.rs (2 hunks)

🧰 Additional context used

🧠 Learnings (1)

📓 Common learnings

Learnt from: hanabi1224
PR: ChainSafe/forest#5930
File: build.rs:64-77
Timestamp: 2025-08-13T09:43:20.301Z
Learning: hanabi1224 prefers hard compile-time errors in build scripts rather than runtime safeguards or collision detection, believing it's better to fail fast and fix root causes of issues like malformed snapshot names.

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (6)

GitHub Check: tests
GitHub Check: tests-release
GitHub Check: Build forest binaries on Linux AMD64
GitHub Check: All lint checks
GitHub Check: cargo-publish-dry-run
GitHub Check: Build Ubuntu

🔇 Additional comments (1)

src/statediff/mod.rs (1)

91-110: Good: defer pretty-printing to only changed actors.

Computing calc_pp inside the changed-actor branch eliminates unnecessary large String allocations for unchanged actors and directly addresses the OOM risk. Nice, targeted fix.

LesnyRumcajs

I don't quite understand. Could you please explain what was the issue and how this change fixes it?

hanabi1224 · 2025-09-09T00:18:41Z

I don't quite understand. Could you please explain what was the issue and how this change fixes it?

@LesnyRumcajs Calling pp_actor_state for every actor state causes OOM on fuzzy(with 64GiB RAM), even with cacheless hamt iteration. Calling pp_actor_state on demand fixes the issue.

LesnyRumcajs · 2025-09-09T08:29:00Z

I don't quite understand. Could you please explain what was the issue and how this change fixes it?

@LesnyRumcajs Calling pp_actor_state for every actor state causes OOM on fuzzy(with 64GiB RAM), even with cacheless hamt iteration. Calling pp_actor_state on demand fixes the issue.

But why is that? Is it due to the number of calls to pp_actor_state or is there a particular state that causes it? I don't see how the memory grows here, even if called on all actors; pp_actor_state should claim memory and then release it, no?

hanabi1224 · 2025-09-09T08:34:59Z

even if called on all actors; pp_actor_state should claim memory and then release it, no?

@LesnyRumcajs I don't see memory leak in it, my best guess is, resolving some large IPLD on a single actor causes OOM (which is confirmed by monitoring htop on Fuzzy)

LesnyRumcajs · 2025-09-09T08:39:56Z

even if called on all actors; pp_actor_state should claim memory and then release it, no?

@LesnyRumcajs I don't see memory leak in it, my best guess is, resolving some large IPLD on a single actor causes OOM (which is confirmed by monitoring htop on Fuzzy)

I see. So this will surface again if that actor has a diff. Something to keep in mind.

fix: OOM in forest-tool archive diff

a1f9adf

hanabi1224 marked this pull request as ready for review September 8, 2025 13:03

hanabi1224 requested a review from a team as a code owner September 8, 2025 13:03

hanabi1224 requested review from LesnyRumcajs and akaladarshi and removed request for a team September 8, 2025 13:03

coderabbitai bot reviewed Sep 8, 2025

View reviewed changes

akaladarshi approved these changes Sep 8, 2025

View reviewed changes

LesnyRumcajs reviewed Sep 8, 2025

View reviewed changes

hanabi1224 enabled auto-merge September 8, 2025 14:13

LesnyRumcajs approved these changes Sep 9, 2025

View reviewed changes

hanabi1224 added this pull request to the merge queue Sep 9, 2025

Merged via the queue into main with commit ca01b7f Sep 9, 2025
44 checks passed

hanabi1224 deleted the hm/fix-state-diff-oom branch September 9, 2025 09:06

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

fix: OOM in `forest-tool archive diff` #6051

fix: OOM in `forest-tool archive diff` #6051

Uh oh!

hanabi1224 commented Sep 8, 2025 •

edited by coderabbitai bot

Loading

Uh oh!

coderabbitai bot commented Sep 8, 2025 •

edited

Loading

Uh oh!

coderabbitai bot left a comment

Uh oh!

LesnyRumcajs left a comment

Uh oh!

hanabi1224 commented Sep 9, 2025

Uh oh!

LesnyRumcajs commented Sep 9, 2025

Uh oh!

hanabi1224 commented Sep 9, 2025 •

edited

Loading

Uh oh!

LesnyRumcajs commented Sep 9, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

fix: OOM in forest-tool archive diff #6051

fix: OOM in forest-tool archive diff #6051

Uh oh!

Conversation

hanabi1224 commented Sep 8, 2025 • edited by coderabbitai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary of changes

Reference issue to close (if applicable)

Other information and links

Change checklist

Summary by CodeRabbit

Uh oh!

coderabbitai bot commented Sep 8, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Sequence Diagram(s)

Estimated code review effort

Suggested reviewers

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

LesnyRumcajs left a comment

Choose a reason for hiding this comment

Uh oh!

hanabi1224 commented Sep 9, 2025

Uh oh!

LesnyRumcajs commented Sep 9, 2025

Uh oh!

hanabi1224 commented Sep 9, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

LesnyRumcajs commented Sep 9, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

fix: OOM in `forest-tool archive diff` #6051

fix: OOM in `forest-tool archive diff` #6051

hanabi1224 commented Sep 8, 2025 •

edited by coderabbitai bot

Loading

coderabbitai bot commented Sep 8, 2025 •

edited

Loading

hanabi1224 commented Sep 9, 2025 •

edited

Loading