feat(pacquet): cover the gap in nodeLinker: 'hoisted'#12510
Conversation
|
Caution Review failedThe pull request is closed. ℹ️ Recent review info⚙️ Run configurationConfiguration used: Path: .coderabbit.yaml Review profile: CHILL Plan: Pro Plus Run ID: 📒 Files selected for processing (5)
📝 WalkthroughWalkthroughRefactors the ChangesPreferred-ident hoisting algorithm and tests
Estimated code review effort🎯 4 (Complex) | ⏱️ ~50 minutes ✨ Finishing Touches📝 Generate docstrings
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
nodeLinler: 'hoisted'nodeLinker: 'hoisted'
Micro-Benchmark ResultsLinux |
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## main #12510 +/- ##
==========================================
+ Coverage 88.13% 88.16% +0.02%
==========================================
Files 314 314
Lines 43435 43514 +79
==========================================
+ Hits 38281 38363 +82
+ Misses 5154 5151 -3 ☔ View full report in Codecov by Harness. 🚀 New features to boost your workflow:
|
There was a problem hiding this comment.
🧹 Nitpick comments (1)
pacquet/crates/real-hoist/src/lib.rs (1)
743-746: ⚡ Quick winAvoid O(n²) front-removal in the ident-shift loop
Line 745 uses
idents.remove(0)inside the per-pass hoist loop. That is O(n) per shift and can turn this path quadratic when many versions compete for one package name. This code is on the install/link hot path; use a front-pop structure (VecDeque) or a head index cursor instead.⚙️ Suggested change
-use std::collections::{BTreeMap, BTreeSet, HashMap, HashSet}; +use std::collections::{BTreeMap, BTreeSet, HashMap, HashSet, VecDeque}; struct HoistCtx<'a> { root: &'a Rc<HoisterResult>, border_names: &'a BTreeSet<String>, - hoist_ident_map: &'a HashMap<String, Vec<String>>, + hoist_ident_map: &'a HashMap<String, VecDeque<String>>, } -fn build_hoist_ident_map(root: &Rc<HoisterResult>) -> HashMap<String, Vec<String>> { +fn build_hoist_ident_map(root: &Rc<HoisterResult>) -> HashMap<String, VecDeque<String>> { @@ - let mut ident_map: IndexMap<String, Vec<String>> = IndexMap::new(); + let mut ident_map: IndexMap<String, VecDeque<String>> = IndexMap::new(); @@ - ident_map.insert(root.name.clone(), vec![root_ident]); + ident_map.insert(root.name.clone(), VecDeque::from([root_ident])); @@ - if idents.len() > 1 && !root_index.contains_key(name) { - idents.remove(0); + if idents.len() > 1 && !root_index.contains_key(name) { + idents.pop_front(); shifted = true; }As per coding guidelines, performance regressions in pnpm/pacquet are a priority immediately after security issues.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@pacquet/crates/real-hoist/src/lib.rs` around lines 743 - 746, The performance issue is in the hoist_ident_map loop where idents.remove(0) performs an O(n) front-removal operation that can cause quadratic behavior when many versions compete for the same package name. Replace the Vec structure used in hoist_ident_map with a VecDeque for efficient front-pop operations, or implement a head index cursor to track the current position without actually removing elements. Update the code that accesses idents to work with the new data structure approach to maintain the same logic while improving performance on the install/link hot path.Source: Coding guidelines
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Nitpick comments:
In `@pacquet/crates/real-hoist/src/lib.rs`:
- Around line 743-746: The performance issue is in the hoist_ident_map loop
where idents.remove(0) performs an O(n) front-removal operation that can cause
quadratic behavior when many versions compete for the same package name. Replace
the Vec structure used in hoist_ident_map with a VecDeque for efficient
front-pop operations, or implement a head index cursor to track the current
position without actually removing elements. Update the code that accesses
idents to work with the new data structure approach to maintain the same logic
while improving performance on the install/link hot path.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro Plus
Run ID: 364f19fe-eb74-486c-ba3c-a599b8d182d8
📒 Files selected for processing (5)
pacquet/crates/cli/tests/hoisted_node_linker.rspacquet/crates/package-manager/src/hoisted_dep_graph/tests.rspacquet/crates/real-hoist/src/lib.rspacquet/crates/real-hoist/src/tests.rspacquet/plans/TEST_PORTING.md
The per-pass ident-shift loop in `hoist_into_root` dropped the preferred candidate with `Vec::remove(0)`, an O(n) front-removal that turns the loop quadratic when many versions of one name compete — on the install/link hot path. Switch the per-name candidate lists to `VecDeque` and shift with `pop_front()` (O(1)). Also reattach the `hoist_into_root` doc comment, which the `HoistCtx` insertion in 0898df1 accidentally stranded onto the adjacent struct. Addresses a CodeRabbit review comment on #12510.
Integrated-Benchmark Report (Linux)Each scenario reports direct installs and pnpr installs. Bencher consumes pacquet@HEAD and pnpr@HEAD. Scenario: Isolated linker: fresh restore, cold cache + cold store
BENCHMARK_REPORT.json{
"results": [
{
"command": "pacquet@HEAD",
"mean": 4.116592239979999,
"stddev": 0.15335624398064104,
"median": 4.1022876964799995,
"user": 3.85254176,
"system": 3.4546504999999996,
"min": 3.8980246159800003,
"max": 4.37496952498,
"times": [
4.17232684698,
3.9508817089800004,
3.9990941649800003,
4.14349110498,
4.23452569998,
4.042855490979999,
4.37496952498,
4.288668953979999,
4.06108428798,
3.8980246159800003
]
},
{
"command": "pacquet@main",
"mean": 4.02497835768,
"stddev": 0.08317399666058918,
"median": 3.98915292798,
"user": 3.836720059999999,
"system": 3.4565067,
"min": 3.93522086298,
"max": 4.17851260698,
"times": [
4.1180427449799994,
4.01152198098,
4.17851260698,
4.11957879498,
3.94313742498,
3.98860269098,
3.98104077798,
3.93522086298,
3.9897031649800003,
3.98442252698
]
},
{
"command": "pnpr@HEAD",
"mean": 2.0419425063800003,
"stddev": 0.08381404350621083,
"median": 2.0341460054800002,
"user": 2.6596486599999998,
"system": 2.9622908,
"min": 1.93685125998,
"max": 2.17407586298,
"times": [
2.11005940998,
2.10793465998,
1.95706104998,
1.93685125998,
1.99735520598,
1.95269302098,
2.07093680498,
2.11671708898,
1.9957406999800003,
2.17407586298
]
},
{
"command": "pnpr@main",
"mean": 2.13506108688,
"stddev": 0.1429389915840566,
"median": 2.0530584109800003,
"user": 2.6312335599999996,
"system": 2.981212,
"min": 2.01094772298,
"max": 2.35849379598,
"times": [
2.04557438998,
2.06054243198,
2.03718445198,
2.33043992898,
2.30984490598,
2.15650507698,
2.01094772298,
2.02199878198,
2.35849379598,
2.01907938198
]
}
]
}Scenario: Isolated linker: fresh restore, hot cache + hot store
BENCHMARK_REPORT.json{
"results": [
{
"command": "pacquet@HEAD",
"mean": 0.6355314527200001,
"stddev": 0.011499349110555064,
"median": 0.63291516162,
"user": 0.3795541,
"system": 1.31172372,
"min": 0.61578022862,
"max": 0.6529032816200001,
"times": [
0.62906990362,
0.6529032816200001,
0.63921020062,
0.61578022862,
0.6329643316200001,
0.6286464716200001,
0.6463354856200001,
0.63286599162,
0.64981799862,
0.6277206336200001
]
},
{
"command": "pacquet@main",
"mean": 0.66559315292,
"stddev": 0.0984778601186495,
"median": 0.6348216036200001,
"user": 0.37091599999999997,
"system": 1.32061402,
"min": 0.6101964496200001,
"max": 0.9435955636200001,
"times": [
0.63770892162,
0.6575409666200001,
0.6301031916200001,
0.6290464036200001,
0.6101964496200001,
0.6289134456200001,
0.64731883162,
0.6319342856200001,
0.6395734696200001,
0.9435955636200001
]
},
{
"command": "pnpr@HEAD",
"mean": 0.71969590412,
"stddev": 0.07011621896172465,
"median": 0.7063156666200001,
"user": 0.3910643999999999,
"system": 1.3441831199999998,
"min": 0.6646684826200001,
"max": 0.9127402966200001,
"times": [
0.70997913962,
0.7150398306200001,
0.72209965062,
0.6646684826200001,
0.6956204716200001,
0.67764795862,
0.68653187762,
0.7098483706200001,
0.9127402966200001,
0.7027829626200001
]
},
{
"command": "pnpr@main",
"mean": 0.68664637192,
"stddev": 0.02075635358033934,
"median": 0.6894849171200002,
"user": 0.3864402,
"system": 1.3413567200000003,
"min": 0.6609419146200001,
"max": 0.71944123362,
"times": [
0.6609419146200001,
0.6635872436200001,
0.70487159562,
0.6895796486200001,
0.6693821096200001,
0.6893901856200001,
0.6653201766200001,
0.6975921166200001,
0.7063574946200001,
0.71944123362
]
}
]
}Scenario: Isolated linker: fresh install, cold cache + cold store
BENCHMARK_REPORT.json{
"results": [
{
"command": "pacquet@HEAD",
"mean": 4.185150179999999,
"stddev": 0.04384255672711061,
"median": 4.1830139312,
"user": 3.7178868799999996,
"system": 3.3406346199999994,
"min": 4.1298888407,
"max": 4.2689496796999995,
"times": [
4.1298888407,
4.2023873887,
4.1410991177,
4.1854427607,
4.2689496796999995,
4.2369380877,
4.176385306699999,
4.1385793537,
4.1805851017,
4.1912461627
]
},
{
"command": "pacquet@main",
"mean": 4.1683541271,
"stddev": 0.035755369668987236,
"median": 4.1725989447,
"user": 3.6803170799999996,
"system": 3.356527419999999,
"min": 4.1057591336999995,
"max": 4.2324687817,
"times": [
4.2324687817,
4.1057591336999995,
4.1893096237,
4.1872700147,
4.1328187516999995,
4.155210654699999,
4.1755587667,
4.1921505297,
4.1696391227,
4.1433558917
]
},
{
"command": "pnpr@HEAD",
"mean": 2.0877509290000003,
"stddev": 0.05240723189096225,
"median": 2.1018864542,
"user": 2.48987798,
"system": 2.8796811199999994,
"min": 2.0062952597000003,
"max": 2.1879866047000003,
"times": [
2.0401836107,
2.0062952597000003,
2.1020864207,
2.1093244097,
2.1016864877000003,
2.1042832077000004,
2.0250994307,
2.0862089857000004,
2.1143548727000003,
2.1879866047000003
]
},
{
"command": "pnpr@main",
"mean": 2.1751765093,
"stddev": 0.12201225172861403,
"median": 2.1322346937,
"user": 2.4823642799999996,
"system": 2.8797174199999995,
"min": 2.0393611827,
"max": 2.4146449967000003,
"times": [
2.3550808217,
2.0891420487000003,
2.2033090937,
2.4146449967000003,
2.0837052457,
2.1413832087,
2.1049601527000004,
2.1230861787000004,
2.1970921637000003,
2.0393611827
]
}
]
}Scenario: Isolated linker: fresh install, hot cache + hot store
BENCHMARK_REPORT.json{
"results": [
{
"command": "pacquet@HEAD",
"mean": 1.31462452872,
"stddev": 0.014221619486055326,
"median": 1.3128025509199999,
"user": 1.32700476,
"system": 1.6776753999999996,
"min": 1.30005856142,
"max": 1.33840926242,
"times": [
1.33840926242,
1.32104573442,
1.33831213642,
1.3141315194199998,
1.31147358242,
1.30240165842,
1.3054806694199999,
1.31433658242,
1.30059558042,
1.30005856142
]
},
{
"command": "pacquet@main",
"mean": 1.35656121842,
"stddev": 0.05552782696528185,
"median": 1.33743321242,
"user": 1.3301256599999998,
"system": 1.7189054,
"min": 1.31998305642,
"max": 1.49164214642,
"times": [
1.3223335054199998,
1.32743329342,
1.31998305642,
1.49164214642,
1.42007218242,
1.32661789242,
1.34439301642,
1.33827066642,
1.33686368142,
1.33800274342
]
},
{
"command": "pnpr@HEAD",
"mean": 0.6553210991200001,
"stddev": 0.0319765696735816,
"median": 0.64231221742,
"user": 0.32660026000000003,
"system": 1.2764746999999999,
"min": 0.6377585744200001,
"max": 0.7433360504200001,
"times": [
0.65469291342,
0.63800514742,
0.6427674764200001,
0.64185695842,
0.6584300484200001,
0.6570599394200001,
0.6377585744200001,
0.6381373984200001,
0.64116648442,
0.7433360504200001
]
},
{
"command": "pnpr@main",
"mean": 0.6455579120200001,
"stddev": 0.005258829104616755,
"median": 0.6466878654200001,
"user": 0.34280315999999994,
"system": 1.2653083999999999,
"min": 0.6351728124200001,
"max": 0.6520035514200001,
"times": [
0.63898971642,
0.64923390542,
0.6474789904200001,
0.6520035514200001,
0.6458967404200001,
0.6494979404200001,
0.64421346442,
0.6351728124200001,
0.64968904942,
0.6434029494200001
]
}
]
}Scenario: Isolated linker: fresh install, cold cache + hot store
BENCHMARK_REPORT.json{
"results": [
{
"command": "pacquet@HEAD",
"mean": 2.9757910055000005,
"stddev": 0.0514791197329443,
"median": 2.9634855013,
"user": 1.7265983599999999,
"system": 1.9482943000000001,
"min": 2.9365894618,
"max": 3.1131256578,
"times": [
2.9365894618,
2.9473891807999997,
2.9794415198,
2.9688005868,
2.9496983528,
2.9725104728,
2.9924954908,
2.9396889158,
3.1131256578,
2.9581704157999997
]
},
{
"command": "pacquet@main",
"mean": 3.0089126219,
"stddev": 0.04336049652987134,
"median": 3.0004806927999996,
"user": 1.7486466600000004,
"system": 1.9809647,
"min": 2.9721753858,
"max": 3.1253284978,
"times": [
2.9721753858,
3.0108120948,
2.9839800737999997,
3.0148007698,
3.0000270307999997,
2.9963020628,
3.0009343547999996,
2.9763999858,
3.1253284978,
3.0083659627999997
]
},
{
"command": "pnpr@HEAD",
"mean": 0.6519015228,
"stddev": 0.012121097854862812,
"median": 0.6494715353,
"user": 0.32750836000000005,
"system": 1.2804896000000001,
"min": 0.6316981908000001,
"max": 0.6688145168,
"times": [
0.6614577358000001,
0.6540251958000001,
0.6448900578000001,
0.6688145168,
0.6316981908000001,
0.6571288248,
0.6684755158000001,
0.6430522628,
0.6449178748000001,
0.6445550528
]
},
{
"command": "pnpr@main",
"mean": 0.6517483466,
"stddev": 0.0116272537529652,
"median": 0.6527111758,
"user": 0.3320571599999999,
"system": 1.2687332,
"min": 0.6343692308000001,
"max": 0.6693113348,
"times": [
0.6572346048000001,
0.6586662988,
0.6693113348,
0.6582317198000001,
0.6386878678000001,
0.6343692308000001,
0.6452010498,
0.6481877468,
0.6650159948000001,
0.6425776178000001
]
}
]
} |
|
| Branch | pr/12510 |
| Testbed | pacquet |
Click to view all benchmark results
| Benchmark | Latency | Benchmark Result milliseconds (ms) (Result Δ%) | Upper Boundary milliseconds (ms) (Limit %) |
|---|---|---|---|
| isolated-linker.fresh-install.cold-cache.cold-store | 📈 view plot 🚷 view threshold | 4,185.15 ms(-0.73%)Baseline: 4,215.93 ms | 5,059.11 ms (82.72%) |
| isolated-linker.fresh-install.cold-cache.hot-store | 📈 view plot 🚷 view threshold | 2,975.79 ms(-0.81%)Baseline: 3,000.12 ms | 3,600.14 ms (82.66%) |
| isolated-linker.fresh-install.hot-cache.hot-store | 📈 view plot 🚷 view threshold | 1,314.62 ms(-0.86%)Baseline: 1,326.07 ms | 1,591.28 ms (82.61%) |
| isolated-linker.fresh-restore.cold-cache.cold-store | 📈 view plot 🚷 view threshold | 4,116.59 ms(-0.78%)Baseline: 4,149.09 ms | 4,978.91 ms (82.68%) |
| isolated-linker.fresh-restore.hot-cache.hot-store | 📈 view plot 🚷 view threshold | 635.53 ms(+3.32%)Baseline: 615.09 ms | 738.11 ms (86.10%) |
|
| Branch | pr/12510 |
| Testbed | pnpr |
⚠️ WARNING: No Threshold found!Without a Threshold, no Alerts will ever be generated.
Click here to create a new Threshold
For more information, see the Threshold documentation.
To only post results if a Threshold exists, set the--ci-only-thresholdsflag.
Click to view all benchmark results
| Benchmark | Latency | milliseconds (ms) |
|---|---|---|
| isolated-linker.fresh-install.cold-cache.cold-store | 📈 view plot | 2,087.75 ms |
| isolated-linker.fresh-install.cold-cache.hot-store | 📈 view plot | 651.90 ms |
| isolated-linker.fresh-install.hot-cache.hot-store | 📈 view plot | 655.32 ms |
| isolated-linker.fresh-restore.cold-cache.cold-store | 📈 view plot | 2,041.94 ms |
| isolated-linker.fresh-restore.hot-cache.hot-store | 📈 view plot | 719.70 ms |
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@pacquet/crates/real-hoist/src/tests.rs`:
- Line 1098: Remove the leftover dbg! macro calls from the test file in
tests.rs. Specifically, remove the dbg!(&ident_map) call around line 1098 and
the other dbg! call around line 1125. These debugging macros print to stderr
during test execution and add unnecessary noise to CI logs without providing
value to the test assertions, so they should be completely removed from the test
code.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro Plus
Run ID: 26d6340c-92fe-453e-888f-dc6031b28cda
📒 Files selected for processing (2)
pacquet/crates/real-hoist/src/lib.rspacquet/crates/real-hoist/src/tests.rs
🚧 Files skipped from review as they are similar to previous changes (1)
- pacquet/crates/real-hoist/src/lib.rs
Port `@yarnpkg/nm`'s `buildPreferenceMap` / `getHoistIdentMap` into the real-directory hoister. When several versions of one name compete for the root `node_modules` slot, the most-depended-on version now wins, with the root's own direct dependencies always preferred. A per-pass ident shift promotes the next candidate when the preferred version cannot be placed (nested under a conflict, border, or peer). Previously the first version reached in the depth-first walk claimed the slot, which diverged from pnpm and yarn for the cases that exercise this tie-break (for example, a workspace where the root pins a different version than a nested project). Ported from yarn berry's hoist.ts at 4287909fa6.
…rence Port the headless `nodeLinker: hoisted` cases from `installing/deps-restorer/test/index.ts`: - `installing_with_hoisted_node_linker_frozen` (index.ts:859): a `--frozen-lockfile` replay materializes the hoisted layout (real directories plus version-conflict nesting) from a pre-seeded lockfile. - `installing_in_a_workspace_with_hoisted_node_linker_frozen` (index.ts:873): a workspace replay where the root importer's `ms@2.1.3` wins the top-level `node_modules` slot and a project's conflicting `ms@2.0.0` nests under the project. Add a matching walker unit test (`walker_workspace_root_version_wins_root_slot`) that pins the same root-wins invariant deterministically, and tick the two ported cases in `plans/TEST_PORTING.md`.
`cargo fmt` wrapped and collapsed two macro calls in ways perfectionist rejects, and a doc comment carried a bare numbered reference: - real-hoist `tests.rs`: add the trailing comma perfectionist requires on a multi-line `assert!` that fmt had wrapped. - package-manager `hoisted_dep_graph/tests.rs`: drop the trailing comma fmt left on a single-line `assert_eq!`, and rephrase a bare numbered reference in a doc comment that `bare_issue_reference` flagged. Verified with `RUSTFLAGS="-D warnings" cargo dylint --all -- --all-targets --workspace`.
The per-pass ident-shift loop in `hoist_into_root` dropped the preferred candidate with `Vec::remove(0)`, an O(n) front-removal that turns the loop quadratic when many versions of one name compete — on the install/link hot path. Switch the per-name candidate lists to `VecDeque` and shift with `pop_front()` (O(1)). Also reattach the `hoist_into_root` doc comment, which the `HoistCtx` insertion in 0898df1 accidentally stranded onto the adjacent struct. Addresses a CodeRabbit review comment on #12510.
…_map `build_hoist_ident_map`'s `if root.peer_names.contains(&name)` guard (yarn's `!rootNode.peerNames.has(name)` in `getHoistIdentMap`) was uncovered: `hoist` always builds the `.` root with empty `peer_names`, so no install or public-API path reaches it — the branch exists for parity and for the per-importer hoisting roots that will carry peers. Add a direct unit test that hands the private builder a synthetic root declaring a peer that is reachable through its non-peer descendants, exercising the skip branch. Verified by neutralizing the guard and watching the test fail.
Three branches in the preference machinery were unreachable from the public `hoist` entry (the `.` root is always built with empty `peer_names`, and `build_hoist_ident_map` never emits an empty candidate list), so cover them with direct unit tests of the private functions: - `add_dependent`'s peer branch (a node records a child it declares as its own peer without recursing into it): assert the peer's exclusive subtree is never walked. - `is_preferred_ident`'s two early returns: a name absent from the map, and a name mapping to an empty candidate list — both hoist freely. Each test was verified by neutralizing the guard and watching it fail.
The preferred-ident change carried a few comments that restated information already present elsewhere: - the `None if is_preferred_ident` match arm glossed what the `AbsorbDecision::Defer` variant doc already states; - the stay-at-parent arm re-enumerated each variant's meaning, which lives on the variants themselves; - two preference-map unit tests narrated their fixture in an inline comment that repeated their own doc comment; - the frozen-lockfile e2e test docs and `most_used_version_wins_root_slot` re-narrated the body's step comments and assert messages. Keep the load-bearing why (ancestor-path consequence, the discovered-first contrast, the upstream mirror) and drop the restatements, per the repo comment conventions.
677e4fa to
9dce517
Compare
nodeLinker: 'hoisted'nodeLinker: 'hoisted'
Code Review by Qodo
1. Hoist preference map overhead
|
| /// One entry of the preference map: the set of dependent idents | ||
| /// (and peer-dependent idents) that pull in a given `(name, | ||
| /// ident)` package. Usage count is the sum of the two, matching | ||
| /// yarn's `entry.dependents.size + entry.peerDependents.size`. | ||
| #[derive(Default)] | ||
| struct PreferenceEntry { | ||
| dependents: HashSet<String>, | ||
| peer_dependents: HashSet<String>, | ||
| } | ||
|
|
||
| impl PreferenceEntry { | ||
| fn usages(&self) -> usize { | ||
| self.dependents.len() + self.peer_dependents.len() | ||
| } | ||
| } | ||
|
|
||
| /// Port of yarn's `buildPreferenceMap` + `getHoistIdentMap`. For | ||
| /// each dependency name reachable from `root`, returns its | ||
| /// candidate idents (references) ordered most-preferred first: | ||
| /// | ||
| /// 1. The root's own direct deps are seeded first, so a version the | ||
| /// root depends on always wins its name slot. | ||
| /// 2. Every other ident follows, ordered by usage (the count of | ||
| /// distinct dependents + peer-dependents) descending, stable on | ||
| /// ties (preserving depth-first discovery order). | ||
| /// | ||
| /// [`hoist_into_root`] consults the front of each list as the | ||
| /// currently-preferred ident and shifts it as passes progress. | ||
| /// Ports | ||
| /// <https://github.com/yarnpkg/berry/blob/4287909fa6a0a1ec976a55776bff606864b31990/packages/yarnpkg-nm/sources/hoist.ts>. | ||
| fn build_hoist_ident_map(root: &Rc<HoisterResult>) -> HashMap<String, VecDeque<String>> { | ||
| let mut preference: IndexMap<(String, String), PreferenceEntry> = IndexMap::new(); | ||
| let mut seen: HashSet<*const HoisterResult> = HashSet::new(); | ||
| seen.insert(Rc::as_ptr(root)); | ||
|
|
||
| let root_ident = node_ident(root); | ||
| let root_children: Vec<Rc<HoisterResult>> = | ||
| root.dependencies.borrow().iter().map(|dep| Rc::clone(&dep.0)).collect(); | ||
| for dep in &root_children { | ||
| if !root.peer_names.contains(&dep.name) { | ||
| add_dependent(&root_ident, dep, &mut preference, &mut seen); | ||
| } | ||
| } | ||
|
|
||
| // Seed the result with the root and its direct deps so their | ||
| // idents always rank first. Mirrors `getHoistIdentMap`'s initial | ||
| // `identMap` construction before the sorted append loop. | ||
| let mut ident_map: IndexMap<String, VecDeque<String>> = IndexMap::new(); | ||
| ident_map.insert(root.name.clone(), VecDeque::from([root_ident])); | ||
| for dep in &root_children { | ||
| if !root.peer_names.contains(&dep.name) { | ||
| ident_map.insert(dep.name.clone(), VecDeque::from([node_ident(dep)])); | ||
| } | ||
| } | ||
|
|
||
| let mut keys: Vec<(String, String)> = preference.keys().cloned().collect(); | ||
| // `hoist_priority` is always 0 in pacquet, so the sort reduces to | ||
| // usage (descending). `sort_by` is stable, so equal-usage keys | ||
| // keep preference-map insertion order (depth-first discovery) — | ||
| // matching yarn's `keyList.sort`, which is likewise stable on | ||
| // equal usage. | ||
| keys.sort_by(|left, right| preference[right].usages().cmp(&preference[left].usages())); | ||
| for (name, ident) in keys { | ||
| if root.peer_names.contains(&name) { | ||
| continue; | ||
| } | ||
| let idents = ident_map.entry(name).or_default(); | ||
| if !idents.contains(&ident) { | ||
| idents.push_back(ident); | ||
| } | ||
| } | ||
|
|
||
| ident_map.into_iter().collect() | ||
| } | ||
|
|
||
| /// Recursive half of [`build_hoist_ident_map`]'s preference pass. | ||
| /// Records `dependent_ident` as a dependent of `node`, then (the | ||
| /// first time `node` is seen) recurses into its non-peer children | ||
| /// and records peer children as peer-dependents. Mirrors yarn's | ||
| /// `addDependent`. | ||
| fn add_dependent( | ||
| dependent_ident: &str, | ||
| node: &Rc<HoisterResult>, | ||
| preference: &mut IndexMap<(String, String), PreferenceEntry>, | ||
| seen: &mut HashSet<*const HoisterResult>, | ||
| ) { | ||
| let parent_ident = node_ident(node); | ||
| preference | ||
| .entry((node.name.clone(), parent_ident.clone())) | ||
| .or_default() | ||
| .dependents | ||
| .insert(dependent_ident.to_string()); | ||
|
|
||
| if seen.insert(Rc::as_ptr(node)) { | ||
| let children: Vec<Rc<HoisterResult>> = | ||
| node.dependencies.borrow().iter().map(|dep| Rc::clone(&dep.0)).collect(); | ||
| for child in children { | ||
| if node.peer_names.contains(&child.name) { | ||
| preference | ||
| .entry((child.name.clone(), node_ident(&child))) | ||
| .or_default() | ||
| .peer_dependents | ||
| .insert(parent_ident.clone()); | ||
| } else { | ||
| add_dependent(&parent_ident, &child, preference, seen); | ||
| } | ||
| } | ||
| } |
There was a problem hiding this comment.
1. Hoist preference map overhead 🐞 Bug ➹ Performance
build_hoist_ident_map builds per-package dependent sets as HashSet<String> and repeatedly to_string()/clone()s identifiers while traversing the whole graph, which adds extra allocations/CPU to every nodeLinker: hoisted install. This runs inside hoist_into_root, and hoist() is invoked as part of building the hoisted dependency graph during installs, so the added work is on a critical path for large workspaces.
Agent Prompt
### Issue description
`build_hoist_ident_map` builds usage counts by storing *distinct dependents* in `HashSet<String>` and inserting cloned `String`s per edge. On large graphs this can be a meaningful regression in CPU time and memory during hoisted installs.
### Issue Context
- `hoist_into_root` calls `build_hoist_ident_map(root)` for each hoist run.
- The preference map uses `HashSet<String>` and frequently allocates/clones strings (`dependent_ident.to_string()`, `parent_ident.clone()`).
### Fix Focus Areas
- Consider tracking dependent identity without allocating new `String`s per edge (e.g., stable numeric IDs / pointer identity / interning `Rc<str>`), while preserving the “distinct dependents” semantics.
- Keep output deterministic and behavior-equivalent.
- pacquet/crates/real-hoist/src/lib.rs[764-872]
- pacquet/crates/real-hoist/src/lib.rs[707-748]
ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools
Summary
Closes a parity gap in pacquet's
nodeLinker: 'hoisted'implementation.When several versions of the same dependency name compete for the top-level
node_modulesslot, pnpm (via yarn berry's hoister) gives the slot to themost-used version, with the root's own direct dependencies always ranking
first. pacquet previously hoisted whichever version it visited first in its
depth-first walk, so a less-used version could win the root slot and produce a
layout that is more deeply nested than pnpm's.
This PR ports yarn's
buildPreferenceMap/getHoistIdentMapintoreal-hoist:(root deps first, then by usage count);
AbsorbDecision::Deferplus anis_preferred_identgate keeps anon-preferred version nested until a per-pass
VecDeque::pop_frontidentshift promotes the next candidate when the preferred one cannot reach the
root;
HoistCtxbundles the root, the border-name set, and the ident map so therecursive walker stays within the argument limit.
It also ports two upstream frozen-lockfile (headless) tests
(
installing/deps-restorer/test/index.ts:859and:873), adds unit anddep-graph coverage for the preference machinery, and marks both items as
ported in
pacquet/plans/TEST_PORTING.md.This is a pacquet-only port: the behavior already exists in the TypeScript pnpm
CLI (the source of truth), so no TypeScript-side change is required.
Squash Commit Body
Checklist
pacquet/port. This is a pacquet-only port of behavior that already existsin the TypeScript pnpm CLI (the reference being ported from), bringing the two
back into parity.
Written by an agent (Claude Code, claude-opus-4-8).
Summary by CodeRabbit
nodeLinker: hoisted, verifying real directory materialization and correct nesting for conflicting dependency versions.