Skip to content

fix(audit): prune path traversal#12087

Merged
zkochan merged 3 commits into
pnpm:mainfrom
aqeelat:fix/audit-path-traversal-performance
Jun 1, 2026
Merged

fix(audit): prune path traversal#12087
zkochan merged 3 commits into
pnpm:mainfrom
aqeelat:fix/audit-path-traversal-performance

Conversation

@aqeelat

@aqeelat aqeelat commented May 31, 2026

Copy link
Copy Markdown
Contributor

Summary

  • Prune lockfile subtrees that cannot reach vulnerable packages while building audit findings paths.
  • Stop descending into subtrees once every reachable vulnerable package/version has hit the existing path cap.
  • Add focused coverage for non-vulnerable subtree pruning and saturated finding short-circuiting.

Benchmark

In our large monorepo (/Users/aqeelat/fathom/frontend/fathom-frontend), pnpm audit went from 4+ minutes to less than 2 seconds with this change:

Before: pnpm audit  251.46s user 2.22s system 93% cpu 4:30.94 total
After:  real 1.44s, user 0.59s, sys 0.04s

Validation

  • pnpm --filter @pnpm/deps.compliance.audit test
  • git diff --check
  • Push hook: typecheck, pnpm bundle compile, spellcheck, meta lint, TS lint

Fixes #12086


Written by an agent (OpenCode, gpt-5.5).

Summary by CodeRabbit

  • Performance
    • Improved audit performance: non-vulnerable lockfile subtrees are pruned and path enumeration stops once per‑vulnerability path limits are reached, reducing work and speeding up audits.
  • Tests
    • Added tests verifying traversal pruning, saturation/capping behavior, and correct handling of cycles to ensure audit accuracy and reliability.
  • Chores
    • Updated release metadata and added a note describing the audit performance improvements.

Copilot AI review requested due to automatic review settings May 31, 2026 02:18
@aqeelat aqeelat requested a review from zkochan as a code owner May 31, 2026 02:18
@coderabbitai

coderabbitai Bot commented May 31, 2026

Copy link
Copy Markdown

Review Change Stack

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro Plus

Run ID: aadc3018-9bf8-4b7d-b82b-4d2ed5b5d4df

📥 Commits

Reviewing files that changed from the base of the PR and between 733dae6 and 5a057d4.

📒 Files selected for processing (2)
  • deps/compliance/audit/src/lockfileToAuditIndex.ts
  • deps/compliance/audit/test/index.ts
🚧 Files skipped from review as they are similar to previous changes (1)
  • deps/compliance/audit/src/lockfileToAuditIndex.ts
📜 Recent review details
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (2)
  • GitHub Check: Compile & Lint
  • GitHub Check: Analyze (javascript)
🧰 Additional context used
📓 Path-based instructions (1)
**/*.{ts,tsx}

📄 CodeRabbit inference engine (AGENTS.md)

**/*.{ts,tsx}: Follow Standard Style with trailing commas, preferring functions over classes, and declaring functions after they are used (relying on hoisting)
Use a single options object instead of multiple parameters when a function needs more than two or three arguments
Follow Import Order: Standard libraries first, then external dependencies (alphabetically), then relative imports
Write self-documenting code where function names, parameters, and types explain what a function does without requiring prose comments
Do not write comments that restate what the code already says; refactor via renaming, splitting helpers, or restructuring instead
Do not repeat documentation at call sites that already exists in JSDoc on the callee; update JSDoc once for all call sites to benefit
Use JSDoc only for a function's contract (preconditions, postconditions, edge cases, why the function exists), not for re-narrating the body
Do not record past implementation shape, refactor history, or 'the previous code did X' framing in code; use git log and git blame instead
Write comments only when: the reason for code is non-obvious (hidden invariant, workaround for known bug, deliberate exception), or the right name doesn't fit (temporary technical constraint)

Files:

  • deps/compliance/audit/test/index.ts
🧠 Learnings (1)
📚 Learning: 2026-05-14T09:04:00.133Z
Learnt from: zkochan
Repo: pnpm/pnpm PR: 11622
File: resolving/npm-resolver/test/publishedBy.test.ts:350-354
Timestamp: 2026-05-14T09:04:00.133Z
Learning: In the pnpm/pnpm repository, ESLint is the authoritative style linter. Do not raise review findings for missing trailing commas in multiline function calls (e.g., `fs.writeFileSync(...)`) when this repo’s ESLint configuration does not report them and lint passes. Prefer deferring to the ESLint results for this specific trailing-comma rule rather than enforcing it manually in code review.

Applied to files:

  • deps/compliance/audit/test/index.ts
🔇 Additional comments (1)
deps/compliance/audit/test/index.ts (1)

280-309: LGTM!


📝 Walkthrough

Walkthrough

This PR precomputes reachable vulnerabilities per dependency path and adds saturation checks into audit path enumeration so DFS traversal prunes non-vulnerable subtrees and stops exploring once per-finding path caps are reached; tests validate pruning, saturation, and cyclic reachability.

Changes

Audit Path Traversal Performance

Layer / File(s) Summary
Release notes and changeset
.changeset/audit-path-traversal-performance.md
Changeset documents the performance improvements to pnpm audit related to pruning non-vulnerable lockfile subtrees and enforcing a path enumeration cap during vulnerability discovery.
Reachability computation and early-termination logic
deps/compliance/audit/src/lockfileToAuditIndex.ts
createReachableVulnerabilitiesGetter precomputes and memoizes reachable vulnerability keys per depPath with cycle-aware memoization; allReachableVulnerabilitiesSaturated checks per-finding saturation against paths/MAX_PATHS_PER_FINDING; walkForPaths uses these to short-circuit DFS when the reachable set is empty or saturated and re-checks saturation after recording a path to avoid needless child traversal.
Performance & correctness tests
deps/compliance/audit/test/index.ts
Adds Jest tests verifying lazy snapshot getter behavior and traversal changes: pruning of non-vulnerable subtrees (cold getters invoked once), path enumeration capped at saturation (vulnerable getters read only until cap), and cyclic-dependency tests asserting correct path recording and traversal-order robustness.

Sequence Diagram(s)

sequenceDiagram
  participant walkForPaths
  participant createReachableVulnerabilitiesGetter
  participant visit
  participant allReachableVulnerabilitiesSaturated
  walkForPaths->>createReachableVulnerabilitiesGetter: construct reachable-vulnerabilities getter
  walkForPaths->>visit: begin DFS from entry edges
  visit->>createReachableVulnerabilitiesGetter: query reachable keys for depPath
  visit->>allReachableVulnerabilitiesSaturated: check if reachable findings are saturated
  alt empty or saturated
    visit->>visit: return early (prune subtree)
  else
    visit->>visit: record path for vulnerable edge
    visit->>allReachableVulnerabilitiesSaturated: re-check saturation after record
    alt saturated
      visit->>visit: return early (stop exploring children)
    else
      visit->>visit: continue to children
    end
  end
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Possibly related issues

Suggested reviewers

  • zkochan

Poem

🐇 I nibble through lockfile vines,

pruning branches, counting signs.
Paths to danger capped at the gate,
cold getters touched just once — celebrate!
Audit hops lighter, swift, and great.

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title 'fix(audit): prune path traversal' directly addresses the main optimization in the PR—pruning non-vulnerable subtrees and saturating path enumeration during traversal.
Linked Issues check ✅ Passed The PR implements two of the four proposed objectives from #12086: short-circuiting traversal when vulnerable packages reach MAX_PATHS [12086] and pruning non-vulnerable subtrees via precomputed reachability sets [12086].
Out of Scope Changes check ✅ Passed All changes are scoped to audit path traversal optimization (changeset metadata, lockfileToAuditIndex algorithm, and test coverage); event-loop yielding and AbortController wiring from issue #12086 are not addressed here.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Warning

There were issues while running some tools. Please review the errors and either fix the tool's configuration or disable the tool if it's a critical failure.

🔧 ESLint

If the error stems from missing dependencies, add them to the package.json file. For unrecoverable errors (e.g., due to private dependencies), disable the tool in the CodeRabbit configuration.

ESLint install failed. For unrecoverable errors, disable the tool in CodeRabbit configuration.


Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@qodo-free-for-open-source-projects

Copy link
Copy Markdown

Review Summary by Qodo

Optimize pnpm audit performance via lockfile subtree pruning

🐞 Bug fix ✨ Enhancement

Grey Divider

Walkthroughs

Description
• Optimize audit performance by pruning non-vulnerable lockfile subtrees during path traversal
• Stop descending into dependency trees once all reachable vulnerabilities reach path cap
• Add helper functions to track reachable vulnerabilities and detect saturation
• Include comprehensive test coverage for pruning and saturation scenarios
Diagram
flowchart LR
  A["Audit Path Traversal"] --> B["Check Reachable Vulnerabilities"]
  B --> C["Prune Non-Vulnerable Subtrees"]
  C --> D["Check Saturation"]
  D --> E["Skip Traversal if Saturated"]
  E --> F["Improved Performance"]

Loading

Grey Divider

File Changes

1. deps/compliance/audit/src/lockfileToAuditIndex.ts ✨ Enhancement +72/-0

Add vulnerability reachability analysis and saturation detection

• Added createReachableVulnerabilitiesGetter() function to compute reachable vulnerabilities from
 each dependency node using memoization
• Added allReachableVulnerabilitiesSaturated() function to check if all reachable vulnerabilities
 have reached the path cap
• Integrated early exit checks in walkForPaths() visit function to skip traversal when no
 vulnerabilities are reachable or all are saturated
• Added helper functions vulnerabilityKey(), parseVulnerabilityKey(), and addAll() for
 vulnerability tracking

deps/compliance/audit/src/lockfileToAuditIndex.ts


2. deps/compliance/audit/test/index.ts 🧪 Tests +63/-0

Add tests for subtree pruning and saturation

• Added test for pruning non-vulnerable subtrees to verify cold dependency reads are minimized
• Added test for stopping at saturated vulnerable nodes to verify path enumeration stops at cap
• Both tests use property getters to track access counts and validate optimization behavior

deps/compliance/audit/test/index.ts


3. .changeset/audit-path-traversal-performance.md 📝 Documentation +6/-0

Document audit performance improvement

• Added changeset documenting performance improvement for pnpm audit command
• Describes pruning of non-vulnerable subtrees and path enumeration cap optimization

.changeset/audit-path-traversal-performance.md


Grey Divider

Qodo Logo

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@deps/compliance/audit/src/lockfileToAuditIndex.ts`:
- Around line 246-271: The memoization currently caches an incomplete result
because get() returns new Set() for back-edges and only writes memo.set(...)
after recursion; change the algorithm to create and memoize an empty placeholder
set for edge.depPath immediately when starting get (e.g., const result = new
Set<string>(); memo.set(edge.depPath, result);), and on encountering a back-edge
(inTrail.has(edge.depPath)) return the already-memoized placeholder
(memo.get(edge.depPath)) instead of new Set(); then proceed to fill that
placeholder during recursion, remove from inTrail when done, and leave the
placeholder in memo for future calls; update the get function (and related
symbols nameVerFromPkgSnapshot, vulnerabilityKey, addAll) accordingly and add a
regression test that builds a cyclic lockfile graph to assert reachability is
preserved.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro Plus

Run ID: 880ccf00-3ff0-48e4-91cf-1f48b412f4e9

📥 Commits

Reviewing files that changed from the base of the PR and between d99b725 and 7152ec7.

📒 Files selected for processing (3)
  • .changeset/audit-path-traversal-performance.md
  • deps/compliance/audit/src/lockfileToAuditIndex.ts
  • deps/compliance/audit/test/index.ts
📜 Review details
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (3)
  • GitHub Check: Agent
  • GitHub Check: Compile & Lint
  • GitHub Check: Analyze (javascript)
🧰 Additional context used
📓 Path-based instructions (1)
**/*.{ts,tsx}

📄 CodeRabbit inference engine (AGENTS.md)

**/*.{ts,tsx}: Follow Standard Style with trailing commas, preferring functions over classes, and declaring functions after they are used (relying on hoisting)
Use a single options object instead of multiple parameters when a function needs more than two or three arguments
Follow Import Order: Standard libraries first, then external dependencies (alphabetically), then relative imports
Write self-documenting code where function names, parameters, and types explain what a function does without requiring prose comments
Do not write comments that restate what the code already says; refactor via renaming, splitting helpers, or restructuring instead
Do not repeat documentation at call sites that already exists in JSDoc on the callee; update JSDoc once for all call sites to benefit
Use JSDoc only for a function's contract (preconditions, postconditions, edge cases, why the function exists), not for re-narrating the body
Do not record past implementation shape, refactor history, or 'the previous code did X' framing in code; use git log and git blame instead
Write comments only when: the reason for code is non-obvious (hidden invariant, workaround for known bug, deliberate exception), or the right name doesn't fit (temporary technical constraint)

Files:

  • deps/compliance/audit/test/index.ts
  • deps/compliance/audit/src/lockfileToAuditIndex.ts
🧠 Learnings (2)
📚 Learning: 2026-05-26T21:01:06.666Z
Learnt from: zkochan
Repo: pnpm/pnpm PR: 11966
File: .changeset/require-tarball-integrity.md:6-6
Timestamp: 2026-05-26T21:01:06.666Z
Learning: In pnpm lockfile-related release notes/docs (especially changeset markdown), preserve URL hostnames exactly as they appear in pnpm-lock.yaml tarball resolution entries—keep hosts like `codeload.github.com`, `bitbucket.org`, and `gitlab.com` in lowercase. Do not “correct” them to title-case/preserve brand capitalization (e.g., LanguageTool rules like `GITHUB` capitalization) because these are literal URL fragments, not platform brand names.

Applied to files:

  • .changeset/audit-path-traversal-performance.md
📚 Learning: 2026-05-14T09:04:00.133Z
Learnt from: zkochan
Repo: pnpm/pnpm PR: 11622
File: resolving/npm-resolver/test/publishedBy.test.ts:350-354
Timestamp: 2026-05-14T09:04:00.133Z
Learning: In the pnpm/pnpm repository, ESLint is the authoritative style linter. Do not raise review findings for missing trailing commas in multiline function calls (e.g., `fs.writeFileSync(...)`) when this repo’s ESLint configuration does not report them and lint passes. Prefer deferring to the ESLint results for this specific trailing-comma rule rather than enforcing it manually in code review.

Applied to files:

  • deps/compliance/audit/test/index.ts
  • deps/compliance/audit/src/lockfileToAuditIndex.ts
🔇 Additional comments (2)
.changeset/audit-path-traversal-performance.md (1)

1-6: LGTM!

deps/compliance/audit/test/index.ts (1)

5-5: LGTM!

Also applies to: 87-147

Comment thread deps/compliance/audit/src/lockfileToAuditIndex.ts Outdated

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR optimizes pnpm audit’s client-side path enumeration by pruning lockfile subtrees that cannot lead to vulnerable packages and by stopping traversal once all reachable vulnerable findings have reached the per-finding path cap, addressing the severe CPU-bound slowdown reported in #12086.

Changes:

  • Add a memoized “reachable vulnerabilities” precomputation to prune non-vulnerable subtrees during buildAuditPathIndex() traversal.
  • Short-circuit traversal when all reachable vulnerable findings are saturated at MAX_PATHS_PER_FINDING.
  • Add targeted tests covering subtree pruning and saturated-finding short-circuiting; add a changeset for patch releases.

Reviewed changes

Copilot reviewed 3 out of 3 changed files in this pull request and generated 1 comment.

File Description
deps/compliance/audit/src/lockfileToAuditIndex.ts Adds reachable-vulnerability pruning + saturation-based traversal short-circuit for audit path building.
deps/compliance/audit/test/index.ts Adds tests to assert pruning behavior and saturated-finding traversal short-circuiting.
.changeset/audit-path-traversal-performance.md Declares patch bumps for @pnpm/deps.compliance.audit and pnpm with a performance note.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread deps/compliance/audit/src/lockfileToAuditIndex.ts Outdated
aqeelat added 2 commits June 1, 2026 14:20
…e reachability

The reachable-vulnerabilities getter returned a non-memoized empty Set for
back-edges, causing incomplete results for nodes in dependency cycles. Memoize
the result Set immediately so the same mutable placeholder is returned for
back-edges and filled as recursion unwinds.
Copilot AI review requested due to automatic review settings June 1, 2026 12:23
@zkochan zkochan force-pushed the fix/audit-path-traversal-performance branch from 204fbe4 to 733dae6 Compare June 1, 2026 12:23

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 3 out of 3 changed files in this pull request and generated 1 comment.

Comment thread deps/compliance/audit/src/lockfileToAuditIndex.ts
The placeholder-before-recursion approach only made the SCC entry node's
reachable set correct; non-entry cycle members were memoized with an
under-approximated set, dropping valid audit paths reached through them.
Cache a node's reachable vulnerabilities only when no descendant back-edges
to an ancestor; recompute cycle-touching nodes per query.
@zkochan zkochan merged commit 719cc21 into pnpm:main Jun 1, 2026
9 of 10 checks passed
@aqeelat aqeelat deleted the fix/audit-path-traversal-performance branch June 1, 2026 13:24
@zkochan zkochan mentioned this pull request Jun 2, 2026
zkochan pushed a commit to chentsulin/pnpm that referenced this pull request Jun 20, 2026
The reachable-vulnerability pruning added in 11.5.1 (pnpm#12087)
only memoized acyclic subtrees, so any node whose subtree touched a
dependency cycle, and all of its ancestors, was recomputed on every
query. That made the audit path walk O(N^2) on lockfiles that contain
cycles, which is the common case.

Reachability is now computed once per node with Tarjan's
strongly-connected-components algorithm. Every member of a cycle shares
one reachable set, so cyclic graphs are handled in linear time.

Closes pnpm#12212

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
zkochan pushed a commit to chentsulin/pnpm that referenced this pull request Jun 20, 2026
The reachable-vulnerability pruning added in 11.5.1 (pnpm#12087)
only memoized acyclic subtrees, so any node whose subtree touched a
dependency cycle, and all of its ancestors, was recomputed on every
query. That made the audit path walk O(N^2) on lockfiles that contain
cycles, which is the common case.

Reachability is now computed once per node with Tarjan's
strongly-connected-components algorithm. Every member of a cycle shares
one reachable set, so cyclic graphs are handled in linear time.

Closes pnpm#12212

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
zkochan added a commit that referenced this pull request Jun 20, 2026
`pnpm audit` enumerates the install paths to every vulnerable package. The
reachability-based pruning added in 11.5.1 (#12087) lets the walker
skip subtrees that reach no unsaturated finding by precomputing, per node, the
set of vulnerabilities reachable from it.

That getter only memoised acyclic subtrees: a node whose subtree contained a
cycle was `complete === false`, and so was every ancestor up to the importer
roots. None of them were cached, so their reachable set was recomputed on every
query. Real dependency graphs commonly contain cycles, and a single cycle high
in the graph makes a large fraction of nodes non-memoisable, yielding an O(N^2)
walk. This matched the report in #12212 exactly (CPU-bound, identical
audit output across versions).

Reachability is now computed with Tarjan's strongly-connected-components
algorithm. Every node is scanned once; all members of an SCC reach the same set
of vulnerabilities and share one set, finalised in reverse-topological order.
Cyclic graphs are handled in O(N + E).

The reachable set is used only to prune, so it must never under-approximate
(that would hide a real finding). Tarjan yields the exact set for every node,
so no finding can be dropped, and the path-recording logic is unchanged. The
getter returns ReadonlySet<string> so the shared sets cannot be mutated by
callers, and a missing memo entry (an impossible-by-construction state) throws
rather than silently returning an empty set.

A regression test asserts the read-count growth ratio between two cycle sizes
(L=200 and L=400) is sub-quadratic: the fix scales ~2x (linear), the previous
code ~4x (quadratic). Asserting the ratio cancels the per-node constant, so the
test is not brittle to constant-factor changes.

Closes #12212.

---------

Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Co-authored-by: Zoltan Kochan <z@kochan.io>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

pnpm audit extremely slow (4+ min) and uncancellable after v11 rewrite

3 participants