fix(audit): compute reachable vulnerabilities with Tarjan SCC#12467
Conversation
|
💖 Thanks for opening this pull request! 💖 |
Code Review by Qodo
Context used 1. strongconnect() recursion stack overflow
|
There was a problem hiding this comment.
Pull request overview
Note
Copilot was unable to run its full agentic suite in this review.
Improves pnpm audit performance on cyclic lockfiles by switching reachability computation to a linear-time SCC approach and adds a regression test to guard against quadratic behavior.
Changes:
- Compute reachable vulnerabilities using Tarjan’s strongly-connected-components algorithm to avoid quadratic recomputation on cycles.
- Add a performance/correctness regression test that constructs a large cyclic dependency graph.
- Add a changeset documenting the patch release fix.
Reviewed changes
Copilot reviewed 3 out of 3 changed files in this pull request and generated 3 comments.
| File | Description |
|---|---|
| deps/compliance/audit/src/lockfileToAuditIndex.ts | Replaces cycle-prone reachability memoization with SCC-based reachability sets shared across cycles. |
| deps/compliance/audit/test/index.ts | Adds a regression test to ensure cyclic graphs are scanned a bounded number of times and paths remain correct. |
| .changeset/fix-audit-cyclic-path-traversal-perf.md | Documents the performance regression fix and release impact. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
|
Code review by qodo was updated up to the latest commit e68ea52 |
|
Code review by qodo was updated up to the latest commit adbfaa8 |
|
Code review by qodo was updated up to the latest commit 7ab65ad |
|
Note Reviews pausedIt looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the Use the following commands to manage reviews:
Use the checkboxes below for quick actions:
📝 WalkthroughWalkthroughReplaces the reachability memoization logic inside ChangesTarjan SCC reachability fix for audit cycle performance
Sequence DiagramsequenceDiagram
participant Caller
participant createReachableVulnerabilitiesGetter
participant strongconnect
participant sccReachable as SCC shared set
Caller->>createReachableVulnerabilitiesGetter: getReachable(node)
createReachableVulnerabilitiesGetter->>strongconnect: visit(node, index, lowlink)
strongconnect->>strongconnect: iterate neighbors via explicit stack
strongconnect-->>sccReachable: fold member vulnerabilities into shared set
strongconnect-->>createReachableVulnerabilitiesGetter: memoize set for each SCC member
createReachableVulnerabilitiesGetter-->>Caller: ReadonlySet<string>
Caller->>allReachableVulnerabilitiesSaturated: check(ReadonlySet<string>)
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~25 minutes Assessment against linked issues
Possibly related PRs
Suggested reviewers
🚥 Pre-merge checks | ✅ 5✅ Passed checks (5 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
7ab65ad to
490c73f
Compare
|
Code review by qodo was updated up to the latest commit 490c73f |
|
Code review by qodo was updated up to the latest commit 490c73f |
|
Code review by qodo was updated up to the latest commit 977230b |
1 similar comment
|
Code review by qodo was updated up to the latest commit 977230b |
|
Code review by qodo was updated up to the latest commit 07f8246 |
1 similar comment
|
Code review by qodo was updated up to the latest commit 07f8246 |
|
Code review by qodo was updated up to the latest commit f52aa6b |
1 similar comment
|
Code review by qodo was updated up to the latest commit f52aa6b |
The reachable-vulnerability pruning added in 11.5.1 (pnpm#12087) only memoized acyclic subtrees, so any node whose subtree touched a dependency cycle, and all of its ancestors, was recomputed on every query. That made the audit path walk O(N^2) on lockfiles that contain cycles, which is the common case. Reachability is now computed once per node with Tarjan's strongly-connected-components algorithm. Every member of a cycle shares one reachable set, so cyclic graphs are handled in linear time. Closes pnpm#12212 Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
`lowlink`, `strongconnect`, and `Tarjan` are used by the audit reachability fix and must be accepted by the spell checker. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Make the read-only contract of the reachable-vulnerability sets explicit so callers cannot mutate the shared per-SCC sets or the empty singleton. Also narrow the cyclic-scan regression test name to what it asserts. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Compare snapshot reads at two cycle sizes and assert the growth ratio is sub-quadratic, instead of a fixed read-count threshold. The ratio cancels the per-node constant, so the test stays robust to constant-factor changes in future linear-time refactors while still catching a quadratic regression. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Closing a strongly-connected component allocated a fresh set and copied every member's contribution into it, even for the common singleton-SCC case in a mostly-acyclic graph. Reuse the first popped member's own set as the shared accumulator so singleton SCCs finalize with no extra allocation or copy, trimming constant-factor overhead on the audit path walk.
After strongconnect finalizes the queried node's SCC, its reachable set is always present in the memo. The previous `?? EMPTY_REACHABLE` fallback would, if that invariant were ever broken by a future refactor, silently return an empty set — pruning valid install paths and hiding a real vulnerability. Replace the shared empty-set fallback with an explicit throw so the bug surfaces instead of under-reporting, and drop the now-unused EMPTY_REACHABLE singleton.
A lockfile is untrusted input. Building the children/roots arrays with `target.push(...resolvedDepsToNamedDepPaths(deps))` spreads a lockfile-derived dependency list into push() arguments; a pathologically large list can exceed the JS engine's argument-count limit and throw, crashing pnpm audit. Append via a loop helper instead, which also drops the intermediate array allocation.
…flow A lockfile is untrusted input, so a deeply nested dependency chain could overflow the JS call stack and crash pnpm audit. All four DFS traversals on the audit path — the bulk-request visitor, the reachable-vulnerability Tarjan SCC builder, the path walker, and the optional-only reachability walk — now use explicit work stacks instead of recursion. The path walker also tracked the install trail by copying the whole path array into every node (O(depth) per node, O(depth^2) for a deep chain). It now links each trail node to its parent and materializes the path string only when a finding is recorded, keeping memory linear in the depth. Adds a 60k-deep-chain regression test that completes without overflowing the stack or exhausting memory.
f52aa6b to
e3aa3f6
Compare
|
Congrats on merging your first pull request! 🎉🎉🎉 |
Summary
pnpm auditbecame dramatically slower (CPU-bound, ~5–6×) starting in 11.5.1 on workspaces whose lockfiles contain dependency cycles. This PR restores linear-time behaviour while keeping the path-pruning optimisation introduced in #12087.The reachability pruning added in 11.5.1 (
createReachableVulnerabilitiesGetter) only memoised acyclic subtrees. A node whose subtree touched a cycle — and every one of its ancestors up to the importer roots — was therefore recomputed from scratch on every query, turning the path walk intoO(N²). Reachability is now computed once per node with Tarjan's strongly-connected-components algorithm: every node is scanned exactly once, and all members of a cycle share a single reachable-vulnerability set, so cyclic graphs are handled inO(N + E).The reachable set is used only to prune, so it must never be an under-approximation (that would hide a real finding). Tarjan yields the exact set for every node, so no finding can be dropped; the path-recording logic in
walkForPathsis unchanged.Closes #12212.
Squash Commit Body
Checklist
pacquet/port, or the description notes what still needs porting. —auditis a TypeScript-only command (outside pacquet's current
install/add/update/removesurface), so no pacquet-side port is needed.pnpm changeset) if this PR changes any publishedpackage. Keep it short and written for pnpm users — it becomes a release note.
@pnpm/deps.compliance.audit; all 23 tests pass.Summary by CodeRabbit
pnpm auditperformance on lockfiles with dependency cycles by computing reachability per cyclic group only once, reducing redundant work.