Skip to content

fix(gvs): engine-agnostic hashes and build failure recovery#10846

Merged
zkochan merged 6 commits intomainfrom
fix/10837
Mar 6, 2026
Merged

fix(gvs): engine-agnostic hashes and build failure recovery#10846
zkochan merged 6 commits intomainfrom
fix/10837

Conversation

@zkochan
Copy link
Copy Markdown
Member

@zkochan zkochan commented Mar 3, 2026

Summary

  • Use the allowBuilds config to determine which packages need ENGINE_NAME in their GVS hash. Packages that are not allowed to build now get engine-agnostic hashes, so they survive Node.js upgrades and architecture changes without re-import.
  • Persist allowBuilds in modules.yaml so that when it changes between installs, the headless installer detects the difference and re-processes all packages with updated GVS hashes.
  • Deduplicate computeBuiltDepPaths into iterateHashedGraphNodes, which now accepts an AllowBuild function instead of a precomputed Set.
  • Recover from failed or interrupted GVS builds using a .pnpm-needs-build marker file.

Build failure recovery

When a GVS package needs building, a .pnpm-needs-build marker file is added to the filesMap before calling importPackage(). The import pipeline treats it as a regular file, so it's atomically included in the staged directory and renamed with the package. On the next install, GVS fast paths detect the marker and force a re-fetch/re-import/re-build.

  • On build success: the marker is removed (before uploading side effects, so it doesn't pollute the side effects cache).
  • On build failure: the entire hash directory is removed so the next install starts fresh.

The marker is only checked for packages that are allowed to build (via allowBuild), minimizing filesystem operations. It is also skipped when cached side effects will be applied, since the package is already built.

Test plan

  • GVS successful build removes the marker and doesn't include it in side effects cache
  • GVS build failure removes the hash directory
  • GVS rebuild works correctly after hash directory cleanup
  • .pnpm-needs-build marker triggers re-import and rebuild on next install

Closes #10837

zkochan and others added 3 commits March 3, 2026 22:34
…hashes

Use the allowBuilds config to determine which packages need ENGINE_NAME
in their GVS hash. Packages that are not allowed to build (and don't
transitively depend on packages that are) now get engine-agnostic hashes,
so they survive Node.js upgrades and architecture changes without
re-import.

Closes #10837

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Store the allowBuilds config in modules.yaml so that when it changes
between installs, the headless installer detects the difference and
re-processes all packages with updated GVS hashes.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Move builtDepPaths computation inside iterateHashedGraphNodes, which now
accepts an AllowBuild function instead of a precomputed Set. This
eliminates duplicate logic in iteratePkgsForVirtualStore and
resolve-dependencies.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@zkochan zkochan marked this pull request as ready for review March 6, 2026 12:32
Copilot AI review requested due to automatic review settings March 6, 2026 12:32
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR updates global virtual store (GVS) hashing so that packages that are not allowed to run build scripts (per allowBuilds) can use engine-agnostic hashes, reducing unnecessary GVS invalidations across Node.js/architecture changes.

Changes:

  • Thread allowBuilds through install/headless/modules state so GVS relinking can react to policy changes.
  • Update @pnpm/calc-dep-state graph hashing to omit ENGINE_NAME when a package (and its transitive deps) are not allowed to build.
  • Add tests covering engine-agnostic hashing behavior and relinking when allowBuilds changes.

Reviewed changes

Copilot reviewed 9 out of 9 changed files in this pull request and generated 3 comments.

Show a summary per file
File Description
pkg-manager/resolve-dependencies/src/index.ts Passes allowBuild into graph hashing; adjusts iteration used to extend the dependency graph.
pkg-manager/modules-yaml/src/index.ts Persists allowBuilds in .modules.yaml state type.
pkg-manager/headless/src/index.ts Triggers relink when allowBuilds changes (for GVS) and writes allowBuilds to modules manifest.
pkg-manager/core/test/install/globalVirtualStore.ts Adds integration tests for engine-agnostic hashes and relinking on allowBuilds changes.
pkg-manager/core/src/install/index.ts Writes allowBuilds into modules manifest.
packages/calc-dep-state/test/index.ts Adds unit tests for calcGraphNodeHash engine inclusion/omission.
packages/calc-dep-state/src/index.ts Implements allowBuilds-based engine-agnostic node hashing + transitive “requires build” walk.
deps/graph-builder/src/iteratePkgsForVirtualStore.ts Threads allowBuild into hash computation for GVS paths.
.changeset/engine-agnostic-gvs-hashes.md Declares minor releases and documents the behavior change.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines 121 to 132
const { name, version, depPath } = pkgMeta
// When builtDepPaths is provided (derived from the allowBuilds config),
// we only include the engine name for packages that are allowed to build
// or transitively depend on a package that is allowed to build.
// This makes GVS hashes engine-agnostic for pure-JS packages,
// so they survive Node.js upgrades and architecture changes.
const includeEngine = builtDepPaths === undefined ||
transitivelyRequiresBuild(graph, builtDepPaths, buildRequiredCache ??= {}, depPath, new Set())
const state = {
// Unfortunately, we need to include the engine name in the hash,
// even though it's only required for packages that are built,
// or have dependencies that are built.
// We can't know for sure whether a package needs to be built
// before it's fetched from the registry.
// However, we fetch and write packages to node_modules in random order for performance,
// so we can't determine at this stage which dependencies will be built.
engine: ENGINE_NAME,
engine: includeEngine ? ENGINE_NAME : null,
deps: calcDepGraphHash(graph, cache, new Set(), depPath),
}
Copy link

Copilot AI Mar 6, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

calcGraphNodeHash() may produce engine-agnostic hashes for packages whose resolution.type === 'variations' (e.g. node@runtime:*, bun@runtime:*, deno@runtime:*). Those packages select different artifacts per OS/CPU/libc at fetch time, so omitting ENGINE_NAME can cause hash collisions across engines and reuse the wrong runtime contents. Consider forcing includeEngine to true when the current node has a VariationsResolution (or more generally when multiple platform variants exist), regardless of allowBuilds / builtDepPaths.

Copilot uses AI. Check for mistakes.
Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The integrities of those packages will differ though. So, no collision is expected.

Comment on lines +91 to +104
export function * iterateHashedGraphNodes<T extends PkgMeta> (
graph: DepsGraph<DepPath>,
pkgMetaIterator: PkgMetaIterator<T>
pkgMetaIterator: PkgMetaIterator<T>,
allowBuild?: AllowBuild
): IterableIterator<HashedDepPath<T>> {
const _calcGraphNodeHash = calcGraphNodeHash.bind(null, { graph, cache: {} })
for (const pkgMeta of pkgMetaIterator) {
const pkgMetaList = Array.from(pkgMetaIterator)
const builtDepPaths = computeBuiltDepPaths(pkgMetaList, allowBuild)
const _calcGraphNodeHash = calcGraphNodeHash.bind(null, {
graph,
cache: {},
builtDepPaths,
buildRequiredCache: builtDepPaths !== undefined ? {} : undefined,
})
for (const pkgMeta of pkgMetaList) {
Copy link

Copilot AI Mar 6, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

iterateHashedGraphNodes() now materializes the full pkgMetaIterator via Array.from(). For large lockfiles/graphs this can significantly increase peak memory and GC pressure, especially since hashing already walks the dependency graph. If possible, avoid collecting all entries when not needed (e.g. only build builtDepPaths/buildRequiredCache when allowBuild is provided and can return true for something), or restructure to compute builtDepPaths/hashes in a single pass without retaining the whole list.

Copilot uses AI. Check for mistakes.
Comment on lines 485 to 488
for (const { pkgMeta: { depPath }, hash } of iterateHashedGraphNodes(graph, iterateGraphPkgMetaEntries(graph), opts.allowBuild)) {
if (!opts.enableGlobalVirtualStore && !isRuntimeDepPath(depPath)) continue
const modules = path.join(opts.globalVirtualStoreDir, hash, 'node_modules')
const node = graph[depPath]
Copy link

Copilot AI Mar 6, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

extendGraph() now runs iterateHashedGraphNodes() over all graph nodes even when the global virtual store is disabled, and only then skips non-runtime nodes. This makes non-GVS installs pay the full hashing cost (and any hashing preconditions) unnecessarily. Consider restoring the previous behavior by filtering the iterator up-front (or branching on enableGlobalVirtualStore) so only runtime dep paths are hashed when GVS is off, while keeping the full-graph hashing only for the GVS-enabled case.

Copilot uses AI. Check for mistakes.
…-build marker

When a GVS package needs building, a .pnpm-needs-build marker file is added
to the filesMap before import. The import pipeline treats it as a regular file,
so it's atomically included in the staged directory and renamed with the
package. On the next install, GVS fast paths detect the marker and force a
re-fetch/re-import/re-build.

On build success, the marker is removed. On build failure, the entire hash
directory is removed so the next install starts fresh.

The marker is only checked for packages that are allowed to build (via
allowBuild), minimizing filesystem operations. It is also skipped when cached
side effects will be applied, since the package is already built.

Closes #10837

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Move the marker removal before the side effects upload so the marker
file is not included in the side effects diff. Add a test assertion
that verifies the marker does not appear in the cached side effects.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 11 out of 11 changed files in this pull request and generated 2 comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

allowBuild?: AllowBuild
): Set<DepPath> | undefined {
if (allowBuild == null) {
return new Set()
Copy link

Copilot AI Mar 6, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

computeBuiltDepPaths() currently returns an empty Set when allowBuild is undefined (allowBuild == null). That makes iterateHashedGraphNodes() always pass a defined builtDepPaths, so calcGraphNodeHash() will omit ENGINE_NAME even when the user did not configure allowBuilds (i.e. no allowlist policy was provided). This contradicts the intended fallback behavior (keep engine-specific hashes unless allowBuilds is explicitly configured) and can make hashes inconsistent with other call sites that still use the default (engine-including) hashing.

Consider returning undefined when allowBuild is not provided, and only returning a Set when an allowBuild function exists (including the case where allowBuilds: {} yields a function that always returns undefined).

Suggested change
return new Set()
return undefined

Copilot uses AI. Check for mistakes.
Comment on lines 238 to +255
@@ -243,14 +250,19 @@ async function buildGraphFromPackages (
!opts.includeUnchangedDeps
) {
dirExists = await pathExists(dir)
if (dirExists) return
brokenModulesLogger.debug({ missing: dir })
if (dirExists) {
if (!(mightNeedBuild && fs.existsSync(path.join(dir, '.pnpm-needs-build')))) return
} else {
Copy link

Copilot AI Mar 6, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The .pnpm-needs-build marker is added during import for both packages that requiresBuild and packages that have patch != null (see headless import logic). Here, the fast-path skip checks the marker only when allowBuild?.(pkgName, pkgVersion) === true. This means a patched package that isn’t in allowBuilds (or when allowBuild is undefined) may incorrectly take the fast path even if a previous run crashed before patching/building, leaving an unpatched/partially processed directory that should be re-imported.

Consider including the “patched” condition in mightNeedBuild (e.g. using existing patch info), or simply checking for the marker file regardless of allowBuild when enableGlobalVirtualStore is on (ideally using an async existence check to avoid sync I/O in hot paths).

Copilot uses AI. Check for mistakes.
@zkochan zkochan changed the title feat(calc-dep-state): use allowBuilds to compute engine-agnostic GVS hashes fix(gvs): engine-agnostic hashes and build failure recovery Mar 6, 2026
…allowBuild is not configured

Previously, computeBuiltDepPaths returned an empty Set when allowBuild
was undefined, causing all GVS hashes to become engine-agnostic even
without allowBuilds configured. Now the function is only called when
allowBuild is provided, and iterateHashedGraphNodes avoids materializing
the iterator when it's not needed.

Also restore upfront filtering in extendGraph so non-GVS installs only
hash runtime dep paths, and only pass allowBuild when GVS is on.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@zkochan zkochan merged commit cd743ef into main Mar 6, 2026
10 of 13 checks passed
@zkochan zkochan deleted the fix/10837 branch March 6, 2026 18:03
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Re-add requiresBuild to lockfile snapshots for engine-agnostic GVS hashes

2 participants