Skip to content

fix(lockfile): parse scalar pnpm platform fields#337

Merged
jdx merged 1 commit intomainfrom
codex/fix-pnpm-scalar-platform-fields
Apr 27, 2026
Merged

fix(lockfile): parse scalar pnpm platform fields#337
jdx merged 1 commit intomainfrom
codex/fix-pnpm-scalar-platform-fields

Conversation

@jdx
Copy link
Copy Markdown
Contributor

@jdx jdx commented Apr 27, 2026

What changed

Why

pnpm can emit package platform fields as either arrays or scalar strings. Aube already handled this shape for registry packuments, but the pnpm lockfile parser expected arrays and failed on libc: glibc.

Validation

  • cargo test -p aube-lockfile
  • cargo fmt --check
  • commit hook: cargo fmt --check and clippy for the staged Rust file

Note

Low Risk
Low risk: small deserialization change limited to normalizing os/cpu/libc metadata, with added regression coverage; potential impact is only on platform-constraint parsing for pnpm inputs.

Overview
pnpm lockfile parsing now accepts os, cpu, and libc platform fields as either a scalar string or an array, normalizing all forms to Vec<String>.

The shared string_or_seq deserializer was moved into aube-util (adding a serde dependency) and is now used by both the pnpm lockfile parser and registry packument parsing, with a new regression test covering the sass-embedded-linux-arm64 lockfile shape.

Reviewed by Cursor Bugbot for commit a4ce500. Bugbot is set up for automated code reviews on this repo. Configure here.

@jdx jdx marked this pull request as ready for review April 27, 2026 12:03
@greptile-apps
Copy link
Copy Markdown

greptile-apps Bot commented Apr 27, 2026

Greptile Summary

This PR extracts the string_or_seq serde helper into aube-util (resolving the duplication flagged in the previous review) and applies it to the pnpm lockfile os/cpu/libc fields, which previously expected only arrays and failed on scalar strings like libc: glibc. A regression test covering the sass-embedded-linux-arm64 shape is included. The new implementation is strictly more permissive than the old registry-local one — unexpected value types (booleans, numbers, objects) now silently normalize to an empty vec rather than returning a deserialize error, which is the right direction for lockfile tolerance.

Confidence Score: 5/5

Safe to merge — small, well-scoped fix with a regression test and no breaking behaviour changes.

No P0 or P1 issues found. The refactor correctly uses serde_json::Value as a generic serde intermediate (works with both JSON and YAML deserializers), the behaviour difference vs the old registry implementation is intentional and safer, and the regression test directly covers the reported failure shape.

No files require special attention.

Important Files Changed

Filename Overview
crates/aube-util/src/lib.rs Adds shared string_or_seq deserializer; handles scalar, array, null, and unexpected types — all paths look correct
crates/aube-lockfile/src/pnpm.rs Applies aube_util::string_or_seq to os, cpu, libc fields and adds a regression test for the scalar-string shape — correct and well-tested
crates/aube-registry/src/lib.rs Removes the now-duplicate local string_or_seq and switches callers to aube_util::string_or_seq — behaviour-preserving refactor
crates/aube-util/Cargo.toml Adds serde workspace dependency needed by the new deserializer; serde_json was already present
Cargo.lock Lock file updated to record serde as a direct dependency of aube-util

Reviews (2): Last reviewed commit: "fix(lockfile): parse scalar pnpm platfor..." | Re-trigger Greptile

Comment thread crates/aube-lockfile/src/pnpm.rs Outdated
@jdx jdx force-pushed the codex/fix-pnpm-scalar-platform-fields branch from 3436b63 to a4ce500 Compare April 27, 2026 12:11
@jdx jdx merged commit 31ed4c8 into main Apr 27, 2026
17 checks passed
@jdx jdx deleted the codex/fix-pnpm-scalar-platform-fields branch April 27, 2026 12:21
jdx added a commit that referenced this pull request Apr 27, 2026
## Summary

- Preserve existing top-level `package.json` key order when commands
rewrite dependency sections.
- Parse Bun lockfile local tarball entries whose package ident omits the
`file:` prefix.
- Accept abbreviated git commit SHAs from existing Bun GitHub-shorthand
lockfile entries during checkout verification.

## Details

`aube add` already preserved manifest order by editing the parsed raw
`package.json` object and syncing only dependency sections. This PR
moves that targeted raw-JSON dependency-section writer into the shared
command helpers so `add`, `remove`, and `update --latest` use the same
order-preserving path.

`remove` still prunes pnpm/aube sidecar metadata, but now applies those
removals directly to the parsed raw JSON object instead of replacing
unrelated fields from the typed `PackageJson.extra` map. That keeps
existing key positions and avoids churn in unrelated or nested manifest
data.

The Bun regression matrix also exposed two plain-install failures not
covered by the currently open lockfile PRs:

- Bun can write local tarball package entries as
`local-helper@tarballs/local-helper-1.0.0.tgz` while the workspace
dependency remains `file:tarballs/local-helper-1.0.0.tgz`. Aube now
recognizes that prefixless `.tgz` ident as `LocalSource::Tarball`
instead of treating it as a registry version.
- Bun records GitHub shorthand lock entries with abbreviated commit IDs.
Aube's clone verifier now accepts a checked-out full SHA when it starts
with the abbreviated lockfile SHA, while still rejecting non-hex refs
and mismatched SHAs.

I checked the other open lockfile PRs before adding these: #337 covers
pnpm scalar platform fields, and #338 covers package-lock git resolved
URLs plus pnpm/Bun scalar platform metadata. Those fixes are
intentionally not duplicated here.

## Validation

- `cargo fmt --check`
- `cargo build`
- `cargo test -p aube write_manifest_dep_sections`
- `cargo test -p aube-lockfile test_parse_prefixless_local_tarball`
- `cargo test -p aube-lockfile test_parse_github_dep`
- `cargo test -p aube-store git_commit_matches_abbreviated_sha`
- `mise run test:bats test/remove.bats`
- `mise run test:bats test/update.bats`
- Bun regression matrix from
`johnpyp/2026-04-23-aube-bun-lock-regression-matrix` against
`target/debug/aube`; `github-shorthand` and `local-tarball` now pass
plain install as `plain-aube-unchanged`
- pre-commit hook: cargo fmt and cargo clippy for staged Rust files

<!-- CURSOR_SUMMARY -->
---

> [!NOTE]
> **Medium Risk**
> Medium risk because it changes how `package.json` is rewritten across
multiple commands and relaxes git checkout verification to accept
abbreviated SHAs, which could affect correctness of dependency updates
and git-sourced installs.
> 
> **Overview**
> Improves compatibility and reduces churn when working with Bun and
when editing `package.json`.
> 
> Commands that mutate dependencies (`add`, `remove`, `update --latest`)
now update only the dependency sections in the existing parsed
`package.json` object (via shared helpers) so **top-level key order is
preserved** and empty dep sections are removed without reserializing
unrelated fields; `remove` also prunes pnpm/aube sidecar metadata
directly in raw JSON to avoid reordering.
> 
> Bun lockfile parsing now treats prefixless local tarball idents (e.g.
`pkg@tarballs/foo.tgz` without `file:`) as `LocalSource::Tarball`, and
git dependency checkouts now accept **abbreviated hex SHAs** by
verifying the full HEAD starts with the requested short SHA. Added
targeted unit tests and a bats test to lock in these behaviors.
> 
> <sup>Reviewed by [Cursor Bugbot](https://cursor.com/bugbot) for commit
edcf920. Bugbot is set up for automated
code reviews on this repo. Configure
[here](https://www.cursor.com/dashboard/bugbot).</sup>
<!-- /CURSOR_SUMMARY -->
jdx added a commit that referenced this pull request Apr 27, 2026
## Summary

- Parse git `resolved` URLs from `package-lock.json` entries as
`LocalSource::Git` instead of registry packages.
- Key those locked packages by the git source dep path so install
imports the checkout rather than fetching `<name>@<version>` from npm.
- Re-emit npm git `resolved` fields with the npm-compatible `git+`
prefix, while keeping unsupported non-git local sources out of
package-lock output.
- Accept scalar or array `os` / `cpu` / `libc` metadata in Bun lockfile
package entries.
- Add parser, writer, and BATS regressions covering these lockfile
compatibility cases.

## Root Cause

The npm lockfile reader only preserved `resolved` values that looked
like HTTP tarball URLs. Git resolved URLs such as `git+ssh://...#<sha>`
were dropped, leaving only `<pkg>@<version>`, which sent install down
the npm registry tarball path. The npm writer also needed to preserve
the external npm-compatible `git+` form when writing those pinned git
sources back.

The Bun reader assumed platform metadata was always serialized as
arrays. Real Bun lockfiles can carry single platform values as scalars,
for example `{ "os": "darwin", "cpu": "arm64" }`, which caused parse
failures or metadata loss.

Note: the related pnpm scalar `os` / `cpu` / `libc` fix is now on `main`
via #337, so this PR keeps only the remaining package-lock and Bun
fixes.

## Validation

- `cargo fmt --check`
- `cargo test -p aube-lockfile`
- `cargo clippy --all-targets -- -D warnings`
- `cargo build`
- `mise run test:bats test/git_deps.bats`
jdx added a commit that referenced this pull request Apr 30, 2026
## Summary

- Extends [#337](#337) to the npm
`package-lock.json` parser. The pnpm parser learned to accept scalar
`libc` / `os` / `cpu` (e.g. `"libc": "glibc"`) but the npm parser still
required arrays, so verbatim-roundtripped lockfiles containing
`sass-embedded-linux-*` and friends fail to parse.
- Switches `RawNpmPackage::{os,cpu,libc}` to `aube_util::string_or_seq`,
the same helper bun and pnpm already use.
- Adds a regression test with the exact shape reported in [discussion
#336](#336 (reply in thread)).

## Test plan

- [x] `cargo test -p aube-lockfile --lib npm::`
- [x] `cargo clippy -p aube-lockfile --all-targets -- -D warnings`
- [x] `cargo fmt --check`

🤖 Generated with [Claude Code](https://claude.com/claude-code)

<!-- CURSOR_SUMMARY -->
---

> [!NOTE]
> **Low Risk**
> Low risk: small, localized change to npm lockfile deserialization to
accept an additional JSON shape, with a focused regression test covering
the new behavior.
> 
> **Overview**
> Fixes npm `package-lock.json` parsing to accept platform fields `os`,
`cpu`, and `libc` when npm emits them as **scalar strings** instead of
arrays, by deserializing them via `aube_util::string_or_seq`.
> 
> Adds a regression test that parses a minimal v3 lockfile containing
scalar `os`/`cpu`/`libc` (e.g. `sass-embedded-linux-arm`) and asserts
the values are captured correctly.
> 
> <sup>Reviewed by [Cursor Bugbot](https://cursor.com/bugbot) for commit
d164b37. Bugbot is set up for automated
code reviews on this repo. Configure
[here](https://www.cursor.com/dashboard/bugbot).</sup>
<!-- /CURSOR_SUMMARY -->

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant