fix(lockfile): parse scalar pnpm platform fields#337
Conversation
Greptile SummaryThis PR extracts the Confidence Score: 5/5Safe to merge — small, well-scoped fix with a regression test and no breaking behaviour changes. No P0 or P1 issues found. The refactor correctly uses No files require special attention. Important Files Changed
Reviews (2): Last reviewed commit: "fix(lockfile): parse scalar pnpm platfor..." | Re-trigger Greptile |
3436b63 to
a4ce500
Compare
## Summary - Preserve existing top-level `package.json` key order when commands rewrite dependency sections. - Parse Bun lockfile local tarball entries whose package ident omits the `file:` prefix. - Accept abbreviated git commit SHAs from existing Bun GitHub-shorthand lockfile entries during checkout verification. ## Details `aube add` already preserved manifest order by editing the parsed raw `package.json` object and syncing only dependency sections. This PR moves that targeted raw-JSON dependency-section writer into the shared command helpers so `add`, `remove`, and `update --latest` use the same order-preserving path. `remove` still prunes pnpm/aube sidecar metadata, but now applies those removals directly to the parsed raw JSON object instead of replacing unrelated fields from the typed `PackageJson.extra` map. That keeps existing key positions and avoids churn in unrelated or nested manifest data. The Bun regression matrix also exposed two plain-install failures not covered by the currently open lockfile PRs: - Bun can write local tarball package entries as `local-helper@tarballs/local-helper-1.0.0.tgz` while the workspace dependency remains `file:tarballs/local-helper-1.0.0.tgz`. Aube now recognizes that prefixless `.tgz` ident as `LocalSource::Tarball` instead of treating it as a registry version. - Bun records GitHub shorthand lock entries with abbreviated commit IDs. Aube's clone verifier now accepts a checked-out full SHA when it starts with the abbreviated lockfile SHA, while still rejecting non-hex refs and mismatched SHAs. I checked the other open lockfile PRs before adding these: #337 covers pnpm scalar platform fields, and #338 covers package-lock git resolved URLs plus pnpm/Bun scalar platform metadata. Those fixes are intentionally not duplicated here. ## Validation - `cargo fmt --check` - `cargo build` - `cargo test -p aube write_manifest_dep_sections` - `cargo test -p aube-lockfile test_parse_prefixless_local_tarball` - `cargo test -p aube-lockfile test_parse_github_dep` - `cargo test -p aube-store git_commit_matches_abbreviated_sha` - `mise run test:bats test/remove.bats` - `mise run test:bats test/update.bats` - Bun regression matrix from `johnpyp/2026-04-23-aube-bun-lock-regression-matrix` against `target/debug/aube`; `github-shorthand` and `local-tarball` now pass plain install as `plain-aube-unchanged` - pre-commit hook: cargo fmt and cargo clippy for staged Rust files <!-- CURSOR_SUMMARY --> --- > [!NOTE] > **Medium Risk** > Medium risk because it changes how `package.json` is rewritten across multiple commands and relaxes git checkout verification to accept abbreviated SHAs, which could affect correctness of dependency updates and git-sourced installs. > > **Overview** > Improves compatibility and reduces churn when working with Bun and when editing `package.json`. > > Commands that mutate dependencies (`add`, `remove`, `update --latest`) now update only the dependency sections in the existing parsed `package.json` object (via shared helpers) so **top-level key order is preserved** and empty dep sections are removed without reserializing unrelated fields; `remove` also prunes pnpm/aube sidecar metadata directly in raw JSON to avoid reordering. > > Bun lockfile parsing now treats prefixless local tarball idents (e.g. `pkg@tarballs/foo.tgz` without `file:`) as `LocalSource::Tarball`, and git dependency checkouts now accept **abbreviated hex SHAs** by verifying the full HEAD starts with the requested short SHA. Added targeted unit tests and a bats test to lock in these behaviors. > > <sup>Reviewed by [Cursor Bugbot](https://cursor.com/bugbot) for commit edcf920. Bugbot is set up for automated code reviews on this repo. Configure [here](https://www.cursor.com/dashboard/bugbot).</sup> <!-- /CURSOR_SUMMARY -->
## Summary
- Parse git `resolved` URLs from `package-lock.json` entries as
`LocalSource::Git` instead of registry packages.
- Key those locked packages by the git source dep path so install
imports the checkout rather than fetching `<name>@<version>` from npm.
- Re-emit npm git `resolved` fields with the npm-compatible `git+`
prefix, while keeping unsupported non-git local sources out of
package-lock output.
- Accept scalar or array `os` / `cpu` / `libc` metadata in Bun lockfile
package entries.
- Add parser, writer, and BATS regressions covering these lockfile
compatibility cases.
## Root Cause
The npm lockfile reader only preserved `resolved` values that looked
like HTTP tarball URLs. Git resolved URLs such as `git+ssh://...#<sha>`
were dropped, leaving only `<pkg>@<version>`, which sent install down
the npm registry tarball path. The npm writer also needed to preserve
the external npm-compatible `git+` form when writing those pinned git
sources back.
The Bun reader assumed platform metadata was always serialized as
arrays. Real Bun lockfiles can carry single platform values as scalars,
for example `{ "os": "darwin", "cpu": "arm64" }`, which caused parse
failures or metadata loss.
Note: the related pnpm scalar `os` / `cpu` / `libc` fix is now on `main`
via #337, so this PR keeps only the remaining package-lock and Bun
fixes.
## Validation
- `cargo fmt --check`
- `cargo test -p aube-lockfile`
- `cargo clippy --all-targets -- -D warnings`
- `cargo build`
- `mise run test:bats test/git_deps.bats`
## Summary - Extends [#337](#337) to the npm `package-lock.json` parser. The pnpm parser learned to accept scalar `libc` / `os` / `cpu` (e.g. `"libc": "glibc"`) but the npm parser still required arrays, so verbatim-roundtripped lockfiles containing `sass-embedded-linux-*` and friends fail to parse. - Switches `RawNpmPackage::{os,cpu,libc}` to `aube_util::string_or_seq`, the same helper bun and pnpm already use. - Adds a regression test with the exact shape reported in [discussion #336](#336 (reply in thread)). ## Test plan - [x] `cargo test -p aube-lockfile --lib npm::` - [x] `cargo clippy -p aube-lockfile --all-targets -- -D warnings` - [x] `cargo fmt --check` 🤖 Generated with [Claude Code](https://claude.com/claude-code) <!-- CURSOR_SUMMARY --> --- > [!NOTE] > **Low Risk** > Low risk: small, localized change to npm lockfile deserialization to accept an additional JSON shape, with a focused regression test covering the new behavior. > > **Overview** > Fixes npm `package-lock.json` parsing to accept platform fields `os`, `cpu`, and `libc` when npm emits them as **scalar strings** instead of arrays, by deserializing them via `aube_util::string_or_seq`. > > Adds a regression test that parses a minimal v3 lockfile containing scalar `os`/`cpu`/`libc` (e.g. `sass-embedded-linux-arm`) and asserts the values are captured correctly. > > <sup>Reviewed by [Cursor Bugbot](https://cursor.com/bugbot) for commit d164b37. Bugbot is set up for automated code reviews on this repo. Configure [here](https://www.cursor.com/dashboard/bugbot).</sup> <!-- /CURSOR_SUMMARY --> Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
What changed
os,cpu, andlibcmetadata, normalizing them to the existing vector representation.sass-embedded-linux-arm64@1.99.0shape reported in Discussion failed to parse .../pnpm-lock.yaml: invalid type: string "glibc" #336.Why
pnpm can emit package platform fields as either arrays or scalar strings. Aube already handled this shape for registry packuments, but the pnpm lockfile parser expected arrays and failed on
libc: glibc.Validation
cargo test -p aube-lockfilecargo fmt --checkcargo fmt --checkand clippy for the staged Rust fileNote
Low Risk
Low risk: small deserialization change limited to normalizing
os/cpu/libcmetadata, with added regression coverage; potential impact is only on platform-constraint parsing for pnpm inputs.Overview
pnpm lockfile parsing now accepts
os,cpu, andlibcplatform fields as either a scalar string or an array, normalizing all forms toVec<String>.The shared
string_or_seqdeserializer was moved intoaube-util(adding aserdedependency) and is now used by both the pnpm lockfile parser and registry packument parsing, with a new regression test covering thesass-embedded-linux-arm64lockfile shape.Reviewed by Cursor Bugbot for commit a4ce500. Bugbot is set up for automated code reviews on this repo. Configure here.