feat(sbom): add pnpm sbom command#10592
Conversation
new command that generates SBOMs from the lockfile + store metadata. supports CycloneDX 1.6 JSON and SPDX 2.3 JSON via `--sbom-format`. two new packages following the existing `pnpm licenses` architecture: - `@pnpm/sbom` — core library (lockfile walking, store reading, serializers) - `@pnpm/plugin-commands-sbom` — CLI plugin wiring uses the lockfile walker for dependency traversal and reads package.json from the CAFS store for license/author/description metadata. `--lockfile-only` skips the store entirely for faster CI runs where metadata isn't needed. validated against official CycloneDX 1.6 and SPDX 2.3 JSON schemas.
Implements all 5 items from the CycloneDX maintainer review: split scoped names into group/name, move hashes to externalReferences distribution, use license.id for known SPDX identifiers, switch to modern tools.components structure with pnpm version, and bump specVersion to 1.7. Also adds spdx-license-ids for proper license classification and improves SPDX serializer test coverage.
createRequire doesn't work in the esbuild bundle since it's a runtime resolve, switched back to regular import which esbuild can inline.
Use actual tarball download URL instead of PURL for CycloneDX distribution externalReferences, per review feedback.
adds $schema, timestamp, lifecycles (build/pre-build) to CycloneDX output to match what npm does. also enriches both CycloneDX and SPDX with metadata.authors, metadata.supplier, component supplier from author, vcs externalReferences from repository, and root component details (purl, license, description, author, vcs). SPDX now uses tarball URL for downloadLocation instead of NOASSERTION. renames CycloneDxToolInfo to CycloneDxOptions, passes lockfileOnly through to the serializer for lifecycle phase selection. adds store-dir to accepted CLI options.
switches license classification from spdx-license-ids to @cyclonedx/cyclonedx-library (SPDX.isSupportedSpdxId) for accurate CycloneDX license ID validation per jkowalleck's feedback. removes hardcoded metadata.authors and metadata.supplier — these are not appropriate for a tool to set. adds --sbom-authors and --sbom-supplier CLI flags so the SBOM consumer (e.g. ACME Corp) can declare who they are. removes supplier from components — supplier is the registry/distributor, not the package author. also fixes distribution externalReference to only emit when a real tarball URL exists, no PURL fallback.
|
I see
that |
The optional dependencies are not installed for a reason. That is good. The fix might be to mark them as external in the esbuild config. Like this // build.ts
import esbuild from 'esbuild';
esbuild.build({
entryPoints: [],
bundle: true,
external: ['your-optional-peer'],
}); |
top-level import from @cyclonedx/cyclonedx-library drags in validation/serialize layers with optional deps (ajv-formats, libxmljs2, xmlbuilder2) that esbuild can't resolve during pnpm CLI bundling. switch to @cyclonedx/cyclonedx-library/SPDX which only pulls in the SPDX module we actually use — pure JS, no optional deps.
or import as sayed earlier - ala this should support the tree-shaking process as expected. |
reviewing/sbom/src/getPkgMetadata.ts
Outdated
| const isPackageWithIntegrity = 'integrity' in resolution | ||
|
|
||
| let pkgIndexFilePath: string | ||
| if (isPackageWithIntegrity) { | ||
| const parsedId = parse(id) | ||
| pkgIndexFilePath = getIndexFilePathInCafs( | ||
| opts.storeDir, | ||
| resolution.integrity as string, | ||
| parsedId.nonSemverVersion ?? `${parsedId.name}@${parsedId.version}` | ||
| ) | ||
| } else if (!resolution.type && 'tarball' in resolution && resolution.tarball) { | ||
| const packageDirInStore = depPathToFilename(parse(id).nonSemverVersion ?? id, opts.virtualStoreDirMaxLength) | ||
| pkgIndexFilePath = path.join( | ||
| opts.storeDir, | ||
| packageDirInStore, | ||
| 'integrity.mpk' | ||
| ) | ||
| } else { | ||
| return {} | ||
| } | ||
|
|
||
| try { | ||
| const { files } = await readMsgpackFile<PackageFilesIndex>(pkgIndexFilePath) | ||
| const pkgJsonInfo = files.get('package.json') as PackageFileInfo | undefined | ||
| if (!pkgJsonInfo) return {} | ||
| manifestPath = getFilePathByModeInCafs(opts.storeDir, pkgJsonInfo.digest, pkgJsonInfo.mode) | ||
| } catch { | ||
| return {} | ||
| } |
There was a problem hiding this comment.
This code is probably duplicated from license-scanner. We need to refactor it. Although it might get simplified if I merge a variation of this PR: #10473
There was a problem hiding this comment.
It's indeed from there just copied it, but makes sense to split it out, want me to wait for #10473 or just split it out and import it in my PR now?
There was a problem hiding this comment.
what fields do you need from the package.json? If you need the license field, then you'll need this code because even if I merge #10473, it won't contain all the fields from the package.json files. It'll contain: 'bin', 'cpu', 'directories', 'engines', 'libc', 'os'. and maybe field related to dependencies.
There was a problem hiding this comment.
We need license, description, author, homepage, and repository. None of those are in the #10473 set, so we'll need to keep this code as-is, I guess :)
There was a problem hiding this comment.
So maybe we keep the duplicated code from now, as the other place will have it removed. Or do you prefer I split it out now?
There was a problem hiding this comment.
I'd go with the duplicate code for now. who knowns what the future brings and how those both code blocks will evolve.
Prevent premature overgeneralization.
There was a problem hiding this comment.
I don't think that reading the raw package.json file from the store using data from the lockfile is overgeneralization
Both @pnpm/license-scanner and @pnpm/sbom independently implemented nearly identical logic to read a package's file index from the content-addressable store. This extracts that into a new shared package that returns a uniform Map<string, string> (filename → absolute path), simplifying both consumers.
|
Congrats on merging your first pull request! 🎉🎉🎉 |
|
Hello, I need to use this command, can someone point me to a version of |
As far as I see it landed on Big thanks to @jkowalleck for pointing me here :) |
It's just 3 hours ago :) But happy that people need this! |
|
@Saturate we recently migrated to pnpm from npm, and were unable to find any tools that reliably generate the complete and accurate SBOM for monorepo-based projects. After almost a day of no luck, I was shocked to see that the https://github.com/CycloneDX/cyclonedx-node-pnpm repository, which I had visited just a couple of hours ago only to find it being in early development, was now archived in this PR's favor.😄 This actually made my day and would end our hunt for the SBOM generation solution but I am just hoping that it is released soon 😄 Do you have an idea as to what the release timeline could be? PS. thanks for the PR 😊 |
|
For those interested: this command is out in |
|
what is the official command to print this to a file? I just tried this pnpm sbom --sbom-format spdx --prod > foo.spdx.jsonBut it can lead to invalid json if there are any warnings printed out: WARN Unsupported engine: wanted: {"node":"v24.10.x"} (current: {"node":"v24.11.0","pnpm":"11.0.0-alpha.13"})
{
"spdxVersion": "SPDX-2.3",
"dataLicense": "CC0-1.0",
"SPDXID": "SPDXRef-DOCUMENT",So --silent fixes the issue: pnpm sbom --silent --sbom-format spdx --prod > foo.spdx.jsonBut I was wondering if there is a better way. Other CLIs anticipate that you are interested in the output of the command and print all logs to stderr instead, so stdout only contains the output of the command you are running without other logs, but it still allows you to see the logs in case you are interested in them. If that is too much work, a quick fix could be that pnpm sbom implies --silent for now, or we add an output file to pnpm sbom to avoid the situation altogether. What do you think? |
indead. stdOUT should be for the SBOM result only. could you open a new bug report for this very issue? |
|
Aw dang it. Can't believe that I missed that one, I'll do a fix :) |
sbom always outputs JSON to stdout, but the pnpm log reporter could write warnings (e.g. engine mismatch) to stdout before the JSON, breaking parsers and piping to files. Refs: pnpm#10592, pnpm#10923
sbom always outputs JSON to stdout, but the pnpm log reporter could write warnings (e.g. engine mismatch) to stdout before the JSON, breaking parsers and piping to files. Refs: pnpm#10592, pnpm#10923
sbom always outputs JSON to stdout, but the pnpm log reporter could write warnings (e.g. engine mismatch) to stdout before the JSON, breaking parsers and piping to files. Refs: pnpm#10592, pnpm#10923
sbom always outputs JSON to stdout, but the pnpm log reporter could write warnings (e.g. engine mismatch) to stdout before the JSON, breaking parsers and piping to files. Refs: pnpm#10592, pnpm#10923
Closes #9088
Adds
pnpm sbomfor generating Software Bill of Materials in CycloneDX 1.7 JSON or SPDX 2.3 JSON.Follows the
pnpm licensesarchitecture — a core library (@pnpm/sbom) and a thin CLI plugin (@pnpm/plugin-commands-sbom). Walks the lockfile vialockfileWalkerGroupImporterStepsand reads package.json from the CAFS store for license/author metadata.--lockfile-onlyskips the store entirely. Supports--prod/--dev/--no-optionalfiltering and workspace--filter.One new dependency:
spdx-license-idsfor license classification. Output validated against official CycloneDX 1.7 and SPDX 2.3 JSON schemas.