Remove MetadataType from core package object and normalize JSON metadataType values#1983
Merged
Remove MetadataType from core package object and normalize JSON metadataType values#1983
Conversation
Signed-off-by: Alex Goodman <wagoodman@users.noreply.github.com>
Signed-off-by: Alex Goodman <wagoodman@users.noreply.github.com>
Signed-off-by: Alex Goodman <wagoodman@users.noreply.github.com>
Signed-off-by: Alex Goodman <wagoodman@users.noreply.github.com>
Signed-off-by: Alex Goodman <wagoodman@users.noreply.github.com>
Signed-off-by: Alex Goodman <wagoodman@users.noreply.github.com>
Signed-off-by: Alex Goodman <wagoodman@users.noreply.github.com>
Signed-off-by: Alex Goodman <wagoodman@users.noreply.github.com>
Signed-off-by: Alex Goodman <wagoodman@users.noreply.github.com>
Signed-off-by: Alex Goodman <wagoodman@users.noreply.github.com>
Signed-off-by: Alex Goodman <wagoodman@users.noreply.github.com>
Signed-off-by: Alex Goodman <wagoodman@users.noreply.github.com>
Signed-off-by: Alex Goodman <wagoodman@users.noreply.github.com>
Signed-off-by: Alex Goodman <wagoodman@users.noreply.github.com>
Signed-off-by: Alex Goodman <wagoodman@users.noreply.github.com>
Signed-off-by: Alex Goodman <wagoodman@users.noreply.github.com>
Signed-off-by: Alex Goodman <wagoodman@users.noreply.github.com>
Signed-off-by: Alex Goodman <wagoodman@users.noreply.github.com>
…adata Signed-off-by: Alex Goodman <wagoodman@users.noreply.github.com>
Signed-off-by: Alex Goodman <wagoodman@users.noreply.github.com>
Signed-off-by: Alex Goodman <wagoodman@users.noreply.github.com>
Signed-off-by: Alex Goodman <wagoodman@users.noreply.github.com>
Benchmark Test ResultsBenchmark results from the latest changes vs base branch |
Signed-off-by: Alex Goodman <wagoodman@users.noreply.github.com>
Contributor
Author
|
Semantic diff for reviewers between the v11 and v12 json schemas BEFORE the struct renames CodeThe python code that generated this list import json
import difflib
original_schema = "schema/json/schema-11.0.1.json"
new_schema = "schema/json/schema-12.0.0.json"
# {old-type-name: new-type-name}
type_def_mapping = {
"AlpmMetadata": "arch-alpm-db-record",
"ApkMetadata": "alpine-apk-db-record",
"BinaryMetadata": "binary-signature",
"CocoapodsMetadata": "cocoa-podfile-lock",
"ConanLockMetadata": "c-conan-lock",
"ConanMetadata": "c-conan",
"DartPubMetadata": "dart-pubspec-lock",
"DotnetPortableExecutableMetadata": "dotnet-portable-executable",
"DotnetDepsMetadata": "dotnet-deps",
"DpkgMetadata": "debian-dpkg-db-record",
"GemMetadata": "ruby-gemspec",
"GolangBinMetadata": "go-module-binary-buildinfo",
"GolangModMetadata": "go-module",
"HackageMetadata": "haskell-hackage-stack",
"JavaMetadata": "java-archive",
"KbPackageMetadata": "microsoft-kb-patch",
"LinuxKernelMetadata": "linux-kernel-archive",
"LinuxKernelModuleMetadata": "linux-kernel-module",
"MixLockMetadata": "elixir-mix-lock",
"NixStoreMetadata": "nix-store",
"NpmPackageJSONMetadata": "javascript-npm-package",
"NpmPackageLockJSONMetadata": "javascript-npm-package-lock",
"PhpComposerJSONMetadata": "php-composer-lock",
"PortageMetadata": "gentoo-portage-db-record",
"PythonPackageMetadata": "python-package",
"PythonPipfileLockMetadata": "python-pipfile-lock",
"PythonRequirementsMetadata": "python-pip-requirements",
"RebarLockMetadata": "erlang-rebar-lock",
"RDescriptionFileMetadata": "r-description",
"RpmdbFileRecord": "rpm-file-record",
"RpmMetadata": "redhat-rpm-db-record",
"RpmdbMetadata": "redhat-rpm-db-record",
"RpmDBMetadata": "redhat-rpm-db-record",
"RpmArchiveMetadata": "redhat-rpm-archive",
"SwiftPackageManagerMetadata": "swift-package-manager-lock",
"CargoPackageMetadata": "rust-cargo-lock"
}
def main():
original_type_definitions = extract_type_definitions(original_schema)
new_type_definitions = extract_type_definitions(new_schema)
new_names_diffed = set()
names_with_same_content = set()
for definition_name, old_def in original_type_definitions.items():
new_name = get_new_name(definition_name)
if not new_name:
new_name = definition_name
new_def = new_type_definitions.get(new_name, "")
if not new_def:
print("Missing definition in new schema: {}".format(definition_name))
continue
new_names_diffed.add(new_name)
# diff the definitions
diff = difflib.unified_diff(old_def.splitlines(), new_def.splitlines(), fromfile=original_schema, tofile=new_schema)
diff = "\n".join(diff)
if diff:
print("Diff for {}".format(definition_name))
print(diff)
print()
else:
names_with_same_content.add(definition_name)
# for all new names not processed, print a warning
for definition_name, new_def in new_type_definitions.items():
if definition_name not in new_names_diffed:
print("Missing equivalent definition in original schema: {}".format(definition_name))
print(f"Definitions with same content: {len(names_with_same_content)}")
for name in sorted(list(names_with_same_content)):
print(" -", name)
def extract_type_definitions(schema_file_path) -> dict[str, str]:
with open(schema_file_path, "r") as schema_file:
schema = json.load(schema_file)
definitions = schema.get("$defs", {})
type_definitions = {}
for definition_name, definition in definitions.items():
# if definition_name in type_def_mapping:
# new_name = to_camel_case(type_def_mapping[definition_name])
# # print("Renaming {} to {}".format(definition_name, new_name))
# definition_name = new_name
# # else:
# # print("No mapping for {}".format(definition_name))
type_definitions[definition_name] = json.dumps(definition, indent=2, sort_keys=True)
return type_definitions
def get_new_name(name: str) -> str | None:
if name in type_def_mapping:
return to_camel_case(type_def_mapping[name])
def to_camel_case(s: str) -> str:
s = s.replace("-", "_")
return ''.join(x.capitalize() or '_' for x in s.split('_'))
if __name__ == "__main__":
main() |
…adata Signed-off-by: Alex Goodman <wagoodman@users.noreply.github.com>
Signed-off-by: Alex Goodman <wagoodman@users.noreply.github.com>
Signed-off-by: Alex Goodman <wagoodman@users.noreply.github.com>
kzantow
previously approved these changes
Aug 11, 2023
Contributor
kzantow
left a comment
There was a problem hiding this comment.
I don't see any blocking issues, but left a suggestion about defining the type-to-name mappings for JSON.
Removing approval so this doesn't accidentally get merged until we're ready for it
spiffcs
previously approved these changes
Aug 17, 2023
Contributor
|
@wagoodman I read through everything here and no notes or comments the change makes sense IMO - feel free to merge when you think syft is ready for the major schema bump |
…adata Signed-off-by: Alex Goodman <wagoodman@users.noreply.github.com>
Signed-off-by: Alex Goodman <wagoodman@users.noreply.github.com>
…adata Signed-off-by: Alex Goodman <wagoodman@users.noreply.github.com>
Signed-off-by: Alex Goodman <wagoodman@users.noreply.github.com>
Signed-off-by: Alex Goodman <wagoodman@users.noreply.github.com>
there have been enough changes to warrant a review on the new change set
Signed-off-by: Alex Goodman <wagoodman@users.noreply.github.com>
Signed-off-by: Alex Goodman <wagoodman@users.noreply.github.com>
kzantow
reviewed
Oct 27, 2023
Signed-off-by: Alex Goodman <wagoodman@users.noreply.github.com>
…adata Signed-off-by: Alex Goodman <wagoodman@users.noreply.github.com>
e4a4303 to
1d867ac
Compare
GijsCalis
pushed a commit
to GijsCalis/syft
that referenced
this pull request
Feb 19, 2024
…ataType values (anchore#1983) * [wip] Signed-off-by: Alex Goodman <wagoodman@users.noreply.github.com> * distinct the package metadata functions Signed-off-by: Alex Goodman <wagoodman@users.noreply.github.com> * remove metadata type from package core model Signed-off-by: Alex Goodman <wagoodman@users.noreply.github.com> * incorporate review feedback for names Signed-off-by: Alex Goodman <wagoodman@users.noreply.github.com> * add RPM archive metadata and split parser helpers Signed-off-by: Alex Goodman <wagoodman@users.noreply.github.com> * clarify the python package metadata type Signed-off-by: Alex Goodman <wagoodman@users.noreply.github.com> * rename the KB metadata type Signed-off-by: Alex Goodman <wagoodman@users.noreply.github.com> * break hackage and composer types by use case Signed-off-by: Alex Goodman <wagoodman@users.noreply.github.com> * linting fix Signed-off-by: Alex Goodman <wagoodman@users.noreply.github.com> * fix encoding and decoding for syft-json and cyclonedx Signed-off-by: Alex Goodman <wagoodman@users.noreply.github.com> * bump json schema to 11 Signed-off-by: Alex Goodman <wagoodman@users.noreply.github.com> * update cyclonedx-json snapshots Signed-off-by: Alex Goodman <wagoodman@users.noreply.github.com> * update cyclonedx-xml snapshots Signed-off-by: Alex Goodman <wagoodman@users.noreply.github.com> * update spdx-json snapshots Signed-off-by: Alex Goodman <wagoodman@users.noreply.github.com> * update spdx-tv snapshots Signed-off-by: Alex Goodman <wagoodman@users.noreply.github.com> * update syft-json snapshots Signed-off-by: Alex Goodman <wagoodman@users.noreply.github.com> * correct metadata type in stack yaml parser test Signed-off-by: Alex Goodman <wagoodman@users.noreply.github.com> * fix bom-ref redactor for cyclonedx-xml Signed-off-by: Alex Goodman <wagoodman@users.noreply.github.com> * add tests for legacy package metadata names Signed-off-by: Alex Goodman <wagoodman@users.noreply.github.com> * regenerate json schema v11 Signed-off-by: Alex Goodman <wagoodman@users.noreply.github.com> * fix legacy HackageMetadataType reflect type value check Signed-off-by: Alex Goodman <wagoodman@users.noreply.github.com> * fix linting Signed-off-by: Alex Goodman <wagoodman@users.noreply.github.com> * packagemetadata discovery should account for type shadowing Signed-off-by: Alex Goodman <wagoodman@users.noreply.github.com> * fix linting Signed-off-by: Alex Goodman <wagoodman@users.noreply.github.com> * fix cli tests Signed-off-by: Alex Goodman <wagoodman@users.noreply.github.com> * bump json schema version to v12 Signed-off-by: Alex Goodman <wagoodman@users.noreply.github.com> * update json schema to incorporate changes from main Signed-off-by: Alex Goodman <wagoodman@users.noreply.github.com> * add syft-json legacy config option Signed-off-by: Alex Goodman <wagoodman@users.noreply.github.com> * add tests around v11-v12 json decoding Signed-off-by: Alex Goodman <wagoodman@users.noreply.github.com> * add docs for SYFT_JSON_LEGACY Signed-off-by: Alex Goodman <wagoodman@users.noreply.github.com> * rename structs to be compliant with new naming scheme Signed-off-by: Alex Goodman <wagoodman@users.noreply.github.com> --------- Signed-off-by: Alex Goodman <wagoodman@users.noreply.github.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
This PR:
pkg.Package.MetadataTypefrom the core package model, keeping it as a concern for thesyftjsonformat.SYFT_FORMAT_JSON_LEGACY=<bool>(defaulting tofalse) to the syft application config. This allows folks to be able to fallback to the old JSON metadata type names (and other soon-to-be-breaking changes) to get to a pre-1.0 state of the JSON output.SYFT_TEMPLATEconfiguration toSYFT_FORMAT_TEMPLATEto be consistent with future format related configurations.pkg.*Metadatastructs to be consistent with the metadata type names (they do not always match exactly).Doing this necessarily breaks the JSON schema, so it has been rev'd to v12 in this PR.
The downstream grype PR has been drafted: anchore/grype#1423
For a semantic diff of the v11.0.1 vs v12 JSON schema see #1983 (comment) .
Fixes #1844
Fixes #1735