Fix: map license URLs to SPDX IDs for machine readable format#4244
Fix: map license URLs to SPDX IDs for machine readable format#4244Avadhut03 wants to merge 2 commits intoanchore:mainfrom
Conversation
Signed-off-by: Avadhut03 <avadhutkul60@gmail.com>
|
Thanks for the PR @Avadhut03! I think we need this to be in a separate area since I think I'm open to having two maps here. One generated from the official SPDX source and the other contributed by users who see areas where we can map the URL and get better license answers. cc @wagoodman for when he get's back to get a +1 on adding a maintainer map that we merge with the generated SPDX map on compile for one single lookup |
|
Thanks for the feedback @spiffcs. That makes sense. I can update the PR to add a separate map for maintainer/user-contributed URLs and merge it with the generated SPDX map during compile time. Will wait for @wagoodman’s thoughts as well before making the changes. |
| var urlToLicense = map[string]string{ | ||
| "ftp://ftp.tin.org/pub/news/utils/newsx/newsx-1.6.tar.gz": "Zeeff", | ||
| "http://apache.org/licenses/LICENSE-1.1": "Apache-1.1", | ||
| "http://www.gnu.org/licenses/old-licenses/lgpl-2.1.html": "LGPL-2.1-only", |
There was a problem hiding this comment.
The SPDX license list is really the source of truth for these kinds of changes. If they don't want to accept a contribution for these URLs then we can update this code to account for manual adjustments, but we couldn't take this as-is since it's manually updating a file that is automatically generated.
We pull in the SPDX license list when generating this code, which is maintained in a github repo and gets regular release. It looks like adding these URLs would be a small update to an XML file and this kind of change looks to be regularly accepted.
That way once your URL enhancements are accepted and released in the SPDX license list it would flow downstream to us (and other users of this list, which there are a lot of, get the benefit too).
| if name == "" && url == "" { | ||
| continue | ||
| } | ||
| if licInfo, ok := spdxlicense.LicenseByURL(url); ok { |
There was a problem hiding this comment.
it looks like this is already covered here, but I might be missing a nuance with the caller.
|
Hi @Avadhut03 ! Are you going to continue on this pr? Just aksing as your last comment was some time ago. We could also offer some help if needed. |
|
@whereIsMyDipp - let me update this PR so that we can get what I outlined here:
After we merge I'll take some time to make some PRs against the repo that @wagoodman suggested and see how long that takes. Apologies for the staleness of this issue. It got lost in the pile of work 😢 |
|
When I get feedback on this PR I'll pull the trigger on adding http://www.eclipse.org/org/documents/edl-v10.php as a PR too and then we'll get this with the next SPDX update |
Signed-off-by: Christopher Phillips <32073428+spiffcs@users.noreply.github.com>
|
Alright! So with #4588 we should fix: Given the https link already exists in the upstream list: spdx/license-list-XML#2935 was accepted so we should have a fix for http://www.gnu.org/licenses/old-licenses/lgpl-2.1.html when the next list comes out I'm going to close this PR since we're not dong this exact solution anymore. But thank you to the original author @Avadhut03 and @whereIsMyDipp for contributing to the discussion here and helping find a solution to support these gaps in URL lookup. We should see on the next release of syft both of these URL supported (upstream update, and scheme lookup fix). |
This PR fixes an issue in Syft where Java project licenses with URLs were not properly mapped to SPDX license IDs.
Currently, multiple or even single license URLs were being reported as LicenseRef-http---... instead of their proper SPDX identifiers, making the output machine-unreadable.
With this change:
License URLs such as http://www.eclipse.org/legal/epl-v10.html are now correctly mapped to EPL-1.0.
Deprecated or older license URLs like http://www.gnu.org/licenses/old-licenses/lgpl-2.1.html are mapped to LGPL-2.1-only.
This ensures the licenseDeclared and licenseConcluded fields in SPDX and CycloneDX outputs are properly machine-readable.
This addresses the issues reported when analyzing Java dependencies in projects such as spring-petclinic.
Fixes #4233
Type of change
Bug fix (non-breaking change which fixes an issue)
Checklist:
I have added unit tests for LicenseByURL covering the new URL mappings
I have tested the changes in common scenarios (Java Maven projects with single/multiple license URLs)