Add REUSE compliance for machine-readable licensing and SBOM generation#3968
Conversation
The jemalloc section was stale: it still described custom source patches for active defragmentation that were removed in valkey-io#1266. The only remaining source modification is the VALKEY_VENDORED_JEMALLOC macro in jemalloc.sh. The Lua section was incomplete: it was missing readonly tables, globals protection, and CVE patches that have been applied over the years. Signed-off-by: Viktor Söderqvist <viktor.soderqvist@est.tech>
Replace the non-standard dual-license COPYING file with a single standard BSD-3-Clause text. The previous format (two full license blocks in one file) was not recognized by GitHub's license detection, OpenSSF Scorecard, or other automated compliance tools. Add REUSE structure (.reuse/dep5 and LICENSES/) to provide machine-readable per-file copyright and license annotations, covering both Valkey source code and all vendored dependencies. Fix invalid SPDX-License-Identifier headers in 6 source files that used 'BSD 3-Clause' (with space) instead of 'BSD-3-Clause'. Signed-off-by: Viktor Söderqvist <viktor.soderqvist@est.tech>
Runs 'reuse lint' on push and pull requests using the official fsfe/reuse-action to catch invalid SPDX identifiers and missing license/copyright annotations. Signed-off-by: Viktor Söderqvist <viktor.soderqvist@est.tech>
❌ Provenance Check AlertPotential code similarities detected with upstream repository.
This check was performed automatically by the Provenance Guard Action. |
📝 WalkthroughWalkthroughThis pull request establishes REUSE compliance infrastructure by adding a GitHub Actions workflow for automated checks, populating license text files, normalizing SPDX headers across source files to ChangesREUSE Compliance and License Standardization
🎯 2 (Simple) | ⏱️ ~10 minutes 🚥 Pre-merge checks | ✅ 5✅ Passed checks (5 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
This comment was marked as duplicate.
This comment was marked as duplicate.
There was a problem hiding this comment.
Actionable comments posted: 1
🧹 Nitpick comments (3)
.github/workflows/reuse.yml (1)
24-25: ⚡ Quick winConsider adding
persist-credentials: falseto the checkout action.The
actions/checkoutaction defaults to persisting GitHub credentials in the local.git/config, which can be a security risk if subsequent steps are compromised. Since this workflow only performs a read-only compliance check, explicitly disabling credential persistence is a security best practice.🔒 Proposed fix
- name: Checkout code uses: actions/checkout@de0fac2e4500dabe0009e67214ff5f5447ce83dd # v6.0.2 + with: + persist-credentials: false🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In @.github/workflows/reuse.yml around lines 24 - 25, Update the actions/checkout step to explicitly disable persisting GitHub credentials by adding persist-credentials: false to the checkout action configuration; locate the checkout step that uses actions/checkout@de0fac2e... (the "Checkout code" step) and add the persist-credentials: false key under it so the workflow does not store repo credentials in .git/config during this read-only compliance run.deps/README.md (2)
31-31: ⚡ Quick winCapitalize "GitHub" per proper noun convention.
The platform name should be capitalized as "GitHub" rather than "github".
📝 Proposed fix
-The jemalloc directory is pulled as a subtree from the upstream jemalloc github repo. To update it you should run from the project root: +The jemalloc directory is pulled as a subtree from the upstream jemalloc GitHub repo. To update it you should run from the project root:🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@deps/README.md` at line 31, Change the lowercase platform name "github" in the sentence "The jemalloc directory is pulled as a subtree from the upstream jemalloc github repo." to the proper-cased "GitHub" (i.e., replace "github" with "GitHub") so the README uses the correct proper noun casing.Source: Linters/SAST tools
29-29: ⚡ Quick winConsider adjusting heading level for proper hierarchy.
The heading uses
####(h4), but should use###(h3) since it follows the---(h2) "Jemalloc" section. Markdown heading levels should increment by one level at a time for proper document structure.📝 Proposed fix
-#### Updating/upgrading jemalloc +### Updating/upgrading jemalloc🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@deps/README.md` at line 29, Change the heading "#### Updating/upgrading jemalloc" to one level higher ("### Updating/upgrading jemalloc") so it properly follows the H2 "Jemalloc" section; update the heading text in the README (look for the exact string "#### Updating/upgrading jemalloc") to maintain correct markdown hierarchy.Source: Linters/SAST tools
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@src/cluster_migrateslots.c`:
- Line 4: This change touches cluster_migrateslots.c which requires an
architectural review by `@core-team`; update the PR by adding an explicit request
for `@core-team` review (e.g., add the `@core-team` mention in the PR description
and/or add the “needs-architectural-review” label), and update the commit
message or PR checklist to state “Request `@core-team` architectural review for
changes to cluster_migrateslots.c” so the mandatory review is clearly flagged
before merging.
---
Nitpick comments:
In @.github/workflows/reuse.yml:
- Around line 24-25: Update the actions/checkout step to explicitly disable
persisting GitHub credentials by adding persist-credentials: false to the
checkout action configuration; locate the checkout step that uses
actions/checkout@de0fac2e... (the "Checkout code" step) and add the
persist-credentials: false key under it so the workflow does not store repo
credentials in .git/config during this read-only compliance run.
In `@deps/README.md`:
- Line 31: Change the lowercase platform name "github" in the sentence "The
jemalloc directory is pulled as a subtree from the upstream jemalloc github
repo." to the proper-cased "GitHub" (i.e., replace "github" with "GitHub") so
the README uses the correct proper noun casing.
- Line 29: Change the heading "#### Updating/upgrading jemalloc" to one level
higher ("### Updating/upgrading jemalloc") so it properly follows the H2
"Jemalloc" section; update the heading text in the README (look for the exact
string "#### Updating/upgrading jemalloc") to maintain correct markdown
hierarchy.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Repository UI
Review profile: CHILL
Plan: Pro Plus
Run ID: bc0c986f-83c3-4a97-a1d0-3985d485225a
📒 Files selected for processing (18)
.github/workflows/reuse.yml.reuse/dep5COPYINGLICENSES/Apache-2.0.txtLICENSES/BSD-2-Clause.txtLICENSES/BSD-3-Clause.txtLICENSES/BSL-1.0.txtLICENSES/CC0-1.0.txtLICENSES/ISC.txtLICENSES/MIT.txtLICENSES/Zlib.txtdeps/README.mdsrc/cluster_migrateslots.csrc/endianconv.hsrc/fmtargs.hsrc/hashtable.csrc/valkey-benchmark-dataset.csrc/valkey-benchmark-dataset.h
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## unstable #3968 +/- ##
============================================
+ Coverage 76.58% 76.69% +0.10%
============================================
Files 162 162
Lines 80753 80752 -1
============================================
+ Hits 61844 61932 +88
+ Misses 18909 18820 -89
🚀 New features to boost your workflow:
|
|
REUSE is from FSF Europe, https://reuse.software/ Why it matters? Maybe mostly in Europe so far, but similar requirements may come to other continents. Motivation: EU Cyber Resilience Act (CRA) and SBOMsThe EU Cyber Resilience Act (CRA) requires manufacturers of "products with digital elements" placed on the EU market to provide a Software Bill of Materials (SBOM) as part of their technical documentation. What the CRA requires
Timeline
Open source exemptionFree and open source software developed outside the course of a commercial activity is exempt from CRA obligations. However, when a company takes Valkey and sells a product or service based on it, that company becomes the manufacturer and must produce the SBOM. Why this matters for ValkeyValkey itself has no obligation, but our downstream commercial users (cloud providers, appliance vendors, embedded integrators) will need SBOMs for their products containing Valkey. The REUSE structure introduced in this PR means they can generate a complete SPDX SBOM with a single command ( |
dvkashapov
left a comment
There was a problem hiding this comment.
Awesome, thank you! Posted some suggestions, questions.
|
Does every module related to Valkey need to have this?, such as Lua, vectorsearch, and json etc.. |
Not really, and Valkey doesn't need it either, but it helps companies that want to include Valkey in a product to track the dependencies and their licenses. The company that sells a product or service is the one who needs to provide the SBOM stuff to their customers. |
This comment was marked as duplicate.
This comment was marked as duplicate.
Add Mark Pulford (lua_cjson, strbuf) and Mike Pall (lua_bit) to the Lua stanza. Add Florian Loitsch to the fpconv stanza. Signed-off-by: Viktor Söderqvist <viktor.soderqvist@est.tech>
This comment was marked as duplicate.
This comment was marked as duplicate.
Use REUSE.toml instead of the deprecated .reuse/dep5 format. REUSE.toml is the current recommended format, supports richer metadata such as per-annotation comments, and propagates fields to SPDX output that dep5 did not. Signed-off-by: Viktor Söderqvist <viktor.soderqvist@est.tech>
Adds REUSE 3.3 structure (REUSE.toml, LICENSES/) covering all source files and vendored dependencies.
Replaces the non-standard dual-license COPYING file with a standard BSD-3-Clause text.
Updates the description of custom patches to Lua and Jemalloc in deps/README.md.
Benefits: