-
Notifications
You must be signed in to change notification settings - Fork 6.6k
Description
Search before asking
- I had searched in the issues and found no similar issues.
Apache SkyWalking Component
License Tools (apache/skywalking-eyes)
What happened
TL;DR: if the pattern includes "Copyright", then the whole regex will be replaced by "" by the OneLineNormalizer due to a wrong (IMHO) text replacement.
My copyright header is supposed to be:
// Copyright 2022-2023 the Kubeapps contributors.
// SPDX-License-Identifier: Apache-2.0In the .licenserc.yaml file, I have set pattern: Copyright: (?:\d{4}-\d{4}|\d{4}) the Kubeapps contributors\.'.
Then I run license-eye -c .\.licenserc.yaml header check -v debug and it returns OK, that's OK.
However, when I play around with my header and change it to Copyright whatever... the "header check" result is still OK, when it clearly shouldn't be.
What you expected to happen
I'd rather expect the check to fail when passing Copyright whatever and setting pattern: Copyright: (?:\d{4}-\d{4}|\d{4}) the Kubeapps contributors\.'.
How to reproduce
- Install the tool with
go install github.com/apache/skywalking-eyes/cmd/license-eye@latest - Run
license-eye header checkin a directory containing the files below:
.licenserc.yaml
header:
license:
spdx-id: Apache-2.0
copyright-owner: the Kubeapps contributors
pattern: |
Copyright (?:\d{4}-\d{4}|\d{4}) the Kubeapps contributors.
SPDX-License-Identifier: Apache-2.0main.go
// Copyright whatever.
// SPDX-License-Identifier: Apache-2.0
package main
import (
"fmt"
)
func main() {
fmt.Println("Hello world")
}Anything else
Inspecting the code, I guess this is caused by a wrong regex replacement when normalizing the regex. Let me explain:
All the patterns are normalized here (this is where we add (?i).*" + pattern + ".*")
Then, NormalizePattern applies all the normalizers:
One of them is the OneLinerNormalizer, which replaces the string with the corresponding replacement string:
And... here comes the issue: there are two replacements that match til the end of the line, which is erasing the whole regex in the pattern (change added in apache/skywalking-eyes@3a6d309)
So the the regex becomes now "" and the yielded normalized regex is therefore (?i).*" + ""+ ".*", that is (?i).*.*... which always would return a match :S
See how this regex is matching the whole line:
Note this OneLinerNormalizer not only affects the pattern, but also the files (see how the copyright line disappears)
DEBUG Checking file: main.rs
DEBUG After normalized by github.com/apache/skywalking-eyes/pkg/license.CommentIndicatorNormalizer:
DEBUG Copyright 2023 the Kubeapps contributors.
SPDX-License-Identifier: Apache-2.0
....DEBUG After normalized by github.com/apache/skywalking-eyes/pkg/license.OneLineNormalizer:
DEBUG SPDX-License-Identifier: Apache-2.0 use clap::Parser; ....So, if I just replace the pattern: Copyright: foo'. with pattern: foo' (removing the word "copyright"), I'd have yet another problem: the check will always fail since foo would have been erased.
In short, I think the regexes (?m)^\s*([cC©])?\s*Copyright (\([cC©]\))?.+$ and (?m)^\s*Portions Copyright (\([cC©]\))?.+$, should be removed the .+$ part to avoid undesired matches. What do you think? Am I missing something?
Are you willing to submit PR?
- Yes I am willing to submit a PR!
Code of Conduct
- I agree to follow this project's Code of Conduct
