Skip to content

[Bug] Using pattern with "copyright" always yields a (?i).*.* regex #11218

@antgamdia

Description

@antgamdia

Search before asking

  • I had searched in the issues and found no similar issues.

Apache SkyWalking Component

License Tools (apache/skywalking-eyes)

What happened

TL;DR: if the pattern includes "Copyright", then the whole regex will be replaced by "" by the OneLineNormalizer due to a wrong (IMHO) text replacement.

My copyright header is supposed to be:

// Copyright 2022-2023 the Kubeapps contributors.
// SPDX-License-Identifier: Apache-2.0

In the .licenserc.yaml file, I have set pattern: Copyright: (?:\d{4}-\d{4}|\d{4}) the Kubeapps contributors\.'.
Then I run license-eye -c .\.licenserc.yaml header check -v debug and it returns OK, that's OK.

However, when I play around with my header and change it to Copyright whatever... the "header check" result is still OK, when it clearly shouldn't be.

What you expected to happen

I'd rather expect the check to fail when passing Copyright whatever and setting pattern: Copyright: (?:\d{4}-\d{4}|\d{4}) the Kubeapps contributors\.'.

How to reproduce

  1. Install the tool with go install github.com/apache/skywalking-eyes/cmd/license-eye@latest
  2. Run license-eye header check in a directory containing the files below:
.licenserc.yaml
header:
  license:
    spdx-id: Apache-2.0
    copyright-owner: the Kubeapps contributors
    pattern: |
      Copyright (?:\d{4}-\d{4}|\d{4}) the Kubeapps contributors.
      SPDX-License-Identifier: Apache-2.0
main.go
// Copyright whatever.
// SPDX-License-Identifier: Apache-2.0

package main

import (
	"fmt"
)

func main() {

	fmt.Println("Hello world")
}

Anything else

Inspecting the code, I guess this is caused by a wrong regex replacement when normalizing the regex. Let me explain:

All the patterns are normalized here (this is where we add (?i).*" + pattern + ".*")

https://github.com/apache/skywalking-eyes/blob/16b9726be37536a05279e061f0da02d205a2af77/pkg/header/config.go#L113

Then, NormalizePattern applies all the normalizers:

https://github.com/apache/skywalking-eyes/blob/3a6d3090d78b7c104cb55ce4cc63a4333d66ecd0/pkg/license/norm.go#L268

One of them is the OneLinerNormalizer, which replaces the string with the corresponding replacement string:

https://github.com/apache/skywalking-eyes/blob/3a6d3090d78b7c104cb55ce4cc63a4333d66ecd0/pkg/license/norm.go#L296-L301

And... here comes the issue: there are two replacements that match til the end of the line, which is erasing the whole regex in the pattern (change added in apache/skywalking-eyes@3a6d309)

https://github.com/apache/skywalking-eyes/blob/3a6d3090d78b7c104cb55ce4cc63a4333d66ecd0/pkg/license/norm.go#L237C8-L247

So the the regex becomes now "" and the yielded normalized regex is therefore (?i).*" + ""+ ".*", that is (?i).*.*... which always would return a match :S

See how this regex is matching the whole line:

image

Note this OneLinerNormalizer not only affects the pattern, but also the files (see how the copyright line disappears)

DEBUG Checking file: main.rs
DEBUG After normalized by github.com/apache/skywalking-eyes/pkg/license.CommentIndicatorNormalizer:
DEBUG  Copyright 2023 the Kubeapps contributors.
 SPDX-License-Identifier: Apache-2.0

....
DEBUG After normalized by github.com/apache/skywalking-eyes/pkg/license.OneLineNormalizer:
DEBUG   SPDX-License-Identifier: Apache-2.0 use clap::Parser; ....

So, if I just replace the pattern: Copyright: foo'. with pattern: foo' (removing the word "copyright"), I'd have yet another problem: the check will always fail since foo would have been erased.

In short, I think the regexes (?m)^\s*([cC©])?\s*Copyright (\([cC©]\))?.+$ and (?m)^\s*Portions Copyright (\([cC©]\))?.+$, should be removed the .+$ part to avoid undesired matches. What do you think? Am I missing something?

Are you willing to submit PR?

  • Yes I am willing to submit a PR!

Code of Conduct

Metadata

Metadata

Assignees

Labels

bugSomething isn't working and you are sure it's a bug!license eye

Type

No type

Projects

No projects

Relationships

None yet

Development

No branches or pull requests

Issue actions