Skip to content

Support analyzing only modified files #6

@adangel

Description

@adangel

With the new input parameter analyzeModifiedFilesOnly this action now determines the modified files of a push or pull_request and executes PMD only on these files. This new input parameter is enabled by default.

Instead of analyze all files under "sourcePath", only the files that have been touched in a pull request or push will be analyzed. This makes the analysis faster and helps especially bigger projects which gradually want to introduce PMD. This helps in enforcing that no new code violation is introduced.

Depending on the analyzed language, the results might be less accurate results. At the moment, this is not a problem, as PMD mostly analyzes each file individually, but that might change in the future.

If the change is very big, not all files might be analyzed. Currently the maximum number of modified files is 300.

Note: the touched files are analyzed completely and all found violations within these files are reported - regardless whether the specific lines have been modified or not.


For bigger projects or projects, which gradually want to introduce PMD, it would be beneficial if only the currently modified files would be analyzed. This helps in enforcing that no new code violation is introduced.

Maybe a new input parameter needs to be introduced, similar like pmd-analyser-action's analyze-all-code.

Running the action in this mode, we would probably ignore the sourcePath input parameter and only look at the github event's compare payload: For pull requests we could look at the diff_url or patch_url. For pushes we might be able to use the "compare" url. The REST API also has a way to compare two commits. For pull requests there is list-pull-requests-files.
If the files can be determined by calling the GitHub API, then there is no full checkout needed (fetch-depth).

Instead of providing the sourcePath to PMD (-d cli option), the --file-list option needs to be used, generating a comma-delimited file list on the fly.

A simple solution for the first version could be: let PMD analyze all files and just filter the found violations.

Currently the action doesn't use an API token. This probably should be changed, if the patches/diffs are retrieved by the API calls. Then @actions-github. In theory, the default token should be available via the github-context. Since we only need read-only access, that should be sufficient. Then no extra configuration is required. At least we can provide a reasonable default value. The default value can be provided like the "checkout" action does it: https://github.com/actions/checkout/blob/230611dbd0eb52da1e1f4f7bc8bb0c3a339fc8b7/action.yml#L12-L24

Which mode is active (analyzing all files or only modified files) should be logged in the build log, e.g. "Running PMD on path xyz" or "Running PMD on x files".

Metadata

Metadata

Assignees

Labels

enhancementNew feature or request

Type

No type

Projects

No projects

Milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions