feat(heuristics): add whitespace check to detect excessive spacing and invisible characters for malware check#1086
Conversation
db0e35c to
6978bd7
Compare
|
The CI test seems to be failing due to detecting excessive whitespace in |
Dismissing approval until integration test failure is resolved.
|
I have investigated this problem. in In Both of these examples are triggered by the excessive whitespace Semgrep rule as there are over 50 spaces before some of the indented lines. Both of these examples occur in docstrings, so my proposed solution (which I have tested does not trigger on I have used Something we may have to be wary of is benign code blocks that are excessively indented and will cause this to trigger. Many projects will not encounter this, as the indentation level will not reach more than 50 spaces and/or code linters will prevent this from happening, so I don't expect too many false positives with that, but it is a possibility. |
The |
e759fd3
a62851e to
c88ee73
Compare
…d invisible characters Signed-off-by: Amine <amine.raouane@enim.ac.ma>
Signed-off-by: Amine <amine.raouane@enim.ac.ma>
…d invisible characters Signed-off-by: Amine <amine.raouane@enim.ac.ma>
Signed-off-by: Amine <amine.raouane@enim.ac.ma>
Signed-off-by: Amine <amine.raouane@enim.ac.ma>
…tion threshold Signed-off-by: Amine <amine.raouane@enim.ac.ma>
Signed-off-by: Carl Flottmann <carl.flottmann@oracle.com>
c88ee73 to
667c4f0
Compare
Summary
This PR adds a new heuristic that analyzes code to detect suspicious use of excessive spaces and invisible characters. It checks whether the amount of spacing and invisible Unicode characters exceeds a defined threshold.
Description of changes
WhiteSpacesheuristic in a new Python module.heuristics.pyfile.WhiteSpacesAnalyzerheuristic.detect_malicious_metadata_check.pyto integrate and execute the new heuristic logic during analysis.Related issues
None
Checklist
verifiedlabel should appear next to all of your commits on GitHub.