Skip to content

Java: Add models for Commons-Lang's RegExUtils class#5339

Merged
aschackmull merged 4 commits intogithub:mainfrom
smowton:smowton/feature/commons-regex-utils
Mar 10, 2021
Merged

Java: Add models for Commons-Lang's RegExUtils class#5339
aschackmull merged 4 commits intogithub:mainfrom
smowton:smowton/feature/commons-regex-utils

Conversation

@smowton
Copy link
Contributor

@smowton smowton commented Mar 5, 2021

No description provided.

@smowton smowton requested a review from a team as a code owner March 5, 2021 14:40
Pattern cleanPattern = Pattern.compile("clean");
Pattern taintedPattern = Pattern.compile(taint());

sink(RegExUtils.removeAll(taint(), cleanPattern)); // $hasTaintFlow=y
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

=y could be omitted, see #5347

Copy link
Contributor Author

@smowton smowton Mar 8, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll do this once that PR lands

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's merged now.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

@@ -0,0 +1,2 @@
lgtm,codescanning
* Added models for Apache Commons-Lang's `RegExUtils` class. This means that any query that tracks tainted data may return additional results in cases where a `RegExUtils` transformation is part of the path from source to sink.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Really minor, but it looks like the project itself does not use a hyphen, see https://commons.apache.org/proper/commons-lang/

Suggested change
* Added models for Apache Commons-Lang's `RegExUtils` class. This means that any query that tracks tainted data may return additional results in cases where a `RegExUtils` transformation is part of the path from source to sink.
* Added models for Apache Commons Lang's `RegExUtils` class. This means that any query that tracks tainted data may return additional results in cases where a `RegExUtils` transformation is part of the path from source to sink.

@aschackmull aschackmull changed the title Add models for Commons-Lang's RegExUtils class Java: Add models for Commons-Lang's RegExUtils class Mar 8, 2021
@aschackmull
Copy link
Contributor

These steps seem like something that could plausibly also be sanitisers in the context of some queries (and depending on the exact pattern). Historically I don't think we've modelled these sort of steps as taint steps for that reason. @lcartey Looks like you requested these steps - any insights on this?

@lcartey
Copy link
Contributor

lcartey commented Mar 8, 2021

These steps seem like something that could plausibly also be sanitisers in the context of some queries (and depending on the exact pattern). Historically I don't think we've modelled these sort of steps as taint steps for that reason. @lcartey Looks like you requested these steps - any insights on this?

I would prefer to model these as taint steps, and add them as sanitizers for specific queries where that makes sense. I think this makes an appropriate trade-off between false positives and false negatives. For example, I don't think we should consider any regex approach as a sanitizer for SQL injection.

@aschackmull
Copy link
Contributor

I would prefer to model these as taint steps, and add them as sanitizers for specific queries where that makes sense. I think this makes an appropriate trade-off between false positives and false negatives. For example, I don't think we should consider any regex approach as a sanitizer for SQL injection.

SGTM. We should make sure to give similar treatment to the jdk equivalents + String.replace etc., which I believe currently aren't taint steps. I'm guessing that the main query that want things like that as sanitizers is XSS (correct me if I'm wrong and/or more queries should be considered, @lcartey )

@smowton
Copy link
Contributor Author

smowton commented Mar 9, 2021

@aschackmull what order do you want these things to happen?

@aschackmull
Copy link
Contributor

@aschackmull what order do you want these things to happen?

It's probably fine to merge this PR as-is first, but the other additions should probably be accompanied with suitable sanitisers to avoid an intermediate state where some queries yield a bunch of extra FPs.

@smowton smowton dismissed a stale review via 189b221 March 9, 2021 15:12
@smowton smowton force-pushed the smowton/feature/commons-regex-utils branch from c6e4cdd to 189b221 Compare March 9, 2021 15:12
@aschackmull aschackmull merged commit ed250d5 into github:main Mar 10, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants