fix: CONTAINSANY with method call on left-hand side (#3581)#3582
fix: CONTAINSANY with method call on left-hand side (#3581)#3582
Conversation
Set $current context variable before WHERE clause evaluation in
ScanWithFilterStep, and handle arrays in ContainsAnyCondition.execute()
so that expressions like txt.split(' ') CONTAINSANY 'te' work correctly.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Summary of ChangesHello, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed! This pull request addresses a bug where SQL queries using Highlights
🧠 New Feature in Public Preview: You can now enable Memory to help Gemini Code Assist learn from your team's feedback. This makes future code reviews more consistent and personalized to your project's style. Click here to enable Memory in your admin console. Changelog
Activity
Using Gemini Code AssistThe full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips. Invoking Gemini You can request assistance from Gemini at any point by creating a comment using either
Customization To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a Limitations & Feedback Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here. You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension. Footnotes
|
🧪 CI InsightsHere's what we observed from your CI run for e6dca65. 🟢 All jobs passed!But CI Insights is watching 👀 |
Code ReviewThe fix correctly identifies and addresses two distinct root causes for the reported issue. Here's my analysis: What's Working Well
Issues to Address1. The SELECT FROM doc WHERE txt.split(' ') CONTAINSALL 'te'would silently return wrong results with this PR. Consider applying the symmetric fix there, or at minimum add a test demonstrating the current (broken) behaviour as a TODO. 2. Multi-element right-side iterator bug exposed by this fix (pre-existing, but newly reachable) In the while (rightIterator.hasNext()) {
final Object leftItem = rightIterator.next(); // items from *right*, confusingly named
while (leftIterator.hasNext()) { // exhausted after first outer iteration!
final Object rightItem = leftIterator.next(); // items from *left*, confusingly named
if (leftItem != null && leftItem.equals(rightItem))
return true;
}
}This PR makes this path reachable for the first time via SELECT FROM doc WHERE txt.split(' ') CONTAINSANY ['te', 'hello']will silently fail to match 3. Missing null-field test case A document with no Minor ObservationsComment verbosity: The new comment in // must be set before WHERE evaluation so method calls (e.g. split()) resolve correctlywould be more consistent with the surrounding codebase style. Variable naming (pre-existing, not introduced here): In the SummaryThe core fix is sound and the regression test is a good start. The main ask before merging is:
|
There was a problem hiding this comment.
Code Review
This pull request addresses an issue where CONTAINSANY with method calls on the left-hand side failed due to the $current context variable not being set correctly and Java arrays not being handled. While the fix for the $current variable in ScanWithFilterStep is correct, the implementation of array support in ContainsAnyCondition introduces a logic bug. This bug, caused by improper handling of non-resettable iterators in a nested loop, can lead to incorrect query results, potentially having security implications if the operator is used for access control or data filtering. It is recommended to convert the right-hand side to a Set and iterate through the left-hand side once to ensure correctness and efficiency. The addition of a comprehensive regression test (Issue3581ContainsAnyMethodLhsTest) provides excellent validation for the $current fix.
| else if (left != null && left.getClass().isArray()) | ||
| left = MultiValue.getMultiValueIterator(left); |
There was a problem hiding this comment.
The implementation of CONTAINSANY for arrays (and other non-Collection Iterables) contains a logic error in the nested loop. When left is an Iterator (e.g., from an array returned by split()), the inner loop while (leftIterator.hasNext()) (lines 70-74) completely consumes the iterator during the first iteration of the outer loop while (rightIterator.hasNext()) (lines 68-75). If a match is not found in the first pass, subsequent iterations of the outer loop will find leftIterator exhausted, potentially leading to incorrect false results even when a match exists. This can lead to security filter bypasses if CONTAINSANY is used in a WHERE NOT clause to exclude sensitive records based on multiple tags or attributes.
…d test - ContainsAllCondition.execute() had the same missing array handling - Added test for null/missing txt field to verify graceful handling - Added CONTAINSALL with split() test - Trimmed comment in ScanWithFilterStep Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Code ReviewOverall this is a clean, well-scoped fix. The root cause analysis is correct and the implementation follows existing patterns. A few observations: What's Good
Tests - Five distinct test cases (single match, multiple matches, no match, null field, CONTAINSALL variant) provide solid regression coverage. The null-field test is especially valuable for verifying graceful handling of missing properties. Pre-existing issue now more reachableThe iterator-exhaustion bug in Example: Consider adding a test case to document the expected behavior. If it fails, it is worth opening a follow-up issue to fix the inner iterator exhaustion in There is also a cosmetic naming inconsistency in the existing iterator loop: Minor: Test isolationEach test method creates its own type (doc3581, doc3581b, ...). If SummaryThe fix correctly addresses both root causes: the missing |
Coverage summary from CodacySee diff coverage on Codacy
Coverage variation details
Coverage variation is the difference between the coverage for the head and common ancestor commits of the pull request branch: Diff coverage details
Diff coverage is the percentage of lines that are covered by tests out of the coverable lines that the pull request added or modified: See your quality gate settings Change summary preferences |
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## main #3582 +/- ##
==========================================
- Coverage 65.06% 64.95% -0.11%
==========================================
Files 1514 1514
Lines 103658 103659 +1
Branches 21454 21454
==========================================
- Hits 67443 67332 -111
- Misses 26958 27080 +122
+ Partials 9257 9247 -10 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
Summary
ScanWithFilterStepto set$currentcontext variable before evaluating the WHERE clause filter, so method calls likesplit()can resolve their parametersContainsAnyCondition.execute()to handle Java arrays (e.g.,String[]returned bysplit()) usingMultiValue.getMultiValueIterator()Closes #3581
Test plan
Issue3581ContainsAnyMethodLhsTestpasses (3 tests)ContainsAnyConditionTestpassesFilterStepTestpasses