Skip to content

fix: CONTAINSANY with method call on left-hand side (#3581)#3582

Merged
robfrank merged 2 commits intomainfrom
fix/3581-containsany-method-result-lhs
Mar 5, 2026
Merged

fix: CONTAINSANY with method call on left-hand side (#3581)#3582
robfrank merged 2 commits intomainfrom
fix/3581-containsany-method-result-lhs

Conversation

@robfrank
Copy link
Copy Markdown
Collaborator

@robfrank robfrank commented Mar 5, 2026

Summary

  • Fix ScanWithFilterStep to set $current context variable before evaluating the WHERE clause filter, so method calls like split() can resolve their parameters
  • Fix ContainsAnyCondition.execute() to handle Java arrays (e.g., String[] returned by split()) using MultiValue.getMultiValueIterator()
  • Add regression test with 3 test cases covering single match, multiple matches, and no match scenarios

Closes #3581

Test plan

  • New regression test Issue3581ContainsAnyMethodLhsTest passes (3 tests)
  • Existing ContainsAnyConditionTest passes
  • Existing FilterStepTest passes
  • Broad SQL test suite passes (1378 tests, 0 failures)

Set $current context variable before WHERE clause evaluation in
ScanWithFilterStep, and handle arrays in ContainsAnyCondition.execute()
so that expressions like txt.split(' ') CONTAINSANY 'te' work correctly.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@gemini-code-assist
Copy link
Copy Markdown
Contributor

Summary of Changes

Hello, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request addresses a bug where SQL queries using CONTAINSANY with a method call on the left-hand side (e.g., txt.split(' ') CONTAINSANY 'value') would fail. The fix ensures that the necessary context is available during WHERE clause evaluation and that Java arrays resulting from such method calls are correctly processed, thereby enhancing the robustness of SQL query execution.

Highlights

  • Context Variable Handling: The $current context variable is now set in ScanWithFilterStep before evaluating the WHERE clause filter, resolving issues with method calls like split() in expressions.
  • Java Array Support in CONTAINSANY: The ContainsAnyCondition.execute() method has been updated to correctly handle Java arrays (e.g., String[]) returned by method calls on the left-hand side, utilizing MultiValue.getMultiValueIterator().
  • Regression Test Coverage: A new regression test, Issue3581ContainsAnyMethodLhsTest, has been added to cover single match, multiple matches, and no match scenarios for CONTAINSANY with method calls on the left-hand side.

🧠 New Feature in Public Preview: You can now enable Memory to help Gemini Code Assist learn from your team's feedback. This makes future code reviews more consistent and personalized to your project's style. Click here to enable Memory in your admin console.

Changelog
  • engine/src/main/java/com/arcadedb/query/sql/executor/ScanWithFilterStep.java
    • Ensured the $current context variable is set before WHERE clause evaluation.
  • engine/src/main/java/com/arcadedb/query/sql/parser/ContainsAnyCondition.java
    • Added logic to handle Java arrays as the left operand for CONTAINSANY.
  • engine/src/test/java/com/arcadedb/query/sql/Issue3581ContainsAnyMethodLhsTest.java
    • Introduced a new test class to validate CONTAINSANY behavior with method calls on the left-hand side, including tests for single, multiple, and no matches.
Activity
  • New regression test Issue3581ContainsAnyMethodLhsTest passed with 3 test cases.
  • Existing ContainsAnyConditionTest passed.
  • Existing FilterStepTest passed.
  • Broad SQL test suite passed with 1378 tests and 0 failures.
  • The pull request was generated with Claude Code.
Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

@robfrank robfrank added this to the 26.4.1 milestone Mar 5, 2026
@mergify
Copy link
Copy Markdown
Contributor

mergify Bot commented Mar 5, 2026

🧪 CI Insights

Here's what we observed from your CI run for e6dca65.

🟢 All jobs passed!

But CI Insights is watching 👀

@claude
Copy link
Copy Markdown

claude Bot commented Mar 5, 2026

Code Review

The fix correctly identifies and addresses two distinct root causes for the reported issue. Here's my analysis:

What's Working Well

  • Root cause is correct: ScanWithFilterStep is a combined scan+filter optimization that, unlike FetchFromClusterExecutionStep and FetchFromTypeWithFilterStep, was not setting $current before evaluating the WHERE clause. The fix aligns it with the established pattern.
  • Array handling in ContainsAnyCondition: Using MultiValue.getMultiValueIterator() is the right tool here — it's already used in the same class for the rightBlock path. The null guard (left != null && left.getClass().isArray()) is correct.
  • Tests follow project conventions: JUnit 5 + AssertJ, and using distinct type names per test method (doc3581, doc3581b, doc3581c) is a reasonable workaround for test isolation without requiring schema teardown.

Issues to Address

1. ContainsAllCondition has the same array gap (missing fix)

The CONTAINSALL operator has the same String[] blind spot — its execute() method converts Iterable to an iterator but has no isArray() branch. A query like:

SELECT FROM doc WHERE txt.split(' ') CONTAINSALL 'te'

would silently return wrong results with this PR. Consider applying the symmetric fix there, or at minimum add a test demonstrating the current (broken) behaviour as a TODO.

2. Multi-element right-side iterator bug exposed by this fix (pre-existing, but newly reachable)

In the execute(Object left, Object right) iterator branch (now reachable for String[]), the leftIterator is exhausted after the first element of right is processed. If right has more than one element, subsequent elements will never match:

while (rightIterator.hasNext()) {
    final Object leftItem = rightIterator.next();   // items from *right*, confusingly named
    while (leftIterator.hasNext()) {                // exhausted after first outer iteration!
        final Object rightItem = leftIterator.next(); // items from *left*, confusingly named
        if (leftItem != null && leftItem.equals(rightItem))
            return true;
    }
}

This PR makes this path reachable for the first time via String[]. A query like:

SELECT FROM doc WHERE txt.split(' ') CONTAINSANY ['te', 'hello']

will silently fail to match 'hello' once 'te' exhausts the iterator without matching. Please add a test that covers a multi-value RHS to document or fix this behaviour.

3. Missing null-field test case

A document with no txt property (null value) should be tested. The null guard on the array branch looks correct, but a test would confirm graceful handling rather than an NPE or unexpected exception.


Minor Observations

Comment verbosity: The new comment in ScanWithFilterStep reads more like a commit message than an inline code note. Something terser like:

//  must be set before WHERE evaluation so method calls (e.g. split()) resolve correctly

would be more consistent with the surrounding codebase style.

Variable naming (pre-existing, not introduced here): In the Iterator branch of execute(), leftItem is drawn from rightIterator and rightItem from leftIterator — the names are inverted relative to their sources. This is confusing for future readers and worth cleaning up, even though it predates this PR.


Summary

The core fix is sound and the regression test is a good start. The main ask before merging is:

  1. Add a test for multi-value RHS (CONTAINSANY ['a', 'b']) to expose or verify the iterator exhaustion behaviour.
  2. Consider adding a null-field test case.
  3. Decide whether to fix ContainsAllCondition symmetrically or file a follow-up issue.

Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request addresses an issue where CONTAINSANY with method calls on the left-hand side failed due to the $current context variable not being set correctly and Java arrays not being handled. While the fix for the $current variable in ScanWithFilterStep is correct, the implementation of array support in ContainsAnyCondition introduces a logic bug. This bug, caused by improper handling of non-resettable iterators in a nested loop, can lead to incorrect query results, potentially having security implications if the operator is used for access control or data filtering. It is recommended to convert the right-hand side to a Set and iterate through the left-hand side once to ensure correctness and efficiency. The addition of a comprehensive regression test (Issue3581ContainsAnyMethodLhsTest) provides excellent validation for the $current fix.

Comment on lines +58 to +59
else if (left != null && left.getClass().isArray())
left = MultiValue.getMultiValueIterator(left);
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

security-medium medium

The implementation of CONTAINSANY for arrays (and other non-Collection Iterables) contains a logic error in the nested loop. When left is an Iterator (e.g., from an array returned by split()), the inner loop while (leftIterator.hasNext()) (lines 70-74) completely consumes the iterator during the first iteration of the outer loop while (rightIterator.hasNext()) (lines 68-75). If a match is not found in the first pass, subsequent iterations of the outer loop will find leftIterator exhausted, potentially leading to incorrect false results even when a match exists. This can lead to security filter bypasses if CONTAINSANY is used in a WHERE NOT clause to exclude sensitive records based on multiple tags or attributes.

…d test

- ContainsAllCondition.execute() had the same missing array handling
- Added test for null/missing txt field to verify graceful handling
- Added CONTAINSALL with split() test
- Trimmed comment in ScanWithFilterStep

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@claude
Copy link
Copy Markdown

claude Bot commented Mar 5, 2026

Code Review

Overall this is a clean, well-scoped fix. The root cause analysis is correct and the implementation follows existing patterns. A few observations:


What's Good

ScanWithFilterStep.java - The fix is correct. ScanWithFilterStep is a combined fetch+filter step that was missing the $current assignment before calling matchesFilters(). The regular FilterStep works correctly because it delegates fetching to a prior step (like FetchFromClusterExecutionStep) that sets $current in its next() call before returning. ScanWithFilterStep bypasses that delegation and was therefore missing the assignment. The fix is properly placed before the filter, and the post-filter reassignment (line 114) is correctly retained for the projected-result path.

ContainsAnyCondition / ContainsAllCondition - Using MultiValue.getMultiValueIterator(left) for array handling is consistent with how MultiValue is used elsewhere and correctly handles String[] returned by split(). The symmetric fix across both condition types is the right call.

Tests - Five distinct test cases (single match, multiple matches, no match, null field, CONTAINSALL variant) provide solid regression coverage. The null-field test is especially valuable for verifying graceful handling of missing properties.


Pre-existing issue now more reachable

The iterator-exhaustion bug in ContainsAnyCondition.execute() (present before this PR) becomes more reachable via the new array code path. The inner loop exhausts leftIterator while checking against the first right-side value. If that first right value does not match, subsequent right values are never checked because leftIterator is already exhausted.

Example: txt.split(' ') CONTAINSANY ['nomatch', 'te'] with txt = 'te st' would return no results despite 'te' being in the split array, because 'nomatch' exhausts the left iterator before 'te' is checked. The single-value right-hand side used in all new tests works correctly, but this multi-value right-hand-side boundary case is now reachable in a way it was not before (arrays previously fell through without matching).

Consider adding a test case to document the expected behavior. If it fails, it is worth opening a follow-up issue to fix the inner iterator exhaustion in ContainsAnyCondition.execute().

There is also a cosmetic naming inconsistency in the existing iterator loop: leftItem is drawn from rightIterator and rightItem from leftIterator. Not introduced here, but worth a future cleanup.


Minor: Test isolation

Each test method creates its own type (doc3581, doc3581b, ...). If TestHelper creates a fresh database per test method this is redundant; if the database is shared across methods within the class it is correct. Either way it works, but the intent might be worth a brief comment.


Summary

The fix correctly addresses both root causes: the missing $current context variable in ScanWithFilterStep and the unhandled Java array type in ContainsAny/AllCondition.execute(). No concerns with the approach. The pre-existing iterator exhaustion issue is worth a follow-up to document or fix the multi-value right-hand side behavior.

@codacy-production
Copy link
Copy Markdown

codacy-production Bot commented Mar 5, 2026

Coverage summary from Codacy

See diff coverage on Codacy

Coverage variation Diff coverage
-9.81% 100.00%
Coverage variation details
Coverable lines Covered lines Coverage
Common ancestor commit (9959c82) 103658 76728 74.02%
Head commit (e6dca65) 134218 (+30560) 86186 (+9458) 64.21% (-9.81%)

Coverage variation is the difference between the coverage for the head and common ancestor commits of the pull request branch: <coverage of head commit> - <coverage of common ancestor commit>

Diff coverage details
Coverable lines Covered lines Diff coverage
Pull request (#3582) 1 1 100.00%

Diff coverage is the percentage of lines that are covered by tests out of the coverable lines that the pull request added or modified: <covered lines added or modified>/<coverable lines added or modified> * 100%

See your quality gate settings    Change summary preferences

@codecov
Copy link
Copy Markdown

codecov Bot commented Mar 5, 2026

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 64.95%. Comparing base (9959c82) to head (e6dca65).
⚠️ Report is 1 commits behind head on main.

Additional details and impacted files
@@            Coverage Diff             @@
##             main    #3582      +/-   ##
==========================================
- Coverage   65.06%   64.95%   -0.11%     
==========================================
  Files        1514     1514              
  Lines      103658   103659       +1     
  Branches    21454    21454              
==========================================
- Hits        67443    67332     -111     
- Misses      26958    27080     +122     
+ Partials     9257     9247      -10     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

@robfrank robfrank merged commit 64eb57e into main Mar 5, 2026
23 of 28 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

SQL: Left-hand side of filter being method return value errors (Regression)

1 participant