Skip to content

fixes #3307 head and collect + list indexing#3310

Merged
lvca merged 8 commits intoArcadeData:mainfrom
ExtReMLapin:claude/fix-opencypher-engine-iuCb0
Feb 2, 2026
Merged

fixes #3307 head and collect + list indexing#3310
lvca merged 8 commits intoArcadeData:mainfrom
ExtReMLapin:claude/fix-opencypher-engine-iuCb0

Conversation

@ExtReMLapin
Copy link
Contributor

What does this PR do?

A brief description of the change being made with this pull request.

Motivation

What inspired you to submit this pull request?

Related issues

A list of issues either fixed, containing architectural discussions, otherwise relevant
for this Pull Request.

Additional Notes

Anything else we should know when reviewing?

Checklist

  • I have run the build using mvn clean package command
  • My unit tests cover both failure and success scenarios

Fixed head(collect()) and similar wrapped aggregation patterns returning
null in WITH clauses.

Root cause: WithClause.hasAggregations() only checked for direct
aggregation functions using isAggregation(), missing wrapped patterns
like head(collect(...)) where the outer function is not an aggregation.

Solution:
- Added Expression.containsAggregation() method that recursively checks
  for aggregations within nested expressions
- Updated WithClause and ReturnClause to use containsAggregation()
- Implemented containsAggregation() in expression types with nested
  expressions (FunctionCallExpression, ListExpression, CaseExpression,
  ListComprehensionExpression, ArithmeticExpression)
- Added comprehensive test suite (HeadCollectTest)

This ensures the query planner correctly identifies wrapped aggregations
and uses GroupByAggregationStep instead of WithStep, enabling proper
cross-row aggregation.

https://claude.ai/code/session_019P6hHuHrJK5ZGKWoxWL3Ev
Extract property values into Object variables before passing to assertThat
to avoid ambiguity between Predicate/IntPredicate overloads.

https://claude.ai/code/session_019P6hHuHrJK5ZGKWoxWL3Ev
Detailed documentation of the head(collect()) fix including:
- Problem description and root cause analysis
- Solution architecture and implementation details
- Complete list of modified files
- Before/after execution flow diagrams
- Testing strategy and backward compatibility notes

https://claude.ai/code/session_019P6hHuHrJK5ZGKWoxWL3Ev
AggregationStep was only detecting direct aggregations via isAggregation(),
missing wrapped patterns like head(collect(...)). Added logic to:
- Detect wrapped aggregations using findInnerAggregation()
- Create separate aggregators for inner aggregation functions
- Apply wrapper functions to aggregated results after processing
- Update prettyPrint to show wrapped aggregations

This matches the approach used in GroupByAggregationStep.

https://claude.ai/code/session_019P6hHuHrJK5ZGKWoxWL3Ev
@gemini-code-assist
Copy link
Contributor

Summary of Changes

Hello @ExtReMLapin, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request addresses a critical bug in the OpenCypher engine where queries using wrapped aggregation functions like 'head(collect())' were incorrectly evaluated, leading to null results. The fix involves a fundamental change to how aggregation functions are detected within expressions, ensuring that the query planner correctly identifies and processes these complex aggregation patterns, thereby enabling accurate query execution and results.

Highlights

  • Enhanced Aggregation Detection: Introduced a new 'containsAggregation()' method in the 'Expression' interface and its implementations to recursively detect aggregation functions, even when they are wrapped within other non-aggregation functions (e.g., 'head(collect(...))').
  • Corrected Query Planning: Updated 'WithClause' and 'ReturnClause' to utilize the new 'containsAggregation()' method, ensuring the query planner correctly identifies wrapped aggregations and selects the appropriate 'GroupByAggregationStep' for execution instead of 'WithStep'.
  • Improved Aggregation Execution: Modified 'AggregationStep' to explicitly handle wrapped aggregations by identifying the inner aggregation, executing it, and then applying the outer wrapper function to its result.
  • Comprehensive Test Coverage: Added a new 'HeadCollectTest.java' file with multiple test cases to validate the fix for various wrapped aggregation patterns, including 'head(collect())', 'last(collect())', and 'size(collect())'.

🧠 New Feature in Public Preview: You can now enable Memory to help Gemini Code Assist learn from your team's feedback. This makes future code reviews more consistent and personalized to your project's style. Click here to enable Memory in your admin console.

Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

@ExtReMLapin ExtReMLapin changed the title Claude/fix opencypher engine iu cb0 fixes #3307 head and collect Feb 2, 2026
Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces a robust way to detect wrapped aggregations like head(collect(...)) by adding a recursive containsAggregation() method to the Expression hierarchy. This is a solid approach and has been applied consistently across various expression types. The refactoring in ReturnClause to use this new method is clean, and the addition of a comprehensive test suite in HeadCollectTest.java is excellent for ensuring correctness.

My review includes a few suggestions to improve code conciseness using Java Streams.

However, I've identified a critical issue in AggregationStep where the logic for identifying aggregations only considers FunctionCallExpression, which can lead to incorrect results for queries involving aggregations within other expression types like ListExpression or ArithmeticExpression.

Comment on lines +127 to +143
// CASE contains aggregation if any of its expressions contain aggregations
if (caseExpression != null && caseExpression.containsAggregation()) {
return true;
}

for (final CaseAlternative alternative : alternatives) {
if (alternative.getWhenExpression().containsAggregation()
|| alternative.getThenExpression().containsAggregation()) {
return true;
}
}

if (elseExpression != null && elseExpression.containsAggregation()) {
return true;
}

return false;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

This method can be written more concisely using Java Streams to check the alternatives.

Suggested change
// CASE contains aggregation if any of its expressions contain aggregations
if (caseExpression != null && caseExpression.containsAggregation()) {
return true;
}
for (final CaseAlternative alternative : alternatives) {
if (alternative.getWhenExpression().containsAggregation()
|| alternative.getThenExpression().containsAggregation()) {
return true;
}
}
if (elseExpression != null && elseExpression.containsAggregation()) {
return true;
}
return false;
// CASE contains aggregation if any of its expressions contain aggregations
return (caseExpression != null && caseExpression.containsAggregation())
|| alternatives.stream().anyMatch(alt -> alt.getWhenExpression().containsAggregation() || alt.getThenExpression().containsAggregation())
|| (elseExpression != null && elseExpression.containsAggregation());

Comment on lines +56 to +66
// If this function itself is an aggregation, return true
if (isAggregation()) {
return true;
}
// Otherwise, check if any argument contains an aggregation (wrapped aggregation)
for (final Expression arg : arguments) {
if (arg.containsAggregation()) {
return true;
}
}
return false;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

This method can be simplified to a single return statement using Java Streams.

    // If this function itself is an aggregation, return true
    if (isAggregation()) {
      return true;
    }
    // Otherwise, check if any argument contains an aggregation (wrapped aggregation)
    return arguments.stream().anyMatch(Expression::containsAggregation);

Comment on lines +154 to +160
if (listExpression.containsAggregation())
return true;
if (whereExpression != null && whereExpression.containsAggregation())
return true;
if (mapExpression != null && mapExpression.containsAggregation())
return true;
return false;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

This can be simplified into a single return statement.

Suggested change
if (listExpression.containsAggregation())
return true;
if (whereExpression != null && whereExpression.containsAggregation())
return true;
if (mapExpression != null && mapExpression.containsAggregation())
return true;
return false;
return listExpression.containsAggregation()
|| (whereExpression != null && whereExpression.containsAggregation())
|| (mapExpression != null && mapExpression.containsAggregation());

Comment on lines +62 to +68
// A list contains an aggregation if any of its elements contain aggregations
for (final Expression element : elements) {
if (element.containsAggregation()) {
return true;
}
}
return false;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

This method can be simplified to a single return statement using Java Streams.

    // A list contains an aggregation if any of its elements contain aggregations
    return elements.stream().anyMatch(Expression::containsAggregation);

@mergify
Copy link
Contributor

mergify bot commented Feb 2, 2026

🧪 CI Insights

Here's what we observed from your CI run for 53e9da8.

🟢 All jobs passed!

But CI Insights is watching 👀

Added support for list/array indexing syntax (list[index]) which was defined
in the grammar but not implemented in the AST builder and evaluator.

Changes:
- Created ListIndexExpression AST node for list[index] syntax
- Supports negative indices (Python-style): list[-1] gets last element
- Handles List, array, and String types
- Updated ExpressionEvaluator to evaluate ListIndexExpression
- Updated CypherExpressionBuilder to parse Expression0 with postFix operations
- Added findExpression0Recursive() to detect postfix expressions
- Added parseExpression0WithPostfix() to handle property access and indexing
- Created ChainedPropertyAccessExpression for chained access like list[0].property

This fixes queries where list[index] was returning null because it was being
treated as a literal property name instead of an indexing operation.

https://claude.ai/code/session_019P6hHuHrJK5ZGKWoxWL3Ev
…dling

Fixed incorrect reference to Expression0Context which doesn't exist in the
generated parser. The grammar rule is:
  expression2 : expression1 postFix*

So the correct context class is Expression2Context, not Expression0Context.

Changes:
- Expression0Context -> Expression2Context (all occurrences)
- findExpression0Recursive -> findExpression2Recursive
- parseExpression0WithPostfix -> parseExpression2WithPostfix
- expr0Ctx variable -> expr2Ctx
- Updated all comments and documentation

https://claude.ai/code/session_019P6hHuHrJK5ZGKWoxWL3Ev
Fixed compilation error where parseExpression() expected ExpressionContext
but ctx.expression1() returns Expression1Context.

Changed line 713:
- parseExpression(ctx.expression1())
+ parseExpressionFromText(ctx.expression1())

parseExpressionFromText() accepts any ParseTree node and handles the
conversion properly.

https://claude.ai/code/session_019P6hHuHrJK5ZGKWoxWL3Ev
@ExtReMLapin ExtReMLapin changed the title fixes #3307 head and collect fixes #3307 head and collect + list indexing Feb 2, 2026
@ExtReMLapin ExtReMLapin marked this pull request as ready for review February 2, 2026 14:34
@ExtReMLapin
Copy link
Contributor Author

mvn test -Dtest="com.arcadedb.query.opencypher.**.*" -> [INFO] ArcadeDB Engine .................................... SUCCESS [ 11.134 s]

Claude wrote all of this, again.

@codecov
Copy link

codecov bot commented Feb 2, 2026

Codecov Report

❌ Patch coverage is 36.90476% with 106 lines in your changes missing coverage. Please review.
⚠️ Please upload report for BASE (main@c7cd467). Learn more about missing BASE report.

Files with missing lines Patch % Lines
...dedb/query/opencypher/ast/ListIndexExpression.java 0.00% 43 Missing ⚠️
...ery/opencypher/parser/CypherExpressionBuilder.java 30.43% 26 Missing and 6 partials ⚠️
...ery/opencypher/executor/steps/AggregationStep.java 70.73% 7 Missing and 5 partials ⚠️
.../arcadedb/query/opencypher/ast/CaseExpression.java 30.00% 3 Missing and 4 partials ⚠️
...ry/opencypher/ast/ListComprehensionExpression.java 14.28% 3 Missing and 3 partials ⚠️
...query/opencypher/executor/ExpressionEvaluator.java 0.00% 2 Missing and 1 partial ⚠️
.../arcadedb/query/opencypher/ast/ListExpression.java 60.00% 1 Missing and 1 partial ⚠️
...edb/query/opencypher/ast/ArithmeticExpression.java 0.00% 0 Missing and 1 partial ⚠️
Additional details and impacted files
@@           Coverage Diff           @@
##             main    #3310   +/-   ##
=======================================
  Coverage        ?   55.18%           
=======================================
  Files           ?     1367           
  Lines           ?   100424           
  Branches        ?    20438           
=======================================
  Hits            ?    55420           
  Misses          ?    35814           
  Partials        ?     9190           

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

@lvca lvca merged commit 80c8ee9 into ArcadeData:main Feb 2, 2026
13 of 18 checks passed
@lvca lvca added this to the 26.2.1 milestone Feb 2, 2026
robfrank pushed a commit that referenced this pull request Feb 17, 2026
* fix: detect wrapped aggregations in OpenCypher WITH clauses

Fixed head(collect()) and similar wrapped aggregation patterns returning
null in WITH clauses.

Root cause: WithClause.hasAggregations() only checked for direct
aggregation functions using isAggregation(), missing wrapped patterns
like head(collect(...)) where the outer function is not an aggregation.

Solution:
- Added Expression.containsAggregation() method that recursively checks
  for aggregations within nested expressions
- Updated WithClause and ReturnClause to use containsAggregation()
- Implemented containsAggregation() in expression types with nested
  expressions (FunctionCallExpression, ListExpression, CaseExpression,
  ListComprehensionExpression, ArithmeticExpression)
- Added comprehensive test suite (HeadCollectTest)

This ensures the query planner correctly identifies wrapped aggregations
and uses GroupByAggregationStep instead of WithStep, enabling proper
cross-row aggregation.

https://claude.ai/code/session_019P6hHuHrJK5ZGKWoxWL3Ev

* fix: resolve ambiguous assertThat calls in HeadCollectTest

Extract property values into Object variables before passing to assertThat
to avoid ambiguity between Predicate/IntPredicate overloads.

https://claude.ai/code/session_019P6hHuHrJK5ZGKWoxWL3Ev

* docs: add comprehensive fix summary for wrapped aggregations issue

Detailed documentation of the head(collect()) fix including:
- Problem description and root cause analysis
- Solution architecture and implementation details
- Complete list of modified files
- Before/after execution flow diagrams
- Testing strategy and backward compatibility notes

https://claude.ai/code/session_019P6hHuHrJK5ZGKWoxWL3Ev

* fix: handle wrapped aggregations in AggregationStep

AggregationStep was only detecting direct aggregations via isAggregation(),
missing wrapped patterns like head(collect(...)). Added logic to:
- Detect wrapped aggregations using findInnerAggregation()
- Create separate aggregators for inner aggregation functions
- Apply wrapper functions to aggregated results after processing
- Update prettyPrint to show wrapped aggregations

This matches the approach used in GroupByAggregationStep.

https://claude.ai/code/session_019P6hHuHrJK5ZGKWoxWL3Ev

* fix: implement list indexing (subscript) support in OpenCypher

Added support for list/array indexing syntax (list[index]) which was defined
in the grammar but not implemented in the AST builder and evaluator.

Changes:
- Created ListIndexExpression AST node for list[index] syntax
- Supports negative indices (Python-style): list[-1] gets last element
- Handles List, array, and String types
- Updated ExpressionEvaluator to evaluate ListIndexExpression
- Updated CypherExpressionBuilder to parse Expression0 with postFix operations
- Added findExpression0Recursive() to detect postfix expressions
- Added parseExpression0WithPostfix() to handle property access and indexing
- Created ChainedPropertyAccessExpression for chained access like list[0].property

This fixes queries where list[index] was returning null because it was being
treated as a literal property name instead of an indexing operation.

https://claude.ai/code/session_019P6hHuHrJK5ZGKWoxWL3Ev

* fix: correct Expression0Context to Expression2Context for postfix handling

Fixed incorrect reference to Expression0Context which doesn't exist in the
generated parser. The grammar rule is:
  expression2 : expression1 postFix*

So the correct context class is Expression2Context, not Expression0Context.

Changes:
- Expression0Context -> Expression2Context (all occurrences)
- findExpression0Recursive -> findExpression2Recursive
- parseExpression0WithPostfix -> parseExpression2WithPostfix
- expr0Ctx variable -> expr2Ctx
- Updated all comments and documentation

https://claude.ai/code/session_019P6hHuHrJK5ZGKWoxWL3Ev

* fix: use parseExpressionFromText for Expression1Context parsing

Fixed compilation error where parseExpression() expected ExpressionContext
but ctx.expression1() returns Expression1Context.

Changed line 713:
- parseExpression(ctx.expression1())
+ parseExpressionFromText(ctx.expression1())

parseExpressionFromText() accepts any ParseTree node and handles the
conversion properly.

https://claude.ai/code/session_019P6hHuHrJK5ZGKWoxWL3Ev

* removed markdown fix_summary

---------

Co-authored-by: Claude <noreply@anthropic.com>
Co-authored-by: CNE Pierre FICHEPOIL <pierre-1.fichepoil@gendarmerie.interieur.gouv.fr>
(cherry picked from commit 80c8ee9)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants