fixes #3307 head and collect + list indexing#3310
Conversation
Fixed head(collect()) and similar wrapped aggregation patterns returning null in WITH clauses. Root cause: WithClause.hasAggregations() only checked for direct aggregation functions using isAggregation(), missing wrapped patterns like head(collect(...)) where the outer function is not an aggregation. Solution: - Added Expression.containsAggregation() method that recursively checks for aggregations within nested expressions - Updated WithClause and ReturnClause to use containsAggregation() - Implemented containsAggregation() in expression types with nested expressions (FunctionCallExpression, ListExpression, CaseExpression, ListComprehensionExpression, ArithmeticExpression) - Added comprehensive test suite (HeadCollectTest) This ensures the query planner correctly identifies wrapped aggregations and uses GroupByAggregationStep instead of WithStep, enabling proper cross-row aggregation. https://claude.ai/code/session_019P6hHuHrJK5ZGKWoxWL3Ev
Extract property values into Object variables before passing to assertThat to avoid ambiguity between Predicate/IntPredicate overloads. https://claude.ai/code/session_019P6hHuHrJK5ZGKWoxWL3Ev
Detailed documentation of the head(collect()) fix including: - Problem description and root cause analysis - Solution architecture and implementation details - Complete list of modified files - Before/after execution flow diagrams - Testing strategy and backward compatibility notes https://claude.ai/code/session_019P6hHuHrJK5ZGKWoxWL3Ev
AggregationStep was only detecting direct aggregations via isAggregation(), missing wrapped patterns like head(collect(...)). Added logic to: - Detect wrapped aggregations using findInnerAggregation() - Create separate aggregators for inner aggregation functions - Apply wrapper functions to aggregated results after processing - Update prettyPrint to show wrapped aggregations This matches the approach used in GroupByAggregationStep. https://claude.ai/code/session_019P6hHuHrJK5ZGKWoxWL3Ev
Summary of ChangesHello @ExtReMLapin, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed! This pull request addresses a critical bug in the OpenCypher engine where queries using wrapped aggregation functions like 'head(collect())' were incorrectly evaluated, leading to null results. The fix involves a fundamental change to how aggregation functions are detected within expressions, ensuring that the query planner correctly identifies and processes these complex aggregation patterns, thereby enabling accurate query execution and results. Highlights
🧠 New Feature in Public Preview: You can now enable Memory to help Gemini Code Assist learn from your team's feedback. This makes future code reviews more consistent and personalized to your project's style. Click here to enable Memory in your admin console. Using Gemini Code AssistThe full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips. Invoking Gemini You can request assistance from Gemini at any point by creating a comment using either
Customization To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a Limitations & Feedback Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here. You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension. Footnotes
|
There was a problem hiding this comment.
Code Review
This pull request introduces a robust way to detect wrapped aggregations like head(collect(...)) by adding a recursive containsAggregation() method to the Expression hierarchy. This is a solid approach and has been applied consistently across various expression types. The refactoring in ReturnClause to use this new method is clean, and the addition of a comprehensive test suite in HeadCollectTest.java is excellent for ensuring correctness.
My review includes a few suggestions to improve code conciseness using Java Streams.
However, I've identified a critical issue in AggregationStep where the logic for identifying aggregations only considers FunctionCallExpression, which can lead to incorrect results for queries involving aggregations within other expression types like ListExpression or ArithmeticExpression.
| // CASE contains aggregation if any of its expressions contain aggregations | ||
| if (caseExpression != null && caseExpression.containsAggregation()) { | ||
| return true; | ||
| } | ||
|
|
||
| for (final CaseAlternative alternative : alternatives) { | ||
| if (alternative.getWhenExpression().containsAggregation() | ||
| || alternative.getThenExpression().containsAggregation()) { | ||
| return true; | ||
| } | ||
| } | ||
|
|
||
| if (elseExpression != null && elseExpression.containsAggregation()) { | ||
| return true; | ||
| } | ||
|
|
||
| return false; |
There was a problem hiding this comment.
This method can be written more concisely using Java Streams to check the alternatives.
| // CASE contains aggregation if any of its expressions contain aggregations | |
| if (caseExpression != null && caseExpression.containsAggregation()) { | |
| return true; | |
| } | |
| for (final CaseAlternative alternative : alternatives) { | |
| if (alternative.getWhenExpression().containsAggregation() | |
| || alternative.getThenExpression().containsAggregation()) { | |
| return true; | |
| } | |
| } | |
| if (elseExpression != null && elseExpression.containsAggregation()) { | |
| return true; | |
| } | |
| return false; | |
| // CASE contains aggregation if any of its expressions contain aggregations | |
| return (caseExpression != null && caseExpression.containsAggregation()) | |
| || alternatives.stream().anyMatch(alt -> alt.getWhenExpression().containsAggregation() || alt.getThenExpression().containsAggregation()) | |
| || (elseExpression != null && elseExpression.containsAggregation()); |
| // If this function itself is an aggregation, return true | ||
| if (isAggregation()) { | ||
| return true; | ||
| } | ||
| // Otherwise, check if any argument contains an aggregation (wrapped aggregation) | ||
| for (final Expression arg : arguments) { | ||
| if (arg.containsAggregation()) { | ||
| return true; | ||
| } | ||
| } | ||
| return false; |
There was a problem hiding this comment.
This method can be simplified to a single return statement using Java Streams.
// If this function itself is an aggregation, return true
if (isAggregation()) {
return true;
}
// Otherwise, check if any argument contains an aggregation (wrapped aggregation)
return arguments.stream().anyMatch(Expression::containsAggregation);| if (listExpression.containsAggregation()) | ||
| return true; | ||
| if (whereExpression != null && whereExpression.containsAggregation()) | ||
| return true; | ||
| if (mapExpression != null && mapExpression.containsAggregation()) | ||
| return true; | ||
| return false; |
There was a problem hiding this comment.
This can be simplified into a single return statement.
| if (listExpression.containsAggregation()) | |
| return true; | |
| if (whereExpression != null && whereExpression.containsAggregation()) | |
| return true; | |
| if (mapExpression != null && mapExpression.containsAggregation()) | |
| return true; | |
| return false; | |
| return listExpression.containsAggregation() | |
| || (whereExpression != null && whereExpression.containsAggregation()) | |
| || (mapExpression != null && mapExpression.containsAggregation()); |
| // A list contains an aggregation if any of its elements contain aggregations | ||
| for (final Expression element : elements) { | ||
| if (element.containsAggregation()) { | ||
| return true; | ||
| } | ||
| } | ||
| return false; |
🧪 CI InsightsHere's what we observed from your CI run for 53e9da8. 🟢 All jobs passed!But CI Insights is watching 👀 |
Added support for list/array indexing syntax (list[index]) which was defined in the grammar but not implemented in the AST builder and evaluator. Changes: - Created ListIndexExpression AST node for list[index] syntax - Supports negative indices (Python-style): list[-1] gets last element - Handles List, array, and String types - Updated ExpressionEvaluator to evaluate ListIndexExpression - Updated CypherExpressionBuilder to parse Expression0 with postFix operations - Added findExpression0Recursive() to detect postfix expressions - Added parseExpression0WithPostfix() to handle property access and indexing - Created ChainedPropertyAccessExpression for chained access like list[0].property This fixes queries where list[index] was returning null because it was being treated as a literal property name instead of an indexing operation. https://claude.ai/code/session_019P6hHuHrJK5ZGKWoxWL3Ev
…dling Fixed incorrect reference to Expression0Context which doesn't exist in the generated parser. The grammar rule is: expression2 : expression1 postFix* So the correct context class is Expression2Context, not Expression0Context. Changes: - Expression0Context -> Expression2Context (all occurrences) - findExpression0Recursive -> findExpression2Recursive - parseExpression0WithPostfix -> parseExpression2WithPostfix - expr0Ctx variable -> expr2Ctx - Updated all comments and documentation https://claude.ai/code/session_019P6hHuHrJK5ZGKWoxWL3Ev
Fixed compilation error where parseExpression() expected ExpressionContext but ctx.expression1() returns Expression1Context. Changed line 713: - parseExpression(ctx.expression1()) + parseExpressionFromText(ctx.expression1()) parseExpressionFromText() accepts any ParseTree node and handles the conversion properly. https://claude.ai/code/session_019P6hHuHrJK5ZGKWoxWL3Ev
|
Claude wrote all of this, again. |
Codecov Report❌ Patch coverage is Additional details and impacted files@@ Coverage Diff @@
## main #3310 +/- ##
=======================================
Coverage ? 55.18%
=======================================
Files ? 1367
Lines ? 100424
Branches ? 20438
=======================================
Hits ? 55420
Misses ? 35814
Partials ? 9190 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
* fix: detect wrapped aggregations in OpenCypher WITH clauses Fixed head(collect()) and similar wrapped aggregation patterns returning null in WITH clauses. Root cause: WithClause.hasAggregations() only checked for direct aggregation functions using isAggregation(), missing wrapped patterns like head(collect(...)) where the outer function is not an aggregation. Solution: - Added Expression.containsAggregation() method that recursively checks for aggregations within nested expressions - Updated WithClause and ReturnClause to use containsAggregation() - Implemented containsAggregation() in expression types with nested expressions (FunctionCallExpression, ListExpression, CaseExpression, ListComprehensionExpression, ArithmeticExpression) - Added comprehensive test suite (HeadCollectTest) This ensures the query planner correctly identifies wrapped aggregations and uses GroupByAggregationStep instead of WithStep, enabling proper cross-row aggregation. https://claude.ai/code/session_019P6hHuHrJK5ZGKWoxWL3Ev * fix: resolve ambiguous assertThat calls in HeadCollectTest Extract property values into Object variables before passing to assertThat to avoid ambiguity between Predicate/IntPredicate overloads. https://claude.ai/code/session_019P6hHuHrJK5ZGKWoxWL3Ev * docs: add comprehensive fix summary for wrapped aggregations issue Detailed documentation of the head(collect()) fix including: - Problem description and root cause analysis - Solution architecture and implementation details - Complete list of modified files - Before/after execution flow diagrams - Testing strategy and backward compatibility notes https://claude.ai/code/session_019P6hHuHrJK5ZGKWoxWL3Ev * fix: handle wrapped aggregations in AggregationStep AggregationStep was only detecting direct aggregations via isAggregation(), missing wrapped patterns like head(collect(...)). Added logic to: - Detect wrapped aggregations using findInnerAggregation() - Create separate aggregators for inner aggregation functions - Apply wrapper functions to aggregated results after processing - Update prettyPrint to show wrapped aggregations This matches the approach used in GroupByAggregationStep. https://claude.ai/code/session_019P6hHuHrJK5ZGKWoxWL3Ev * fix: implement list indexing (subscript) support in OpenCypher Added support for list/array indexing syntax (list[index]) which was defined in the grammar but not implemented in the AST builder and evaluator. Changes: - Created ListIndexExpression AST node for list[index] syntax - Supports negative indices (Python-style): list[-1] gets last element - Handles List, array, and String types - Updated ExpressionEvaluator to evaluate ListIndexExpression - Updated CypherExpressionBuilder to parse Expression0 with postFix operations - Added findExpression0Recursive() to detect postfix expressions - Added parseExpression0WithPostfix() to handle property access and indexing - Created ChainedPropertyAccessExpression for chained access like list[0].property This fixes queries where list[index] was returning null because it was being treated as a literal property name instead of an indexing operation. https://claude.ai/code/session_019P6hHuHrJK5ZGKWoxWL3Ev * fix: correct Expression0Context to Expression2Context for postfix handling Fixed incorrect reference to Expression0Context which doesn't exist in the generated parser. The grammar rule is: expression2 : expression1 postFix* So the correct context class is Expression2Context, not Expression0Context. Changes: - Expression0Context -> Expression2Context (all occurrences) - findExpression0Recursive -> findExpression2Recursive - parseExpression0WithPostfix -> parseExpression2WithPostfix - expr0Ctx variable -> expr2Ctx - Updated all comments and documentation https://claude.ai/code/session_019P6hHuHrJK5ZGKWoxWL3Ev * fix: use parseExpressionFromText for Expression1Context parsing Fixed compilation error where parseExpression() expected ExpressionContext but ctx.expression1() returns Expression1Context. Changed line 713: - parseExpression(ctx.expression1()) + parseExpressionFromText(ctx.expression1()) parseExpressionFromText() accepts any ParseTree node and handles the conversion properly. https://claude.ai/code/session_019P6hHuHrJK5ZGKWoxWL3Ev * removed markdown fix_summary --------- Co-authored-by: Claude <noreply@anthropic.com> Co-authored-by: CNE Pierre FICHEPOIL <pierre-1.fichepoil@gendarmerie.interieur.gouv.fr> (cherry picked from commit 80c8ee9)
What does this PR do?
A brief description of the change being made with this pull request.
Motivation
What inspired you to submit this pull request?
Related issues
A list of issues either fixed, containing architectural discussions, otherwise relevant
for this Pull Request.
Additional Notes
Anything else we should know when reviewing?
Checklist
mvn clean packagecommand