Skip to content

fix(native): Preserve Java aggregation iteration order across protocol#27903

Merged
kaikalur merged 1 commit into
prestodb:masterfrom
kaikalur:fix/aggregation-outputs-order-v2
Jun 3, 2026
Merged

fix(native): Preserve Java aggregation iteration order across protocol#27903
kaikalur merged 1 commit into
prestodb:masterfrom
kaikalur:fix/aggregation-outputs-order-v2

Conversation

@kaikalur

@kaikalur kaikalur commented May 31, 2026

Copy link
Copy Markdown
Contributor

Summary

protocol::AggregationNode::aggregations is std::map<VRE, Aggregation> on the native side (sorted by variable name); the Java side iterates LinkedHashMap insertion order. When the two orders diverge — e.g. approx_distinct_* mixed with sum_*, alphabetical sort approx_distinct < sum but Java inserts sums first — the native AggregationNode output schema differs from the Java planner's. Channel positions shift and type mismatches surface at exchange operators:

type_->kindEquals(vector.type()) Type mismatch: BIGINT vs. DOUBLE
Operator: LocalPartition(...)

#27493 partially addressed this by switching three Java optimizer rules from HashMap to LinkedHashMap (deterministic insertion order), but did not align Java with the native side's sort. Queries whose variable names happened to sort consistently with their insertion order passed; others still crashed.

This adds an explicit aggregationOutputs: List<VariableReferenceExpression> field, populated on the Java side from aggregations.keySet() in LinkedHashMap order. The native converter uses this list when present and falls back to std::map iteration only for older coordinators that don't send the field (so rolling upgrades are safe).

Fixes #27902.

Changes

  • presto-spi/.../AggregationNode.java: @JsonProperty(READ_ONLY) getAggregationOutputs() — derived from aggregations, no constructor change.
  • presto_protocol_core.h/.cpp: List<VRE> aggregationOutputs on protocol::AggregationNode, with backward-compatible from_json.
  • PrestoToVeloxQueryPlan.cpp: converter uses aggregationOutputs when non-empty; falls back to map iteration otherwise.
  • Tests: protocol round-trip + backward-compat (AggregationNodeTest.cpp); Java getter ordering (TestPreAggregateBeforeGroupId).

Test plan

  • mvn test -pl presto-main-base -Dtest=TestPreAggregateBeforeGroupId#testAggregationOutputsPreservesInsertionOrder
  • presto-native-execution build + run presto_protocol_test --gtest_filter="AggregationNodeTest.*"
  • Existing TestPreAggregateBeforeGroupId suite stays green
  • Manually verify a reproducer query (approx_distinct(...) + sum(...) under GROUPING SETS with optimizer.pre_aggregate_before_grouping_sets=true) no longer crashes on a Prestissimo cluster

Release notes

```
== RELEASE NOTES ==

Native Execution Changes

  • Fix runtime type-mismatch crashes at exchange operators in Prestissimo when aggregation variable names sort differently from their Java allocation order. The protocol now carries an explicit aggregation output ordering so native workers build the AggregationNode output schema in the order the Java planner intended.
    ```

Summary by Sourcery

Ensure native AggregationNode output schemas follow Java planner aggregation ordering by explicitly propagating aggregation output variables through the protocol and converter.

Bug Fixes:

  • Prevent type-mismatch failures in native execution caused by differing aggregation iteration orders between Java and C++.

Enhancements:

  • Expose an explicit aggregationOutputs list on AggregationNode in the Java SPI and carry it through the native protocol for deterministic aggregation output ordering.

Tests:

  • Add Java and C++ regression tests to verify aggregation output ordering is preserved and that the protocol remains backward compatible when aggregationOutputs is absent.

@kaikalur kaikalur requested a review from a team as a code owner May 31, 2026 04:24
@kaikalur kaikalur requested a review from arhimondr May 31, 2026 04:24
@kaikalur kaikalur requested review from a team, feilong-liu and jaystarshot as code owners May 31, 2026 04:24
@prestodb-ci prestodb-ci added the from:Meta PR from Meta label May 31, 2026
@sourcery-ai

sourcery-ai Bot commented May 31, 2026

Copy link
Copy Markdown
Contributor

Reviewer's Guide

Adds an explicit aggregation output ordering to the Presto protocol and native converter so Prestissimo workers build AggregationNode schemas in the same order as the Java planner, keeping it backward compatible, and introduces tests to lock in ordering and compatibility behavior.

Sequence diagram for aggregation output ordering across Java, protocol, and native converter

sequenceDiagram
  participant JavaAggregationNode
  participant ProtocolSerialization
  participant ProtocolAggregationNode
  participant VeloxQueryPlanConverterBase

  JavaAggregationNode->>JavaAggregationNode: getAggregations
  JavaAggregationNode->>JavaAggregationNode: getAggregationOutputs
  JavaAggregationNode->>ProtocolSerialization: to_json
  ProtocolSerialization->>ProtocolAggregationNode: to_json_key aggregations
  ProtocolSerialization->>ProtocolAggregationNode: to_json_key aggregationOutputs

  ProtocolAggregationNode->>ProtocolSerialization: from_json
  ProtocolSerialization->>ProtocolAggregationNode: from_json_key aggregations
  ProtocolSerialization->>ProtocolAggregationNode: from_json_key aggregationOutputs

  VeloxQueryPlanConverterBase->>ProtocolAggregationNode: toVeloxQueryPlan
  VeloxQueryPlanConverterBase->>ProtocolAggregationNode: read aggregationOutputs
  alt aggregationOutputs non_empty
    VeloxQueryPlanConverterBase->>VeloxQueryPlanConverterBase: use aggregationOutputs
  else aggregationOutputs empty
    VeloxQueryPlanConverterBase->>ProtocolAggregationNode: iterate aggregations
  end
  VeloxQueryPlanConverterBase->>VeloxQueryPlanConverterBase: toAggregations
Loading

File-Level Changes

Change Details Files
Expose deterministic aggregation output ordering from Java AggregationNode and serialize it over the wire
  • Add a read-only getAggregationOutputs() getter on AggregationNode that returns aggregations.keySet() as an unmodifiable List in LinkedHashMap insertion order
  • Annotate the getter for JSON serialization without changing the constructor, ensuring the field is derived from aggregations
presto-spi/src/main/java/com/facebook/presto/spi/plan/AggregationNode.java
Extend protocol AggregationNode to carry aggregationOutputs and ensure JSON (de)serialization is backward compatible
  • Add aggregationOutputs field to protocol::AggregationNode alongside aggregations
  • Serialize aggregationOutputs into JSON when present
  • Deserialize aggregationOutputs only when the key is present, leaving it empty for payloads from older coordinators
presto-native-execution/presto_cpp/presto_protocol/core/presto_protocol_core.h
presto-native-execution/presto_cpp/presto_protocol/core/presto_protocol_core.cpp
Use aggregationOutputs in the Presto-to-Velox converter to drive aggregation channel layout
  • Change converter to build outputVariables from aggregationOutputs when non-empty
  • Retain previous behavior by falling back to iterating the aggregations map when aggregationOutputs is empty for backward compatibility
presto-native-execution/presto_cpp/presto_protocol/main/types/PrestoToVeloxQueryPlan.cpp
Add tests to verify aggregation output ordering and protocol backward compatibility
  • Add Java regression test ensuring getAggregationOutputs() preserves insertion order and that rule does not fire
  • Add C++ AggregationNode protocol tests to verify aggregationOutputs preserves JSON list order and that omitting it remains backward compatible
  • Wire new C++ test file into presto_protocol_test CMake target
presto-main-base/src/test/java/com/facebook/presto/sql/planner/iterative/rule/TestPreAggregateBeforeGroupId.java
presto-native-execution/presto_cpp/presto_protocol/tests/AggregationNodeTest.cpp
presto-native-execution/presto_cpp/presto_protocol/tests/CMakeLists.txt

Assessment against linked issues

Issue Objective Addressed Explanation
#27902 Add an explicit, order-preserving aggregation output list to the protocol (e.g., aggregationOutputs: List) that reflects the Java planner’s LinkedHashMap insertion order.
#27902 Update the native (C++) converter to build the AggregationNode output schema using the explicit aggregation output list when present, falling back to map iteration only for backward compatibility with older coordinators.
#27902 Add tests to verify that aggregationOutputs preserves JSON / Java insertion order and that omitting aggregationOutputs remains backward compatible.

Possibly linked issues


Tips and commands

Interacting with Sourcery

  • Trigger a new review: Comment @sourcery-ai review on the pull request.
  • Continue discussions: Reply directly to Sourcery's review comments.
  • Generate a GitHub issue from a review comment: Ask Sourcery to create an
    issue from a review comment by replying to it. You can also reply to a
    review comment with @sourcery-ai issue to create an issue from it.
  • Generate a pull request title: Write @sourcery-ai anywhere in the pull
    request title to generate a title at any time. You can also comment
    @sourcery-ai title on the pull request to (re-)generate the title at any time.
  • Generate a pull request summary: Write @sourcery-ai summary anywhere in
    the pull request body to generate a PR summary at any time exactly where you
    want it. You can also comment @sourcery-ai summary on the pull request to
    (re-)generate the summary at any time.
  • Generate reviewer's guide: Comment @sourcery-ai guide on the pull
    request to (re-)generate the reviewer's guide at any time.
  • Resolve all Sourcery comments: Comment @sourcery-ai resolve on the
    pull request to resolve all Sourcery comments. Useful if you've already
    addressed all the comments and don't want to see them anymore.
  • Dismiss all Sourcery reviews: Comment @sourcery-ai dismiss on the pull
    request to dismiss all existing Sourcery reviews. Especially useful if you
    want to start fresh with a new review - don't forget to comment
    @sourcery-ai review to trigger a new review!

Customizing Your Experience

Access your dashboard to:

  • Enable or disable review features such as the Sourcery-generated pull request
    summary, the reviewer's guide, and others.
  • Change the review language.
  • Add, remove or edit custom review instructions.
  • Adjust other review settings.

Getting Help

@sourcery-ai sourcery-ai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey - I've left some high level feedback:

  • AggregationNode.getAggregationOutputs() allocates a new ArrayList and wrapper on every call; if this method is used in hot paths, consider caching the list or storing it as a field derived once from aggregations to avoid repeated allocations.
Prompt for AI Agents
Please address the comments from this code review:

## Overall Comments
- AggregationNode.getAggregationOutputs() allocates a new ArrayList and wrapper on every call; if this method is used in hot paths, consider caching the list or storing it as a field derived once from aggregations to avoid repeated allocations.

Sourcery is free for open source - if you like our reviews please consider sharing them ✨
Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.

@kaikalur kaikalur force-pushed the fix/aggregation-outputs-order-v2 branch from 050a7d4 to 653dbf9 Compare May 31, 2026 04:33
@kaikalur

Copy link
Copy Markdown
Contributor Author

Addressed sourcery-ai's feedback in the latest push: aggregationOutputs is now computed once in the constructor and stored in a final field (matching the existing pattern for outputs); getAggregationOutputs() just returns the cached list, no allocation per call.

@kaikalur kaikalur force-pushed the fix/aggregation-outputs-order-v2 branch from 653dbf9 to 386c644 Compare May 31, 2026 04:53
@kaikalur

Copy link
Copy Markdown
Contributor Author

Pushed a test fix: AggregationNodeTest JSON had an invalid valuesNodeLabel: {} (expected String) on the inline ValuesNode source. Dropped the source from the test JSON entirely — the test only exercises protocol round-trip for the AggregationNode fields, and protocol::AggregationNode::source is a std::shared_ptr<PlanNode> so an absent key is tolerated by from_json_key. CI should be green now.

@kaikalur kaikalur force-pushed the fix/aggregation-outputs-order-v2 branch from 386c644 to e4aca4f Compare May 31, 2026 13:53
@kaikalur

Copy link
Copy Markdown
Contributor Author

Pushed a follow-up fix for AggregationNodeTest. The previous attempt was failing because the inline test JSON for protocol::Aggregation was missing required top-level fields (arguments is an ExtraField per the yml, not nested inside call). Rewrote the first test to build the AggregationNode in C++ and round-trip it through json — narrower scope, no protocol dependency boilerplate. The other CI failures (dependency-check NVD feed, spark-integration maven wrapper download, test (17, :presto-hive) testLowStockAnalysis and testMultipleIndependentPersistentCtes) are unrelated flakes and should clear on the rerun.

@kaikalur kaikalur force-pushed the fix/aggregation-outputs-order-v2 branch from e4aca4f to ededdf5 Compare May 31, 2026 15:45
@kaikalur

Copy link
Copy Markdown
Contributor Author

Push 3 fixes the GPU build. prestocpp-linux-build-gpu-engine regenerates the protocol headers from the YAML, which dropped my manually-edited aggregationOutputs field (the Java getter is READ_ONLY so the codegen didn't include it). Declared the field in ExtraFields: in presto_protocol_core.yml as Optional<List<VariableReferenceExpression>> so the generator emits it unconditionally; converter dereferences the std::shared_ptr<> and falls back to map iteration when the field is absent (old coordinator → new worker rolling-upgrade case).

The Prestissimo protocol struct protocol::AggregationNode stores
aggregations as Map<VRE, Aggregation> = std::map<VRE>, which iterates
in sorted name order. The Java side iterates in LinkedHashMap insertion
order. When these orders diverge (e.g. approx_distinct_* mixed with
sum_*), the native AggregationNode output schema differs from the
Java planner's, shifting channel positions and causing type mismatches
at exchange operators:

  type_->kindEquals(vector.type()) Type mismatch: BIGINT vs. DOUBLE
  Operator: LocalPartition(...)

prestodb#27493 partially addressed this by switching Java rules
from HashMap to LinkedHashMap (deterministic insertion order), but did
not align Java with the native side's sort. Queries whose variable
names happened to sort consistently with insertion order passed; others
still crashed.

Add an explicit aggregationOutputs: List<VariableReferenceExpression>
field, populated on the Java side from aggregations.keySet() in
LinkedHashMap order. The native converter uses this list when present
and falls back to map iteration only for older coordinators that do
not send the field.

```
== RELEASE NOTES ==

Native Execution Changes
* Fix runtime type-mismatch crashes at exchange operators in Prestissimo when aggregation variable names sort differently from their Java allocation order. The protocol now carries an explicit aggregation output ordering so native workers build the AggregationNode output schema in the order the Java planner intended.
```

Fixes prestodb#27902
@kaikalur kaikalur force-pushed the fix/aggregation-outputs-order-v2 branch from ededdf5 to dfb76b2 Compare June 1, 2026 15:33
@kaikalur kaikalur requested a review from amitkdutta June 1, 2026 20:20
@kaikalur

kaikalur commented Jun 1, 2026

Copy link
Copy Markdown
Contributor Author

@feilong-liu when you get a chance, could you take a look? Thanks! Also adding @amitkdutta as a reviewer.

@kaikalur kaikalur merged commit 84e8a9e into prestodb:master Jun 3, 2026
119 of 121 checks passed
@kaikalur kaikalur deleted the fix/aggregation-outputs-order-v2 branch June 3, 2026 18:21
msmygit pushed a commit to msmygit/presto that referenced this pull request Jun 3, 2026
prestodb#27903)

## Summary

`protocol::AggregationNode::aggregations` is `std::map<VRE,
Aggregation>` on the native side (sorted by variable name); the Java
side iterates `LinkedHashMap` insertion order. When the two orders
diverge — e.g. `approx_distinct_*` mixed with `sum_*`, alphabetical sort
`approx_distinct < sum` but Java inserts `sum`s first — the native
`AggregationNode` output schema differs from the Java planner's. Channel
positions shift and type mismatches surface at exchange operators:

```
type_->kindEquals(vector.type()) Type mismatch: BIGINT vs. DOUBLE
Operator: LocalPartition(...)
```

prestodb#27493 partially addressed this by switching three Java optimizer rules
from `HashMap` to `LinkedHashMap` (deterministic insertion order), but
did not align Java with the native side's sort. Queries whose variable
names happened to sort consistently with their insertion order passed;
others still crashed.

This adds an explicit `aggregationOutputs:
List<VariableReferenceExpression>` field, populated on the Java side
from `aggregations.keySet()` in `LinkedHashMap` order. The native
converter uses this list when present and falls back to `std::map`
iteration only for older coordinators that don't send the field (so
rolling upgrades are safe).

Fixes prestodb#27902.

## Changes

- `presto-spi/.../AggregationNode.java`: `@JsonProperty(READ_ONLY)
getAggregationOutputs()` — derived from `aggregations`, no constructor
change.
- `presto_protocol_core.h`/`.cpp`: `List<VRE> aggregationOutputs` on
`protocol::AggregationNode`, with backward-compatible `from_json`.
- `PrestoToVeloxQueryPlan.cpp`: converter uses `aggregationOutputs` when
non-empty; falls back to map iteration otherwise.
- Tests: protocol round-trip + backward-compat
(`AggregationNodeTest.cpp`); Java getter ordering
(`TestPreAggregateBeforeGroupId`).

## Test plan

- [ ] `mvn test -pl presto-main-base
-Dtest=TestPreAggregateBeforeGroupId#testAggregationOutputsPreservesInsertionOrder`
- [ ] `presto-native-execution` build + run `presto_protocol_test
--gtest_filter="AggregationNodeTest.*"`
- [ ] Existing `TestPreAggregateBeforeGroupId` suite stays green
- [ ] Manually verify a reproducer query (`approx_distinct(...)` +
`sum(...)` under `GROUPING SETS` with
`optimizer.pre_aggregate_before_grouping_sets=true`) no longer crashes
on a Prestissimo cluster

## Release notes

\`\`\`
== RELEASE NOTES ==

Native Execution Changes
* Fix runtime type-mismatch crashes at exchange operators in Prestissimo
when aggregation variable names sort differently from their Java
allocation order. The protocol now carries an explicit aggregation
output ordering so native workers build the AggregationNode output
schema in the order the Java planner intended.
\`\`\`

## Summary by Sourcery

Ensure native AggregationNode output schemas follow Java planner
aggregation ordering by explicitly propagating aggregation output
variables through the protocol and converter.

Bug Fixes:
- Prevent type-mismatch failures in native execution caused by differing
aggregation iteration orders between Java and C++.

Enhancements:
- Expose an explicit aggregationOutputs list on AggregationNode in the
Java SPI and carry it through the native protocol for deterministic
aggregation output ordering.

Tests:
- Add Java and C++ regression tests to verify aggregation output
ordering is preserved and that the protocol remains backward compatible
when aggregationOutputs is absent.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

from:Meta PR from Meta

Projects

None yet

Development

Successfully merging this pull request may close these issues.

protocol::AggregationNode::aggregations order-dependence causes type mismatches at exchange boundaries in native execution

3 participants