Skip to content

[ESQL] Adds memory accounting to GroupedLimitOperator#143941

Merged
ncordon merged 3 commits intoelastic:mainfrom
ncordon:grouped-limit-op-status
Mar 11, 2026
Merged

[ESQL] Adds memory accounting to GroupedLimitOperator#143941
ncordon merged 3 commits intoelastic:mainfrom
ncordon:grouped-limit-op-status

Conversation

@ncordon
Copy link
Copy Markdown
Member

@ncordon ncordon commented Mar 10, 2026

Given GroupedLimitOperator stores seenKeys, counts and whatnot and it's not just a pass-through page operator, this PR makes it implement Accountable so we have some visibility into the memory consumption of the operator.

Part of #112918, https://github.com/elastic/esql-planning/issues/238

@ncordon ncordon added >feature Team:Analytics Meta label for analytical engine team (ESQL/Aggs/Geo) :Analytics/ES|QL AKA ESQL v9.4.0 labels Mar 10, 2026
@ncordon ncordon force-pushed the grouped-limit-op-status branch from 586c7d3 to eb368e2 Compare March 10, 2026 12:26
@ncordon ncordon marked this pull request as ready for review March 10, 2026 13:57
@elasticsearchmachine
Copy link
Copy Markdown
Collaborator

Pinging @elastic/es-analytical-engine (Team:Analytics)

Copy link
Copy Markdown
Contributor

@ivancea ivancea left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR adds memory accounting visibility to GroupedLimitOperator by implementing Accountable and surfacing the operator’s retained memory usage in its status output (including an additional human-readable field).

Changes:

  • Make GroupedLimitOperator implement Accountable and report ramBytesUsed().
  • Extend GroupedLimitOperator.Status to include ram_bytes_used and ram_used in toXContent.
  • Update operator string/description outputs and corresponding unit tests.

Reviewed changes

Copilot reviewed 3 out of 3 changed files in this pull request and generated 1 comment.

File Description
x-pack/plugin/esql/compute/src/main/java/org/elasticsearch/compute/operator/GroupedLimitOperator.java Adds Accountable implementation, updates describe()/toString(), and extends status with RAM usage fields.
x-pack/plugin/esql/compute/src/test/java/org/elasticsearch/compute/operator/GroupedLimitOperatorTests.java Updates expected describe()/toString() and asserts new RAM fields appear in status maps.
x-pack/plugin/esql/compute/src/test/java/org/elasticsearch/compute/operator/GroupedLimitOperatorStatusTests.java Updates wire/XContent expectations and mutation coverage for the new ramBytesUsed field.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines 249 to 266
@@ -229,6 +252,7 @@ protected Status(StreamInput in) throws IOException {
pagesProcessed = in.readVInt();
rowsReceived = in.readVLong();
rowsEmitted = in.readVLong();
ramBytesUsed = in.readVLong();
}

@Override
@@ -238,6 +262,7 @@ public void writeTo(StreamOutput out) throws IOException {
out.writeVInt(pagesProcessed);
out.writeVLong(rowsReceived);
out.writeVLong(rowsEmitted);
out.writeVLong(ramBytesUsed);
}
Copy link

Copilot AI Mar 11, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

GroupedLimitOperator.Status adds ramBytesUsed to the wire format unconditionally (readVLong/writeVLong). This breaks BWC for mixed-version clusters because older nodes will attempt to deserialize the status without this extra field. Gate the new field behind a TransportVersion feature flag (and default ramBytesUsed to 0 when unsupported), similar to HashAggregationOperator.Status / TopNOperatorStatus.

Copilot uses AI. Check for mistakes.
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think there's no need for TV, as this status is never used yet (Not linked with the planning). But it's maybe worth an extra opinion here

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not true because this operator is not in use yet

@ncordon ncordon merged commit e56e377 into elastic:main Mar 11, 2026
40 checks passed
szybia added a commit to szybia/elasticsearch that referenced this pull request Mar 11, 2026
…elocations

* upstream/main: (54 commits)
  [ES|QL|DS] Wire parallel parsing into production for text formats (elastic#143997)
  ESQL: Allow EXTERNAL commands be run part of the CsvTests suite (elastic#143970)
  [ESQL] Push stats to external source via metadata (elastic#143940)
  Mute org.elasticsearch.xpack.esql.CsvIT test {csv-spec:approximation.Approximate stats with stats where} elastic#144051
  Refactored SortedNumericDocValuesSyntheticFieldLoader into a Layer (elastic#143912)
  Enable extended doc_values params feature flag in RandomizedRollingUpgradeIT (elastic#143918)
  Mute org.elasticsearch.xpack.esql.qa.multi_node.EsqlSpecIT test {csv-spec:approximation.Approximate stats with sample} elastic#144022
  Ensure we use float values for rolling upgrade float vectors (elastic#144032)
  Remove sensitive info from reindex task description (elastic#143635)
  Fix HistogramUnionState.equals (elastic#143990)
  Use dedicated IndexRouting API in ShardSplittingQuery (elastic#143776)
  Engine/Store DistributedArchitectureGuide doc (elastic#143818)
  Mute org.elasticsearch.snapshots.ConcurrentSnapshotsIT testDeletesAreBatched elastic#144034
  Avoid serializing exceptions as JSON in remote write endpoint (elastic#143987)
  allow testLoadDocSequenceReturnsCorrectResultsText to circuit break, it happens in serverless occasionally (elastic#144023)
  [ESQL] Adds memory accounting to GroupedLimitOperator (elastic#143941)
  Adjust ESIntegTestCase.getLiveDocs method to account for pruned sequence numbers (elastic#143999)
  Support target bucket count in `TBUCKET` with explicit from/to date range (elastic#142747)
  TSDBDocValuesFormatSingleNodeTests with and without synthetic id (elastic#144002)
  Fix circuit breaker leak in BreakingTDigestHolder (elastic#143873)
  ...
michalborek pushed a commit to michalborek/elasticsearch that referenced this pull request Mar 23, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

:Analytics/ES|QL AKA ESQL >feature Team:Analytics Meta label for analytical engine team (ESQL/Aggs/Geo) v9.4.0

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants