Enable Circuit Breaker tracking in more parts of the aggregations framework

### Description

This meta issue tracks the effort to extend circuit breaker memory tracking beyond the collect phase of aggregations.  There are several existing issues related to this, which do a good job of describing the problem (and are linked below), but we need a place to track the tasks for fixing this.  That's what this issue is for.

## Plan
Currently (8.4) aggregations create an object per collected bucket, called `InternalAggregation`.  These objects are stored in the `QuerySearchResult` which is responsible for serializing them from the data nodes back to the coordinator, and also for de-serializing them on coordinator side.  Managing these objects is quite tricky, and does not provide good places to inject the circuit breaker logic.

Instead, we want to move to a dense representation, which would create one object per aggregator.  These objects would be `Releasable`, and responsible for tracking both the post-collection data node side memory usage and the reduce time coordinating node memory usage.  Obviously this involves a (big) change to the wire format used for `QuerySearchResult`.  Doing this in a backwards compatible way is non trivial.

### Tasks
- [x] Improve testing for post collection memory access
    - In particular, we need to be able to validate that we don't access something that was "released" at the end of the collect phase.
    - https://github.com/elastic/elasticsearch/pull/90114
    - https://github.com/elastic/elasticsearch/pull/90149
    - https://github.com/elastic/elasticsearch/pull/90530
    - https://github.com/elastic/elasticsearch/pull/90540
    - https://github.com/elastic/elasticsearch/pull/90699
    - https://github.com/elastic/elasticsearch/pull/90775
    - https://github.com/elastic/elasticsearch/pull/91598
- [ ] Netty Buffer Backed `BigArrays`
    - [x] For `BigIntArray` (https://github.com/elastic/elasticsearch/pull/89668)
    - [x] For `BigLongArray` (#91641)
    - [x] For `BigByteArray` (#92706)
    - [ ] For `BigObjectArray`
    - [x] For `BigDoubleArray` (#90745)
    - [ ] For `BigFloatArray`
- [ ] Add in support for a new aggregations response format into `QuerySearchResult`.  We can lay the ground work for this without defining too much about the format itself
    - [x] Add support for releasing aggregation big arrays in `QuerySearchResult` (https://github.com/elastic/elasticsearch/pull/94023)
    - [ ] Write side mixed version support
- [ ] Data Node Local Reduce in Big Arrays
    - [ ] New reduction logic (prototype: https://github.com/elastic/elasticsearch/pull/95346)
        - Since we don't have the object hierarchy of the `InternalAggregation`s to work with, we need a pure ordinals based reduction path. 
        - [ ] Sorted Key Iterator for `LongKeyedBucketOrds` (https://github.com/elastic/elasticsearch/pull/95809)
        - [ ] Sorted Key Iterator for `BytesKeyedBucketOrds`
        - [ ] [Migrate those iterators to be backed by BigArrays](https://github.com/elastic/elasticsearch/issues/95960)
        - [ ] [Use Primitive Iterators](https://github.com/elastic/elasticsearch/issues/95961)
    - [ ] Wire up the reduce side circuit breakers, make sure new reduce time big arrays are correctly released
    - [ ] `AggregatorTestCase` coverage of the new reduce logic, some how
    - [ ] Reduce phase circuit breaker tests, verify that `MockBigArrays` in `AggregatorTestCase` is correctly catching leaks in the new reduce path
    - [ ] Reduce with Cranky Circuit Breaker tests
- [ ] Dense Wire Format
    - [ ] `LongKeyedBucketOrds` and friends need to become `Writable`, same as we did with `BigArrays` (and probably leaning on the buffer backed big arrays)
    - [ ] Reduce side version detection (i.e. downgrade some results if we got mixed old and new format
    - [ ] CCS version detection, re-writing as appropriate


***

### Vague Tasks
- [ ] Prototype a dense representation that addresses the memory concerns.  Initially this can be done with the non-recycling big arrays instance, but long term needs to be wired up to a real circuit breaker.
    - [ ] Works for `Max`
    - [ ] Works for `Range`
    - [ ] Works for `Cardinality`
    - [ ] Works for `Terms`
- [ ] Make sure the prototype dense representation also solves the normalization concerns within terms
- [ ] Validate prototype (many reviews)
- [ ] BWC, Mixed Mode, CCS testing, testing, testing
- [ ] convert the rest of the aggs



## Related Issues
- https://github.com/elastic/elasticsearch/issues/59892
- https://github.com/elastic/elasticsearch/issues/88373
- https://github.com/elastic/elasticsearch/issues/88128
- https://github.com/elastic/elasticsearch/issues/72619
- https://github.com/elastic/elasticsearch/issues/77449

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Enable Circuit Breaker tracking in more parts of the aggregations framework #89437

Description

Plan

Tasks

Vague Tasks

Related Issues

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Enable Circuit Breaker tracking in more parts of the aggregations framework #89437

Description

Description

Plan

Tasks

Vague Tasks

Related Issues

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions