Skip to content

Save a little space in agg tree (backport of #53730)#54213

Merged
nik9000 merged 8 commits intoelastic:7.xfrom
nik9000:pipeline_drop_serialization_7_x
Mar 25, 2020
Merged

Save a little space in agg tree (backport of #53730)#54213
nik9000 merged 8 commits intoelastic:7.xfrom
nik9000:pipeline_drop_serialization_7_x

Conversation

@nik9000
Copy link
Copy Markdown
Member

@nik9000 nik9000 commented Mar 25, 2020

This drop the "top level" pipeline aggregators from the aggregation
result tree which should save a little memory and a few serialization
bytes. Perhaps more imporantly, this provides a mechanism by which we
can remove all pipelines from the aggregation result tree. This will
save quite a bit of space when pipelines are deep in the tree.

Sadly, doing this isn't simple because of backwards compatibility. Nodes
before 7.8.0 need those pipelines. We provide them by setting passing
a Supplier<PipelineTree> into the root of the aggregation tree that we
only call if we need to serialize to a version before 7.8.0.

This solution works for cross cluster search because we always reduce
the aggregations in each remote cluster and then forward them back to
the coordinating node. Its quite possible that the coordinating node
needs the pipeline (say it is version 7.1.0) and the gateway node in the
remote cluster doesn't (version 7.8.0). In that case the data nodes
won't send the pipeline aggregations back to the gateway node.
Critically, the gateway node will send the pipeline aggregations back
to the coordinating node. This is all managed with that
Supplier<PipelineTree>, but how it is managed is a bit tricky.

nik9000 added 6 commits March 25, 2020 08:59
This drop the "top level" pipeline aggregators from the aggregation
result tree which should save a little memory and a few serialization
bytes. Perhaps more imporantly, this provides a mechanism by which we
can remove *all* pipelines from the aggregation result tree. This will
save quite a bit of space when pipelines are deep in the tree.

Sadly, doing this isn't simple because of backwards compatibility. Nodes
before 7.7.0 *need* those pipelines. We provide them by setting passing
a `Supplier<PipelineTree>` into the root of the aggregation tree that we
only call if we need to serialize to a version before 7.7.0.

This solution works for cross cluster search because we always reduce
the aggregations in each remote cluster and then forward them back to
the coordinating node. Its quite possible that the coordinating node
needs the pipeline (say it is version 7.1.0) and the gateway node in the
remote cluster doesn't (version 7.7.0). In that case the data nodes
won't send the pipeline aggregations back to the gateway node.
Critically, the gateway node *will* send the pipeline aggregations back
to the coordinating node. This is all managed with that
`Supplier<PipelineTree>`, but *how* it is managed is a bit tricky.
@nik9000 nik9000 marked this pull request as ready for review March 25, 2020 17:32
@nik9000
Copy link
Copy Markdown
Member Author

nik9000 commented Mar 25, 2020

@elasticmachine run elasticsearch-ci/bwc

@nik9000 nik9000 merged commit 8f40f14 into elastic:7.x Mar 25, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant