Remove deprecated wrapper from scripted_metric by nik9000 · Pull Request #57627 · elastic/elasticsearch

nik9000 · 2020-06-03T21:11:15Z

This removes the deprecated asMultiBucketAggregator wrapper from
scripted_metric. Unlike most other such removals, this isn't likely to
save much memory. But it does make the internals of the aggregator
slightly less twisted.

Relates to #56487

This removes the deprecated `asMultiBucketAggregator` wrapper from `scripted_metric`. Unlike most other such removals, this isn't likely to save much memory. But it does make the internals of the aggregator slightly less twisted. Relates to elastic#56487

elasticmachine · 2020-06-03T21:11:18Z

Pinging @elastic/es-analytics-geo (:Analytics/Aggregations)

nik9000 · 2020-06-03T21:54:57Z

@elasticmachine run elasticsearch-ci/1

talevy · 2020-06-04T22:14:09Z

...er/src/main/java/org/elasticsearch/search/aggregations/metrics/ScriptedMetricAggregator.java

+     * tracked by the circuit breakers properly. This is sad. So we pick a big
+     * number and estimate that each bucket costs that. It could be wildly
+     * inaccurate. We're sort of hoping that the real memory breaker saves
+     * us here. Or that folks just don't use the aggregation.


talevy · 2020-06-04T22:14:25Z

...er/src/main/java/org/elasticsearch/search/aggregations/metrics/ScriptedMetricAggregator.java

+     * inaccurate. We're sort of hoping that the real memory breaker saves
+     * us here. Or that folks just don't use the aggregation.
+     */
+    private static final long BUCKET_COST_ESTIMATE = 1024 * 5;


what made you choose this number beyond just "make it large"?

Drive-by comment: I wonder if it would make sense to expose this as a configurable option for advanced users? (perhaps in a followup PR)

Reasoning being that we can't figure it out easily, it's already an advanced agg, so a user might actually be able to give us a reasonable estimate. And if they don't set it, we fall back to Big Number.

Dunno, might be a terrible idea :)

The number is the original default "weight" of an aggregator. So it is basically what we used to have. I didn't want to just set it to the default weight because it doesn't really have anything to do with it. Other than it is what we were using.

Making it configurable is certainly interesting! I don't folks are going to have a good idea of what to set it to without having done some java hacking though. We try not to leak stuff like that to our users. OTOH scripted_metric is pretty bonkers so if we are going to leak that sort of thing anywhere, here is it.

Yeah I don't feel super great about the idea either. But then again, we're essentially allowing a user to write a custom map-reduce job that runs on ES, so exposing this kind of accounting isn't too crazy in that context.

There's probably a case for forcing the user to estimate it too, since I've definitely seen very bad scripts knock over clusters before (maps holding high cardinality IDs, etc). I think that's probably a bridge too far, alas :)

…7627) This removes the deprecated `asMultiBucketAggregator` wrapper from `scripted_metric`. Unlike most other such removals, this isn't likely to save much memory. But it does make the internals of the aggregator slightly less twisted. Relates to elastic#56487

scripts don't have permission to call `parallelStream` because it'll sometimes make a fork/join pool and script can't start threads. In this case I added it by accident in #57627 and should have just used `stream`.

scripts don't have permission to call `parallelStream` because it'll sometimes make a fork/join pool and script can't start threads. In this case I added it by accident in elastic#57627 and should have just used `stream`.

…57763) This removes the deprecated `asMultiBucketAggregator` wrapper from `scripted_metric`. Unlike most other such removals, this isn't likely to save much memory. But it does make the internals of the aggregator slightly less twisted. Relates to #56487

Fixes two bugs introduced by elastic#57627: 1. We were not properly letting go of memory from the request breaker when the aggregation finished. 2. We no longer supported totally arbitrary stuff produced by the init script because we *assumed* that it'd be ok to run the script once and clone its results. Sadly, cloning can't clone *anything* that the init script can make, like `String` arrays. This runs the init script once for every new bucket so we don't need to clone.

Fixes two bugs introduced by #57627: 1. We were not properly letting go of memory from the request breaker when the aggregation finished. 2. We no longer supported totally arbitrary stuff produced by the init script because we *assumed* that it'd be ok to run the script once and clone its results. Sadly, cloning can't clone *anything* that the init script can make, like `String` arrays. This runs the init script once for every new bucket so we don't need to clone.

Fixes two bugs introduced by elastic#57627: 1. We were not properly letting go of memory from the request breaker when the aggregation finished. 2. We no longer supported totally arbitrary stuff produced by the init script because we *assumed* that it'd be ok to run the script once and clone its results. Sadly, cloning can't clone *anything* that the init script can make, like `String` arrays. This runs the init script once for every new bucket so we don't need to clone.

Fixes two bugs introduced by #57627: 1. We were not properly letting go of memory from the request breaker when the aggregation finished. 2. We no longer supported totally arbitrary stuff produced by the init script because we *assumed* that it'd be ok to run the script once and clone its results. Sadly, cloning can't clone *anything* that the init script can make, like `String` arrays. This runs the init script once for every new bucket so we don't need to clone.

nik9000 added >non-issue :Analytics/Aggregations Aggregations v8.0.0 v7.9.0 labels Jun 3, 2020

nik9000 requested a review from talevy June 3, 2020 21:11

elasticmachine added the Team:Analytics Meta label for analytical engine team (ESQL/Aggs/Geo) label Jun 3, 2020

nik9000 mentioned this pull request Jun 3, 2020

Multi-bucket aggregator wrapper is slow and uses a ton of memory #56487

Closed

16 tasks

Merge branch 'master' into scripted_metric_mem

22828cc

talevy approved these changes Jun 4, 2020

View reviewed changes

nik9000 merged commit 2b82551 into elastic:master Jun 5, 2020

nik9000 added the backport pending label Jun 5, 2020

nik9000 removed the backport pending label Jun 5, 2020

nik9000 mentioned this pull request Jun 25, 2020

Fix two scripted_metric bugs #58547

Merged

nik9000 mentioned this pull request Jun 25, 2020

Fix two scripted_metric bugs (backport of #58547) #58565

Merged

jakelandis added v8.0.0-alpha1 and removed v8.0.0 labels Jul 26, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Remove deprecated wrapper from scripted_metric#57627

Remove deprecated wrapper from scripted_metric#57627
nik9000 merged 2 commits intoelastic:masterfrom
nik9000:scripted_metric_mem

nik9000 commented Jun 3, 2020

Uh oh!

elasticmachine commented Jun 3, 2020

Uh oh!

nik9000 commented Jun 3, 2020

Uh oh!

talevy Jun 4, 2020

Uh oh!

talevy Jun 4, 2020

Uh oh!

$@polyfractal$ polyfractal Jun 5, 2020

Uh oh!

nik9000 Jun 5, 2020

Uh oh!

$@polyfractal$ polyfractal Jun 5, 2020

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

Conversation

nik9000 commented Jun 3, 2020

Uh oh!

elasticmachine commented Jun 3, 2020

Uh oh!

nik9000 commented Jun 3, 2020

Uh oh!

talevy Jun 4, 2020

Choose a reason for hiding this comment

Uh oh!

talevy Jun 4, 2020

Choose a reason for hiding this comment

Uh oh!

polyfractal Jun 5, 2020

Choose a reason for hiding this comment

Uh oh!

nik9000 Jun 5, 2020

Choose a reason for hiding this comment

Uh oh!

polyfractal Jun 5, 2020

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

$@polyfractal$ polyfractal Jun 5, 2020

$@polyfractal$ polyfractal Jun 5, 2020