Skip to content

Fix several potential circuit breaker leaks in Aggregators#79676

Merged
iverase merged 10 commits intoelastic:masterfrom
iverase:AggregatorTestCase
Nov 23, 2021
Merged

Fix several potential circuit breaker leaks in Aggregators#79676
iverase merged 10 commits intoelastic:masterfrom
iverase:AggregatorTestCase

Conversation

@iverase
Copy link
Copy Markdown
Contributor

@iverase iverase commented Oct 22, 2021

This PR adds a new CircuitBreaker implementation in the test that throws CircuitBreakerExceptions randomly. This new circuit breaker helps uncover several places where we might leak if the circuit breaker throws such exception.

A typical case is an object that creates two big arrays in its constructor. This object itself is created in an aggregator constructor. If the circuit breaker throws an exception during the creation of the second array, we need especial code to handle closing the first array or we leak.

This leaks are probably rare and the effect is small but potentially can decrease the performance of a cluster over time.

@iverase iverase requested a review from nik9000 October 22, 2021 11:33
@elasticmachine elasticmachine added the Team:Analytics Meta label for analytical engine team (ESQL/Aggs/Geo) label Oct 22, 2021
@elasticmachine
Copy link
Copy Markdown
Collaborator

Pinging @elastic/es-analytics-geo (Team:Analytics)

# Conflicts:
#	test/framework/src/main/java/org/elasticsearch/search/aggregations/AggregatorTestCase.java
@iverase
Copy link
Copy Markdown
Contributor Author

iverase commented Oct 26, 2021

@elasticmachine run elasticsearch-ci/packaging-tests-unix-sample

# Conflicts:
#	test/framework/src/main/java/org/elasticsearch/search/aggregations/AggregatorTestCase.java
@iverase iverase added the v8.1.0 label Nov 2, 2021
Copy link
Copy Markdown
Member

@nik9000 nik9000 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wish there was a cleaner looking way to do this. But it does seem right. I left a question about one case that looked different from the others.

/** Release the provided {@link Releasable}s. */
public static void close(Releasable... releasables) {
close(Arrays.asList(releasables));
if (releasables != null) {
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's kind of rude to pass a null list here.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I moved the null check to the composite aggregator that was causing this issue.

if (success == false) {
Releasables.close(metricValues);
}
}
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if it's more clear to do this with one try block.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changed to one try block in a28511e

boolean success = false;
try {
long[] mergeMap = new long[Math.toIntExact(oldOrds.size())];
bucketOrds = new LongKeyedBucketOrds.FromMany(bigArrays());
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the problem here is that we may close oldOrds twice, right?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes indeed that's the problem. We should not close oldOrds until we successfully allocated bucketOrds andwe need to close it at the end.

@iverase
Copy link
Copy Markdown
Contributor Author

iverase commented Nov 11, 2021

Hey @nik9000,

Do you think we still want to pursue this change?

@nik9000
Copy link
Copy Markdown
Member

nik9000 commented Nov 22, 2021

Do you think we still want to pursue this change?

I thought I replied to this last week..... I liked it. let me double check.

Copy link
Copy Markdown
Member

@nik9000 nik9000 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@iverase
Copy link
Copy Markdown
Contributor Author

iverase commented Nov 23, 2021

@elasticmachine run elasticsearch-ci/part-2

@iverase iverase merged commit 6de5edd into elastic:master Nov 23, 2021
@iverase iverase deleted the AggregatorTestCase branch November 23, 2021 08:11
iverase added a commit to iverase/elasticsearch that referenced this pull request Nov 23, 2021
…9676)

This commit adds a new CircuitBreaker implementation in the test that throws CircuitBreaker
Exceptions randomly. This new circuit breaker helps uncover several places where we might
leak if the circuit breaker throws such exception.
iverase added a commit to iverase/elasticsearch that referenced this pull request Nov 23, 2021
…9676)

This commit adds a new CircuitBreaker implementation in the test that throws CircuitBreaker
Exceptions randomly. This new circuit breaker helps uncover several places where we might
leak if the circuit breaker throws such exception.
@elasticsearchmachine
Copy link
Copy Markdown
Collaborator

💚 Backport successful

Status Branch Result
8.0
7.16

elasticsearchmachine pushed a commit that referenced this pull request Nov 23, 2021
…80928)

This commit adds a new CircuitBreaker implementation in the test that throws CircuitBreaker
Exceptions randomly. This new circuit breaker helps uncover several places where we might
leak if the circuit breaker throws such exception.
elasticsearchmachine pushed a commit that referenced this pull request Nov 23, 2021
…80929)

This commit adds a new CircuitBreaker implementation in the test that throws CircuitBreaker
Exceptions randomly. This new circuit breaker helps uncover several places where we might
leak if the circuit breaker throws such exception.
@nik9000
Copy link
Copy Markdown
Member

nik9000 commented Nov 23, 2021

@iverase iverase merged commit 6de5edd into elastic:master 6 hours ago

Thanks for dragging me along with this. I think it's a good change.

@iverase
Copy link
Copy Markdown
Contributor Author

iverase commented Nov 23, 2021

I do hope that no more leaks can escape our testing :)

weizijun added a commit to weizijun/elasticsearch that referenced this pull request Nov 23, 2021
* upstream/master:
  Cleanup SLM History Item .equals (elastic#80938)
  Rework breaking changes for new structure (elastic#80907)
  [DOCS] Fix elasticsearch-reset-password typo (elastic#80919)
  [ML] No need to use parent task client when internal infer delegates (elastic#80905)
  Fix shadowed vars pt6 (elastic#80899)
  add ignore info (elastic#80924)
  Fix several potential circuit breaker leaks in Aggregators (elastic#79676)

# Conflicts:
#	server/src/internalClusterTest/java/org/elasticsearch/index/TimeSeriesModeIT.java
weizijun added a commit to weizijun/elasticsearch that referenced this pull request Nov 24, 2021
* upstream/master: (29 commits)
  Fix typo (elastic#80925)
  Increase docker compose timeouts for CI builds
  TSDB: fix error without feature flag (elastic#80945)
  [DOCS] Relocate `index.mapping.dimension_fields.limit` setting docs (elastic#80964)
  Explicit write methods for always-missing values (elastic#80958)
  TSDB: move TimeSeriesModeIT to yaml tests (elastic#80933)
  [ML] Removing temporary debug (elastic#80956)
  Remove unused ConnectTransportException#node (elastic#80944)
  Reinterpret dots in field names as object structure (elastic#79922)
  Remove obsolete typed legacy index templates (elastic#80937)
  Remove unnecessary shuffle in unassigned shards allocation. (elastic#65172)
  TSDB: Tests for nanosecond timeprecision timestamp just beyond the limit (elastic#80932)
  Cleanup SLM History Item .equals (elastic#80938)
  Rework breaking changes for new structure (elastic#80907)
  [DOCS] Fix elasticsearch-reset-password typo (elastic#80919)
  [ML] No need to use parent task client when internal infer delegates (elastic#80905)
  Fix shadowed vars pt6 (elastic#80899)
  add ignore info (elastic#80924)
  Fix several potential circuit breaker leaks in Aggregators (elastic#79676)
  Extract more standard metadata from binary files (elastic#78754)
  ...
weizijun added a commit to weizijun/elasticsearch that referenced this pull request Nov 24, 2021
* upstream/master: (319 commits)
  Fix typo (elastic#80925)
  Increase docker compose timeouts for CI builds
  TSDB: fix error without feature flag (elastic#80945)
  [DOCS] Relocate `index.mapping.dimension_fields.limit` setting docs (elastic#80964)
  Explicit write methods for always-missing values (elastic#80958)
  TSDB: move TimeSeriesModeIT to yaml tests (elastic#80933)
  [ML] Removing temporary debug (elastic#80956)
  Remove unused ConnectTransportException#node (elastic#80944)
  Reinterpret dots in field names as object structure (elastic#79922)
  Remove obsolete typed legacy index templates (elastic#80937)
  Remove unnecessary shuffle in unassigned shards allocation. (elastic#65172)
  TSDB: Tests for nanosecond timeprecision timestamp just beyond the limit (elastic#80932)
  Cleanup SLM History Item .equals (elastic#80938)
  Rework breaking changes for new structure (elastic#80907)
  [DOCS] Fix elasticsearch-reset-password typo (elastic#80919)
  [ML] No need to use parent task client when internal infer delegates (elastic#80905)
  Fix shadowed vars pt6 (elastic#80899)
  add ignore info (elastic#80924)
  Fix several potential circuit breaker leaks in Aggregators (elastic#79676)
  Extract more standard metadata from binary files (elastic#78754)
  ...
@iverase iverase mentioned this pull request Dec 8, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

:Analytics/Aggregations Aggregations >bug Team:Analytics Meta label for analytical engine team (ESQL/Aggs/Geo) v7.16.0 v8.0.0-rc1 v8.1.0

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants