Significantly Lower Monitoring HttpExport Memory Footprint#48854
Merged
original-brownbear merged 3 commits intoelastic:masterfrom Nov 11, 2019
original-brownbear:improve-monitoring-perf
Merged
Significantly Lower Monitoring HttpExport Memory Footprint#48854original-brownbear merged 3 commits intoelastic:masterfrom original-brownbear:improve-monitoring-perf
original-brownbear merged 3 commits intoelastic:masterfrom
original-brownbear:improve-monitoring-perf
Conversation
The `HttpExportBulk` exporter is using a lot more memory than it needs to by allocating buffers for serialization and IO: * Remove copying of all bytes when flushing, instead use the stream wrapper * Remove copying step turning the BAOS into a `byte[]` * This also avoids the allocation of a single huge `byte[]` and instead makes use of the internal paging logic of the `BytesStreamOutput` * Don't allocate a new BAOS for every document, just keep appending to a single BAOS
Collaborator
|
Pinging @elastic/es-core-features (:Core/Features/Monitoring) |
|
|
||
| return BytesReference.toBytes(out.bytes()); | ||
| } catch (Exception e) { | ||
| logger.warn((Supplier<?>) () -> new ParameterizedMessage("failed to render document [{}], skipping it [{}]", doc, name), e); |
Contributor
Author
There was a problem hiding this comment.
This is hard to model with streaming IO, but to me catch and appending an empty byte array seems bogus here and not worth retaining (there's no test covering this case and we should simply make sure out documents that we have full control over serialise shouldn't we?).
jakelandis
requested changes
Nov 11, 2019
Contributor
jakelandis
left a comment
There was a problem hiding this comment.
A couple questions...
...onitoring/src/main/java/org/elasticsearch/xpack/monitoring/exporter/http/HttpExportBulk.java
Outdated
Show resolved
Hide resolved
...onitoring/src/main/java/org/elasticsearch/xpack/monitoring/exporter/http/HttpExportBulk.java
Show resolved
Hide resolved
...onitoring/src/main/java/org/elasticsearch/xpack/monitoring/exporter/http/HttpExportBulk.java
Show resolved
Hide resolved
jakelandis
approved these changes
Nov 11, 2019
Contributor
jakelandis
left a comment
There was a problem hiding this comment.
LGTM , thanks for addressing this.
Contributor
Author
|
npnp @jakelandis thanks for reviewing! |
original-brownbear
added a commit
that referenced
this pull request
Nov 12, 2019
…48966) The `HttpExportBulk` exporter is using a lot more memory than it needs to by allocating buffers for serialization and IO: * Remove copying of all bytes when flushing, instead use the stream wrapper * Remove copying step turning the BAOS into a `byte[]` * This also avoids the allocation of a single huge `byte[]` and instead makes use of the internal paging logic of the `BytesStreamOutput` * Don't allocate a new BAOS for every document, just keep appending to a single BAOS
This was referenced Feb 3, 2020
original-brownbear
added a commit
that referenced
this pull request
May 8, 2020
Even with changes from #48854 we're still seeing significant (as in tens and hundreds of MB) buffer usage for bulk exports in some cases which destabilizes master nodes. Since we need to know the serialized length of the bulk body we can't do the serialization in a streaming manner. (also it's not easily doable with the HTTP client API we're using anyway). => let's at least serialize on heap in compressed form and decompress as we're streaming to the HTTP connection. For small requests this adds negligible overhead but for large requests this reduces the size of the payload field by about an order of magnitude (empirically determined) which is a massive reduction in size when considering O(100MB) bulk requests.
original-brownbear
added a commit
that referenced
this pull request
May 8, 2020
Even with changes from #48854 we're still seeing significant (as in tens and hundreds of MB) buffer usage for bulk exports in some cases which destabilizes master nodes. Since we need to know the serialized length of the bulk body we can't do the serialization in a streaming manner. (also it's not easily doable with the HTTP client API we're using anyway). => let's at least serialize on heap in compressed form and decompress as we're streaming to the HTTP connection. For small requests this adds negligible overhead but for large requests this reduces the size of the payload field by about an order of magnitude (empirically determined) which is a massive reduction in size when considering O(100MB) bulk requests.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
The
HttpExportBulkexporter is using a lot more memory than it needs toby allocating buffers for serialization and IO:
byte[]byte[]and instead makes use of the internal paging logic of theBytesStreamOutput