Skip to content

Optimize GCS Mock#51593

Merged
original-brownbear merged 1 commit intoelastic:masterfrom
original-brownbear:more-efficient-gcs-mock-master
Jan 29, 2020
Merged

Optimize GCS Mock#51593
original-brownbear merged 1 commit intoelastic:masterfrom
original-brownbear:more-efficient-gcs-mock-master

Conversation

@original-brownbear
Copy link
Copy Markdown
Contributor

This test was still very GC heavy in Java 8 runs in particular
which seems to slow down request processing to the point of timeouts
in some runs.
This PR completely removes the large number of O(MB) byte[] allocations
that were happening in the mock http handler which cuts the allocation rate
by about a factor of 5 in my local testing for the GC heavy testSnapshotWithLargeSegmentFiles
run.

Closes #51446
Closes #50754

This test was still very GC heavy in Java 8 runs in particular
which seems to slow down request processing to the point of timeouts
in some runs.
This PR completely removes the large number of O(MB) `byte[]` allocations
that were happening in the mock http handler which cuts the allocation rate
by about a factor of 5 in my local testing for the GC heavy `testSnapshotWithLargeSegmentFiles`
run.

Closes #51446
@original-brownbear original-brownbear added >test Issues or PRs that are addressing/adding tests :Distributed/Snapshot/Restore Anything directly related to the `_snapshot/*` APIs v8.0.0 v7.7.0 v7.6.1 labels Jan 29, 2020
@elasticmachine
Copy link
Copy Markdown
Collaborator

Pinging @elastic/es-distributed (:Distributed/Snapshot/Restore)

@original-brownbear
Copy link
Copy Markdown
Contributor Author

@tlrx if this doesn't help I think we have to find a way to down-scale these tests a little maybe. They're handling multiple MB of data at once which is a little tricky with 512M heaps on Cloud. Though with this change at least on my local even Java 8 runs on 7.x never go above ~250MB heap usage so maybe this will work :)

final int start = getContentRangeStart(range);
final int end = getContentRangeEnd(range);

final ByteArrayOutputStream out = new ByteArrayOutputStream() {
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Using the BAOS here was pretty brutal for multi MB requests :) Much nicer now with the 16kb chunks Streams.readFully(wrappedRequest) produces

final byte[] buffer = new byte[len];
Streams.readFully(stream, buffer);
content = Tuple.tuple(name, new BytesArray(buffer));
content = Tuple.tuple(name, fullRequestBody.slice(startPos, len));
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Even though slice wastes some memory here for the wrapping around the actual blob content for smaller blobs, it's a huge efficiency gain to stick with 16kb chunks from the BytesReference compared to allocating a potentially MB sized array.

Copy link
Copy Markdown
Member

@tlrx tlrx left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thanks for taking care of this Armin

}
System.arraycopy(out.toByteArray(), 0, blob, start, Math.toIntExact(bytesRead));
blobs.put(blobName, new BytesArray(blob));
blob = new CompositeBytesReference(blob, Streams.readFully(wrappedRequest));
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I didn't know about this one 👍

@original-brownbear
Copy link
Copy Markdown
Contributor Author

Thanks Tanguy!

@original-brownbear original-brownbear merged commit 0952c21 into elastic:master Jan 29, 2020
@original-brownbear original-brownbear deleted the more-efficient-gcs-mock-master branch January 29, 2020 09:04
original-brownbear added a commit that referenced this pull request Jan 29, 2020
This test was still very GC heavy in Java 8 runs in particular
which seems to slow down request processing to the point of timeouts
in some runs.
This PR completely removes the large number of O(MB) `byte[]` allocations
that were happening in the mock http handler which cuts the allocation rate
by about a factor of 5 in my local testing for the GC heavy `testSnapshotWithLargeSegmentFiles`
run.

Closes #51446
Closes #50754
original-brownbear added a commit that referenced this pull request Jan 29, 2020
This test was still very GC heavy in Java 8 runs in particular
which seems to slow down request processing to the point of timeouts
in some runs.
This PR completely removes the large number of O(MB) `byte[]` allocations
that were happening in the mock http handler which cuts the allocation rate
by about a factor of 5 in my local testing for the GC heavy `testSnapshotWithLargeSegmentFiles`
run.

Closes #51446
Closes #50754
original-brownbear added a commit that referenced this pull request Jan 29, 2020
For small uploads (that can still be up to 5MB!) we needlessly
reading the `InputStream` into a BAOS which entailed allocating
the `byte[]` for the stream contents twice (because to `toByteArray` on the BAOS copies).

Also, for resumeable uploads we were needlessly wrapping the output channel and running each individual write in its own privileged context when we could just wrap the whole upload in a single privileged context.

Relates #51593
original-brownbear added a commit that referenced this pull request Jan 29, 2020
For small uploads (that can still be up to 5MB!) we needlessly
reading the `InputStream` into a BAOS which entailed allocating
the `byte[]` for the stream contents twice (because to `toByteArray` on the BAOS copies).

Also, for resumeable uploads we were needlessly wrapping the output channel and running each individual write in its own privileged context when we could just wrap the whole upload in a single privileged context.

Relates #51593
@polyfractal polyfractal added v7.6.0 and removed v7.6.1 labels Feb 7, 2020
@original-brownbear original-brownbear restored the more-efficient-gcs-mock-master branch August 6, 2020 18:32
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

:Distributed/Snapshot/Restore Anything directly related to the `_snapshot/*` APIs >test Issues or PRs that are addressing/adding tests v7.6.0 v7.7.0 v8.0.0-alpha1

Projects

None yet

5 participants