Skip to content

Simplify and Speed up some Compression Usage#60953

Merged
original-brownbear merged 3 commits intoelastic:masterfrom
original-brownbear:faster-compression
Aug 11, 2020
Merged

Simplify and Speed up some Compression Usage#60953
original-brownbear merged 3 commits intoelastic:masterfrom
original-brownbear:faster-compression

Conversation

@original-brownbear
Copy link
Copy Markdown
Contributor

Use thread-local buffers and deflater and inflater instances to speed up
compressing and decompressing from in-memory bytes.
Not manually invoking end() on these should be safe since their off-heap memory
will eventually be reclaimed by the finalizer thread which should not be an issue for thread-locals
that are not instantiated at a high frequency.
This significantly reduces the amount of byte copying and object creation relative to the previous approach
which had to create a fresh temporary buffer (that was then resized multiple times during operations), copied
bytes out of that buffer to a freshly allocated byte[], used 4k stream buffers needlessly when working with
bytes that are already in arrays (writeTo handles efficient writing to the compression logic now) etc.

Relates #57284 which should be helped by this change to some degree.
Also, I expect this change to speed up mapping/template updates a little as those make heavy use of these
code paths.

Use thread-local buffers and deflater and inflater instances to speed up
compressing and decompressing from in-memory bytes.
Not manually invoking `end()` on these should be safe since their off-heap memory
will eventually be reclaimed by the finalizer thread which should not be an issue for thread-locals
that are not instantiated at a high frequency.
This significantly reduces the amount of byte copying and object creation relative to the previous approach
which had to create a fresh temporary buffer (that was then resized multiple times during operations), copied
bytes out of that buffer to a freshly allocated `byte[]`, used 4k stream buffers needlessly when working with
bytes that are already in arrays (`writeTo` handles efficient writing to the compression logic now) etc.

Relates #57284 which should be helped by this change to some degree.
Also, I expect this change to speed up mapping/template updates a little as those make heavy use of these
code paths.
@elasticmachine
Copy link
Copy Markdown
Collaborator

Pinging @elastic/es-core-infra (:Core/Infra/Core)

@elasticmachine elasticmachine added the Team:Core/Infra Meta label for core/infra team label Aug 11, 2020
@original-brownbear
Copy link
Copy Markdown
Contributor Author

Jenkins run elasticsearch-ci/packaging-sample-windows (known Jenkins+Windows issue)

@original-brownbear
Copy link
Copy Markdown
Contributor Author

Jenkins run elasticsearch-ci/1 (unrelated test failure #60954)

}

return Arrays.equals(uncompressed(), that.uncompressed());
return uncompressed().equals(uncompressed());
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
return uncompressed().equals(uncompressed());
return uncompressed().equals(that.uncompressed());

also looking at the code above, I wonder why we don't compare crc32 first before comparing the compressed byte arrays

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🤦 thanks for spotting.

I wonder why we don't compare crc32 first before comparing the compressed byte arrays

I guess you could argue that it's really unlikely that the compressed bytes in the equal (as in equal uncompressed bytes) case aren't actually equal so you'd just add an extra int comparison in the equal case. So if we assume it's mostly the equal case here then the current version is better, I have no clue if that's true though. Probably doesn't matter much in practice since byte array comparison in JDK9+ is blazing fast anyway via jdk.internal.util.ArraysSupport#vectorizedMismatch? :)

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Probably doesn't matter much in practice

I agree that it doesn't matter much. Just a random thought that popped into my head :)

};
}

private static final ThreadLocal<Inflater> inflaterRef = ThreadLocal.withInitial(() -> new Inflater(true));
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IMO it would be a good idea to add a comment about why these threadlocals are not used in the other methods of the class

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

++

Copy link
Copy Markdown
Member

@jaymode jaymode left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@original-brownbear
Copy link
Copy Markdown
Contributor Author

Thanks Jay!

@original-brownbear original-brownbear merged commit 9059cfb into elastic:master Aug 11, 2020
@original-brownbear original-brownbear deleted the faster-compression branch August 11, 2020 20:47
original-brownbear added a commit that referenced this pull request Aug 12, 2020
Use thread-local buffers and deflater and inflater instances to speed up
compressing and decompressing from in-memory bytes.
Not manually invoking `end()` on these should be safe since their off-heap memory
will eventually be reclaimed by the finalizer thread which should not be an issue for thread-locals
that are not instantiated at a high frequency.
This significantly reduces the amount of byte copying and object creation relative to the previous approach
which had to create a fresh temporary buffer (that was then resized multiple times during operations), copied
bytes out of that buffer to a freshly allocated `byte[]`, used 4k stream buffers needlessly when working with
bytes that are already in arrays (`writeTo` handles efficient writing to the compression logic now) etc.

Relates #57284 which should be helped by this change to some degree.
Also, I expect this change to speed up mapping/template updates a little as those make heavy use of these
code paths.
@original-brownbear original-brownbear restored the faster-compression branch December 6, 2020 19:00
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

:Core/Infra/Core Core issues without another label >non-issue Team:Core/Infra Meta label for core/infra team v7.10.0 v8.0.0-alpha1

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants