Memory efficient xcontent filtering (backport of #77154)#77653
Merged
elasticsearchmachine merged 3 commits intoelastic:7.xfrom Sep 13, 2021
Merged
Memory efficient xcontent filtering (backport of #77154)#77653elasticsearchmachine merged 3 commits intoelastic:7.xfrom
elasticsearchmachine merged 3 commits intoelastic:7.xfrom
Conversation
I found myself needing support for something like `filter_path` on `XContentParser`. It was simple enough to plug it in so I did. Then I realized that it might offer more memory efficient source filtering (elastic#25168) so I put together a quick benchmark comparing the source filtering that we do in `_search`. Filtering using the parser is about 33% faster than how we filter now when you select a single field from a 300 byte document: ``` Benchmark (excludes) (includes) (source) Mode Cnt Score Error Units FetchSourcePhaseBenchmark.filterObjects message short avgt 5 2360.342 ± 4.715 ns/op FetchSourcePhaseBenchmark.filterXContentOnBuilder message short avgt 5 2010.278 ± 15.042 ns/op FetchSourcePhaseBenchmark.filterXContentOnParser message short avgt 5 1588.446 ± 18.593 ns/op ``` The top line is the way we filter now. The middle line is adding a filter to `XContentBuilder` - something we can do right now without any of my plumbing work. The bottom line is filtering on the parser, requiring all the new plumbing. This isn't particularly impresive. 33% *sounds* great! But 700 nanoseconds per document isn't going to cut into anyone's search times. If you fetch a thousand docuents that's .7 milliseconds of savings. But we mostly advise folks to use source filtering on fetch when the source is large and you only want a small part of it. So I tried when the source is about 4.3kb and you want a single field: ``` Benchmark (excludes) (includes) (source) Mode Cnt Score Error Units FetchSourcePhaseBenchmark.filterObjects message one_4k_field avgt 5 5957.128 ± 117.402 ns/op FetchSourcePhaseBenchmark.filterXContentOnBuilder message one_4k_field avgt 5 4999.073 ± 96.003 ns/op FetchSourcePhaseBenchmark.filterXContentonParser message one_4k_field avgt 5 3261.478 ± 48.879 ns/op ``` That's 45% faster. Put another way, 2.7 microseconds a document. Not bad! But have a look at how things come out when you want a single field from a 4 *megabyte* document: ``` Benchmark (excludes) (includes) (source) Mode Cnt Score Error Units FetchSourcePhaseBenchmark.filterObjects message one_4m_field avgt 5 8266343.036 ± 176197.077 ns/op FetchSourcePhaseBenchmark.filterXContentOnBuilder message one_4m_field avgt 5 6227560.013 ± 68306.318 ns/op FetchSourcePhaseBenchmark.filterXContentonParser message one_4m_field avgt 5 1617153.472 ± 80164.547 ns/op ``` These documents are very large. I've encountered documents like them in real life, but they've always been the outlier for me. But a 6.5 millisecond per document savings ain't anything to sneeze at. Take a look at what you get when I turn on gc metrics: ``` FetchSourcePhaseBenchmark.filterObjects message one_4m_field avgt 5 7036097.561 ± 84721.312 ns/op FetchSourcePhaseBenchmark.filterObjects:·gc.alloc.rate message one_4m_field avgt 5 2166.613 ± 25.975 MB/sec FetchSourcePhaseBenchmark.filterXContentOnBuilder message one_4m_field avgt 5 6104595.992 ± 55445.508 ns/op FetchSourcePhaseBenchmark.filterXContentOnBuilder:·gc.alloc.rate message one_4m_field avgt 5 2496.978 ± 22.650 MB/sec FetchSourcePhaseBenchmark.filterXContentonParser message one_4m_field avgt 5 1614980.846 ± 31716.956 ns/op FetchSourcePhaseBenchmark.filterXContentonParser:·gc.alloc.rate message one_4m_field avgt 5 1.755 ± 0.035 MB/sec ```
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
I found myself needing support for something like
filter_pathonXContentParser. It was simple enough to plug it in so I did. Then Irealized that it might offer more memory efficient source filtering
(#25168) so I put together a quick benchmark comparing the source
filtering that we do in
_search.Filtering using the parser is about 33% faster than how we filter now
when you select a single field from a 300 byte document:
The top line is the way we filter now. The middle line is adding a
filter to
XContentBuilder- something we can do right now without anyof my plumbing work. The bottom line is filtering on the parser,
requiring all the new plumbing.
This isn't particularly impresive. 33% sounds great! But 700
nanoseconds per document isn't going to cut into anyone's search times.
If you fetch a thousand docuents that's .7 milliseconds of savings.
But we mostly advise folks to use source filtering on fetch when the
source is large and you only want a small part of it. So I tried when
the source is about 4.3kb and you want a single field:
That's 45% faster. Put another way, 2.7 microseconds a document. Not
bad!
But have a look at how things come out when you want a single field from
a 4 megabyte document:
These documents are very large. I've encountered documents like them in
real life, but they've always been the outlier for me. But a 6.5
millisecond per document savings ain't anything to sneeze at.
Take a look at what you get when I turn on gc metrics: