Speed up PK lookups at index time. by jpountz · Pull Request #19856 · elastic/elasticsearch

jpountz · 2016-08-08T10:05:49Z

At index time Elasticsearch needs to look up the version associated with the
_uid of the document that is being indexed, which is often the bottleneck for
indexing.

While reviewing the output of the jfr telemetry from a Rally benchmark, I saw
that significant time was spent in ConcurrentHashMap#get and ThreadLocal#get.
The reason is that we cache lookup objects per thread and segment, and for every
indexed document, we first need to look up the cache associated with this
segment (ConcurrentHashMap#get) and then get a state that is local to the
current thread (ThreadLocal#get). So if you are indexing N documents per
second and have S segments, both these methods will be called N*S times per
second.

This commit changes version lookup to use a cache per index reader rather than
per segment. While this makes cache entries live for less long, we now only need
to do one call to ConcurrentHashMap#get and ThreadLocal#get per indexed
document.

NOTE: I had to remove IndexReader#getCombinedCoreAndDeletesKey from the
forbidden APIs in order to make it work.

Here are some screenshots of the hot methods as reported when indexing 100M empty documents with rally using 1 shard, 0 replicas and otherwise default settings.
Before:

After:

ThreadLocal#get is no longer among the hottest methods after the change, and while ConcurrentHashMap#get is still there, its main callers are now the live version map and the circuit breaker, not the state we maintain for version lookups in the index.

Here is the report from rally compare on two races that have been run without telemetry (so that it does not affect the benchmark results). The speedup looks real.

                             Metric    Baseline    Contender                Diff
-----------------------------------  ----------  -----------  ------------------
   Min Indexing Throughput [docs/s]       67415        73665  +6250.00000
Median Indexing Throughput [docs/s]       73608      80540.3  +6932.33333
   Max Indexing Throughput [docs/s]       75151        81799  +6648.00000
                Indexing time [min]     113.854      94.9646    -18.88890
                   Merge time [min]     15.0901      11.0337     -4.05640
                 Refresh time [min]     1.52247      1.57633     +0.05387
                   Flush time [min]    0.901483     0.874733     -0.02675
          Merge throttle time [min]     5.95868       2.5531     -3.40558
       Median CPU usage (index) [%]     566.462      568.981     +2.51969
             Total Young Gen GC [s]     113.929      116.784     +2.85500
               Total Old Gen GC [s]      18.633       19.979     +1.34600
                    Index size [GB]     1.88654      1.87796     -0.00858
               Totally written [GB]     20.4271      19.9763     -0.45073
                      Segment count          22           34    +12.00000

jpountz · 2016-08-09T15:57:52Z

This is basically a revert of #14070. I'm stalling this on making cache keys / closed listeners less trappy.

elasticmachine · 2017-02-23T18:14:36Z

Since this is a community submitted pull request, a Jenkins build has not been kicked off automatically. Can an Elastic organization member please verify the contents of this patch and then kick off a build manually?

rjernst · 2017-06-09T05:27:38Z

@jpountz Is this PR worth reviving?

jpountz · 2017-06-09T06:21:40Z

Absolutely!

At index time Elasticsearch needs to look up the version associated with the `_id` of the document that is being indexed, which is often the bottleneck for indexing. While reviewing the output of the `jfr` telemetry from a Rally benchmark, I saw that significant time was spent in `ConcurrentHashMap#get` and `ThreadLocal#get`. The reason is that we cache lookup objects per thread and segment, and for every indexed document, we first need to look up the cache associated with this segment (`ConcurrentHashMap#get`) and then get a state that is local to the current thread (`ThreadLocal#get`). So if you are indexing N documents per second and have S segments, both these methods will be called N*S times per second. This commit changes version lookup to use a cache per index reader rather than per segment. While this makes cache entries live for less long, we now only need to do one call to `ConcurrentHashMap#get` and `ThreadLocal#get` per indexed document.

jpountz · 2017-06-14T18:16:33Z

I rebased this PR!

s1monw

left a small comment LGTM otherwise

s1monw · 2017-06-15T06:19:19Z

core/src/main/java/org/elasticsearch/common/lucene/uid/VersionsAndSeqNoResolver.java

-        IndexReader.CacheHelper cacheHelper = reader.getCoreCacheHelper();
-        CloseableThreadLocal<PerThreadIDVersionAndSeqNoLookup> ctl = lookupStates.get(cacheHelper.getKey());
+    private static PerThreadIDVersionAndSeqNoLookup[] getLookupState(IndexReader reader, String uidField) throws IOException {
+        // We cache on the top level


can you leave a comment why we moved to this? you can also reference this PR for documentation purposes

+1 I had more comments initially but they apparently got lost with the rebase

* master: Upgrade icu4j for the ICU analysis plugin to 59.1 (elastic#25243) move assertBusy to use CheckException (elastic#25246) Use SPI in High Level Rest Client to load XContent parsers (elastic#25098) [TEST] test that low level REST client leaves path untouched (elastic#25193) Speed up PK lookups at index time. (elastic#19856) [Docs] Fix documentation for percentiles bucket aggregation (elastic#25229) Upgrade to lucene-7.0.0-snapshot-92b1783. (elastic#25222) Build: Add master flag for disabling bwc tests (elastic#25230) Scripting: Rename SearchScript.needsScores to needs_score (elastic#25235) Support script context stateful factory in Painless. (elastic#25233) FastVectorHighlighter should not cache the field query globally (elastic#25197) Remove QUERY_AND_FETCH BWC for pre-5.3.0 nodes (elastic#25223) Add more missing AggregationBuilder getters (elastic#25198) Extract the snapshot/restore full cluster restart tests from the translog full cluster restart tests (elastic#25204)

* master: (44 commits) Upgrade icu4j for the ICU analysis plugin to 59.1 (elastic#25243) move assertBusy to use CheckException (elastic#25246) Use SPI in High Level Rest Client to load XContent parsers (elastic#25098) [TEST] test that low level REST client leaves path untouched (elastic#25193) Speed up PK lookups at index time. (elastic#19856) [Docs] Fix documentation for percentiles bucket aggregation (elastic#25229) Upgrade to lucene-7.0.0-snapshot-92b1783. (elastic#25222) Build: Add master flag for disabling bwc tests (elastic#25230) Scripting: Rename SearchScript.needsScores to needs_score (elastic#25235) Support script context stateful factory in Painless. (elastic#25233) FastVectorHighlighter should not cache the field query globally (elastic#25197) Remove QUERY_AND_FETCH BWC for pre-5.3.0 nodes (elastic#25223) Add more missing AggregationBuilder getters (elastic#25198) Extract the snapshot/restore full cluster restart tests from the translog full cluster restart tests (elastic#25204) Refactor TransportShardBulkAction.executeUpdateRequest and add tests Make sure range queries are correctly profiled. (elastic#25108) Test: allow setting socket timeout for rest client (elastic#25221) Migration docs for elastic#25080 (elastic#25218) Remove `discovery.type` BWC layer from the EC2/Azure/GCE plugins elastic#25080 When stopping via systemd only kill the JVM, not its control group (elastic#25195) ...

jpountz added >enhancement review labels Aug 8, 2016

jpountz added stalled and removed review labels Aug 9, 2016

clintongormley added the :Core/Infra/Core Core issues without another label label Aug 11, 2016

jpountz force-pushed the enhancement/faster_version_lookups branch from 896fa1f to fc1db9e Compare June 14, 2017 18:14

jpountz removed the stalled label Jun 14, 2017

jpountz requested a review from s1monw June 14, 2017 18:16

s1monw approved these changes Jun 15, 2017

View reviewed changes

More comments

2ff243d

jpountz merged commit 5a6fa62 into elastic:master Jun 15, 2017

jpountz deleted the enhancement/faster_version_lookups branch June 15, 2017 08:17

jpountz added the v6.0.0 label Jun 15, 2017

colings86 added v6.0.0-beta1 and removed v6.0.0 labels Jul 31, 2017

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Speed up PK lookups at index time.#19856

Speed up PK lookups at index time.#19856
jpountz merged 2 commits intoelastic:masterfrom
jpountz:enhancement/faster_version_lookups

jpountz commented Aug 8, 2016 •

edited

Loading

Uh oh!

jpountz commented Aug 9, 2016

Uh oh!

elasticmachine commented Feb 23, 2017

Uh oh!

rjernst commented Jun 9, 2017

Uh oh!

jpountz commented Jun 9, 2017

Uh oh!

jpountz commented Jun 14, 2017

Uh oh!

s1monw left a comment

Uh oh!

s1monw Jun 15, 2017

Uh oh!

jpountz Jun 15, 2017

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

6 participants

Conversation

jpountz commented Aug 8, 2016 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

jpountz commented Aug 9, 2016

Uh oh!

elasticmachine commented Feb 23, 2017

Uh oh!

rjernst commented Jun 9, 2017

Uh oh!

jpountz commented Jun 9, 2017

Uh oh!

jpountz commented Jun 14, 2017

Uh oh!

s1monw left a comment

Choose a reason for hiding this comment

Uh oh!

s1monw Jun 15, 2017

Choose a reason for hiding this comment

Uh oh!

jpountz Jun 15, 2017

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

6 participants

jpountz commented Aug 8, 2016 •

edited

Loading