[Extension] oak-incremental-index: Low resource (RAM and CPU) incremental-index implementation using off-heap key/value map (OakMap) by liran-funaro · Pull Request #10001 · apache/druid

liran-funaro · 2020-06-08T14:07:26Z

Description

This PR introduces an extension that improves Druid’s ingestion memory and CPU efficiency. It uses 60% less memory and 50% less CPU-time to achieve the same performance. This translated to nearly double the system's ingestion-throughput with the same memory budget, and a 75% increase in throughput with the same CPU-time budget. The experimental setup and the full results are available here.

To understand the motivation and rationale behind some of the proposed changes below, it is necessary to read the related issue: #9967.

Introduce `OakIncrementalIndex`

We add a new incremental index implementation: OakIncrementalIndex as a Druid extension. The implementation is mostly borrowed from OnheapIncrementalIndex and OffheapIncrementalIndex, but has a few notable differences:

It stores both keys and values off-heap (as opposed to the off-heap implementation that stores only the values off-heap)
It is based on OakMap instead of Java’s ConcurrentSkipList (CSL)
It does not need to keep a mapping from row index to an actual row
It is always ordered (as expected by FactsHolder.persistIterable()), even in plain mode

To achieve the best performance of our implementation, we had to refactor some interfaces of IncrementalIndexRow and IncrementalIndex. This refactoring is explained in #12122.

Key changed/added classes in this commit

Added everything under druid/extensions-contrib/oak-incremental-index. Most notable additions:
- OakIncrementalIndex: follows the IncrementalIndex API
- OakIncrementalIndexRow: follows the IncrementalIndexRow API
- OakKey: handles the serialization, deserialization, and comparison of keys
Updated benchmarks to evaluate our implementation.

This PR has:

been self-reviewed.
added documentation for new or modified features or behaviors.
added Javadocs for most classes and all non-trivial methods. Linked related entities via Javadoc links.
added or updated version, license, or notice information in licenses.yaml
added comments explaining the "why" and the intent of the code wherever it would not be obvious for an unfamiliar reader.
added unit tests or modified existing tests to cover new code paths.
added integration tests.
been tested in a test Druid cluster.

lgtm-com · 2020-06-08T15:23:48Z

This pull request introduces 3 alerts when merging eed0606a67f89b653e8cc402eb0fe36af511de30 into 45b699f - view on LGTM.com

new alerts:

3 for Result of multiplication cast to wider type

liran-funaro · 2020-06-09T21:53:41Z

Hi @jihoonson, @leventov, @gianm, @ebortnik, @b-slim, and @jon-wei.
Since you had an interest in our previous discussions regarding OakIncrementalIndex, we wanted to update you that our new and improved implementation (presented in this PR) reduces Druid's CPU and RAM consumption by over 50% during the ingestion process. This translates to nearly double the system's ingestion-throughput with the same CPU and RAM budget. Please check out this PR and its related issue (#9967), as well as our detailed system-level experiments.

jihoonson · 2020-06-09T22:04:02Z

Glad you guys are still working on this! I will take a look soon.

liran-funaro · 2020-06-12T13:15:55Z

@jihoonson Thank you. We are eager to hear your valuable feedback.

liran-funaro · 2020-06-12T13:35:50Z

@clintropolis We noticed your recent commits modified the same benchmarks as we did in this PR (commit #a607c95). Since you are familiar with this part of Druid, we will appreciate it if you can take the time to review the benchmark part of this PR. This commit alone can contribute to Druid since it resolves issues we had with the benchmarks. You can find a summary of the modifications here.

liran-funaro · 2020-06-15T14:51:18Z

We noticed a lot of Druid users run their workload on Amazon EC2. We want to point out that this PR will not only improve performance but will also reduce operational costs by allowing the users to choose more affordable EC2 instances without sacrificing performance.

The figure below shows the operational cost of different required ingestion throughput on Amazon EC2.

server/src/main/java/org/apache/druid/segment/realtime/plumber/Sink.java

liran-funaro · 2020-06-19T11:39:29Z

Thanks to @yuanlihan for helping us find bugs on the realtime query side. Our system experiments focused on batch ingestion, so his contribution is highly appreciated.

liran-funaro · 2020-06-25T12:40:10Z

Hi @jihoonson, have you had a chance to check out our issue/PR? We will be happy to answer any questions you might have.

liran-funaro · 2020-07-20T11:01:57Z

Updates

We are working with our production teams at Verizon Media toward testing our incremental-index implementation on actual production data. As part of this effort, we discovered some issues with: (1) sketches, and (2) scenarios where multiple indexes are used during ingestion in a single Peon.
We just updated the PR to solves these issues. Please let us know if you encounter similar issues (or others) and if this update solves these issues.

…rsion # Conflicts: # pom.xml

extensions-contrib/oak-incremental-index/pom.xml

benchmarks/pom.xml

...al-index/src/main/java/org/apache/druid/segment/incremental/oak/OakIncrementalIndexSpec.java

# Conflicts: # processing/src/main/java/org/apache/druid/segment/incremental/IncrementalIndex.java # processing/src/main/java/org/apache/druid/segment/incremental/OnheapIncrementalIndex.java

stale · 2022-05-02T19:09:07Z

This pull request has been marked as stale due to 60 days of inactivity. It will be closed in 4 weeks if no further activity occurs. If you think that's incorrect or this pull request should instead be reviewed, please simply write any comment. Even if closed, you can still revive the PR at any time or discuss it on the dev@druid.apache.org list. Thank you for your contributions.

didip · 2022-05-02T19:13:35Z

Please don't close this PR.

stale · 2022-05-02T19:13:37Z

This issue is no longer marked as stale.

exherb · 2022-06-13T08:21:37Z

Index performance is very critical for us.Too many tasks means too many segments.

liran-funaro · 2022-07-10T12:42:34Z

Index performance is very critical for us. Too many tasks mean too many segments.

@exherb For batch ingestion, each task can ingest different time periods. Those can be merged when the task is done, similar to how we merge the results from periodic flushes. So we can benefit from better throughput without sacrificing the number of segments.

exherb · 2022-07-20T23:20:24Z

Index performance is very critical for us. Too many tasks mean too many segments.

@exherb For batch ingestion, each task can ingest different time periods. Those can be merged when the task is done, similar to how we merge the results from periodic flushes. So we can benefit from better throughput without sacrificing the number of segments.

It’s take about 3 hours to merge for us.
Is this pull request ready?

liran-funaro · 2022-07-21T06:53:03Z

It takes about 3 hours to merge for us.

@exherb From my experiment, merging fewer, but larger segments take about the same amount of time as merging many small segments. Even so, this extension enables more data to be ingested into memory before it is flushed to disk, resulting in larger segments.

Is this pull request ready?

@exherb Yes. Except for some conflicts that I can resolve (upon demand).

github-actions · 2023-09-05T00:15:06Z

This pull request has been marked as stale due to 60 days of inactivity.
It will be closed in 4 weeks if no further activity occurs. If you think
that's incorrect or this pull request should instead be reviewed, please simply
write any comment. Even if closed, you can still revive the PR at any time or
discuss it on the dev@druid.apache.org list.
Thank you for your contributions.

github-actions · 2023-10-03T00:15:33Z

This pull request/issue has been closed due to lack of activity. If you think that
is incorrect, or the pull request requires review, you can revive the PR at any time.

liran-funaro force-pushed the pr-oak-ii branch 3 times, most recently from 4156fee to 7495ac9 Compare June 8, 2020 17:51

clintropolis added Area - Batch Ingestion Area - Streaming Ingestion Performance labels Jun 9, 2020

liran-funaro force-pushed the pr-oak-ii branch from 7495ac9 to 990edfc Compare June 10, 2020 13:50

yuanlihan reviewed Jun 18, 2020

View reviewed changes

server/src/main/java/org/apache/druid/segment/realtime/plumber/Sink.java Show resolved Hide resolved

liran-funaro force-pushed the pr-oak-ii branch 2 times, most recently from 2aac78d to 0e313e8 Compare June 18, 2020 13:19

liran-funaro force-pushed the pr-oak-ii branch 2 times, most recently from 95d95db to 663f155 Compare June 21, 2020 15:35

liran-funaro force-pushed the pr-oak-ii branch from 663f155 to 84c2922 Compare June 28, 2020 11:20

liran-funaro force-pushed the pr-oak-ii branch 2 times, most recently from 29e8453 to 9c83a01 Compare July 9, 2020 15:29

liran-funaro force-pushed the pr-oak-ii branch from 9c83a01 to 28603e0 Compare July 20, 2020 10:44

liran-funaro force-pushed the pr-oak-ii branch from 28603e0 to d6bff58 Compare July 20, 2020 11:44

liran-funaro added 6 commits January 12, 2022 14:40

Fix doc

14247ce

Merge branch 'master' into pr-oak-ii

7875f95

Update the latest release of OAk

530a9a6

Merge remote-tracking branch 'upstream/master' into oak-ii-new-oak-ve…

437a0dd

…rsion # Conflicts: # pom.xml

Merge remote-tracking branch 'upstream/master' into pr-oak-ii

ccf1187

Merge branch 'oak-ii-new-oak-version' into pr-oak-ii

8178c26

pjain1 reviewed Jan 28, 2022

View reviewed changes

extensions-contrib/oak-incremental-index/pom.xml Outdated Show resolved Hide resolved

benchmarks/pom.xml Outdated Show resolved Hide resolved

...al-index/src/main/java/org/apache/druid/segment/incremental/oak/OakIncrementalIndexSpec.java Outdated Show resolved Hide resolved

liran-funaro and others added 2 commits January 30, 2022 13:59

Fix groupId

998cca3

Add extension docs

b6ee2b0

liran-funaro force-pushed the pr-oak-ii branch from ebef5f6 to b6ee2b0 Compare January 31, 2022 13:11

liran-funaro added 7 commits January 31, 2022 15:25

Fix Oak license

d209564

Change default maxBytesInMemory and change docs accordingly.

5bd946c

Merge remote-tracking branch 'upstream/master' into pr-oak-ii

ffe32c3

# Conflicts: # processing/src/main/java/org/apache/druid/segment/incremental/IncrementalIndex.java # processing/src/main/java/org/apache/druid/segment/incremental/OnheapIncrementalIndex.java

Merge remote-tracking branch 'upstream/master' into pr-oak-ii

9bbfdd8

Upgrade Oak version

9c247d9

Merge remote-tracking branch 'upstream/master' into pr-oak-ii

289a9e6

Upgrade Oak version license

30f9219

stale bot added the stale label May 2, 2022

stale bot removed the stale label May 2, 2022

github-actions bot added the stale label Sep 5, 2023

github-actions bot closed this Oct 3, 2023

Conversation

liran-funaro commented Jun 8, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

Introduce OakIncrementalIndex

Key changed/added classes in this commit

Uh oh!

lgtm-com bot commented Jun 8, 2020

Uh oh!

liran-funaro commented Jun 9, 2020

Uh oh!

jihoonson commented Jun 9, 2020

Uh oh!

liran-funaro commented Jun 12, 2020

Uh oh!

liran-funaro commented Jun 12, 2020

Uh oh!

liran-funaro commented Jun 15, 2020

Uh oh!

Uh oh!

liran-funaro commented Jun 19, 2020

Uh oh!

liran-funaro commented Jun 25, 2020

Uh oh!

liran-funaro commented Jul 20, 2020

Updates

Uh oh!

Uh oh!

Uh oh!

Uh oh!

stale bot commented May 2, 2022

Uh oh!

didip commented May 2, 2022

Uh oh!

stale bot commented May 2, 2022

Uh oh!

exherb commented Jun 13, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

liran-funaro commented Jul 10, 2022

Uh oh!

exherb commented Jul 20, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

liran-funaro commented Jul 21, 2022

Uh oh!

github-actions bot commented Sep 5, 2023

Uh oh!

github-actions bot commented Oct 3, 2023

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

10 participants

liran-funaro commented Jun 8, 2020 •

edited

Loading

Introduce `OakIncrementalIndex`

exherb commented Jun 13, 2022 •

edited

Loading

exherb commented Jul 20, 2022 •

edited

Loading