Skip to content

[Extension] oak-incremental-index: Low resource (RAM and CPU) incremental-index implementation using off-heap key/value map (OakMap)#10001

Closed
liran-funaro wants to merge 46 commits intoapache:masterfrom
liran-funaro:pr-oak-ii
Closed

[Extension] oak-incremental-index: Low resource (RAM and CPU) incremental-index implementation using off-heap key/value map (OakMap)#10001
liran-funaro wants to merge 46 commits intoapache:masterfrom
liran-funaro:pr-oak-ii

Conversation

@liran-funaro
Copy link
Contributor

@liran-funaro liran-funaro commented Jun 8, 2020

Fixes #9967.

Description

This PR introduces an extension that improves Druid’s ingestion memory and CPU efficiency. It uses 60% less memory and 50% less CPU-time to achieve the same performance. This translated to nearly double the system's ingestion-throughput with the same memory budget, and a 75% increase in throughput with the same CPU-time budget. The experimental setup and the full results are available here.

To understand the motivation and rationale behind some of the proposed changes below, it is necessary to read the related issue: #9967.

Introduce OakIncrementalIndex

We add a new incremental index implementation: OakIncrementalIndex as a Druid extension. The implementation is mostly borrowed from OnheapIncrementalIndex and OffheapIncrementalIndex, but has a few notable differences:

  1. It stores both keys and values off-heap (as opposed to the off-heap implementation that stores only the values off-heap)
  2. It is based on OakMap instead of Java’s ConcurrentSkipList (CSL)
  3. It does not need to keep a mapping from row index to an actual row
  4. It is always ordered (as expected by FactsHolder.persistIterable()), even in plain mode

To achieve the best performance of our implementation, we had to refactor some interfaces of IncrementalIndexRow and IncrementalIndex. This refactoring is explained in #12122.

Key changed/added classes in this commit
  • Added everything under druid/extensions-contrib/oak-incremental-index. Most notable additions:
    • OakIncrementalIndex: follows the IncrementalIndex API
    • OakIncrementalIndexRow: follows the IncrementalIndexRow API
    • OakKey: handles the serialization, deserialization, and comparison of keys
  • Updated benchmarks to evaluate our implementation.

This PR has:

  • been self-reviewed.
  • added documentation for new or modified features or behaviors.
  • added Javadocs for most classes and all non-trivial methods. Linked related entities via Javadoc links.
  • added or updated version, license, or notice information in licenses.yaml
  • added comments explaining the "why" and the intent of the code wherever it would not be obvious for an unfamiliar reader.
  • added unit tests or modified existing tests to cover new code paths.
  • added integration tests.
  • been tested in a test Druid cluster.

@lgtm-com
Copy link

lgtm-com bot commented Jun 8, 2020

This pull request introduces 3 alerts when merging eed0606a67f89b653e8cc402eb0fe36af511de30 into 45b699f - view on LGTM.com

new alerts:

  • 3 for Result of multiplication cast to wider type

@liran-funaro
Copy link
Contributor Author

Hi @jihoonson, @leventov, @gianm, @ebortnik, @b-slim, and @jon-wei.
Since you had an interest in our previous discussions regarding OakIncrementalIndex, we wanted to update you that our new and improved implementation (presented in this PR) reduces Druid's CPU and RAM consumption by over 50% during the ingestion process. This translates to nearly double the system's ingestion-throughput with the same CPU and RAM budget. Please check out this PR and its related issue (#9967), as well as our detailed system-level experiments.

@jihoonson
Copy link
Contributor

Glad you guys are still working on this! I will take a look soon.

@liran-funaro
Copy link
Contributor Author

@jihoonson Thank you. We are eager to hear your valuable feedback.

@liran-funaro
Copy link
Contributor Author

@clintropolis We noticed your recent commits modified the same benchmarks as we did in this PR (commit #a607c95). Since you are familiar with this part of Druid, we will appreciate it if you can take the time to review the benchmark part of this PR. This commit alone can contribute to Druid since it resolves issues we had with the benchmarks. You can find a summary of the modifications here.

@liran-funaro
Copy link
Contributor Author

We noticed a lot of Druid users run their workload on Amazon EC2. We want to point out that this PR will not only improve performance but will also reduce operational costs by allowing the users to choose more affordable EC2 instances without sacrificing performance.

The figure below shows the operational cost of different required ingestion throughput on Amazon EC2.
image

@liran-funaro liran-funaro force-pushed the pr-oak-ii branch 2 times, most recently from 2aac78d to 0e313e8 Compare June 18, 2020 13:19
@liran-funaro
Copy link
Contributor Author

Thanks to @yuanlihan for helping us find bugs on the realtime query side. Our system experiments focused on batch ingestion, so his contribution is highly appreciated.

@liran-funaro liran-funaro force-pushed the pr-oak-ii branch 2 times, most recently from 95d95db to 663f155 Compare June 21, 2020 15:35
@liran-funaro
Copy link
Contributor Author

Hi @jihoonson, have you had a chance to check out our issue/PR? We will be happy to answer any questions you might have.

@liran-funaro
Copy link
Contributor Author

Updates

We are working with our production teams at Verizon Media toward testing our incremental-index implementation on actual production data. As part of this effort, we discovered some issues with: (1) sketches, and (2) scenarios where multiple indexes are used during ingestion in a single Peon.
We just updated the PR to solves these issues. Please let us know if you encounter similar issues (or others) and if this update solves these issues.

@stale
Copy link

stale bot commented May 2, 2022

This pull request has been marked as stale due to 60 days of inactivity. It will be closed in 4 weeks if no further activity occurs. If you think that's incorrect or this pull request should instead be reviewed, please simply write any comment. Even if closed, you can still revive the PR at any time or discuss it on the dev@druid.apache.org list. Thank you for your contributions.

@stale stale bot added the stale label May 2, 2022
@didip
Copy link
Contributor

didip commented May 2, 2022

Please don't close this PR.

@stale
Copy link

stale bot commented May 2, 2022

This issue is no longer marked as stale.

@stale stale bot removed the stale label May 2, 2022
@exherb
Copy link
Contributor

exherb commented Jun 13, 2022

Index performance is very critical for us.Too many tasks means too many segments.

@liran-funaro
Copy link
Contributor Author

Index performance is very critical for us. Too many tasks mean too many segments.

@exherb For batch ingestion, each task can ingest different time periods. Those can be merged when the task is done, similar to how we merge the results from periodic flushes. So we can benefit from better throughput without sacrificing the number of segments.

@exherb
Copy link
Contributor

exherb commented Jul 20, 2022

Index performance is very critical for us. Too many tasks mean too many segments.

@exherb For batch ingestion, each task can ingest different time periods. Those can be merged when the task is done, similar to how we merge the results from periodic flushes. So we can benefit from better throughput without sacrificing the number of segments.

It’s take about 3 hours to merge for us.
Is this pull request ready?

@liran-funaro
Copy link
Contributor Author

It takes about 3 hours to merge for us.

@exherb From my experiment, merging fewer, but larger segments take about the same amount of time as merging many small segments. Even so, this extension enables more data to be ingested into memory before it is flushed to disk, resulting in larger segments.

Is this pull request ready?

@exherb Yes. Except for some conflicts that I can resolve (upon demand).

@github-actions
Copy link

github-actions bot commented Sep 5, 2023

This pull request has been marked as stale due to 60 days of inactivity.
It will be closed in 4 weeks if no further activity occurs. If you think
that's incorrect or this pull request should instead be reviewed, please simply
write any comment. Even if closed, you can still revive the PR at any time or
discuss it on the dev@druid.apache.org list.
Thank you for your contributions.

@github-actions github-actions bot added the stale label Sep 5, 2023
@github-actions
Copy link

github-actions bot commented Oct 3, 2023

This pull request/issue has been closed due to lack of activity. If you think that
is incorrect, or the pull request requires review, you can revive the PR at any time.

@github-actions github-actions bot closed this Oct 3, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Extension: oak-incremental-index: Low resource (RAM and CPU) incremental-index implementation using off-heap key/value map (OakMap)

10 participants