-
Notifications
You must be signed in to change notification settings - Fork 962
Make RocksDB delete entries batch size configurable #3646
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Make RocksDB delete entries batch size configurable #3646
Conversation
| if (deletedEntriesInBatch > deleteEntriesBatchSize) { | ||
| batch.flush(); | ||
| batch.clear(); | ||
| deletedEntriesInBatch = 0; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nov 11, 2022 @ 05:49:39.336 05:49:39.336 [db-storage-cleanup-10-1] INFO org.apache.bookkeeper.bookie.storage.ldb.EntryLocationIndex - Deleted indexes for 0 entries from 16074 ledgers in 625.242 seconds
deletedEntries=0, does it mean that the condition of deletedEntriesInBatch > DELETE_ENTRIES_BATCH_SIZE has not entered yet, will adjusting the size of DELETE_ENTRIES_BATCH_SIZE affect this delay? It doesn't look like the read latency to a few seconds is caused by this situation.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
deletedEntries=0 is a bug in the log, we have fixed it in the master branch.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry. I made a mistake. I have see it #3283
|
@hangc0276 I'm not very clear about the current bookkeeper configuration management, including whether we need to add instructions for this new config in the docs? Personally, I think this configuration is helpful for performance. Should it be explained to users in the docs? |
eolivelli
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
…ance (#3653) ### Motivation The entry log location index deletion is deleted in batches one by one currently, and it will have low performance. Refer to: #3646 Matteo has introduced deleteRange API a few years ago, but rollback due to RocksDB delete ranges bug. #1620. The RocksDB bug has been addressed since 5.18.0 https://github.com/facebook/rocksdb/blob/main/HISTORY.md#5180-2018-11-30. We can bring the `deleteRange` API back to improve the entry log location deletion performance. ### Changes Bring `deleteRange` API back for entry log location deletion.
…ance (#3653) The entry log location index deletion is deleted in batches one by one currently, and it will have low performance. Refer to: #3646 Matteo has introduced deleteRange API a few years ago, but rollback due to RocksDB delete ranges bug. #1620. The RocksDB bug has been addressed since 5.18.0 https://github.com/facebook/rocksdb/blob/main/HISTORY.md#5180-2018-11-30. We can bring the `deleteRange` API back to improve the entry log location deletion performance. Bring `deleteRange` API back for entry log location deletion. (cherry picked from commit 696919c)
…ance (apache#3653) ### Motivation The entry log location index deletion is deleted in batches one by one currently, and it will have low performance. Refer to: apache#3646 Matteo has introduced deleteRange API a few years ago, but rollback due to RocksDB delete ranges bug. apache#1620. The RocksDB bug has been addressed since 5.18.0 https://github.com/facebook/rocksdb/blob/main/HISTORY.md#5180-2018-11-30. We can bring the `deleteRange` API back to improve the entry log location deletion performance. ### Changes Bring `deleteRange` API back for entry log location deletion.
…ance (apache#3653) The entry log location index deletion is deleted in batches one by one currently, and it will have low performance. Refer to: apache#3646 Matteo has introduced deleteRange API a few years ago, but rollback due to RocksDB delete ranges bug. apache#1620. The RocksDB bug has been addressed since 5.18.0 https://github.com/facebook/rocksdb/blob/main/HISTORY.md#5180-2018-11-30. We can bring the `deleteRange` API back to improve the entry log location deletion performance. Bring `deleteRange` API back for entry log location deletion. (cherry picked from commit 696919c) (cherry picked from commit e159510)
…ance (apache#3653) ### Motivation The entry log location index deletion is deleted in batches one by one currently, and it will have low performance. Refer to: apache#3646 Matteo has introduced deleteRange API a few years ago, but rollback due to RocksDB delete ranges bug. apache#1620. The RocksDB bug has been addressed since 5.18.0 https://github.com/facebook/rocksdb/blob/main/HISTORY.md#5180-2018-11-30. We can bring the `deleteRange` API back to improve the entry log location deletion performance. ### Changes Bring `deleteRange` API back for entry log location deletion.
### Motivations This PR is to resolve the issue #3734 in branch-4.14 by following this suggestion. #3734 (comment) ### Modifications 1. Revert #3653 2. Bring #3646 to branch-4.14 to make delete entries batch size configurable.
…e#3768) ### Motivations This PR is to resolve the issue apache#3734 in branch-4.14 by following this suggestion. apache#3734 (comment) ### Modifications 1. Revert apache#3653 2. Bring apache#3646 to branch-4.14 to make delete entries batch size configurable. (cherry picked from commit e56d6d6)
…e#3768) ### Motivations This PR is to resolve the issue apache#3734 in branch-4.14 by following this suggestion. apache#3734 (comment) ### Modifications 1. Revert apache#3653 2. Bring apache#3646 to branch-4.14 to make delete entries batch size configurable. (cherry picked from commit e56d6d6)
…ance (apache#3653) ### Motivation The entry log location index deletion is deleted in batches one by one currently, and it will have low performance. Refer to: apache#3646 Matteo has introduced deleteRange API a few years ago, but rollback due to RocksDB delete ranges bug. apache#1620. The RocksDB bug has been addressed since 5.18.0 https://github.com/facebook/rocksdb/blob/main/HISTORY.md#5180-2018-11-30. We can bring the `deleteRange` API back to improve the entry log location deletion performance. ### Changes Bring `deleteRange` API back for entry log location deletion.
Motivation
When there are a lot of ledgers to be deleted, the EntyLocationIndex in RocksDB will delete a large number of entries at one time.
The RocksDB location index deletes operation will cause RocksDB busy, and impact the read latency to a few seconds.
Currently, we delete RocksDB entries in batch, and the batch size is hard-coded. We can not change it.
Changes
Make the RocksDB delete entries in batch configurable.