Storages: Optimize vector search in scenarios with updates#9597
Storages: Optimize vector search in scenarios with updates#9597ti-chi-bot[bot] merged 6 commits intopingcap:masterfrom
Conversation
|
Can you briefly describe what changes you've done in this PR? |
done |
Signed-off-by: Lloyd-Pottiger <yan1579196623@gmail.com> fix Signed-off-by: Lloyd-Pottiger <yan1579196623@gmail.com> fix Signed-off-by: Lloyd-Pottiger <yan1579196623@gmail.com>
f0ddea4 to
83c44a1
Compare
| // Each ColomnFileTiny has its own vector index, rowid_start_offset is the offset of the ColmnFilePersistSet. | ||
| vec_index->get(rowid - rowid_start_offset, value); |
|
Seem that what is changed is the order of these operations(the yellow and the green lines swapped their order). Could you explain more about why changing order can improve performance? I guess you can somehow reduce the vector index loaded from either ColumnFileTiny or Stable layer, but I am not so clear. |
Add an example in PR description. |
Signed-off-by: Lloyd-Pottiger <yan1579196623@gmail.com>
|
Flamegraph |
|
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: breezewish, JinheLin The full list of commands accepted by this bot can be found here. The pull request process is described here DetailsNeeds approval from an approver in each of these files:
Approvers can indicate their approval by writing |
[LGTM Timeline notifier]Timeline:
|
|
/hold |
|
/unhold |
|
/cherry-pick release-8.5 |
|
@Lloyd-Pottiger: new pull request created to branch DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the ti-community-infra/tichi repository. |
| auto * vector_index_stream = dynamic_cast<ConcatVectorIndexBlockInputStream *>(skippable_stream.get()); | ||
| auto stream = std::make_shared<BitmapFilterBlockInputStream>(columns_to_read, skippable_stream, bitmap_filter); | ||
| if (vector_index_stream) | ||
| { | ||
| // Squash blocks to reduce the number of blocks. | ||
| return std::make_shared<SquashingBlockInputStream>( |
There was a problem hiding this comment.
| auto * vector_index_stream = dynamic_cast<ConcatVectorIndexBlockInputStream *>(skippable_stream.get()); | |
| auto stream = std::make_shared<BitmapFilterBlockInputStream>(columns_to_read, skippable_stream, bitmap_filter); | |
| if (vector_index_stream) | |
| { | |
| // Squash blocks to reduce the number of blocks. | |
| return std::make_shared<SquashingBlockInputStream>( | |
| auto stream = std::make_shared<BitmapFilterBlockInputStream>(columns_to_read, skippable_stream, bitmap_filter); | |
| if (auto * vector_index_stream = dynamic_cast<ConcatVectorIndexBlockInputStream *>(skippable_stream.get()); vector_index_stream) | |
| { | |
| // For vector search, there are more likely to return small blocks from different sub-streams. Squash blocks to reduce the number of blocks thus improve the performance of upper layer. | |
| return std::make_shared<SquashingBlockInputStream>( |
| auto select_size = search_results.size() > topk ? topk : search_results.size(); | ||
| auto top_k_end = search_results.begin() + select_size; | ||
| std::nth_element(search_results.begin(), top_k_end, search_results.end(), [](const auto & lhs, const auto & rhs) { | ||
| return lhs.distance < rhs.distance; | ||
| }); | ||
| search_results.resize(select_size); | ||
| std::vector<UInt32> selected_rows; | ||
| selected_rows.reserve(search_results.size()); | ||
| for (const auto & row : search_results) | ||
| selected_rows.push_back(row.key); |
There was a problem hiding this comment.
| auto select_size = search_results.size() > topk ? topk : search_results.size(); | |
| auto top_k_end = search_results.begin() + select_size; | |
| std::nth_element(search_results.begin(), top_k_end, search_results.end(), [](const auto & lhs, const auto & rhs) { | |
| return lhs.distance < rhs.distance; | |
| }); | |
| search_results.resize(select_size); | |
| std::vector<UInt32> selected_rows; | |
| selected_rows.reserve(search_results.size()); | |
| for (const auto & row : search_results) | |
| selected_rows.push_back(row.key); | |
| const auto select_size = std::min(search_results.size(), topk); | |
| auto top_k_end = search_results.begin() + select_size; | |
| std::nth_element(search_results.begin(), top_k_end, search_results.end(), [](const auto & lhs, const auto & rhs) { | |
| return lhs.distance < rhs.distance; | |
| }); | |
| std::vector<UInt32> selected_rows(select_size); | |
| for (size_t i = 0; i < select_size; ++i) | |
| selected_rows[i] = search_results[i].key); |
This could avoid re-allocate the vector size
There was a problem hiding this comment.
Better add a check for select_size != 0. Because we can not ensure the length of index_stream->load() is always larger than 0 in this class
|
|
||
| UInt32 precedes_rows = 0; | ||
| std::vector<VectorIndexViewer::SearchResult> search_results; | ||
| for (size_t i = 0; i < stream->children.size(); ++i) |
There was a problem hiding this comment.
| for (size_t i = 0; i < stream->children.size(); ++i) | |
| assert(stream->children.size() == index_streams.size()); // otherwise the `row.key` of the search result is not correct | |
| for (size_t i = 0; i < stream->children.size(); ++i) |


What problem does this PR solve?
Issue Number: ref #9600 , close #9599
Problem Summary:
What is changed and how it works?
Before, the read path of vector search is:
A simple example:
We have cf_0 (indexed, offset: 100-110), cf_1 (indexed, offset: 110-120), cf_3 (indexed, offset: 120-130), cf_4 (memtable, offset: 130-140), dmfile_0 (indexed, offset: 0-100, each 10 rows as a pack), to find top5:
So, in this example, we need to read 3 packs and 3 ColumnFileTinys.
This PR changes the read path to:
Still using the above example, now to find top5:
Now, we only need to read 2 packs and 2 ColumnFileTinys.
Benchmark
Run VectorDBBench, about 750k stable, 250k delta with vector index, 4k delta without vector index.
Note: should disable compact tiflash replica
Without this PR (after fix #9599)
With this PR
All data are in stable
Check List
Tests
Side effects
Documentation
Release note