storage: handle IngestSST using segment split#6378
storage: handle IngestSST using segment split#6378ti-chi-bot merged 26 commits intopingcap:masterfrom
Conversation
Signed-off-by: Wish <breezewish@outlook.com>
Signed-off-by: Wish <breezewish@outlook.com>
Signed-off-by: Wish <breezewish@outlook.com>
…plit-2 Signed-off-by: Wish <breezewish@outlook.com>
…ingest-by-split-2 Signed-off-by: Wish <breezewish@outlook.com>
Signed-off-by: Wish <breezewish@outlook.com>
Signed-off-by: Wish <breezewish@outlook.com>
|
[REVIEW NOTIFICATION] This pull request has been approved by:
To complete the pull request process, please ask the reviewers in the list to review by filling The full list of commands accepted by this bot can be found here. DetailsReviewer can indicate their review by submitting an approval review. |
Signed-off-by: Wish <breezewish@outlook.com>
Signed-off-by: Wish <breezewish@outlook.com>
…ingest-by-split-2
…ingest-by-split-2 Signed-off-by: Wish <breezewish@outlook.com>
…ingest-by-split-2 Signed-off-by: Wish <breezewish@outlook.com>
Signed-off-by: Wish <breezewish@outlook.com>
Signed-off-by: Wish <breezewish@outlook.com>
Signed-off-by: Wish <breezewish@outlook.com>
| // Protects the operations in this instance. | ||
| mutable std::mutex mutex; | ||
| // It is a recursive_mutex because the lock may be also used by the parent segment as its update lock. | ||
| mutable std::recursive_mutex mutex; |
There was a problem hiding this comment.
Note: This recursive mutex happens when:
- DeltaMergeStore::segmentIngestData grabs a segment update lock
- DeltaMergeStore::segmentIngestData discovered the segment is not empty and then ingest files into the delta.
- DeltaValueSpace::ingestColumnFiles grabs the same lock to write to delta VS.
The real problem is we do not distinguish the segment update lock with the delta lock. When both lock is needed, they are conflicted. Also, I did not changed the ingestColumnFiles API to allow passing an external lock, because other APIs in the DeltaVS do not have this pattern and I think it may not be a good idea.
There was a problem hiding this comment.
Usually, in DeltaMergeStore::segmentApplyXXX we will acquire the segment update lock and won't change the DeltaVS inside the "apply".
segmentIngestData brings a special pattern that may change the DeltaVS inside its "apply". But I didn't figure out a better solution
dbms/src/Storages/DeltaMerge/DeltaMergeStore_InternalSegment.cpp
Outdated
Show resolved
Hide resolved
dbms/src/Storages/DeltaMerge/DeltaMergeStore_InternalSegment.cpp
Outdated
Show resolved
Hide resolved
Signed-off-by: Wish <breezewish@outlook.com>
Signed-off-by: Wish <breezewish@outlook.com>
|
/run-all-tests |
Signed-off-by: Wish <breezewish@outlook.com>
|
/run-all-tests |
|
/merge |
|
@flowbehappy: It seems you want to merge this PR, I will help you trigger all the tests: /run-all-tests You only need to trigger If you have any questions about the PR merge process, please refer to pr process. DetailsInstructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the ti-community-infra/tichi repository. |
|
This pull request has been accepted and is ready to merge. DetailsCommit hash: 742f389 |
What problem does this PR solve?
Issue Number: close #5237
This is the final PR for #5237. After this one, everything should be finished.
Problem Summary:
What is changed and how it works?
Some changes are extracted:
Add:
Segment::prepareIngestDataandSegment::applyIngestData.It checks whether the segment is empty, and will call either
replaceDataoringestIntoDelta.Add
DeltaMergeStore::segmentIngestDataAPI.It calls Segment::prepareIngestData and Segment::applyIngestData.
Change the workflow of
DeltaMergeStore::ingestFiles. Changes as below:Add tests for concurrent ingest data.
Changed existing
StoreIngestTestto test for bothingestBySplitandingestByDelta.Check List
Tests
Workload 1: Import data using BR on IDC. When all TiFlash replicas are ready, invoke 3 full table scans with 30s interval.
Workload 2: Import CHBenchmark (warehouse=1000) data using BR on AWS EC2. When all TiFlash replicas are ready, invoke 3 full table scans with 30s interval.
Side effects
Documentation
Release note