storage: Add more checks for ingestFiles#6375
Conversation
Signed-off-by: Wish <breezewish@outlook.com>
|
/run-all-tests |
|
/run-unit-test |
JaySon-Huang
left a comment
There was a problem hiding this comment.
LGTM with minor comment
dbms/src/Storages/DeltaMerge/tests/gtest_dm_simple_pk_test_basic.cpp
Outdated
Show resolved
Hide resolved
dbms/src/Storages/DeltaMerge/tests/gtest_dm_simple_pk_test_basic.cpp
Outdated
Show resolved
Hide resolved
|
[REVIEW NOTIFICATION] This pull request has been approved by:
To complete the pull request process, please ask the reviewers in the list to review by filling The full list of commands accepted by this bot can be found here. DetailsReviewer can indicate their review by submitting an approval review. |
Signed-off-by: Wish <breezewish@outlook.com>
|
Will tikv's multi-rocksdb instances feature break the assumption added here (in the future)? |
Currently no, the MultiRaft's data must be strictly matched with the Raft group's boundary. |
|
/run-all-tests |
|
/merge |
|
@breezewish: It seems you want to merge this PR, I will help you trigger all the tests: /run-all-tests You only need to trigger If you have any questions about the PR merge process, please refer to pr process. DetailsInstructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the ti-community-infra/tichi repository. |
|
This pull request has been accepted and is ready to merge. DetailsCommit hash: 8514195 |
What problem does this PR solve?
Issue Number: ref #5237
Problem Summary:
What is changed and how it works?
Add check: Check files passed in are ordered and not overlapped.
Add check: Check files' range is contained by the ingest range (the ingest range is the region's range).
Reason: The content of a ColumnFileBig can be only stripped by the segment range (at the read time), not any other range. If the ingest range is smaller than the files' range, for example: ingest_range = [0, 100) and file_data_range = [50, 200), [50, 200) will be still ingested, while [0, 100) will be cleared (if
clear_data_in_range= true).To avoid surprising results later, we simply forbid this case now.
Note: Although we specified a stripped range to
ingestColumnFileshttps://github.com/pingcap/tiflash/blob/09e3a9e2413189070740af58430a9c1da7fe9b72/dbms/src/Storages/DeltaMerge/DeltaMergeStore.cpp#LL849 , the stripped range is only used for delete data, instead of stripping the output.This check is risky. We need some time for E2E tests to verify whether there are some unintentional edge cases.
Currently TiKV has similar checks:
Introduce
ingestFilesto the dm test framework, so that we can write related tests easier.As a result, test
DeltaMergeStoreRWTest/IngestEmptyFileListsis migrated toStoreIngestTest/EmptyFileLists, simplified using new test framework.Add tests to verify check 1 and check 2, based on the test framework above.
Check List
Tests
Side effects
Documentation
Release note