Storage: Adapt to ingesting snapshot with irregular region range#10151
Storage: Adapt to ingesting snapshot with irregular region range#10151ti-chi-bot[bot] merged 5 commits intopingcap:masterfrom
Conversation
Signed-off-by: JaySon-Huang <tshent@qq.com>
b36f3d4 to
dc991fe
Compare
27775b4 to
1f6f775
Compare
|
/test pull-integration-test |
90c29e8 to
5a114c5
Compare
Signed-off-by: JaySon-Huang <tshent@qq.com>
Signed-off-by: JaySon-Huang <tshent@qq.com>
Signed-off-by: JaySon-Huang <tshent@qq.com>
5a114c5 to
e608e0c
Compare
|
|
||
| assert(!is_common_handle_); | ||
| // According to the tidb encoding rule, the int handle must be larger than 8 bytes. | ||
| RUNTIME_CHECK_MSG( |
There was a problem hiding this comment.
So we allow(I mean RUNTIME_CHECK_MSG) an encoded handle longer than sizeof(int64) here, but will still panic if we meet an encoded handle shorter than sizeof(int64)
There was a problem hiding this comment.
Yes, according to @tangenta this case should not happen. If it happens, then tidb-server would panic.
There was a problem hiding this comment.
And if the value_size is less than 8 bytes, then the DB::DecodeInt64 will likely access random invalid address, which will cause random panic.
Adding a check here so that we can get explicit error rather than random panic.
There was a problem hiding this comment.
I've try to generate a split key like
t{tableID}_r\x30t{tableID}_r\x00\x00\x00\x00\x00\x00\x01
TiKV will refuse to split, with logging like
[2025/05/09 17:41:36.242 +08:00] [INFO] [peer.rs:6460] ["invalid split request"] [source=pd] [peer_id=2078] [region_id=2077] [err="KeyNotInRegion([116, 128, 0, 0, 0, 0, 0, 1, 255, 255, 215, 95, 114, 0, 0, 0, 0, 255, 0, 250, 95, 114, 0, 0, 0, 0, 255, 0, 0, 1, 0, 0, 0, 0, 0, 250], id: 2077 start_key: 7480000000000001FFD75F730000000000FA end_key: 7480000000000001FFD800000000000000F8 region_epoch { conf_ver: 26 version: 217 } peers { id: 2078 store_id: 1 } peers { id: 2166 store_id: 132 role: Learner })"] [thread_id=377]
Signed-off-by: JaySon-Huang <tshent@qq.com>
|
/hold |
|
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: CalvinNeo, Lloyd-Pottiger The full list of commands accepted by this bot can be found here. The pull request process is described here DetailsNeeds approval from an approver in each of these files:
Approvers can indicate their approval by writing |
[LGTM Timeline notifier]Timeline:
|
|
/unhold |
Signed-off-by: ti-chi-bot <ti-community-prow-bot@tidb.io>
|
In response to a cherrypick label: new pull request created to branch |
Signed-off-by: ti-chi-bot <ti-community-prow-bot@tidb.io>
|
In response to a cherrypick label: new pull request created to branch |
|
In response to a cherrypick label: new pull request created to branch |
Signed-off-by: ti-chi-bot <ti-community-prow-bot@tidb.io>
) (release-8.5) (#10157) close #10147 Storage: Adapt to ingesting snapshot with irregular region range If the key format is "t${tableID}_r${handleID1}{AnySuffix}", and `AnySuffix` is not empty, `"t${tableID}_r${handleID1}{AnySuffix}" > "t${tableID}_r${handleID1}"` according to the the comparison semantics of "Key" on TiDB/TiKV. For example, If a table is non-clustered, that is with Int64 as the HandleID type. And the range of a Region is `["xxx_r{handleId1}{AnySuffix}", "xxx_r{handleId2}{AnySuffix}")`, where handleId1 = 100, handleId2 = 200. If `AnySuffix` is empty, then Region contains the left-closed-right-open range of `[handleID1, handleID2)` If `AnySuffix` is not empty, then Region contains the range `(handleID1, handleID2]`, which is actually the left-closed-right-open range of `[handleID1+1, handleID2+1)`. Signed-off-by: ti-chi-bot <ti-community-prow-bot@tidb.io> Signed-off-by: JaySon-Huang <tshent@qq.com> Co-authored-by: JaySon <tshent@qq.com> Co-authored-by: JaySon-Huang <tshent@qq.com>
) cherry-pick of #10151 ### What problem does this PR solve? Issue Number: close #10147 Problem Summary: As the issue describe ### What is changed and how it works? * Adapt to irregular region range with any suffix, not only "\x00" * Refine the irregular region range parsing warning that we can get more information for better diagnosing * Turn `RowKeyValue(bool is_common_handle_, HandleValuePtr value_)` to `RowKeyValue::fromHandleWithSuffix(bool is_common_handle_, HandleValuePtr value_)` * Move the parse warning from parsing `RowKeyValue` to parsing `RowKeyRange::fromRegionRange` ```commit-message Storage: Adapt to ingesting snapshot with irregular region range If the key format is "t${tableID}_r${handleID1}{AnySuffix}", and `AnySuffix` is not empty, `"t${tableID}_r${handleID1}{AnySuffix}" > "t${tableID}_r${handleID1}"` according to the the comparison semantics of "Key" on TiDB/TiKV. For example, If a table is non-clustered, that is with Int64 as the HandleID type. And the range of a Region is `["xxx_r{handleId1}{AnySuffix}", "xxx_r{handleId2}{AnySuffix}")`, where handleId1 = 100, handleId2 = 200. If `AnySuffix` is empty, then Region contains the left-closed-right-open range of `[handleID1, handleID2)` If `AnySuffix` is not empty, then Region contains the range `(handleID1, handleID2]`, which is actually the left-closed-right-open range of `[handleID1+1, handleID2+1)`. ``` ### Check List Tests <!-- At least one of them must be included. --> - [x] Unit test - [ ] Integration test - [x] Manual test (add detailed scripts or steps below) ``` create table if not exists t (a bigint primary key, b int); insert into t values (1,2),(2,3),(3,4),(4,5),(20, 50), (21, 51),(200002,2),(200003,3),(200004,4), (9223372036854775807, 101),(9223372036854775806, 100),(-9223372036854775808, -101),(-9223372036854775807, -100); // The keys is related to the table_id. For these verification cases, the table_id=471 tiup ctl pd -u {pd_ip}:{pd_port} -i # split with table_id=471, row_id=20, suffix='\x00' operator add split-region 2077 --policy usekey --keys 7480000000000001FFD75F728000000000FF0000140000000000FB # then add tiflash replica alter table set tiflash replica 1; # split with table_id=471, row_id=20, suffix='\x01' operator add split-region 2077 --policy usekey --keys 7480000000000001FFD75F728000000000FF0000140100000000FB # then add tiflash replica alter table set tiflash replica 1; # split with table_id=471, row_id=20, suffix='\x0130FF' operator add split-region 2077 --policy usekey --keys 7480000000000001FFD75F728000000000FF0000140130FF0000FD # then add tiflash replica alter table set tiflash replica 1; # split with table_id=471, row_id=-9223372036854775808, suffix='' operator add split-region 2077 --policy usekey --keys 7480000000000001FFD75F720000000000FF0000000000000000FA # then add tiflash replica alter table set tiflash replica 1; # split with table_id=471, row_id=-9223372036854775808, suffix='\x01' operator add split-region 2077 --policy usekey --keys 7480000000000001FFD75F720000000000FF0000000100000000FB # then add tiflash replica alter table set tiflash replica 1; # split with table_id=471, row_id=9223372036854775807, suffix='' operator add split-region 2077 --policy usekey --keys 7480000000000001FFD75F72FFFFFFFFFFFFFFFFFF0000000000FA # then add tiflash replica alter table set tiflash replica 1; # split with table_id=471, row_id=9223372036854775807, suffix='\x01' operator add split-region 2077 --policy usekey --keys 7480000000000001FFD75F72FFFFFFFFFFFFFFFFFF0100000000FB # then add tiflash replica alter table set tiflash replica 1; ``` - [ ] No code Side effects - [ ] Performance regression: Consumes more CPU - [ ] Performance regression: Consumes more Memory - [ ] Breaking backward compatibility Documentation - [ ] Affects user behaviors - [ ] Contains syntax changes - [ ] Contains variable changes - [ ] Contains experimental features - [ ] Changes MySQL compatibility ### Release note <!-- bugfix or new feature needs a release note --> ```release-note Fix the issue that TiFlash might panic when handling snapshot with irregular region range ``` Signed-off-by: JaySon-Huang <tshent@qq.com>
What problem does this PR solve?
Issue Number: close #10147
Problem Summary: As the issue describe
What is changed and how it works?
RowKeyValue(bool is_common_handle_, HandleValuePtr value_)toRowKeyValue::fromHandleWithSuffix(bool is_common_handle_, HandleValuePtr value_)RowKeyValueto parsingRowKeyRange::fromRegionRangeCheck List
Tests
Side effects
Documentation
Release note