-
Notifications
You must be signed in to change notification settings - Fork 3.7k
[Fix](partial update) Fix __DORIS_SEQUENCE_COL__ is not set for newly inserted rows in partial update
#40272
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Fix](partial update) Fix __DORIS_SEQUENCE_COL__ is not set for newly inserted rows in partial update
#40272
Conversation
|
Thank you for your contribution to Apache Doris. Since 2024-03-18, the Document has been moved to doris-website. |
5c95c0b to
4d42e7a
Compare
|
run buildall |
4d42e7a to
e68de4d
Compare
|
clang-tidy review says "All clean, LGTM! 👍" |
1 similar comment
|
clang-tidy review says "All clean, LGTM! 👍" |
e68de4d to
bc37aa8
Compare
|
run buildall |
|
clang-tidy review says "All clean, LGTM! 👍" |
1 similar comment
|
clang-tidy review says "All clean, LGTM! 👍" |
TPC-H: Total hot run time: 38332 ms |
TPC-DS: Total hot run time: 192200 ms |
ClickBench: Total hot run time: 32.06 s |
bc37aa8 to
2ca8923
Compare
|
clang-tidy review says "All clean, LGTM! 👍" |
|
run buildall |
TPC-H: Total hot run time: 37998 ms |
TPC-DS: Total hot run time: 192237 ms |
ClickBench: Total hot run time: 33.21 s |
|
clang-tidy review says "All clean, LGTM! 👍" |
|
run buildall |
TPC-H: Total hot run time: 38168 ms |
TPC-DS: Total hot run time: 187520 ms |
ClickBench: Total hot run time: 31.67 s |
a091bc7 to
afa077a
Compare
|
run buildall |
|
clang-tidy review says "All clean, LGTM! 👍" |
|
run buildall |
|
clang-tidy review says "All clean, LGTM! 👍" |
TPC-H: Total hot run time: 38326 ms |
zhannngchen
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
|
PR approved by at least one committer and no changes requested. |
|
run buildall |
TPC-H: Total hot run time: 41884 ms |
TPC-DS: Total hot run time: 193940 ms |
ClickBench: Total hot run time: 32.07 s |
|
TeamCity be ut coverage result: |
dataroaring
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
…ly inserted rows in partial update (#40272) ## Proposed changes ### 1. Fix `__DORIS_SEQUENCE_COL__` is not set for newly inserted rows in partial update before: ```sql MySQL root@127.1:d1> CREATE TABLE IF NOT EXISTS t3 ( -> `k` BIGINT NOT NULL, -> `c1` int, -> `c2` datetime default current_timestamp, -> ) UNIQUE KEY(`k`) -> DISTRIBUTED BY HASH(`k`) BUCKETS 1 -> PROPERTIES ( -> "replication_allocation" = "tag.location.default: 1", -> "enable_unique_key_merge_on_write" = "true", -> "function_column.sequence_col" = "c2" -> ); Query OK, 0 rows affected MySQL root@127.1:d1> set enable_insert_strict=false; MySQL root@127.1:d1> set enable_unique_key_partial_update=true; MySQL root@127.1:d1> insert into t3(k,c1) values(99,99); MySQL root@127.1:d1> set show_hidden_columns=true; MySQL root@127.1:d1> select * from t3; +----+----+---------------------+-----------------------+-----------------------+------------------------+ | k | c1 | c2 | __DORIS_DELETE_SIGN__ | __DORIS_VERSION_COL__ | __DORIS_SEQUENCE_COL__ | +----+----+---------------------+-----------------------+-----------------------+------------------------+ | 99 | 99 | 2024-09-02 11:03:09 | 0 | 2 | <null> | +----+----+---------------------+-----------------------+-----------------------+------------------------+ ``` after: ```sql MySQL root@127.1:d1> CREATE TABLE IF NOT EXISTS t3 ( -> `k` BIGINT NOT NULL, -> `c1` int, -> `c2` datetime default current_timestamp, -> ) UNIQUE KEY(`k`) -> DISTRIBUTED BY HASH(`k`) BUCKETS 1 -> PROPERTIES ( -> "replication_allocation" = "tag.location.default: 1", -> "enable_unique_key_merge_on_write" = "true", -> "function_column.sequence_col" = "c2" -> ); Query OK, 0 rows affected MySQL root@127.1:d1> set enable_insert_strict=false; MySQL root@127.1:d1> set enable_unique_key_partial_update=true; MySQL root@127.1:d1> insert into t3(k,c1) values(1,10); MySQL root@127.1:d1> set show_hidden_columns=true; MySQL root@127.1:d1> select * from t3; +---+----+---------------------+-----------------------+-----------------------+------------------------+ | k | c1 | c2 | __DORIS_DELETE_SIGN__ | __DORIS_VERSION_COL__ | __DORIS_SEQUENCE_COL__ | +---+----+---------------------+-----------------------+-----------------------+------------------------+ | 1 | 10 | 2024-09-02 16:49:50 | 0 | 2 | 2024-09-02 16:49:50 | +---+----+---------------------+-----------------------+-----------------------+------------------------+ ``` ### 2. Fix `current_timestamp()` precision loss for newly inserted rows in partial update before: ```sql MySQL root@127.1:d1> CREATE TABLE IF NOT EXISTS t3 ( -> `k` BIGINT NOT NULL, -> `c1` int, -> `c2` datetime(6) default current_timestamp(6), -> ) UNIQUE KEY(`k`) -> DISTRIBUTED BY HASH(`k`) BUCKETS 1 -> PROPERTIES ( -> "replication_allocation" = "tag.location.default: 1", -> "enable_unique_key_merge_on_write" = "true", -> "function_column.sequence_col" = "c2" -> ); Query OK, 0 rows affected MySQL root@127.1:d1> set enable_unique_key_partial_update=true; MySQL root@127.1:d1> set enable_insert_strict=false; MySQL root@127.1:d1> insert into t3(k,c1) values(3,10); MySQL root@127.1:d1> set show_hidden_columns=true; MySQL root@127.1:d1> select * from t3; +---+----+----------------------------+-----------------------+-----------------------+----------------------------+ | k | c1 | c2 | __DORIS_DELETE_SIGN__ | __DORIS_VERSION_COL__ | __DORIS_SEQUENCE_COL__ | +---+----+----------------------------+-----------------------+-----------------------+----------------------------+ | 3 | 10 | 2024-09-02 19:04:55 | 0 | 2 | <null> | +---+----+----------------------------+-----------------------+-----------------------+----------------------------+ ``` after: ```sql MySQL root@127.1:d1> CREATE TABLE IF NOT EXISTS t3 ( -> `k` BIGINT NOT NULL, -> `c1` int, -> `c2` datetime(6) default current_timestamp(6), -> ) UNIQUE KEY(`k`) -> DISTRIBUTED BY HASH(`k`) BUCKETS 1 -> PROPERTIES ( -> "replication_allocation" = "tag.location.default: 1", -> "enable_unique_key_merge_on_write" = "true", -> "function_column.sequence_col" = "c2" -> ); Query OK, 0 rows affected MySQL root@127.1:d1> set enable_unique_key_partial_update=true; MySQL root@127.1:d1> set enable_insert_strict=false; MySQL root@127.1:d1> insert into t3(k,c1) values(3,10); MySQL root@127.1:d1> set show_hidden_columns=true; MySQL root@127.1:d1> select * from t3; +---+----+----------------------------+-----------------------+-----------------------+----------------------------+ | k | c1 | c2 | __DORIS_DELETE_SIGN__ | __DORIS_VERSION_COL__ | __DORIS_SEQUENCE_COL__ | +---+----+----------------------------+-----------------------+-----------------------+----------------------------+ | 3 | 10 | 2024-09-02 19:04:55.464438 | 0 | 2 | 2024-09-02 19:04:55.464438 | +---+----+----------------------------+-----------------------+-----------------------+----------------------------+ ```
…newly inserted rows in partial update apache#40272 picks apache#40272
…update apache#39619 pick [opt](partial update) Remove unnecessary lock and refactor some code for partial update (apache#40062) 1. apache#34112 let partial update fetch rowsets in the initialization of RowsetBuilder rather than flush phase. So we can remove that tablet header lock. 2. refactor some partial update code fix compile pick [Fix](partial update) Fix __DORIS_SEQUENCE_COL__ is not set for newly inserted rows in partial update apache#40272 picks apache#40272 pick [Cherry-pick](branch-2.1) Pick "[Featrue](default value) Support bitmap_empty default value (apache#40364)" (apache#40487) Pick apache#40364 <!--Describe your changes.--> pick [Feature](partial update) Support flexible partial update in stream load with json files (apache#39756) This PR add the ability to update different columns for each row in one stream load Doc: apache/doris-website#1140 ```sql MySQL root@127.1:d1> CREATE TABLE t1 ( -> `k` int(11) NULL, -> `v1` BIGINT NULL, -> `v2` BIGINT NULL DEFAULT "9876", -> `v3` BIGINT NOT NULL, -> `v4` BIGINT NOT NULL DEFAULT "1234", -> `v5` BIGINT NULL -> ) UNIQUE KEY(`k`) DISTRIBUTED BY HASH(`k`) BUCKETS 1 -> PROPERTIES( -> "replication_num" = "1", -> "enable_unique_key_merge_on_write" = "true"); Query OK, 0 rows affected Time: 0.013s MySQL root@127.1:d1> insert into t1 select number, number, number, number, number, number from numbers("number" = "6"); Query OK, 6 rows affected Time: 0.107s MySQL root@127.1:d1> select * from t1; +---+----+----+----+----+----+ | k | v1 | v2 | v3 | v4 | v5 | +---+----+----+----+----+----+ | 0 | 0 | 0 | 0 | 0 | 0 | | 1 | 1 | 1 | 1 | 1 | 1 | | 2 | 2 | 2 | 2 | 2 | 2 | | 3 | 3 | 3 | 3 | 3 | 3 | | 4 | 4 | 4 | 4 | 4 | 4 | | 5 | 5 | 5 | 5 | 5 | 5 | +---+----+----+----+----+----+ ``` test1.json: ```json {"k": 1, "v1": 10} {"k": 2, "v2": 20, "v5": 25} {"k": 3, "v3": 30} {"k": 4, "v4": 20, "v1": 43, "v3": 99} {"k": 5, "v5": null} {"k": 6, "v1": 999, "v3": 777} {"k": 2, "v4": 222} {"k": 1, "v2": 111, "v3": 111} ``` ```bash curl --location-trusted -u root: \ -H "strict_mode:false" \ -H "format:json" \ -H "read_json_by_line:true" \ -H "unique_key_update_mode:UPDATE_FLEXIBLE_COLUMNS" \ -T test1.json \ -XPUT http://<host>:<http_port>/api/d1/t1/_stream_load ``` ```sql MySQL root@127.1:d1> select * from t1; +---+-----+------+-----+------+--------+ | k | v1 | v2 | v3 | v4 | v5 | +---+-----+------+-----+------+--------+ | 0 | 0 | 0 | 0 | 0 | 0 | | 1 | 10 | 111 | 111 | 1 | 1 | | 2 | 2 | 20 | 2 | 222 | 25 | | 3 | 3 | 3 | 30 | 3 | 3 | | 4 | 43 | 4 | 99 | 20 | 4 | | 5 | 5 | 5 | 5 | 5 | <null> | | 6 | 999 | 9876 | 777 | 1234 | <null> | +---+-----+------+-----+------+--------+ ``` fix compile pick [branch-2.1] Picks "[opt](partial update) Allow to only specify key columns in partial update apache#40736" (apache#40863) picks apache#40736 fix
…update apache#39619 pick [opt](partial update) Remove unnecessary lock and refactor some code for partial update (apache#40062) 1. apache#34112 let partial update fetch rowsets in the initialization of RowsetBuilder rather than flush phase. So we can remove that tablet header lock. 2. refactor some partial update code fix compile pick [Fix](partial update) Fix __DORIS_SEQUENCE_COL__ is not set for newly inserted rows in partial update apache#40272 picks apache#40272 pick [Cherry-pick](branch-2.1) Pick "[Featrue](default value) Support bitmap_empty default value (apache#40364)" (apache#40487) Pick apache#40364 <!--Describe your changes.--> pick [Feature](partial update) Support flexible partial update in stream load with json files (apache#39756) This PR add the ability to update different columns for each row in one stream load Doc: apache/doris-website#1140 ```sql MySQL root@127.1:d1> CREATE TABLE t1 ( -> `k` int(11) NULL, -> `v1` BIGINT NULL, -> `v2` BIGINT NULL DEFAULT "9876", -> `v3` BIGINT NOT NULL, -> `v4` BIGINT NOT NULL DEFAULT "1234", -> `v5` BIGINT NULL -> ) UNIQUE KEY(`k`) DISTRIBUTED BY HASH(`k`) BUCKETS 1 -> PROPERTIES( -> "replication_num" = "1", -> "enable_unique_key_merge_on_write" = "true"); Query OK, 0 rows affected Time: 0.013s MySQL root@127.1:d1> insert into t1 select number, number, number, number, number, number from numbers("number" = "6"); Query OK, 6 rows affected Time: 0.107s MySQL root@127.1:d1> select * from t1; +---+----+----+----+----+----+ | k | v1 | v2 | v3 | v4 | v5 | +---+----+----+----+----+----+ | 0 | 0 | 0 | 0 | 0 | 0 | | 1 | 1 | 1 | 1 | 1 | 1 | | 2 | 2 | 2 | 2 | 2 | 2 | | 3 | 3 | 3 | 3 | 3 | 3 | | 4 | 4 | 4 | 4 | 4 | 4 | | 5 | 5 | 5 | 5 | 5 | 5 | +---+----+----+----+----+----+ ``` test1.json: ```json {"k": 1, "v1": 10} {"k": 2, "v2": 20, "v5": 25} {"k": 3, "v3": 30} {"k": 4, "v4": 20, "v1": 43, "v3": 99} {"k": 5, "v5": null} {"k": 6, "v1": 999, "v3": 777} {"k": 2, "v4": 222} {"k": 1, "v2": 111, "v3": 111} ``` ```bash curl --location-trusted -u root: \ -H "strict_mode:false" \ -H "format:json" \ -H "read_json_by_line:true" \ -H "unique_key_update_mode:UPDATE_FLEXIBLE_COLUMNS" \ -T test1.json \ -XPUT http://<host>:<http_port>/api/d1/t1/_stream_load ``` ```sql MySQL root@127.1:d1> select * from t1; +---+-----+------+-----+------+--------+ | k | v1 | v2 | v3 | v4 | v5 | +---+-----+------+-----+------+--------+ | 0 | 0 | 0 | 0 | 0 | 0 | | 1 | 10 | 111 | 111 | 1 | 1 | | 2 | 2 | 20 | 2 | 222 | 25 | | 3 | 3 | 3 | 30 | 3 | 3 | | 4 | 43 | 4 | 99 | 20 | 4 | | 5 | 5 | 5 | 5 | 5 | <null> | | 6 | 999 | 9876 | 777 | 1234 | <null> | +---+-----+------+-----+------+--------+ ``` fix compile pick [branch-2.1] Picks "[opt](partial update) Allow to only specify key columns in partial update apache#40736" (apache#40863) picks apache#40736 fix
…update apache#39619 pick [opt](partial update) Remove unnecessary lock and refactor some code for partial update (apache#40062) 1. apache#34112 let partial update fetch rowsets in the initialization of RowsetBuilder rather than flush phase. So we can remove that tablet header lock. 2. refactor some partial update code fix compile pick [Fix](partial update) Fix __DORIS_SEQUENCE_COL__ is not set for newly inserted rows in partial update apache#40272 picks apache#40272 pick [Cherry-pick](branch-2.1) Pick "[Featrue](default value) Support bitmap_empty default value (apache#40364)" (apache#40487) Pick apache#40364 <!--Describe your changes.--> pick [Feature](partial update) Support flexible partial update in stream load with json files (apache#39756) This PR add the ability to update different columns for each row in one stream load Doc: apache/doris-website#1140 ```sql MySQL root@127.1:d1> CREATE TABLE t1 ( -> `k` int(11) NULL, -> `v1` BIGINT NULL, -> `v2` BIGINT NULL DEFAULT "9876", -> `v3` BIGINT NOT NULL, -> `v4` BIGINT NOT NULL DEFAULT "1234", -> `v5` BIGINT NULL -> ) UNIQUE KEY(`k`) DISTRIBUTED BY HASH(`k`) BUCKETS 1 -> PROPERTIES( -> "replication_num" = "1", -> "enable_unique_key_merge_on_write" = "true"); Query OK, 0 rows affected Time: 0.013s MySQL root@127.1:d1> insert into t1 select number, number, number, number, number, number from numbers("number" = "6"); Query OK, 6 rows affected Time: 0.107s MySQL root@127.1:d1> select * from t1; +---+----+----+----+----+----+ | k | v1 | v2 | v3 | v4 | v5 | +---+----+----+----+----+----+ | 0 | 0 | 0 | 0 | 0 | 0 | | 1 | 1 | 1 | 1 | 1 | 1 | | 2 | 2 | 2 | 2 | 2 | 2 | | 3 | 3 | 3 | 3 | 3 | 3 | | 4 | 4 | 4 | 4 | 4 | 4 | | 5 | 5 | 5 | 5 | 5 | 5 | +---+----+----+----+----+----+ ``` test1.json: ```json {"k": 1, "v1": 10} {"k": 2, "v2": 20, "v5": 25} {"k": 3, "v3": 30} {"k": 4, "v4": 20, "v1": 43, "v3": 99} {"k": 5, "v5": null} {"k": 6, "v1": 999, "v3": 777} {"k": 2, "v4": 222} {"k": 1, "v2": 111, "v3": 111} ``` ```bash curl --location-trusted -u root: \ -H "strict_mode:false" \ -H "format:json" \ -H "read_json_by_line:true" \ -H "unique_key_update_mode:UPDATE_FLEXIBLE_COLUMNS" \ -T test1.json \ -XPUT http://<host>:<http_port>/api/d1/t1/_stream_load ``` ```sql MySQL root@127.1:d1> select * from t1; +---+-----+------+-----+------+--------+ | k | v1 | v2 | v3 | v4 | v5 | +---+-----+------+-----+------+--------+ | 0 | 0 | 0 | 0 | 0 | 0 | | 1 | 10 | 111 | 111 | 1 | 1 | | 2 | 2 | 20 | 2 | 222 | 25 | | 3 | 3 | 3 | 30 | 3 | 3 | | 4 | 43 | 4 | 99 | 20 | 4 | | 5 | 5 | 5 | 5 | 5 | <null> | | 6 | 999 | 9876 | 777 | 1234 | <null> | +---+-----+------+-----+------+--------+ ``` fix compile pick [branch-2.1] Picks "[opt](partial update) Allow to only specify key columns in partial update apache#40736" (apache#40863) picks apache#40736 fix
…update apache#39619 pick [opt](partial update) Remove unnecessary lock and refactor some code for partial update (apache#40062) 1. apache#34112 let partial update fetch rowsets in the initialization of RowsetBuilder rather than flush phase. So we can remove that tablet header lock. 2. refactor some partial update code fix compile pick [Fix](partial update) Fix __DORIS_SEQUENCE_COL__ is not set for newly inserted rows in partial update apache#40272 picks apache#40272 pick [Cherry-pick](branch-2.1) Pick "[Featrue](default value) Support bitmap_empty default value (apache#40364)" (apache#40487) Pick apache#40364 <!--Describe your changes.--> pick [Feature](partial update) Support flexible partial update in stream load with json files (apache#39756) This PR add the ability to update different columns for each row in one stream load Doc: apache/doris-website#1140 ```sql MySQL root@127.1:d1> CREATE TABLE t1 ( -> `k` int(11) NULL, -> `v1` BIGINT NULL, -> `v2` BIGINT NULL DEFAULT "9876", -> `v3` BIGINT NOT NULL, -> `v4` BIGINT NOT NULL DEFAULT "1234", -> `v5` BIGINT NULL -> ) UNIQUE KEY(`k`) DISTRIBUTED BY HASH(`k`) BUCKETS 1 -> PROPERTIES( -> "replication_num" = "1", -> "enable_unique_key_merge_on_write" = "true"); Query OK, 0 rows affected Time: 0.013s MySQL root@127.1:d1> insert into t1 select number, number, number, number, number, number from numbers("number" = "6"); Query OK, 6 rows affected Time: 0.107s MySQL root@127.1:d1> select * from t1; +---+----+----+----+----+----+ | k | v1 | v2 | v3 | v4 | v5 | +---+----+----+----+----+----+ | 0 | 0 | 0 | 0 | 0 | 0 | | 1 | 1 | 1 | 1 | 1 | 1 | | 2 | 2 | 2 | 2 | 2 | 2 | | 3 | 3 | 3 | 3 | 3 | 3 | | 4 | 4 | 4 | 4 | 4 | 4 | | 5 | 5 | 5 | 5 | 5 | 5 | +---+----+----+----+----+----+ ``` test1.json: ```json {"k": 1, "v1": 10} {"k": 2, "v2": 20, "v5": 25} {"k": 3, "v3": 30} {"k": 4, "v4": 20, "v1": 43, "v3": 99} {"k": 5, "v5": null} {"k": 6, "v1": 999, "v3": 777} {"k": 2, "v4": 222} {"k": 1, "v2": 111, "v3": 111} ``` ```bash curl --location-trusted -u root: \ -H "strict_mode:false" \ -H "format:json" \ -H "read_json_by_line:true" \ -H "unique_key_update_mode:UPDATE_FLEXIBLE_COLUMNS" \ -T test1.json \ -XPUT http://<host>:<http_port>/api/d1/t1/_stream_load ``` ```sql MySQL root@127.1:d1> select * from t1; +---+-----+------+-----+------+--------+ | k | v1 | v2 | v3 | v4 | v5 | +---+-----+------+-----+------+--------+ | 0 | 0 | 0 | 0 | 0 | 0 | | 1 | 10 | 111 | 111 | 1 | 1 | | 2 | 2 | 20 | 2 | 222 | 25 | | 3 | 3 | 3 | 30 | 3 | 3 | | 4 | 43 | 4 | 99 | 20 | 4 | | 5 | 5 | 5 | 5 | 5 | <null> | | 6 | 999 | 9876 | 777 | 1234 | <null> | +---+-----+------+-----+------+--------+ ``` fix compile pick [branch-2.1] Picks "[opt](partial update) Allow to only specify key columns in partial update apache#40736" (apache#40863) picks apache#40736 fix
pick the following PRs: other PR: - apache#39619 - apache#40062 - apache#40272 - apache#40364 - apache#40736 - apache#41439 main PR: - apache#39756 - apache#41950 - apache#41701
… inserted rows in partial update apache#40272 (apache#40966)
Proposed changes
1. Fix
__DORIS_SEQUENCE_COL__is not set for newly inserted rows in partial updatebefore:
after:
2. Fix
current_timestamp()precision loss for newly inserted rows in partial updatebefore:
after:
branch-2.1-pick: #40964
branch-2.0-pick: #40966