Skip to content

Conversation

@bobhan1
Copy link
Contributor

@bobhan1 bobhan1 commented Mar 19, 2025

What problem does this PR solve?

cloud heavy sc job will retry the whole alter tasks when encounter KV_TXN_CONFLICT_RETRY_EXCEEDED_MAX_TIMES error in commit_tablet_job(#46748). We should remove stop token(#48399) in MS for the sc job if it fails in commit_tablet_job, otherwise the later retries may fail to regsiter stop token(because the first stop token won't expire in config::lease_compaction_interval_seconds * 4=80s) and the schema change job will fail.

I20250318 15:40:15.851157  7677 task_worker_pool.cpp:423] successfully submit task|type=ALTER|signature=1742283174829
I20250318 15:40:31.346628  6496 task_worker_pool.cpp:1999] get alter table task, signature: 1742283174829
I20250318 15:40:31.346635  6496 task_worker_pool.cpp:281] start alter tablet|signature=1742283174829|base_tablet_id=1742283170711|new_tablet_id=1742283174829|mem_limit=10682972209
I20250318 15:40:31.350860  6496 cloud_schema_change_job.cpp:132] Begin to alter tablet. base_tablet_id=1742283170711, new_tablet_id=1742283174829, alter_version=2, job_id=1742283173457
I20250318 15:40:31.350906  6496 cloud_schema_change_job.cpp:226] Begin to convert historical rowsets for new_tablet from base_tablet. base_tablet=1742283170711, new_tablet=1742283174829, job_id=1742283173457
I20250318 15:40:31.350916  6496 cloud_schema_change_job.cpp:247] schema change type, sc_sorting: 0, sc_directly: 1, base_tablet=1742283170711, new_tablet=1742283174829
I20250318 15:40:31.382493  6496 segment_creator.cpp:308] tablet_id:1742283174829, flushing rowset_dir: , rowset_id:020000000000038f6644fd8945079d22209de0cad6c7e5b8, data size:73808, index size:3289
I20250318 15:40:31.385416  6496 cloud_schema_change_job.cpp:416] process mow table|new_tablet_id=1742283174829|out_rowset_size=1|start_calc_delete_bitmap_version=3|alter_version=2
I20250318 15:40:31.387535  6496 cloud_storage_engine.cpp:894] successfully register compaction stop token for tablet_id=1742283174829, delete_bitmap_lock_initiator=6632031443518271970
I20250318 15:40:31.388285  6496 cloud_schema_change_job.cpp:439] alter table for mow table, calculate delete bitmap of incremental rowsets without lock, version: 3-2 new_table_id: 1742283174829
I20250318 15:40:31.391326  6496 cloud_schema_change_job.cpp:460] alter table for mow table, calculate delete bitmap of incremental rowsets with lock, version: 3-2 new_tablet_id: 1742283174829
I20250318 15:40:31.392035  6496 cloud_storage_engine.cpp:915] successfully unregister compaction stop token for tablet_id=1742283174829, delete_bitmap_lock_initiator=6632031443518271970
W20250318 15:40:39.947554  6496 task_worker_pool.cpp:306] failed to alter tablet|signature=1742283174829|base_tablet_id=1742283170711|new_tablet_id=1742283174829|error=[DELETE_BITMAP_LOCK_ERROR]txn conflict when commit tablet job idx { table_id: 1742283165243 index_id: 1742283165244 partition_id: 1742283165242 tablet_id: 1742283170711 } schema_change { initiator: "172.20.56.12:9050" id: "1742283173457" new_tablet_idx { table_id: 1742283165243 index_id: 1742283173458 partition_id: 1742283165242 tablet_id: 1742283174829 } txn_ids: 610474243072 alter_version: 2 num_output_rowsets: 1 num_output_segments: 1 size_output_rowsets: 77097 num_output_rows: 611 output_versions: 2 output_cumulative_point: 2 delete_bitmap_lock_initiator: 6632031443518271970 index_size_output_rowsets: 3289 segment_size_output_rowsets: 73808 }
I20250318 15:40:46.204162  7677 task_worker_pool.cpp:423] successfully submit task|type=ALTER|signature=1742283174829
I20250318 15:41:07.487172  6496 task_worker_pool.cpp:1999] get alter table task, signature: 1742283174829
I20250318 15:41:07.487183  6496 task_worker_pool.cpp:281] start alter tablet|signature=1742283174829|base_tablet_id=1742283170711|new_tablet_id=1742283174829|mem_limit=10682972209
I20250318 15:41:07.489440  6496 cloud_schema_change_job.cpp:132] Begin to alter tablet. base_tablet_id=1742283170711, new_tablet_id=1742283174829, alter_version=2, job_id=1742283173457
I20250318 15:41:07.489511  6496 cloud_schema_change_job.cpp:226] Begin to convert historical rowsets for new_tablet from base_tablet. base_tablet=1742283170711, new_tablet=1742283174829, job_id=1742283173457
I20250318 15:41:07.489523  6496 cloud_schema_change_job.cpp:247] schema change type, sc_sorting: 0, sc_directly: 1, base_tablet=1742283170711, new_tablet=1742283174829
I20250318 15:41:07.490249  6496 cloud_schema_change_job.cpp:285] Rowset [2-2] has already existed in tablet 1742283174829
I20250318 15:41:07.490275  6496 cloud_schema_change_job.cpp:416] process mow table|new_tablet_id=1742283174829|out_rowset_size=1|start_calc_delete_bitmap_version=2|alter_version=2
W20250318 15:41:07.490864  6496 cloud_compaction_stop_token.cpp:89] failed to register compaction stop token|job_id=a018587a-c12f-4926-9d7e-514ff9d88457|delete_bitmap_lock_initiator=1847151139249560285|tablet_id=1742283174829|error=[INTERNAL_ERROR]failed to start tablet job: compactions are not allowed on tablet_id=1742283174829 currently, blocked by schema change job delete_bitmap_initiator=6632031443518271970
W20250318 15:41:07.490897  6496 task_worker_pool.cpp:306] failed to alter tablet|signature=1742283174829|base_tablet_id=1742283170711|new_tablet_id=1742283174829|error=[INTERNAL_ERROR]failed to start tablet job: compactions are not allowed on tablet_id=1742283174829 currently, blocked by schema change job delete_bitmap_initiator=6632031443518271970

Release note

None

Check List (For Author)

  • Test

    • Regression test
    • Unit Test
    • Manual test (add detailed scripts or steps below)
    • No need to test or manual test. Explain why:
      • This is a refactor/code format and no logic has been changed.
      • Previous test can cover this change.
      • No code files have been changed.
      • Other reason
  • Behavior changed:

    • No.
    • Yes.
  • Does this need documentation?

    • No.
    • Yes.

Check List (For Reviewer who merge this PR)

  • Confirm the release note
  • Confirm test cases
  • Confirm document
  • Add branch pick label

@Thearas
Copy link
Contributor

Thearas commented Mar 19, 2025

Thank you for your contribution to Apache Doris.
Don't know what should be done next? See How to process your PR.

Please clearly describe your PR:

  1. What problem was fixed (it's best to include specific error reporting information). How it was fixed.
  2. Which behaviors were modified. What was the previous behavior, what is it now, why was it modified, and what possible impacts might there be.
  3. What features were added. Why was this function added?
  4. Which code was refactored and why was this part of the code refactored?
  5. Which functions were optimized and what is the difference before and after the optimization?

@bobhan1 bobhan1 force-pushed the fix-sc-self-retry-fail-by-stop-token branch from 1b67e73 to f25917a Compare March 19, 2025 13:36
@bobhan1 bobhan1 changed the title [Fix](cloud-sc) Clear stop token when commit_tablet_job fails [Opt](cloud-sc) Clear stop token when commit_tablet_job fails Mar 19, 2025
@bobhan1
Copy link
Contributor Author

bobhan1 commented Mar 20, 2025

run buildall

@doris-robot
Copy link

TPC-H: Total hot run time: 32647 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpch-tools
Tpch sf100 test result on commit f25917a96ba9edc01323fe80a7a45e230f86274b, data reload: false

------ Round 1 ----------------------------------
q1	24179	5111	5041	5041
q2	2036	304	175	175
q3	10380	1269	694	694
q4	10222	1022	553	553
q5	7584	2440	2363	2363
q6	189	165	138	138
q7	954	752	632	632
q8	9325	1307	1273	1273
q9	5242	4556	4838	4556
q10	6901	2316	1921	1921
q11	498	280	270	270
q12	365	361	229	229
q13	17781	3740	3093	3093
q14	237	230	216	216
q15	541	473	495	473
q16	650	606	594	594
q17	610	861	356	356
q18	6896	6477	6385	6385
q19	1954	954	571	571
q20	344	327	200	200
q21	2975	2264	1943	1943
q22	1110	1030	971	971
Total cold run time: 110973 ms
Total hot run time: 32647 ms

----- Round 2, with runtime_filter_mode=off -----
q1	5256	5172	5195	5172
q2	234	329	234	234
q3	2158	2634	2279	2279
q4	1425	1806	1421	1421
q5	4234	4234	4404	4234
q6	228	169	130	130
q7	1998	1939	1798	1798
q8	2661	2548	2593	2548
q9	7286	7187	7216	7187
q10	3088	3128	2706	2706
q11	597	533	506	506
q12	693	777	595	595
q13	3533	3990	3340	3340
q14	289	300	287	287
q15	524	483	486	483
q16	643	694	658	658
q17	1156	1640	1332	1332
q18	7886	7668	7503	7503
q19	899	1024	1178	1024
q20	1996	2019	1894	1894
q21	5410	4744	4815	4744
q22	1111	1061	1047	1047
Total cold run time: 53305 ms
Total hot run time: 51122 ms

@doris-robot
Copy link

TPC-DS: Total hot run time: 192526 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpcds-tools
TPC-DS sf100 test result on commit f25917a96ba9edc01323fe80a7a45e230f86274b, data reload: false

query1	1393	1099	1049	1049
query2	6316	1966	1966	1966
query3	11198	4653	4566	4566
query4	25539	23769	23046	23046
query5	4367	685	482	482
query6	308	194	186	186
query7	3998	506	300	300
query8	298	249	225	225
query9	8477	2622	2612	2612
query10	494	314	261	261
query11	15310	15148	14896	14896
query12	164	111	106	106
query13	1579	533	390	390
query14	9415	6383	6304	6304
query15	212	185	170	170
query16	7443	658	460	460
query17	1203	788	628	628
query18	2135	428	332	332
query19	233	201	195	195
query20	124	123	118	118
query21	209	133	114	114
query22	4512	4557	4337	4337
query23	34236	33548	33550	33548
query24	7722	2528	2470	2470
query25	559	460	409	409
query26	1085	283	166	166
query27	2668	483	345	345
query28	4558	2449	2417	2417
query29	696	594	455	455
query30	281	229	196	196
query31	935	875	816	816
query32	76	69	63	63
query33	527	391	294	294
query34	803	862	541	541
query35	832	855	757	757
query36	985	1016	923	923
query37	128	100	80	80
query38	4298	4231	4308	4231
query39	1496	1433	1467	1433
query40	210	120	117	117
query41	55	56	52	52
query42	123	106	114	106
query43	529	516	491	491
query44	1384	826	813	813
query45	189	206	168	168
query46	894	1050	669	669
query47	1862	1861	1791	1791
query48	396	439	336	336
query49	759	526	464	464
query50	762	788	454	454
query51	4411	4298	4251	4251
query52	107	103	99	99
query53	247	279	196	196
query54	524	521	426	426
query55	90	88	80	80
query56	280	269	256	256
query57	1172	1187	1150	1150
query58	258	238	239	238
query59	2757	3082	2860	2860
query60	295	291	271	271
query61	127	126	128	126
query62	780	779	676	676
query63	246	198	196	196
query64	4047	1066	692	692
query65	4489	4546	4467	4467
query66	927	414	310	310
query67	16405	15433	15354	15354
query68	9703	891	499	499
query69	484	306	267	267
query70	1239	1123	1149	1123
query71	479	311	262	262
query72	5113	3581	3774	3581
query73	799	741	357	357
query74	9026	9162	8774	8774
query75	4243	3149	2666	2666
query76	5097	1198	775	775
query77	973	383	286	286
query78	9927	10143	9175	9175
query79	4803	851	570	570
query80	612	544	474	474
query81	472	254	221	221
query82	196	122	97	97
query83	164	173	158	158
query84	288	96	74	74
query85	753	364	320	320
query86	324	304	302	302
query87	4666	4470	4390	4390
query88	2999	2293	2259	2259
query89	430	319	283	283
query90	2202	226	229	226
query91	141	141	108	108
query92	72	59	58	58
query93	3248	1036	573	573
query94	680	424	309	309
query95	373	278	269	269
query96	511	566	278	278
query97	3306	3338	3287	3287
query98	231	207	211	207
query99	1374	1383	1302	1302
Total cold run time: 286065 ms
Total hot run time: 192526 ms

@doris-robot
Copy link

ClickBench: Total hot run time: 31.44 s
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/clickbench-tools
ClickBench test result on commit f25917a96ba9edc01323fe80a7a45e230f86274b, data reload: false

query1	0.04	0.03	0.03
query2	0.12	0.10	0.10
query3	0.26	0.19	0.19
query4	1.59	0.20	0.18
query5	0.59	0.60	0.58
query6	1.22	0.72	0.73
query7	0.02	0.02	0.02
query8	0.04	0.04	0.03
query9	0.59	0.53	0.53
query10	0.57	0.59	0.56
query11	0.16	0.11	0.11
query12	0.15	0.11	0.11
query13	0.61	0.60	0.61
query14	2.68	2.69	2.68
query15	0.94	0.86	0.85
query16	0.39	0.38	0.38
query17	1.02	1.04	1.06
query18	0.21	0.20	0.20
query19	1.92	1.94	1.84
query20	0.01	0.01	0.02
query21	15.35	0.94	0.57
query22	0.77	1.16	0.76
query23	14.84	1.39	0.61
query24	7.03	1.56	0.98
query25	0.61	0.21	0.10
query26	0.54	0.16	0.13
query27	0.05	0.06	0.05
query28	10.15	0.90	0.42
query29	12.53	3.93	3.27
query30	0.25	0.09	0.06
query31	2.82	0.58	0.39
query32	3.24	0.55	0.45
query33	2.92	2.99	3.13
query34	15.80	5.16	4.55
query35	4.52	4.57	4.57
query36	0.68	0.50	0.48
query37	0.08	0.06	0.06
query38	0.05	0.04	0.03
query39	0.03	0.03	0.02
query40	0.18	0.14	0.12
query41	0.08	0.03	0.02
query42	0.03	0.02	0.03
query43	0.03	0.03	0.03
Total cold run time: 105.71 s
Total hot run time: 31.44 s

@bobhan1
Copy link
Contributor Author

bobhan1 commented Mar 20, 2025

run p0

@doris-robot
Copy link

BE UT Coverage Report

Increment line coverage 0.00% (0/25) 🎉

Increment coverage report
Complete coverage report

Category Coverage
Function Coverage 48.79% (13068/26784)
Line Coverage 38.37% (112712/293729)
Region Coverage 37.16% (57281/154158)
Branch Coverage 32.27% (28798/89250)

}
}};
if (_new_tablet->enable_unique_key_merge_on_write()) {
has_stop_token = true;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should move register_compaction_stop_token() here
the register and unregister operation should request in same sope?

@github-actions
Copy link
Contributor

PR approved by anyone and no changes requested.

Copy link
Contributor

@dataroaring dataroaring left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@github-actions github-actions bot added the approved Indicates a PR has been approved by one committer. label Mar 26, 2025
@github-actions
Copy link
Contributor

PR approved by at least one committer and no changes requested.

@dataroaring dataroaring merged commit 5834fae into apache:master Mar 26, 2025
37 of 40 checks passed
github-actions bot pushed a commit that referenced this pull request Mar 26, 2025
### What problem does this PR solve?

cloud heavy sc job will retry the whole alter tasks when encounter
`KV_TXN_CONFLICT_RETRY_EXCEEDED_MAX_TIMES` error in
`commit_tablet_job`(#46748). We
should remove stop token(#48399) in
MS for the sc job if it fails in `commit_tablet_job`, otherwise the
later retries may fail to regsiter stop token(because the first stop
token won't expire in `config::lease_compaction_interval_seconds *
4=80s`) and the schema change job will fail.

```
I20250318 15:40:15.851157  7677 task_worker_pool.cpp:423] successfully submit task|type=ALTER|signature=1742283174829
I20250318 15:40:31.346628  6496 task_worker_pool.cpp:1999] get alter table task, signature: 1742283174829
I20250318 15:40:31.346635  6496 task_worker_pool.cpp:281] start alter tablet|signature=1742283174829|base_tablet_id=1742283170711|new_tablet_id=1742283174829|mem_limit=10682972209
I20250318 15:40:31.350860  6496 cloud_schema_change_job.cpp:132] Begin to alter tablet. base_tablet_id=1742283170711, new_tablet_id=1742283174829, alter_version=2, job_id=1742283173457
I20250318 15:40:31.350906  6496 cloud_schema_change_job.cpp:226] Begin to convert historical rowsets for new_tablet from base_tablet. base_tablet=1742283170711, new_tablet=1742283174829, job_id=1742283173457
I20250318 15:40:31.350916  6496 cloud_schema_change_job.cpp:247] schema change type, sc_sorting: 0, sc_directly: 1, base_tablet=1742283170711, new_tablet=1742283174829
I20250318 15:40:31.382493  6496 segment_creator.cpp:308] tablet_id:1742283174829, flushing rowset_dir: , rowset_id:020000000000038f6644fd8945079d22209de0cad6c7e5b8, data size:73808, index size:3289
I20250318 15:40:31.385416  6496 cloud_schema_change_job.cpp:416] process mow table|new_tablet_id=1742283174829|out_rowset_size=1|start_calc_delete_bitmap_version=3|alter_version=2
I20250318 15:40:31.387535  6496 cloud_storage_engine.cpp:894] successfully register compaction stop token for tablet_id=1742283174829, delete_bitmap_lock_initiator=6632031443518271970
I20250318 15:40:31.388285  6496 cloud_schema_change_job.cpp:439] alter table for mow table, calculate delete bitmap of incremental rowsets without lock, version: 3-2 new_table_id: 1742283174829
I20250318 15:40:31.391326  6496 cloud_schema_change_job.cpp:460] alter table for mow table, calculate delete bitmap of incremental rowsets with lock, version: 3-2 new_tablet_id: 1742283174829
I20250318 15:40:31.392035  6496 cloud_storage_engine.cpp:915] successfully unregister compaction stop token for tablet_id=1742283174829, delete_bitmap_lock_initiator=6632031443518271970
W20250318 15:40:39.947554  6496 task_worker_pool.cpp:306] failed to alter tablet|signature=1742283174829|base_tablet_id=1742283170711|new_tablet_id=1742283174829|error=[DELETE_BITMAP_LOCK_ERROR]txn conflict when commit tablet job idx { table_id: 1742283165243 index_id: 1742283165244 partition_id: 1742283165242 tablet_id: 1742283170711 } schema_change { initiator: "172.20.56.12:9050" id: "1742283173457" new_tablet_idx { table_id: 1742283165243 index_id: 1742283173458 partition_id: 1742283165242 tablet_id: 1742283174829 } txn_ids: 610474243072 alter_version: 2 num_output_rowsets: 1 num_output_segments: 1 size_output_rowsets: 77097 num_output_rows: 611 output_versions: 2 output_cumulative_point: 2 delete_bitmap_lock_initiator: 6632031443518271970 index_size_output_rowsets: 3289 segment_size_output_rowsets: 73808 }
I20250318 15:40:46.204162  7677 task_worker_pool.cpp:423] successfully submit task|type=ALTER|signature=1742283174829
I20250318 15:41:07.487172  6496 task_worker_pool.cpp:1999] get alter table task, signature: 1742283174829
I20250318 15:41:07.487183  6496 task_worker_pool.cpp:281] start alter tablet|signature=1742283174829|base_tablet_id=1742283170711|new_tablet_id=1742283174829|mem_limit=10682972209
I20250318 15:41:07.489440  6496 cloud_schema_change_job.cpp:132] Begin to alter tablet. base_tablet_id=1742283170711, new_tablet_id=1742283174829, alter_version=2, job_id=1742283173457
I20250318 15:41:07.489511  6496 cloud_schema_change_job.cpp:226] Begin to convert historical rowsets for new_tablet from base_tablet. base_tablet=1742283170711, new_tablet=1742283174829, job_id=1742283173457
I20250318 15:41:07.489523  6496 cloud_schema_change_job.cpp:247] schema change type, sc_sorting: 0, sc_directly: 1, base_tablet=1742283170711, new_tablet=1742283174829
I20250318 15:41:07.490249  6496 cloud_schema_change_job.cpp:285] Rowset [2-2] has already existed in tablet 1742283174829
I20250318 15:41:07.490275  6496 cloud_schema_change_job.cpp:416] process mow table|new_tablet_id=1742283174829|out_rowset_size=1|start_calc_delete_bitmap_version=2|alter_version=2
W20250318 15:41:07.490864  6496 cloud_compaction_stop_token.cpp:89] failed to register compaction stop token|job_id=a018587a-c12f-4926-9d7e-514ff9d88457|delete_bitmap_lock_initiator=1847151139249560285|tablet_id=1742283174829|error=[INTERNAL_ERROR]failed to start tablet job: compactions are not allowed on tablet_id=1742283174829 currently, blocked by schema change job delete_bitmap_initiator=6632031443518271970
W20250318 15:41:07.490897  6496 task_worker_pool.cpp:306] failed to alter tablet|signature=1742283174829|base_tablet_id=1742283170711|new_tablet_id=1742283174829|error=[INTERNAL_ERROR]failed to start tablet job: compactions are not allowed on tablet_id=1742283174829 currently, blocked by schema change job delete_bitmap_initiator=6632031443518271970
```
dataroaring pushed a commit that referenced this pull request Mar 27, 2025
… fails #49275 (#49494)

Cherry-picked from #49275

Co-authored-by: bobhan1 <baohan@selectdb.com>
koarz pushed a commit to koarz/doris that referenced this pull request Jun 4, 2025
…he#49275)

### What problem does this PR solve?

cloud heavy sc job will retry the whole alter tasks when encounter
`KV_TXN_CONFLICT_RETRY_EXCEEDED_MAX_TIMES` error in
`commit_tablet_job`(apache#46748). We
should remove stop token(apache#48399) in
MS for the sc job if it fails in `commit_tablet_job`, otherwise the
later retries may fail to regsiter stop token(because the first stop
token won't expire in `config::lease_compaction_interval_seconds *
4=80s`) and the schema change job will fail.

```
I20250318 15:40:15.851157  7677 task_worker_pool.cpp:423] successfully submit task|type=ALTER|signature=1742283174829
I20250318 15:40:31.346628  6496 task_worker_pool.cpp:1999] get alter table task, signature: 1742283174829
I20250318 15:40:31.346635  6496 task_worker_pool.cpp:281] start alter tablet|signature=1742283174829|base_tablet_id=1742283170711|new_tablet_id=1742283174829|mem_limit=10682972209
I20250318 15:40:31.350860  6496 cloud_schema_change_job.cpp:132] Begin to alter tablet. base_tablet_id=1742283170711, new_tablet_id=1742283174829, alter_version=2, job_id=1742283173457
I20250318 15:40:31.350906  6496 cloud_schema_change_job.cpp:226] Begin to convert historical rowsets for new_tablet from base_tablet. base_tablet=1742283170711, new_tablet=1742283174829, job_id=1742283173457
I20250318 15:40:31.350916  6496 cloud_schema_change_job.cpp:247] schema change type, sc_sorting: 0, sc_directly: 1, base_tablet=1742283170711, new_tablet=1742283174829
I20250318 15:40:31.382493  6496 segment_creator.cpp:308] tablet_id:1742283174829, flushing rowset_dir: , rowset_id:020000000000038f6644fd8945079d22209de0cad6c7e5b8, data size:73808, index size:3289
I20250318 15:40:31.385416  6496 cloud_schema_change_job.cpp:416] process mow table|new_tablet_id=1742283174829|out_rowset_size=1|start_calc_delete_bitmap_version=3|alter_version=2
I20250318 15:40:31.387535  6496 cloud_storage_engine.cpp:894] successfully register compaction stop token for tablet_id=1742283174829, delete_bitmap_lock_initiator=6632031443518271970
I20250318 15:40:31.388285  6496 cloud_schema_change_job.cpp:439] alter table for mow table, calculate delete bitmap of incremental rowsets without lock, version: 3-2 new_table_id: 1742283174829
I20250318 15:40:31.391326  6496 cloud_schema_change_job.cpp:460] alter table for mow table, calculate delete bitmap of incremental rowsets with lock, version: 3-2 new_tablet_id: 1742283174829
I20250318 15:40:31.392035  6496 cloud_storage_engine.cpp:915] successfully unregister compaction stop token for tablet_id=1742283174829, delete_bitmap_lock_initiator=6632031443518271970
W20250318 15:40:39.947554  6496 task_worker_pool.cpp:306] failed to alter tablet|signature=1742283174829|base_tablet_id=1742283170711|new_tablet_id=1742283174829|error=[DELETE_BITMAP_LOCK_ERROR]txn conflict when commit tablet job idx { table_id: 1742283165243 index_id: 1742283165244 partition_id: 1742283165242 tablet_id: 1742283170711 } schema_change { initiator: "172.20.56.12:9050" id: "1742283173457" new_tablet_idx { table_id: 1742283165243 index_id: 1742283173458 partition_id: 1742283165242 tablet_id: 1742283174829 } txn_ids: 610474243072 alter_version: 2 num_output_rowsets: 1 num_output_segments: 1 size_output_rowsets: 77097 num_output_rows: 611 output_versions: 2 output_cumulative_point: 2 delete_bitmap_lock_initiator: 6632031443518271970 index_size_output_rowsets: 3289 segment_size_output_rowsets: 73808 }
I20250318 15:40:46.204162  7677 task_worker_pool.cpp:423] successfully submit task|type=ALTER|signature=1742283174829
I20250318 15:41:07.487172  6496 task_worker_pool.cpp:1999] get alter table task, signature: 1742283174829
I20250318 15:41:07.487183  6496 task_worker_pool.cpp:281] start alter tablet|signature=1742283174829|base_tablet_id=1742283170711|new_tablet_id=1742283174829|mem_limit=10682972209
I20250318 15:41:07.489440  6496 cloud_schema_change_job.cpp:132] Begin to alter tablet. base_tablet_id=1742283170711, new_tablet_id=1742283174829, alter_version=2, job_id=1742283173457
I20250318 15:41:07.489511  6496 cloud_schema_change_job.cpp:226] Begin to convert historical rowsets for new_tablet from base_tablet. base_tablet=1742283170711, new_tablet=1742283174829, job_id=1742283173457
I20250318 15:41:07.489523  6496 cloud_schema_change_job.cpp:247] schema change type, sc_sorting: 0, sc_directly: 1, base_tablet=1742283170711, new_tablet=1742283174829
I20250318 15:41:07.490249  6496 cloud_schema_change_job.cpp:285] Rowset [2-2] has already existed in tablet 1742283174829
I20250318 15:41:07.490275  6496 cloud_schema_change_job.cpp:416] process mow table|new_tablet_id=1742283174829|out_rowset_size=1|start_calc_delete_bitmap_version=2|alter_version=2
W20250318 15:41:07.490864  6496 cloud_compaction_stop_token.cpp:89] failed to register compaction stop token|job_id=a018587a-c12f-4926-9d7e-514ff9d88457|delete_bitmap_lock_initiator=1847151139249560285|tablet_id=1742283174829|error=[INTERNAL_ERROR]failed to start tablet job: compactions are not allowed on tablet_id=1742283174829 currently, blocked by schema change job delete_bitmap_initiator=6632031443518271970
W20250318 15:41:07.490897  6496 task_worker_pool.cpp:306] failed to alter tablet|signature=1742283174829|base_tablet_id=1742283170711|new_tablet_id=1742283174829|error=[INTERNAL_ERROR]failed to start tablet job: compactions are not allowed on tablet_id=1742283174829 currently, blocked by schema change job delete_bitmap_initiator=6632031443518271970
```
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

approved Indicates a PR has been approved by one committer. dev/3.0.5-merged p0_b reviewed

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants