Skip to content

Conversation

@github-actions
Copy link
Contributor

Cherry-picked from #56301

### What problem does this PR solve?

Problem Summary:

Adds the bvar `file_cache_event_driven_warm_up_skipped_rowset_num` to
the source cluster Backend to improve observability of the cache warmup
process.

During event-driven warmup, rowsets can be skipped if the Backend fails
to find the tablet's replica location information from a
`TGetTabletReplicaInfosRequest`.

This new metric makes it possible to monitor and alert on these events,
which helps in diagnosing incomplete cache warmups.
@github-actions github-actions bot requested a review from morrySnow as a code owner September 24, 2025 03:42
@hello-stephen
Copy link
Contributor

Thank you for your contribution to Apache Doris.
Don't know what should be done next? See How to process your PR.

Please clearly describe your PR:

  1. What problem was fixed (it's best to include specific error reporting information). How it was fixed.
  2. Which behaviors were modified. What was the previous behavior, what is it now, why was it modified, and what possible impacts might there be.
  3. What features were added. Why was this function added?
  4. Which code was refactored and why was this part of the code refactored?
  5. Which functions were optimized and what is the difference before and after the optimization?

@dataroaring dataroaring reopened this Sep 24, 2025
@hello-stephen
Copy link
Contributor

run buildall

@doris-robot
Copy link

TPC-H: Total hot run time: 32705 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpch-tools
Tpch sf100 test result on commit c70e62c709ad3eb09b0ee33db4d30d26dbe37879, data reload: false

------ Round 1 ----------------------------------
q1	17585	5648	5371	5371
q2	2037	392	288	288
q3	12472	1241	745	745
q4	10520	868	474	474
q5	9434	2363	2149	2149
q6	183	167	132	132
q7	913	761	609	609
q8	9356	1448	1193	1193
q9	5325	5016	4957	4957
q10	6778	2290	1815	1815
q11	475	278	266	266
q12	333	361	213	213
q13	17765	3621	3013	3013
q14	215	225	209	209
q15	521	469	468	468
q16	411	418	376	376
q17	617	861	371	371
q18	6836	6351	6399	6351
q19	1210	963	546	546
q20	339	346	210	210
q21	2793	2152	1960	1960
q22	1044	1029	989	989
Total cold run time: 107162 ms
Total hot run time: 32705 ms

----- Round 2, with runtime_filter_mode=off -----
q1	5514	5475	5502	5475
q2	235	332	234	234
q3	2254	2667	2308	2308
q4	1390	1872	1451	1451
q5	4432	5046	5078	5046
q6	179	179	139	139
q7	2159	2059	1844	1844
q8	2693	2879	2801	2801
q9	7346	7424	7311	7311
q10	3020	3298	2716	2716
q11	579	522	491	491
q12	682	745	617	617
q13	3354	3789	3130	3130
q14	272	315	279	279
q15	527	464	482	464
q16	425	494	430	430
q17	1213	1754	1257	1257
q18	7599	7401	7373	7373
q19	799	1142	1121	1121
q20	1986	2044	1869	1869
q21	5408	4960	4660	4660
q22	1085	1062	1037	1037
Total cold run time: 53151 ms
Total hot run time: 52053 ms

@doris-robot
Copy link

TPC-DS: Total hot run time: 193131 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpcds-tools
TPC-DS sf100 test result on commit c70e62c709ad3eb09b0ee33db4d30d26dbe37879, data reload: false

query1	964	408	414	408
query2	6144	1933	1832	1832
query3	8706	196	191	191
query4	33945	23830	23419	23419
query5	3681	601	445	445
query6	291	201	173	173
query7	4208	485	312	312
query8	300	235	238	235
query9	9198	2596	2609	2596
query10	465	327	258	258
query11	17944	15793	15776	15776
query12	164	109	107	107
query13	1566	544	408	408
query14	9347	7207	7143	7143
query15	244	189	177	177
query16	7985	643	492	492
query17	1538	778	612	612
query18	2141	434	320	320
query19	245	189	168	168
query20	136	132	124	124
query21	210	125	110	110
query22	4521	4610	4410	4410
query23	35436	34527	33889	33889
query24	7328	2690	2739	2690
query25	548	500	436	436
query26	1147	290	173	173
query27	2101	477	348	348
query28	5659	2265	2232	2232
query29	791	625	461	461
query30	237	203	162	162
query31	977	925	851	851
query32	90	64	61	61
query33	507	390	344	344
query34	756	881	518	518
query35	788	810	754	754
query36	997	1095	968	968
query37	109	90	70	70
query38	3974	4019	4050	4019
query39	1545	1490	1529	1490
query40	209	122	105	105
query41	49	48	46	46
query42	129	108	102	102
query43	500	524	489	489
query44	1388	822	837	822
query45	187	176	173	173
query46	888	1046	689	689
query47	1993	2022	1967	1967
query48	406	428	360	360
query49	795	498	399	399
query50	671	709	441	441
query51	7360	7426	7379	7379
query52	105	101	98	98
query53	231	277	191	191
query54	571	559	464	464
query55	80	74	76	74
query56	267	265	245	245
query57	1287	1277	1216	1216
query58	241	214	215	214
query59	3064	3155	3048	3048
query60	289	290	266	266
query61	117	118	138	118
query62	789	746	714	714
query63	235	197	204	197
query64	4497	1021	658	658
query65	3400	3301	3282	3282
query66	1024	414	306	306
query67	16261	15861	15755	15755
query68	7636	841	557	557
query69	503	297	264	264
query70	1183	1166	1132	1132
query71	379	292	266	266
query72	5732	3728	3919	3728
query73	629	744	352	352
query74	10418	9125	8992	8992
query75	3224	3169	2652	2652
query76	2996	1158	777	777
query77	503	372	284	284
query78	10264	10365	9597	9597
query79	3820	848	589	589
query80	781	530	435	435
query81	496	255	218	218
query82	558	120	86	86
query83	185	162	140	140
query84	283	94	86	86
query85	767	368	309	309
query86	343	319	304	304
query87	4326	4311	4194	4194
query88	5189	2405	2404	2404
query89	400	337	303	303
query90	1840	188	188	188
query91	135	144	111	111
query92	62	55	50	50
query93	2015	892	557	557
query94	700	402	309	309
query95	335	276	270	270
query96	486	601	287	287
query97	3191	3237	3164	3164
query98	231	210	195	195
query99	1564	1412	1331	1331
Total cold run time: 295140 ms
Total hot run time: 193131 ms

@doris-robot
Copy link

ClickBench: Total hot run time: 29.3 s
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/clickbench-tools
ClickBench test result on commit c70e62c709ad3eb09b0ee33db4d30d26dbe37879, data reload: false

query1	0.04	0.03	0.02
query2	0.07	0.03	0.03
query3	0.24	0.07	0.07
query4	1.62	0.11	0.10
query5	0.52	0.52	0.53
query6	1.12	0.74	0.73
query7	0.02	0.01	0.02
query8	0.05	0.03	0.04
query9	0.60	0.50	0.50
query10	0.56	0.54	0.54
query11	0.14	0.10	0.10
query12	0.14	0.11	0.11
query13	0.62	0.62	0.60
query14	0.79	0.78	0.81
query15	0.85	0.84	0.83
query16	0.37	0.41	0.39
query17	1.08	1.05	1.10
query18	0.24	0.23	0.22
query19	1.91	1.89	1.81
query20	0.02	0.01	0.01
query21	15.40	0.91	0.58
query22	0.73	0.90	0.70
query23	14.95	1.43	0.54
query24	3.25	1.22	1.30
query25	0.20	0.11	0.11
query26	0.30	0.16	0.13
query27	0.07	0.07	0.04
query28	13.53	0.98	0.43
query29	12.59	3.92	3.28
query30	0.25	0.09	0.06
query31	2.84	0.60	0.40
query32	3.23	0.54	0.47
query33	3.03	2.99	3.07
query34	16.56	5.19	4.56
query35	4.59	4.54	4.55
query36	0.63	0.49	0.49
query37	0.09	0.06	0.06
query38	0.05	0.03	0.03
query39	0.03	0.03	0.02
query40	0.17	0.13	0.12
query41	0.08	0.03	0.03
query42	0.03	0.03	0.02
query43	0.03	0.03	0.03
Total cold run time: 103.63 s
Total hot run time: 29.3 s

@hello-stephen
Copy link
Contributor

BE UT Coverage Report

Increment line coverage 0.00% (0/2) 🎉

Increment coverage report
Complete coverage report

Category Coverage
Function Coverage 45.57% (12785/28053)
Line Coverage 36.41% (114049/313248)
Region Coverage 34.02% (65197/191640)
Branch Coverage 31.05% (34216/110202)

@hello-stephen
Copy link
Contributor

BE Regression && UT Coverage Report

Increment line coverage 100.00% (1/1) 🎉

Increment coverage report
Complete coverage report

Category Coverage
Function Coverage 66.15% (18250/27590)
Line Coverage 57.78% (180436/312282)
Region Coverage 55.49% (106770/192417)
Branch Coverage 49.87% (55212/110716)

1 similar comment
@hello-stephen
Copy link
Contributor

BE Regression && UT Coverage Report

Increment line coverage 100.00% (1/1) 🎉

Increment coverage report
Complete coverage report

Category Coverage
Function Coverage 66.15% (18250/27590)
Line Coverage 57.78% (180436/312282)
Region Coverage 55.49% (106770/192417)
Branch Coverage 49.87% (55212/110716)

@hello-stephen
Copy link
Contributor

BE Regression && UT Coverage Report

Increment line coverage 100.00% (1/1) 🎉

Increment coverage report
Complete coverage report

Category Coverage
Function Coverage 66.15% (18250/27590)
Line Coverage 57.76% (180359/312282)
Region Coverage 55.46% (106721/192417)
Branch Coverage 49.85% (55191/110716)

@morrySnow morrySnow merged commit da836fa into branch-3.1 Sep 25, 2025
22 of 24 checks passed
@morrySnow morrySnow deleted the auto-pick-56301-branch-3.1 branch September 25, 2025 09:36
@morrySnow morrySnow mentioned this pull request Oct 23, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants