Skip to content

TiFlash slow query cause by read index timeout #10495

@JaySon-Huang

Description

@JaySon-Huang

Bug Report

Please answer these questions before submitting your issue. Thanks!

1. Minimal reproduce step (Required)

tiflash read index meet 10 seconds timeout at 20:35:14, region_id=4418357

[2025/10/18 20:35:14.065 +00:00] [INFO] [LearnerReadWorker.cpp:348] ["[Learner Read] Batch read index, num_regions=8883 num_requests=8883 num_stale_read=0 num_cached_index=0 num_unavailable=1 cost=10010ms, start_ts=461588320446317315"] [source="MPP<gather_id:1, query_ts:1760819703882031990, local_query_id:1519, server_id:1353, start_ts:461588320446317315,task_id:2>"] [thread_id=1113]
[2025/10/18 20:35:14.131 +00:00] [INFO] [RemoteRequest.cpp:35] ["Start to build remote request for 1 regions (4418357) for table 8138"] [source="MPP<gather_id:1, query_ts:1760819703882031990, local_query_id:1519, server_id:1353, start_ts:461588320446317315,task_id:2>"] [thread_id=1113]

check the tikv logging, we can see that 20:35:04, tikv happen to begin transfer leader

[2025/10/18 20:35:04.087 +00:00] [INFO] [pd.rs:1639] ["try to transfer leader"] [to_peers="[]"] [to_peer="id: 317724740 store_id: 317723969"] [from_peer="id: 317835203 store_id: 17"] [region_id=4418357] [thread_id=52]
[2025/10/18 20:35:04.089 +00:00] [INFO] [peer.rs:3982] ["transfer leader"] [peer="id: 317724740 store_id: 317723969"] [peer_id=317835203] [region_id=4418357] [thread_id=230]
[2025/10/18 20:35:04.089 +00:00] [INFO] [raft.rs:1957] ["[term 102] starts to transfer leadership to 317724740"] [lead_transferee=317724740] [term=102] [raft_id=317835203] [region_id=4418357] [thread_id=230]
[2025/10/18 20:35:04.089 +00:00] [INFO] [raft.rs:1970] ["sends MsgTimeoutNow to 317724740 immediately as 317724740 already has up-to-date log"] [lead_transferee=317724740] [raft_id=317835203] [region_id=4418357] [thread_id=230]
[2025/10/18 20:35:04.091 +00:00] [INFO] [raft.rs:1399] ["received a message with higher term from 317724740"] ["msg type"=MsgRequestVote] [message_term=103] [term=102] [from=317724740] [raft_id=317835203] [region_id=4418357] [thread_id=230]Show context
[2025/10/18 20:35:04.091 +00:00] [INFO] [raft.rs:1162] ["became follower at term 103"] [from_role=Leader] [term=103] [raft_id=317835203] [region_id=4418357] [thread_id=230]
[2025/10/18 20:35:04.091 +00:00] [INFO] [raft.rs:1618] ["[logterm: 102, index: 1509868, vote: 0] cast vote for 317724740 [logterm: 102, index: 1509868] at term 103"] ["msg type"=MsgRequestVote] [term=103] [msg_index=1509868] [msg_term=102] [from=317724740] [vote=0] [log_index=1509868] [log_term=102] [raft_id=317835203] [region_id=4418357] [thread_id=230]

Another similar logging

tiflash read index timeout at 07:37:46

[2025/10/21 07:37:46.389 +00:00] [INFO] [LearnerReadWorker.cpp:348] ["[Learner Read] Batch read index, num_regions=10232 num_requests=10232 num_stale_read=0 num_cached_index=0 num_unavailable=1 cost=10012ms, start_ts=461644039756448615"] [source="MPP<gather_id:1, query_ts:1761032256196950457, local_query_id:3059, server_id:2006, start_ts:461644039756448615,task_id:7>"] [thread_id=1650]
[2025/10/21 07:37:46.398 +00:00] [INFO] [LearnerReadWorker.cpp:450] ["[Learner Read] Finish wait index and resolve locks, wait_cost=8ms n_regions=10232 n_unavailable=1, start_ts=461644039756448615"] [source="MPP<gather_id:1, query_ts:1761032256196950457, local_query_id:3059, server_id:2006, start_ts:461644039756448615,task_id:7>"] [thread_id=1650]
[2025/10/21 07:37:46.460 +00:00] [INFO] [RemoteRequest.cpp:35] ["Start to build remote request for 1 regions (4453796) for table 8138"] [source="MPP<gather_id:1, query_ts:1761032256196950457, local_query_id:3059, server_id:2006, start_ts:461644039756448615,task_id:7>"] [thread_id=1650]

tikv happen to begin transfer leader at 07:37:36

[2025/10/21 07:37:36.406 +00:00] [INFO] [pd.rs:1639] ["try to transfer leader"] [to_peers="[id: 4465009 store_id: 2, id: 4465008 store_id: 26]"] [to_peer="id: 4465009 store_id: 2"] [from_peer="id: 317820770 store_id: 317723963"] [region_id=4453796] [thread_id=37]
[2025/10/21 07:37:36.407 +00:00] [INFO] [peer.rs:3982] ["transfer leader"] [peer="id: 4465009 store_id: 2"] [peer_id=317820770] [region_id=4453796] [thread_id=215]
[2025/10/21 07:37:36.407 +00:00] [INFO] [raft.rs:1957] ["[term 12] starts to transfer leadership to 4465009"] [lead_transferee=4465009] [term=12] [raft_id=317820770] [region_id=4453796] [thread_id=215]
[2025/10/21 07:37:36.407 +00:00] [INFO] [raft.rs:1970] ["sends MsgTimeoutNow to 4465009 immediately as 4465009 already has up-to-date log"] [lead_transferee=4465009] [raft_id=317820770] [region_id=4453796] [thread_id=215]
[2025/10/21 07:37:36.408 +00:00] [INFO] [raft.rs:1399] ["received a message with higher term from 4465009"] ["msg type"=MsgRequestVote] [message_term=13] [term=12] [from=4465009] [raft_id=317820770] [region_id=4453796] [thread_id=215]
[2025/10/21 07:37:36.408 +00:00] [INFO] [raft.rs:1162] ["became follower at term 13"] [from_role=Leader] [term=13] [raft_id=317820770] [region_id=4453796] [thread_id=215]
[2025/10/21 07:37:36.409 +00:00] [INFO] [raft.rs:1618] ["[logterm: 12, index: 32, vote: 0] cast vote for 4465009 [logterm: 12, index: 32] at term 13"] ["msg type"=MsgRequestVote] [term=13] [msg_index=32] [msg_term=12] [from=4465009] [vote=0] [log_index=32] [log_term=12] [raft_id=317820770] [region_id=4453796] [thread_id=215]Show context

2. What did you expect to see? (Required)

3. What did you see instead (Required)

4. What is your TiFlash version? (Required)

v8.5.3

Metadata

Metadata

Assignees

No one assigned

    Labels

    affects-7.5This bug affects the 7.5.x(LTS) versions.affects-8.1This bug affects the 8.1.x(LTS) versions.affects-8.5This bug affects the 8.5.x(LTS) versions.component/storageseverity/moderatetype/bugThe issue is confirmed as a bug.

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions