Skip to content

raftstore: incorrect and misleading index logging in StoreMsg when slow logs exist #18561

@LykxSassinator

Description

@LykxSassinator

Bug Report

When slow logs exist in the processing of StoreFsm::handle_msgs, the relevant raftstore
will print the corresponding StoreMsg for the developer to confirm which is the slowest
StoreMsg.

However, the current implementation prints the StoreMsg with incorrect index, generating
misleading debugging log.

      // for `StoreFsm::handle_msgs`
      let mut distribution = [0; StoreMsg::<EK>::COUNT];
      for m in msgs.drain(..) {
            distribution[m.discriminant()] += 1;
            ...
      }
       slow_log!(
            T timer,
            "[store {}] handle {} store messages {:?}",
            self.fsm.store.id,
            count,
            StoreMsg::<EK>::VARIANTS.iter().zip(distribution).filter(|(_, c)| *c > 0).format(", "),
        );

The root cause is that the m.discriminant() mismatches with the definition of StoreMsg:

Image

Image

For example, the slow StoreMsg should be StoreMsg::CompactEvent rather than StoreMsg::Unreachable,
but the logs are:

[2025/06/15 18:16:38.328 +00:00] [WARN] [store.rs:902] ["[store 22] handle 1 store messages (\"StoreUnreachable\", 1)"] [takes=1330] [thread_id=204]

What version of TiKV are you using?

v7.5.6

What operating system and CPU are you using?

Centos 7

Steps to reproduce

What did you expect?

The slow log should be StoreMsg::CompactEvent rather than StoreMsg::Unreachable.

What did happened?

The slow log is StoreMsg::Unreachable.

Metadata

Metadata

Labels

affects-7.5This bug affects the 7.5.x(LTS) versions.affects-8.1This bug affects the 8.1.x(LTS) versions.affects-8.5This bug affects the 8.5.x(LTS) versions.report/customerCustomers have encountered this bug.severity/minortype/bugThe issue is confirmed as a bug.

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions