Skip to content

TiFlash panic with "Detected illegal region boundary" #10147

@JaySon-Huang

Description

@JaySon-Huang

Bug Report

Please answer these questions before submitting your issue. Thanks!

1. Minimal reproduce step (Required)

Can not find an easy way to reproduce yet.

2. What did you expect to see? (Required)

3. What did you see instead (Required)

There is a Region with range

{"id":472129417,"start_key":"7480000000000B01FFE75F720000000000FA","end_key":"7480000000000B01FFE75F72F800000000FF05613A0100000000FB", ...}

The end_key can not successfully decoded as "t${tableID}_r${rowID}", there is a "\x01" suffix

# start_key
>  mok 7480000000000B01FFE75F720000000000FA
"7480000000000B01FFE75F720000000000FA"
└─## decode hex key
  └─"t\200\000\000\000\000\013\001\377\347_r\000\000\000\000\000\372"
    ├─## decode mvcc key
    │ └─"t\200\000\000\000\000\013\001\347_r"
    │   └─## table prefix
    │     └─table: 721383
# end_key
>  mok 7480000000000B01FFE75F72F800000000FF05613A0100000000FB
"7480000000000B01FFE75F72F800000000FF05613A0100000000FB"
└─## decode hex key
└─"t\200\000\000\000\000\013\001\377\347_r\370\000\000\000\000\377\005a:\001\000\000\000\000\373"
    ├─## decode mvcc key
    │ └─"t\200\000\000\000\000\013\001\347_r\370\000\000\000\000\005a:\001"
    │   ├─## table prefix
    │   │ └─table: 721383
    │   └─## table row key
    │     ├─table: 721383
    │     └─"\370\000\000\000\000\005a:\001"
    │       └─## decode index values
    │         └─"\370\000\000\000\000\005a:\001"

When tiflash handle the Region snapshot from TiKV, tiflash detects there are "more rows" of file_range that is not covered by the "region_range", so tiflash panic.

[2025/05/01 08:13:23.231 +00:00] [FATAL] [Exception.cpp:106] ["Code: 49, e.displayText() = DB::Exception: Check compare(range.getStart(), ext_file.range.getStart()) <= 0 && compare(range.getEnd(), ext_file.range.getEnd()) >= 0 failed: Detected illegal region boundary: range=4294967295 file_range=721383 keyspace=[?,?) table_id=[?,?). TiFlash will exit to prevent data inconsistency. If you accept data inconsistency and want to continue the service, set profiles.default.dt_enable_ingest_check=false .: (while applyPreHandledSnapshot region_id=472129417), e.what() = DB::Exception, Stack trace:
       0x1b76450    DB::Exception::Exception(std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > const&, int) [tiflash+28795984]
                    dbms/src/Common/Exception.h:46
       0x692c664    DB::DM::DeltaMergeStore::ingestFiles(std::__1::shared_ptr<DB::DM::DMContext> const&, DB::DM::RowKeyRange const&, std::__1::vector<DB::DM::ExternalDTFileInfo, std::__1::allocator<DB::DM::ExternalDTFileInfo> > const&, bool) [tiflash+110282340]
                    dbms/src/Storages/DeltaMerge/DeltaMergeStore_Ingest.cpp:612
       0x72d4d90    DB::DM::DeltaMergeStore::ingestFiles(DB::Context const&, DB::Settings const&, DB::DM::RowKeyRange const&, std::__1::vector<DB::DM::ExternalDTFileInfo, std::__1::allocator<DB::DM::ExternalDTFileInfo> > const&, bool) [tiflash+120409488]
                    dbms/src/Storages/DeltaMerge/DeltaMergeStore.h:333
       0x7bf0f30    void DB::KVStore::checkAndApplyPreHandledSnapshot<DB::RegionPtrWithSnapshotFiles>(DB::RegionPtrWithSnapshotFiles const&, DB::TMTContext&) [tiflash+129961776]
                    dbms/src/Storages/KVStore/MultiRaft/ApplySnapshot.cpp:117
       0x7bef94c    void DB::KVStore::applyPreHandledSnapshot<DB::RegionPtrWithSnapshotFiles>(DB::RegionPtrWithSnapshotFiles const&, DB::TMTContext&) [tiflash+129956172]
                    dbms/src/Storages/KVStore/MultiRaft/ApplySnapshot.cpp:307
       0x7be4cb0    ApplyPreHandledSnapshot [tiflash+129911984]
                    dbms/src/Storages/KVStore/FFI/ProxyFFI.cpp:693
  0xffffb6cbe4d0    _$LT$engine_store_ffi..observer..TiFlashObserver$LT$T$C$ER$GT$$u20$as$u20$raftstore..coprocessor..ApplySnapshotObserver$GT$::post_apply_snapshot::h97e72c0ec4c7039b [libtiflash_proxy.so+25818320]
  0xffffb7b2c6e0    raftstore::store::worker::region::Runner$LT$EK$C$R$C$T$GT$::handle_pending_applies::h05bf95f1570f6c67 [libtiflash_proxy.so+40949472]
  0xffffb71db6ac    yatp::task::future::RawTask$LT$F$GT$::poll::h15383bccb4d932ed [libtiflash_proxy.so+31180460]
  0xffffb88c3214    _$LT$yatp..task..future..Runner$u20$as$u20$yatp..pool..runner..Runner$GT$::handle::h5adbfadd82cb614e [libtiflash_proxy.so+55198228]
  0xffffb6db7724    std::sys_common::backtrace::__rust_begin_short_backtrace::hcea1723f02df148f [libtiflash_proxy.so+26838820]
  0xffffb6deebc4    core::ops::function::FnOnce::call_once$u7b$$u7b$vtable.shim$u7d$$u7d$::h3aa4e6cbbdc368f4 [libtiflash_proxy.so+27065284]
  0xffffb812a3c0    std::sys::unix::thread::Thread::new::thread_start::hcc916efc5918f503 [libtiflash_proxy.so+47231936]
  0xffffb50de6b8    start_thread [libc.so.6+526008]
  0xffffb5148bdc    thread_start [libc.so.6+961500]"] [source="void DB::ApplyPreHandledSnapshot(DB::EngineStoreServerWrap *, DB::RawVoidPtr, DB::RawCppPtrType)"] [thread_id=5922]

4. What is your TiFlash version? (Required)

v7.5.6

Metadata

Metadata

Assignees

No one assigned

    Labels

    affects-6.1This bug affects the 6.1.x(LTS) versions.affects-6.5This bug affects the 6.5.x(LTS) versions.affects-7.1This bug affects the 7.1.x(LTS) versions.affects-7.5This bug affects the 7.5.x(LTS) versions.affects-8.1This bug affects the 8.1.x(LTS) versions.affects-8.5This bug affects the 8.5.x(LTS) versions.component/storageimpact/panicimpact/upgradereport/customerCustomers have encountered this bug.severity/criticaltype/bugThe issue is confirmed as a bug.

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions