-
Notifications
You must be signed in to change notification settings - Fork 411
Closed
Labels
Description
While testing (schrodinger-test/bank) on cluster deployed on k8s/by TiUP, we found that TiFlash meets exception like this:
We can see there are some invalid versions "0"/"1000" and the column with different size is the _INTERNAL_VERSION column
2021.06.07 11:24:42.057736 [ 7 ] <Error> SSTFilesToDTFilesOutputStream: [Row=8130/8143] [pk=654302] [ver=425476671480791155]
2021.06.07 11:24:42.057750 [ 7 ] <Error> SSTFilesToDTFilesOutputStream: [Row=8131/8143] [pk=654302] [ver=425476677287281071]
2021.06.07 11:24:42.057761 [ 7 ] <Error> SSTFilesToDTFilesOutputStream: [Row=8132/8143] [pk=654303] [ver=425476671480791155]
2021.06.07 11:24:42.057779 [ 7 ] <Error> SSTFilesToDTFilesOutputStream: [Row=8133/8143] [pk=654304] [ver=425476671480791155]
2021.06.07 11:24:42.057791 [ 7 ] <Error> SSTFilesToDTFilesOutputStream: [Row=8134/8143] [pk=654304] [ver=425476677247959890]
2021.06.07 11:24:42.057801 [ 7 ] <Error> SSTFilesToDTFilesOutputStream: [Row=8135/8143] [pk=654305] [ver=425476671480791155]
2021.06.07 11:24:42.057812 [ 7 ] <Error> SSTFilesToDTFilesOutputStream: [Row=8136/8143] [pk=654306] [ver=425476671480791155]
2021.06.07 11:24:42.057822 [ 7 ] <Error> SSTFilesToDTFilesOutputStream: [Row=8137/8143] [pk=654306] [ver=425476672254115893]
2021.06.07 11:24:42.057835 [ 7 ] <Error> SSTFilesToDTFilesOutputStream: [Row=8138/8143] [pk=654307] [ver=0]
2021.06.07 11:24:42.057847 [ 7 ] <Error> SSTFilesToDTFilesOutputStream: [Row=8139/8143] [pk=654307] [ver=0]
2021.06.07 11:24:42.057857 [ 7 ] <Error> SSTFilesToDTFilesOutputStream: [Row=8140/8143] [pk=654307] [ver=425476671480791155]
2021.06.07 11:24:42.057868 [ 7 ] <Error> SSTFilesToDTFilesOutputStream: [Row=8141/8143] [pk=654307] [ver=1000]
2021.06.07 11:24:42.057879 [ 7 ] <Error> SSTFilesToDTFilesOutputStream: [Row=8142/8143] [pk=654307] [ver=1000]
2021.06.07 11:24:42.066495 [ 7 ] <Error> DB::RawCppPtr DB::PreHandleSnapshot(DB::EngineStoreServerWrap*, DB::BaseBuffView, uint64_t, DB::SSTViewVec, uint64_t, uint64_t): Code: 0, e.displayText() = DB::Exception: The block decoded from SSTFile is not sorted by primary key and version [region 115, applied: term 6 index 49453], e.what() = DB::Exception, Stack trace:
DB::Exception: Size of filter doesn't match size of column.: (while filtering column [name=_INTERNAL_VERSION] [filter_rows=8143] [column_rows=8141] [raw_block_rows=8141])
0. /tiflash/tiflash(StackTrace::StackTrace()+0x15) [0x365ecc5]
1. /tiflash/tiflash(DB::Exception::Exception(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, int)+0x25) [0x3655855]
2. /tiflash/tiflash(DB::ColumnVector<unsigned long>::filter(DB::PODArray<unsigned char, 4096ul, Allocator<false>, 15ul, 16ul> const&, long) const+0x6a2) [0x705f682]
3. /tiflash/tiflash(DB::DM::DMVersionFilterBlockInputStream<1>::read(DB::PODArray<unsigned char, 4096ul, Allocator<false>, 15ul, 16ul>*&, bool)+0x176c) [0x765970c]
4. /tiflash/tiflash(DB::DM::BoundedSSTFilesToBlockInputStream::read()+0x28) [0x77ca358]
5. /tiflash/tiflash(DB::DM::SSTFilesToDTFilesOutputStream::write()+0x71) [0x77cf631]
6. /tiflash/tiflash(DB::KVStore::preHandleSSTsToDTFiles(std::shared_ptr<DB::Region>, DB::SSTViewVec, unsigned long, unsigned long, DB::DM::FileConvertJobType, DB::TMTContext&)+0x373) [0x7682c63]
7. /tiflash/tiflash(DB::KVStore::preHandleSnapshotToFiles(std::shared_ptr<DB::Region>, DB::SSTViewVec, unsigned long, unsigned long, DB::TMTContext&)+0x45) [0x76833b5]
8. /tiflash/tiflash(DB::PreHandleSnapshot(DB::EngineStoreServerWrap*, DB::BaseBuffView, unsigned long, DB::SSTViewVec, unsigned long, unsigned long)+0xf3) [0x72ce043]
9. /tiflash/libtiflash_proxy.so(+0x12ce7fc) [0x7f14721677fc]
Try to dump the keys from proxy/snap using tools/sst_dump from https://github.com/tikv/rocksdb/blob/6.4.tikv/tools/sst_dump.cc, we see that some SST files of the lock cf are broken.
./tools/sst_dump --file=tmp/snap/snap1/rev_115_6_49453_lock.sst --command=scan --output_hex
sst_dump: rocksdb/table/block_based/data_block_hash_index.cc:80: void rocksdb::DataBlockHashIndex::Initialize(const char*, uint16_t, uint16_t*): Assertion `size > num_buckets_ * sizeof(uint8_t)' failed.
b.sh: line 16: 67605 Aborted ./tools/sst_dump --file=${file} --command=scan --output_hex > ${kv_file}
./tools/sst_dump --file=tmp/snap/snap2/rev_140_6_466928_lock.sst --command=scan --output_hex
tmp/snap/snap2/rev_140_6_466928_lock.sst: Corruption: Bad table magic number: expected 9863518390377041911, found 1127499697199335 in tmp/snap/snap2/rev_140_6_466928_lock.sst
Reactions are currently unavailable