-
Notifications
You must be signed in to change notification settings - Fork 8.3k
LOGICAL_ERROR: Part {} remains in ZooKeeper after DROP_RANGE {} #56037
Copy link
Copy link
Closed as duplicate of#92850
Copy link
Labels
fuzzProblem found by one of the fuzzersProblem found by one of the fuzzers
Description
Logical error:
2023.10.25 21:56:53.451524 [ 1387 ] {} <Error> test_bcyivuz0.alter_table1 (a8336085-c547-4dac-8e13-cbea2e4e0094): auto DB::StorageReplicatedMergeTree::processQueueEntry(ReplicatedMergeTreeQueue::SelectedEntryPtr)::(anonymous class)::operator()(LogEntryPtr &) const: Code: 49. DB::Exception: Part all_15_15_0 remains in ZooKeeper after DROP_RANGE all_0_15_999999999_999999999. (LOGICAL_ERROR), Stack trace (when copying this message, always include the lines below):
0. ./build_docker/./src/Common/Exception.cpp:98: DB::Exception::Exception(DB::Exception::MessageMasked&&, int, bool) @ 0x000000000c741d97 in /usr/lib/debug/usr/bin/clickhouse.debug
1. DB::Exception::Exception<String const&, String const&>(int, FormatStringHelperImpl<std::type_identity<String const&>::type, std::type_identity<String const&>::type>, String const&, String const&) @ 0x000000000765b5c7 in /usr/lib/debug/usr/bin/clickhouse.debug
2. ./build_docker/./src/Storages/StorageReplicatedMergeTree.cpp:2058: DB::paranoidCheckForCoveredPartsInZooKeeper(std::shared_ptr<zkutil::ZooKeeper> const&, String const&, StrongTypedef<unsigned int, DB::MergeTreeDataFormatVersionTag>, String const&) @ 0x000000001297afcd in /usr/lib/debug/usr/bin/clickhouse.debug
3. ./build_docker/./contrib/llvm-project/libcxx/include/__memory/shared_ptr.h:701: DB::StorageReplicatedMergeTree::executeDropRange(DB::ReplicatedMergeTreeLogEntry const&) @ 0x0000000012952a72 in /usr/lib/debug/usr/bin/clickhouse.debug
4. ./build_docker/./src/Storages/StorageReplicatedMergeTree.cpp:0: DB::StorageReplicatedMergeTree::executeLogEntry(DB::ReplicatedMergeTreeLogEntry&) @ 0x0000000012949fa6 in /usr/lib/debug/usr/bin/clickhouse.debug
5. ./build_docker/./contrib/llvm-project/libcxx/include/__functional/function.h:716: bool std::__function::__policy_invoker<bool (std::shared_ptr<DB::ReplicatedMergeTreeLogEntry>&)>::__call_impl<std::__function::__default_alloc_func<DB::StorageReplicatedMergeTree::processQueueEntry(std::shared_ptr<DB::ReplicatedMergeTreeQueue::SelectedEntry>)::$_1, bool (std::shared_ptr<DB::ReplicatedMergeTreeLogEntry>&)>>(std::__function::__policy_storage const*, std::shared_ptr<DB::ReplicatedMergeTreeLogEntry>&) @ 0x0000000012a5285f in /usr/lib/debug/usr/bin/clickhouse.debug
6. ./build_docker/./contrib/llvm-project/libcxx/include/__functional/function.h:0: DB::ReplicatedMergeTreeQueue::processEntry(std::function<std::shared_ptr<zkutil::ZooKeeper> ()>, std::shared_ptr<DB::ReplicatedMergeTreeLogEntry>&, std::function<bool (std::shared_ptr<DB::ReplicatedMergeTreeLogEntry>&)>) @ 0x00000000130b6e46 in /usr/lib/debug/usr/bin/clickhouse.debug
7. ./build_docker/./src/Storages/StorageReplicatedMergeTree.cpp:3275: DB::StorageReplicatedMergeTree::processQueueEntry(std::shared_ptr<DB::ReplicatedMergeTreeQueue::SelectedEntry>) @ 0x0000000012993efc in /usr/lib/debug/usr/bin/clickhouse.debug
8. ./build_docker/./src/Storages/MergeTree/IExecutableTask.h:61: DB::ExecutableLambdaAdapter::executeStep() @ 0x0000000012a533b2 in /usr/lib/debug/usr/bin/clickhouse.debug
9. ./build_docker/./src/Storages/MergeTree/MergeTreeBackgroundExecutor.cpp:284: DB::MergeTreeBackgroundExecutor<DB::RoundRobinRuntimeQueue>::threadFunction() @ 0x0000000012da88d7 in /usr/lib/debug/usr/bin/clickhouse.debug
10. ./build_docker/./base/base/../base/wide_integer_impl.h:809: ThreadPoolImpl<ThreadFromGlobalPoolImpl<false>>::worker(std::__list_iterator<ThreadFromGlobalPoolImpl<false>, void*>) @ 0x000000000c828e7f in /usr/lib/debug/usr/bin/clickhouse.debug
11. ./build_docker/./src/Common/ThreadPool.cpp:0: void std::__function::__policy_invoker<void ()>::__call_impl<std::__function::__default_alloc_func<ThreadFromGlobalPoolImpl<false>::ThreadFromGlobalPoolImpl<void ThreadPoolImpl<ThreadFromGlobalPoolImpl<false>>::scheduleImpl<void>(std::function<void ()>, Priority, std::optional<unsigned long>, bool)::'lambda0'()>(void&&)::'lambda'(), void ()>>(std::__function::__policy_storage const*) @ 0x000000000c82c99c in /usr/lib/debug/usr/bin/clickhouse.debug
12. ./build_docker/./base/base/../base/wide_integer_impl.h:809: void* std::__thread_proxy[abi:v15000]<std::tuple<std::unique_ptr<std::__thread_struct, std::default_delete<std::__thread_struct>>, void ThreadPoolImpl<std::thread>::scheduleImpl<void>(std::function<void ()>, Priority, std::optional<unsigned long>, bool)::'lambda0'()>>(void*) @ 0x000000000c82b1c7 in /usr/lib/debug/usr/bin/clickhouse.debug
13. ? @ 0x00007f0806d14ac3 in ?
14. ? @ 0x00007f0806da6a40 in ?
(version 23.9.2.56 (official build))
It seems the parts were just attached:
2023.10.25 21:56:52.366189 [ 3022 ] {7e1a0781-182b-4473-912e-a30f11ce6584} <Debug> executeQuery: (from [::1]:32968) (comment: 02443_detach_attach_partition.sh) ALTER TABLE alter_table1 ATTACH PARTITION ID 'all'; (stage: Complete)
2023.10.25 21:56:52.366267 [ 3022 ] {7e1a0781-182b-4473-912e-a30f11ce6584} <Trace> ContextAccess (default): Access granted: INSERT ON test_bcyivuz0.alter_table1
2023.10.25 21:56:52.366295 [ 3022 ] {7e1a0781-182b-4473-912e-a30f11ce6584} <Trace> test_bcyivuz0.alter_table1 (a8336085-c547-4dac-8e13-cbea2e4e0094): Will wait for outdated data parts to be loaded
2023.10.25 21:56:52.366303 [ 3022 ] {7e1a0781-182b-4473-912e-a30f11ce6584} <Trace> test_bcyivuz0.alter_table1 (a8336085-c547-4dac-8e13-cbea2e4e0094): Finished waiting for outdated data parts to be loaded
2023.10.25 21:56:52.366310 [ 3022 ] {7e1a0781-182b-4473-912e-a30f11ce6584} <Debug> test_bcyivuz0.alter_table1 (a8336085-c547-4dac-8e13-cbea2e4e0094): Looking for parts for partition all in detached/
2023.10.25 21:56:52.366364 [ 3022 ] {7e1a0781-182b-4473-912e-a30f11ce6584} <Debug> test_bcyivuz0.alter_table1 (a8336085-c547-4dac-8e13-cbea2e4e0094): Found part all_4_4_0
2023.10.25 21:56:52.366370 [ 3022 ] {7e1a0781-182b-4473-912e-a30f11ce6584} <Debug> test_bcyivuz0.alter_table1 (a8336085-c547-4dac-8e13-cbea2e4e0094): Found part all_5_5_0
2023.10.25 21:56:52.366376 [ 3022 ] {7e1a0781-182b-4473-912e-a30f11ce6584} <Debug> test_bcyivuz0.alter_table1 (a8336085-c547-4dac-8e13-cbea2e4e0094): 2 of them are active
2023.10.25 21:56:52.366381 [ 3022 ] {7e1a0781-182b-4473-912e-a30f11ce6584} <Debug> test_bcyivuz0.alter_table1 (a8336085-c547-4dac-8e13-cbea2e4e0094): Found containing part all_4_4_0 for part all_4_4_0
2023.10.25 21:56:52.366387 [ 3022 ] {7e1a0781-182b-4473-912e-a30f11ce6584} <Debug> test_bcyivuz0.alter_table1 (a8336085-c547-4dac-8e13-cbea2e4e0094): Found containing part all_5_5_0 for part all_5_5_0
2023.10.25 21:56:52.366460 [ 3022 ] {7e1a0781-182b-4473-912e-a30f11ce6584} <Debug> test_bcyivuz0.alter_table1 (a8336085-c547-4dac-8e13-cbea2e4e0094): Checking 2 parts
2023.10.25 21:56:52.366465 [ 3022 ] {7e1a0781-182b-4473-912e-a30f11ce6584} <Debug> test_bcyivuz0.alter_table1 (a8336085-c547-4dac-8e13-cbea2e4e0094): Checking part attaching_all_4_4_0
2023.10.25 21:56:52.477685 [ 3022 ] {7e1a0781-182b-4473-912e-a30f11ce6584} <Debug> test_bcyivuz0.alter_table1 (a8336085-c547-4dac-8e13-cbea2e4e0094): Checking part attaching_all_5_5_0
2023.10.25 21:56:52.596769 [ 3022 ] {7e1a0781-182b-4473-912e-a30f11ce6584} <Trace> test_bcyivuz0.alter_table1 (a8336085-c547-4dac-8e13-cbea2e4e0094): Renaming temporary part detached/attaching_all_4_4_0 to all_13_13_0 with tid (1, 1, 00000000-0000-0000-0000-000000000000).
2023.10.25 21:56:52.596902 [ 3022 ] {7e1a0781-182b-4473-912e-a30f11ce6584} <Trace> test_bcyivuz0.alter_table1 (a8336085-c547-4dac-8e13-cbea2e4e0094): Trying to create zookeeper persistent lock /clickhouse/zero_copy/zero_copy_s3/afa3f230-c69d-4f75-945b-f6cbf07decc0/all_13_13_0/test_ofe_wlmhgofcxrmgjuiohqolobwqxheqn/r2 with hardlinks []
2023.10.25 21:56:52.616335 [ 3022 ] {7e1a0781-182b-4473-912e-a30f11ce6584} <Trace> test_bcyivuz0.alter_table1 (a8336085-c547-4dac-8e13-cbea2e4e0094): Zookeeper persistent lock /clickhouse/zero_copy/zero_copy_s3/afa3f230-c69d-4f75-945b-f6cbf07decc0/all_13_13_0/test_ofe_wlmhgofcxrmgjuiohqolobwqxheqn/r2 created
2023.10.25 21:56:52.656722 [ 3022 ] {7e1a0781-182b-4473-912e-a30f11ce6584} <Information> test_bcyivuz0.alter_table1 (a8336085-c547-4dac-8e13-cbea2e4e0094) (Replicated OutputStream): Block with ID all_B74190DCFE9D27A082DDACADAF0D7F1E already exists (it was just appeared). Renaming part all_13_13_0 back to detached/attaching_all_4_4_0. Will retry write.
2023.10.25 21:56:52.656743 [ 3022 ] {7e1a0781-182b-4473-912e-a30f11ce6584} <Debug> test_bcyivuz0.alter_table1 (a8336085-c547-4dac-8e13-cbea2e4e0094): Undoing transaction. Rollbacking parts state to temporary and removing from working set: all_13_13_0.
2023.10.25 21:56:52.693374 [ 3022 ] {7e1a0781-182b-4473-912e-a30f11ce6584} <Debug> createEphemeralLockInZooKeeper: Deduplication path already exists: deduplication_path=/clickhouse/tables/02443_detach_attach_partition_test_bcyivuz0/alter_table/blocks/all_B74190DCFE9D27A082DDACADAF0D7F1E
2023.10.25 21:56:52.696336 [ 3022 ] {7e1a0781-182b-4473-912e-a30f11ce6584} <Information> test_bcyivuz0.alter_table1 (a8336085-c547-4dac-8e13-cbea2e4e0094) (Replicated OutputStream): Block with ID all_B74190DCFE9D27A082DDACADAF0D7F1E already exists on other replicas as part all_12_12_0; will write it locally with that name.
2023.10.25 21:56:52.709639 [ 3022 ] {7e1a0781-182b-4473-912e-a30f11ce6584} <Debug> test_bcyivuz0.alter_table1 (a8336085-c547-4dac-8e13-cbea2e4e0094) (Replicated OutputStream): Will check part all_12_12_0 checksums
2023.10.25 21:56:52.745089 [ 3022 ] {7e1a0781-182b-4473-912e-a30f11ce6584} <Trace> test_bcyivuz0.alter_table1 (a8336085-c547-4dac-8e13-cbea2e4e0094): Renaming temporary part attaching_all_4_4_0 to all_12_12_0 with tid (1, 1, 00000000-0000-0000-0000-000000000000).
2023.10.25 21:56:52.756281 [ 3022 ] {7e1a0781-182b-4473-912e-a30f11ce6584} <Trace> test_bcyivuz0.alter_table1 (a8336085-c547-4dac-8e13-cbea2e4e0094): Trying to create zookeeper persistent lock /clickhouse/zero_copy/zero_copy_s3/afa3f230-c69d-4f75-945b-f6cbf07decc0/all_12_12_0/test_ofe_wlmhgofcxrmgjuiohqolobwqxheqn/r2 with hardlinks []
2023.10.25 21:56:52.773628 [ 3022 ] {7e1a0781-182b-4473-912e-a30f11ce6584} <Trace> test_bcyivuz0.alter_table1 (a8336085-c547-4dac-8e13-cbea2e4e0094): Zookeeper persistent lock /clickhouse/zero_copy/zero_copy_s3/afa3f230-c69d-4f75-945b-f6cbf07decc0/all_12_12_0/test_ofe_wlmhgofcxrmgjuiohqolobwqxheqn/r2 created
2023.10.25 21:56:52.809090 [ 3022 ] {7e1a0781-182b-4473-912e-a30f11ce6584} <Debug> test_bcyivuz0.alter_table1 (a8336085-c547-4dac-8e13-cbea2e4e0094): Attached part all_4_4_0 as all_12_12_0
2023.10.25 21:56:52.812004 [ 3022 ] {7e1a0781-182b-4473-912e-a30f11ce6584} <Debug> createEphemeralLockInZooKeeper: Deduplication path already exists: deduplication_path=/clickhouse/tables/02443_detach_attach_partition_test_bcyivuz0/alter_table/blocks/all_CFE4743C4144A7CD2129FEF615DD9959
2023.10.25 21:56:52.815775 [ 3022 ] {7e1a0781-182b-4473-912e-a30f11ce6584} <Information> test_bcyivuz0.alter_table1 (a8336085-c547-4dac-8e13-cbea2e4e0094) (Replicated OutputStream): Block with ID all_CFE4743C4144A7CD2129FEF615DD9959 already exists on other replicas as part all_15_15_0; will write it locally with that name.
2023.10.25 21:56:52.819235 [ 3022 ] {7e1a0781-182b-4473-912e-a30f11ce6584} <Debug> test_bcyivuz0.alter_table1 (a8336085-c547-4dac-8e13-cbea2e4e0094) (Replicated OutputStream): Will check part all_15_15_0 checksums
2023.10.25 21:56:52.824870 [ 3022 ] {7e1a0781-182b-4473-912e-a30f11ce6584} <Trace> test_bcyivuz0.alter_table1 (a8336085-c547-4dac-8e13-cbea2e4e0094): Renaming temporary part detached/attaching_all_5_5_0 to all_15_15_0 with tid (1, 1, 00000000-0000-0000-0000-000000000000).
2023.10.25 21:56:52.861429 [ 3022 ] {7e1a0781-182b-4473-912e-a30f11ce6584} <Trace> test_bcyivuz0.alter_table1 (a8336085-c547-4dac-8e13-cbea2e4e0094): Trying to create zookeeper persistent lock /clickhouse/zero_copy/zero_copy_s3/afa3f230-c69d-4f75-945b-f6cbf07decc0/all_15_15_0/test_ame_thjtvjnjrsnkwljaeeoryakwmahfe/r2 with hardlinks []
2023.10.25 21:56:52.880415 [ 3022 ] {7e1a0781-182b-4473-912e-a30f11ce6584} <Trace> test_bcyivuz0.alter_table1 (a8336085-c547-4dac-8e13-cbea2e4e0094): Zookeeper persistent lock /clickhouse/zero_copy/zero_copy_s3/afa3f230-c69d-4f75-945b-f6cbf07decc0/all_15_15_0/test_ame_thjtvjnjrsnkwljaeeoryakwmahfe/r2 created
2023.10.25 21:56:52.907594 [ 3022 ] {7e1a0781-182b-4473-912e-a30f11ce6584} <Debug> test_bcyivuz0.alter_table1 (a8336085-c547-4dac-8e13-cbea2e4e0094): Attached part all_5_5_0 as all_15_15_0
2023.10.25 21:56:52.922436 [ 3022 ] {7e1a0781-182b-4473-912e-a30f11ce6584} <Debug> MemoryTracker: Peak memory usage (for query): 3.06 MiB.
2023.10.25 21:56:52.922449 [ 3022 ] {7e1a0781-182b-4473-912e-a30f11ce6584} <Debug> TCPHandler: Processed in 0.571604992 sec.
It seems like a "race" condition between the background thread doing the drop (and doing the paranoid checks) and a query reattaching partitions.
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
fuzzProblem found by one of the fuzzersProblem found by one of the fuzzers