Skip to content

DatabaseReplicated: settings that make query synchronous may not work for not initial replicas #34818

@tavplubix

Description

@tavplubix

We have some settings such as database_atomic_wait_for_drop_and_detach_synchronously, replication_alter_partitions_sync and mutations_sync that allow to wait for query to be actually finished. However, it may not work as expected if database engine is Replicated. When query is finished on initiator, we wait for it to finish on other database replicas. More precisely, we wait for replicas to create log/query-xxxxx/finished/full_replica_name node. The problem is that DatabaseReplicated may create this node earlier (when "commit point" is passed, but query is not fully completed).

Example:
https://s3.amazonaws.com/clickhouse-test-reports/34749/9b753c84f099eca8fe6271f006ef7d8ca47ae00a/stateless_tests__release__databasereplicated__actions__[1/2].html
(the second replica has created "finished" node when table is marked is dropped, but not actually dropped, so initiator returned "Ok" table's metadata was removed from ZK)
Similar scenario is possible for some ALTERs

I'm not sure what is the best way to fix it. We can simply create another node in ZK when query execution is fully completed and make DDLQueryStatusSource wait for this node to appear instead of "finished" node. On the other hand, we should not allow running "ON CLUSTER" queries with enabled "sync" settings at all or should provide another way to wait for such queries, because waiting in the main thread of DDLWorker is awful.

This issue is slightly related to #23513.

Metadata

Metadata

Assignees

Labels

unexpected behaviourResult is unexpected, but not entirely wrong at the same time.

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions