-
Notifications
You must be signed in to change notification settings - Fork 8.3k
Database replicated stateless tests fails due to incorrect waiting of server in praktika #84028
Description
CI: https://s3.amazonaws.com/clickhouse-test-reports/json.html?REF=master&sha=9fa0c5a1eee3735049950c68d57aca3bdaf5dbdf&name_0=MasterCI
CIDB: https://play.clickhouse.com/play?user=play#U0VMRUNUIGNoZWNrX3N0YXJ0X3RpbWUsIGNoZWNrX2R1cmF0aW9uX21zLCB0ZXN0X25hbWUsIHJlcG9ydF91cmwKRlJPTSBjaGVja3MKV0hFUkUgY2hlY2tfc3RhcnRfdGltZSA+PSBub3coKSAtIElOVEVSVkFMIDI0IGRheQotLSAgICBBTkQgKGhlYWRfcmVmID0gJ21hc3RlcicgQU5EIHN0YXJ0c1dpdGgoaGVhZF9yZXBvLCAnQ2xpY2tIb3VzZS8nKSkKLS0gICAgQU5EIHRlc3Rfc3RhdHVzICE9ICdTS0lQUEVEJwotLSAgICBhbmQgcHVsbF9yZXF1ZXN0X251bWJlciA9IDgzOTgxCi0tICAgIEFORCAodGVzdF9zdGF0dXMgTElLRSAnRiUnIE9SIHRlc3Rfc3RhdHVzIExJS0UgJ0UlJykgCi0tICAgIEFORCBjaGVja19zdGF0dXMgIT0gJ3N1Y2Nlc3MnCi0tICAgIGFuZCByZXBvcnRfdXJsIGxpa2UgJyVTdGF0ZWxlc3MlJwogICAgYW5kIGNoZWNrX2R1cmF0aW9uX21zIDwgMjEwZTMKICAgIGFuZCB0ZXN0X25hbWUgPSAnJwogICAgQU5EIGNoZWNrX25hbWUgPSAnU3RhdGVsZXNzIHRlc3RzIChhbWRfYmluYXJ5LCBvbGQgYW5hbHl6ZXIsIHMzIHN0b3JhZ2UsIERhdGFiYXNlUmVwbGljYXRlZCwgMi8yKScKLS0gICAgYW5kIHRlc3RfbmFtZSA9ICdTdGFydCBDbGlja0hvdXNlIFNlcnZlcicKT1JERVIgQlkgY2hlY2tfc3RhcnRfdGltZQ==
2025.06.27 23:11:24.861685 [ 4037 ] {} <Error> ZooKeeperClient: Code: 999. Coordination::Exception: Received error in heartbeat response: Operation timeout. (KEEPER_EXCEPTION), Stack trace (when copying this message, always include the lines below):
2025.06.27 23:11:24.863811 [ 3015 ] {} <Warning> KeeperTCPHandler: Ignoring user request, because the server is not active yet
...
2025.06.27 23:11:25.894502 [ 4010 ] {} <Error> virtual bool DB::DDLWorker::initializeMainThread(): Code: 999. Coordination::Exception: All connection tries failed while connecting to ZooKeeper. nodes: [::1]:19181, [::1]:29181, [::1]:9181
...
2025.06.27 23:11:24.873521 [ 4010 ] {} <Error> DDLWorker: An error occurred when markReplicasActive: : Code: 999. Coordination::Exception: All connection tries failed while connecting to ZooKeeper. nodes: [::1]:9181, [::1]:19181, [::1]:29181
2025.06.27 23:11:26.214827 [ 3677 ] {} <Warning> RaftInstance: Election timeout, initiate leader election
...
Maybe we need more retries/timeout in DDLWorker, but it is a general question, why keeper is not ready within timeout?
Cc @antonio2368