[ClickHouse-58380] Implement Retries in distributed queries when server stopped during queries#66791
[ClickHouse-58380] Implement Retries in distributed queries when server stopped during queries#66791Zawa-ll wants to merge 147 commits intoClickHouse:masterfrom
Conversation
|
This ticket is involve multiple files, I will continue reviewing and working on my changes to ensure that it works as expected, feel free to review and point out mistakes I made! |
|
This is an automated comment for commit 92ad740 with description of existing statuses. It's updated for the latest CI running ❌ Click here to open a full report in a separate page
Successful checks
|
86cd55f to
5a762ed
Compare
6ccf304 to
2aeb470
Compare
|
@alexey-milovidov Can be seen, can you please take a look and provide me with some help on how to fix this one? Appreciate it! |
|
It is a problem with your changes. Read the logs here: https://s3.amazonaws.com/clickhouse-test-reports/66791/b4511ab45e7c3e306052d5eeabdf15749b0b67d1/fast_test.html |
Hi I read through the log you shared with me, and I noticed that it is the same as the link I checked previously: https://github.com/ClickHouse/ClickHouse/actions/runs/10101883167/job/27936478262?pr=66791 Can you please share with me where the error is, or tell me which part of the log should I check? I will be really appreciate it! |
@alexey-milovidov |
|
Logs show segmentation fault at startup: https://s3.amazonaws.com/clickhouse-test-reports/66791/32e22049ce4967a220ae832a9d0cba889a3522e6/fast_test/server.log which is a null pointer dereference (for reading). |
21b0d1f to
32e2204
Compare
|
Hi @alexey-milovidov, Here’s a quick summary of what I’ve done:
Despite refactoring my code based on the log you shared with me, I still encounter the following error: I wasn't sure if I am changing the correct file, or did I over complicate the task... It would be incredibly helpful if you could provide me with any help or point out any mistakes so I don't go too far down the wrong path. update: Some new changes were implemented and it seems to work well, I will apply testing on it |
…er stopped during queries
Removed unnecessary try-catch Add Comments back Style check
…lientSession.cpp and SessionImpl.cpp
Use Existing logic from other files Added missing right bracket Added Missing param Fixed Cluster.cpp Fixed Cluster.h Fixed Cluster.cpp Fixed Cluster.h again Create an public getter for getEntry() handled unused e modify head file of ConnectPoolwithfailover handled header
…mat and SinkToStorage
f92401b to
552b4b0
Compare
…stributed-queries
439a1cd to
0bad5ba
Compare
|
Hi @alexey-milovidov @devcrafter! Thanks for taking care of this PR, I just create a new rebased PR from with the same changes Ive made from this one, could you please check that one (#68581) instead? Appreciate it!! |
There is no need to create new PR. |
devcrafter
left a comment
There was a problem hiding this comment.
AFAIS, the PR is changing code related to distributed INSERT which unrelated to distributed SELECT as should be per initial task #58380 . What are you trying to implement?
…ELECT instead of SELECT
|
@devcrafter Thank you for taking care of this issue, I will fix the code based on the comments, thank you!! |
92ad740 to
c188700
Compare
|
Dear @devcrafter, this PR hasn't been updated for a while. You will be unassigned. Will you continue working on it? If so, please feel free to reassign yourself. |
Changelog category (leave one):
Changelog entry (a user-readable short description of the changes that goes to CHANGELOG.md):
Add support for dynamic replica reconnection in the
Clusterclass, enabling automatic reconnection to available replicas if some replicas become unavailable during query execution.Documentation entry for user-facing changes
Documentation Information:
Information about CI checks: https://clickhouse.com/docs/en/development/continuous-integration/
CI Settings (Only check the boxes if you know what you are doing):
This addresses issue #58380.