-
Notifications
You must be signed in to change notification settings - Fork 3.7k
[improve][pip] PIP-425: Support connecting with next available endpoint for multi-endpoint serviceUrls #24394
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
…failed when many nodes in PulsarClient serviceUrl become unavailable
|
@AuroraTwinkle Please add the following content to your PR description and select a checkbox: |
lhotari
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some initial feedback
@lhotari All good suggestions, I will fix them and update later. Thanks for your help! |
Updated! |
lhotari
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
good progress! added some follow up comments
7f314b9 to
817a231
Compare
@lhotari Modified and updated as suggested. Thanks! |
codelipenghui
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The proposal looks good to me.
I’ve added a few comments to help clarify the problem and the proposed solution, making it easier to understand.
Ok, I will fix them later, Thanks! |
Co-authored-by: Penghui Li <penghui@apache.org>
Co-authored-by: Penghui Li <penghui@apache.org>
Co-authored-by: Penghui Li <penghui@apache.org>
Co-authored-by: Penghui Li <penghui@apache.org>
Co-authored-by: Penghui Li <penghui@apache.org>
Co-authored-by: Penghui Li <penghui@apache.org>
@codelipenghui Very interesting and detailed suggestions, I have fixed them, thank you again! |
lhotari
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, good work @AuroraTwinkle
|
@lhotari @codelipenghui @315157973 We need more votes on the mailing list to close this PIP. Could you please help vote when you have a moment? |
…nt for multi-endpoint serviceUrls (apache#24394) Fixes apache#22934 (comment) Main Issue: apache#22934 (comment) Implementation: apache#24387 ### Motivation As apache#22934 and apache#22933 mentioned, when most of the nodes in serviceurl are down (but there is at least one available node), creating consumers and producers through PulsarClient will most likely fail. I think this is not as expected. If the code is robust enough, as long as there is one available node, it should be accessible normally. Therefore, this pip is going to optimize the logic, remove unavailable nodes through the feedback mechanism, and improve the success rate of PulsarClient requests. By the way, apache#22935 removes faulty nodes through a regular health check mechanism, but this brings new problems (frequent creation of connections and increased system load), so this solution is abandoned. See apache#22934 (comment) for more details!
…nt for multi-endpoint serviceUrls (apache#24394) Fixes apache#22934 (comment) Main Issue: apache#22934 (comment) Implementation: apache#24387 ### Motivation As apache#22934 and apache#22933 mentioned, when most of the nodes in serviceurl are down (but there is at least one available node), creating consumers and producers through PulsarClient will most likely fail. I think this is not as expected. If the code is robust enough, as long as there is one available node, it should be accessible normally. Therefore, this pip is going to optimize the logic, remove unavailable nodes through the feedback mechanism, and improve the success rate of PulsarClient requests. By the way, apache#22935 removes faulty nodes through a regular health check mechanism, but this brings new problems (frequent creation of connections and increased system load), so this solution is abandoned. See apache#22934 (comment) for more details!
Fixes #22934 (comment)
Main Issue: #22934 (comment)
Implementation: #24387
Motivation
As #22934 and #22933 mentioned, when most of the nodes in serviceurl are down (but there is at least one available node), creating consumers and producers through PulsarClient will most likely fail. I think this is not as expected. If the code is robust enough, as long as there is one available node, it should be accessible normally. Therefore, this pip is going to optimize the logic, remove unavailable nodes through the feedback mechanism, and improve the success rate of PulsarClient requests.
By the way, #22935 removes faulty nodes through a regular health check mechanism, but this brings new problems (frequent creation of connections and increased system load), so this solution is abandoned. See #22934 (comment) for more details!
Modifications
Verifying this change
(Please pick either of the following options)
This change is a trivial rework / code cleanup without any test coverage.
(or)
This change is already covered by existing tests, such as (please describe tests).
(or)
This change added tests and can be verified as follows:
(example:)
Does this pull request potentially affect one of the following parts:
If the box was checked, please highlight the changes
Documentation
docdoc-requireddoc-not-neededdoc-completeMatching PR in forked repository
PR in forked repository: