Skip to content

[pick_first] fix bug that caused us to stop attempting to connect#40162

Closed
markdroth wants to merge 1 commit intogrpc:masterfrom
markdroth:pf_fix
Closed

[pick_first] fix bug that caused us to stop attempting to connect#40162
markdroth wants to merge 1 commit intogrpc:masterfrom
markdroth:pf_fix

Conversation

@markdroth
Copy link
Member

The bug was triggered in a fairly rare sequence of events: The subchannel needed to report CONNECTING as its initial state but then transition to TRANSIENT_FAILURE and back to IDLE before Happy Eyeballs got to it. Because we had recorded that the subchannel already saw TRANSIENT_FAILURE state before Happy Eyeballs got to it, it was not considered a new failure, which broke the logic we use to trigger exiting from the initial Happy Eyeballs pass and continue connecting. The fix is to not record that the subchannel saw TRANSIENT_FAILURE state until we get to it in the Happy Eyeballs pass.

b/428689461

Copy link
Contributor

@apolcyn apolcyn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

@markdroth markdroth deleted the pf_fix branch July 10, 2025 23:58
eugeneo pushed a commit to eugeneo/grpc that referenced this pull request Jul 11, 2025
…pc#40162)

The bug was triggered in a fairly rare sequence of events: The subchannel needed to report CONNECTING as its initial state but then transition to TRANSIENT_FAILURE and back to IDLE before Happy Eyeballs got to it. Because we had recorded that the subchannel already saw TRANSIENT_FAILURE state before Happy Eyeballs got to it, it was not considered a new failure, which broke the logic we use to trigger exiting from the initial Happy Eyeballs pass and continue connecting. The fix is to not record that the subchannel saw TRANSIENT_FAILURE state until we get to it in the Happy Eyeballs pass.

b/428689461

Closes grpc#40162

COPYBARA_INTEGRATE_REVIEW=grpc#40162 from markdroth:pf_fix e37fa72
PiperOrigin-RevId: 781727929
asheshvidyut pushed a commit to asheshvidyut/grpc that referenced this pull request Jul 12, 2025
…pc#40162)

The bug was triggered in a fairly rare sequence of events: The subchannel needed to report CONNECTING as its initial state but then transition to TRANSIENT_FAILURE and back to IDLE before Happy Eyeballs got to it. Because we had recorded that the subchannel already saw TRANSIENT_FAILURE state before Happy Eyeballs got to it, it was not considered a new failure, which broke the logic we use to trigger exiting from the initial Happy Eyeballs pass and continue connecting. The fix is to not record that the subchannel saw TRANSIENT_FAILURE state until we get to it in the Happy Eyeballs pass.

b/428689461

Closes grpc#40162

COPYBARA_INTEGRATE_REVIEW=grpc#40162 from markdroth:pf_fix e37fa72
PiperOrigin-RevId: 781727929
paulosjca pushed a commit to paulosjca/grpc that referenced this pull request Aug 23, 2025
…pc#40162)

The bug was triggered in a fairly rare sequence of events: The subchannel needed to report CONNECTING as its initial state but then transition to TRANSIENT_FAILURE and back to IDLE before Happy Eyeballs got to it. Because we had recorded that the subchannel already saw TRANSIENT_FAILURE state before Happy Eyeballs got to it, it was not considered a new failure, which broke the logic we use to trigger exiting from the initial Happy Eyeballs pass and continue connecting. The fix is to not record that the subchannel saw TRANSIENT_FAILURE state until we get to it in the Happy Eyeballs pass.

b/428689461

Closes grpc#40162

COPYBARA_INTEGRATE_REVIEW=grpc#40162 from markdroth:pf_fix e37fa72
PiperOrigin-RevId: 781727929
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants