conn_pool_grid: try TCP immediately with h3 state FailedRecently#20722
conn_pool_grid: try TCP immediately with h3 state FailedRecently#20722RyanTheOptimist merged 4 commits intoenvoyproxy:mainfrom
Conversation
Signed-off-by: Dan Zhang <danzh@google.com>
Signed-off-by: Dan Zhang <danzh@google.com>
Signed-off-by: Dan Zhang <danzh@google.com>
|
failed thrift_proxy:integration_test on Windows seems unrelated. |
|
/assign @RyanTheOptimist |
|
/retest |
|
Retrying Azure Pipelines: |
source/common/http/conn_pool_grid.cc
Outdated
| auto attempt = std::make_unique<ConnectionAttemptCallbacks>(*this, current_); | ||
| LinkedList::moveIntoList(std::move(attempt), connection_attempts_); | ||
| if (!next_attempt_timer_->enabled()) { | ||
| if (next_attempt_timer_ != nullptr && !next_attempt_timer_->enabled()) { |
There was a problem hiding this comment.
I thought that next_attempt_timer_ was initialized in the constructor and could never be null. Is this check needed?
There was a problem hiding this comment.
You're right. It's not needed
| EXPECT_FALSE(tracker_.isHttp3Confirmed()); | ||
|
|
||
| EXPECT_CALL(*timer_, enabled()).WillOnce(Return(false)); | ||
| EXPECT_CALL(*timer_, enableTimer(std::chrono::milliseconds(5 * 60 * 1000), nullptr)); |
There was a problem hiding this comment.
Should we also test what happens when the timer fires? Or, maybe we tested that earlier?
There was a problem hiding this comment.
We tested in MarkBrokenThenExpires that when timer fires, isHttp3Broken() returns false and hasHttp3FailedRecently() returns true
source/common/http/conn_pool_grid.cc
Outdated
| PoolIterator pool = pools_.begin(); | ||
| if (!shouldAttemptHttp3() || !options.can_use_http3_) { | ||
| Instance::StreamOptions overriding_options(options); | ||
| bool delay_tcp_attempt{true}; |
There was a problem hiding this comment.
nit: we have two initializations, one uses {} and the other (). Could they both use {} (or even = to be more idiomatic)?
| EXPECT_NE(grid_->first(), nullptr); | ||
| // The 2nd pool should be TCP pool and it should have been created together with h3 pool. | ||
| EXPECT_NE(grid_->second(), nullptr); | ||
| EXPECT_EQ(2u, grid_->callbacks_.size()); |
There was a problem hiding this comment.
I'm not quite sure I understand how this test that the TCP attempt is not delayed?
There was a problem hiding this comment.
By checking EXPECT_NE(grid_->second(), nullptr). The 2nd pool (TCP) is created immediately with the quic pool without waiting for the alarm to fire.
There was a problem hiding this comment.
Ah, thanks. I see.
| } | ||
| if (!delay_tcp_attempt) { | ||
| // Immediately start TCP attempt if HTTP/3 failed recently. | ||
| wrapped_callbacks_.front()->tryAnotherConnection(); |
There was a problem hiding this comment.
Should we stop the failover timer at this point since there is nothing else to try?
There was a problem hiding this comment.
next_attempt_timer_ is not accessible from the grid, so I left as is. The alarm should trigger another call to tryAnotherConnection() and early returns if there is no next pool. We definitely can cancel it explicitly by adding a getter interface, but I'm wondering if it's worth it.
There was a problem hiding this comment.
Yeah, fair enough!
| EXPECT_NE(grid_->first(), nullptr); | ||
| // The 2nd pool should be TCP pool and it should have been created together with h3 pool. | ||
| EXPECT_NE(grid_->second(), nullptr); | ||
| EXPECT_EQ(2u, grid_->callbacks_.size()); |
There was a problem hiding this comment.
Ah, thanks. I see.
| } | ||
| if (!delay_tcp_attempt) { | ||
| // Immediately start TCP attempt if HTTP/3 failed recently. | ||
| wrapped_callbacks_.front()->tryAnotherConnection(); |
There was a problem hiding this comment.
Yeah, fair enough!
|
FYI: I've enabled auto-merge so once CI passes it should land automatically. |
|
/retest |
|
Retrying Azure Pipelines: |
…oyproxy#20722) Commit Message: Add a new state FailedRecently to the status tracker. And make the connectivity grid check this state in newStream(). If the h3 pool is in this state, do not delay TCP racing. Risk Level: low, grid only Testing: added unit tests Docs Changes: N/A Release Notes: N/A Platform Specific Features: N/A Part of envoyproxy#18715 Signed-off-by: Dan Zhang danzh@google.com Signed-off-by: Andre Vehreschild <vehre@x41-dsec.de>
…oyproxy#20722) Commit Message: Add a new state FailedRecently to the status tracker. And make the connectivity grid check this state in newStream(). If the h3 pool is in this state, do not delay TCP racing. Risk Level: low, grid only Testing: added unit tests Docs Changes: N/A Release Notes: N/A Platform Specific Features: N/A Part of envoyproxy#18715 Signed-off-by: Dan Zhang danzh@google.com
Signed-off-by: Dan Zhang danzh@google.com
Commit Message: Add a new state FailedRecently to the status tracker. And make the connectivity grid check this state in newStream(). If the h3 pool is in this state, do not delay TCP racing.
Risk Level: low, grid only
Testing: added unit tests
Docs Changes: N/A
Release Notes: N/A
Platform Specific Features: N/A
Part of #18715