config:connected state across hot restarts#6153
config:connected state across hot restarts#6153mattklein123 merged 1 commit intoenvoyproxy:masterfrom
Conversation
Signed-off-by: Rama Chavali <rama.rao@salesforce.com>
mattklein123
left a comment
There was a problem hiding this comment.
Thanks, makes sense.
/wait
| // Some times during hot restarts this stat's value becomes inconsistent and will continue to | ||
| // have 0 till it is reconnected. Setting here ensures that it is consistent with the state of | ||
| // management server connection. | ||
| control_plane_stats_.connected_state_.set(1); |
There was a problem hiding this comment.
Where was it set previously? Should we remove it from there? Also is it possible to somehow test this change? NBD if not.
There was a problem hiding this comment.
It is set in establishNewStream. I think testing it may be difficult simulating hot restart if that is what you mean.
There was a problem hiding this comment.
Sorry. I missed your other question - I think it is OK to leave it there because if there are no messages exchanged /delay in exchanges after stream establishment (which does not happen today) we will still have technically correct state of this stat that it is connected.
|
Having read the Slack thread, I understand better what's going on here. Whenever this line you're adding actually overwrites a "false", that means that the rest of the logic has broken in some way. Definitely good to patch up that inconsistency when it's found, but it should be possible to never let it happen in the first place, which #5910 ought to achieve. Maybe once it's in, this should be replaced with a |
I think once it is in and properly validated, we do not expect inconsistency and these kind of errors may not happen right? However, for this specific stat, I think having this here even after the fix does not harm. |
|
Right, it certainly doesn't hurt. Either it helps, or if the rest of the code is correct, is a no-op. If there is such an error logged, I agree that the error being "this state is wrong but I'm not going to fix it even though I could" is not optimally helpful! |
|
@mattklein123 are we good with this or do I have to make any changes? |
Signed-off-by: Rama Chavali <rama.rao@salesforce.com> Signed-off-by: Fred Douglas <fredlas@google.com>
Description: Some times during hot restart connected state stat has inconsistent value (probably because of how this gauge is updated in shared memory during hot restarts) and is not set to correct value till Envoy is disconnected and reconnected to management server. This PR ensures this is set on receiveMessage so that it stays consistent.
Risk Level: Low
Testing: Existing automated tests
Docs Changes: N/A
Release Notes: N/A