overload: add scaling support to the disable-keepalive and reject-request overload actions#13120
overload: add scaling support to the disable-keepalive and reject-request overload actions#13120mattklein123 merged 10 commits intoenvoyproxy:masterfrom
Conversation
…request overload actions Signed-off-by: Elisha Ziskind <eziskind@google.com>
Signed-off-by: Elisha Ziskind <eziskind@google.com>
Signed-off-by: Elisha Ziskind <eziskind@google.com>
|
@mattklein123 can you or another senior maintainer review this? |
| * Converts the overload action state to a boolean. The return value will be true with probability | ||
| * of the level of saturation. | ||
| */ | ||
| bool isRandomizedActive(Random::RandomGenerator& random_generator) const { |
There was a problem hiding this comment.
nit: I'm unsure about the name of this method but not sure what would be a better name, possibly isPartiallySaturated
There was a problem hiding this comment.
From a public interface perspective, why isn't all of this just hidden by isSaturated()? Should the caller care about scaling if the result ends up being a bool? Also, instead of passing the random generator to this method can it just be passed to the constructor of the state or gotten from the Api after the recent @akonradi refactor?
There was a problem hiding this comment.
I can see some callers wanting to react only on the fully saturated state, but it would make sense to force those cases to access value() directly. The definition of isSaturated could be as you describe and use an internal random generation to turn value() into a bool
There was a problem hiding this comment.
OverloadActionState is a value type right now (basically just a wrapper around the float); I'd be hesitant to start threading references into it.
There was a problem hiding this comment.
I will admit that I don't know the code structure and would have to research it, but it seems like conceptually we should be able to query the state via one function. If that returns an enum (not saturated, partially saturated, saturated) that's fine, but it seems like all of the randnomness, etc. should be abstracted away. How this is called from the HCM feels not right to me. Can we refactor in some way to better abstract it?
There was a problem hiding this comment.
Yeah, I think this is a utility function masquerading as a method. The state is available via value() and isSaturated() provides a convenient boolean accessor. What about making this a separate free function, or a method on RandomGenerator, that takes a float (result of .value()) and has the same logic?
There was a problem hiding this comment.
I would need to spend more time with the code to have a more formed opinion, so will defer to all of you on how to better name/expose this.
There was a problem hiding this comment.
I removed this method and added a RandomGenerator::bernoulli helper method inspired by absl::Bernoulli to use instead.
| if (connection_manager_.drain_state_ == DrainState::NotDraining && | ||
| connection_manager_.overload_disable_keepalive_ref_.isSaturated()) { | ||
| connection_manager_.overload_disable_keepalive_ref_.isRandomizedActive( | ||
| connection_manager_.random_generator_)) { |
There was a problem hiding this comment.
nit: No tests would fail if you reverted the changes to this file. I find that concerning.
There was a problem hiding this comment.
Updated the existing tests so they would fail with the old implementation.
mattklein123
left a comment
There was a problem hiding this comment.
Thanks for working on this. Flushing out some comments.
/wait
| * Converts the overload action state to a boolean. The return value will be true with probability | ||
| * of the level of saturation. | ||
| */ | ||
| bool isRandomizedActive(Random::RandomGenerator& random_generator) const { |
There was a problem hiding this comment.
From a public interface perspective, why isn't all of this just hidden by isSaturated()? Should the caller care about scaling if the result ends up being a bool? Also, instead of passing the random generator to this method can it just be passed to the constructor of the state or gotten from the Api after the recent @akonradi refactor?
| if (connection_manager_.drain_state_ == DrainState::NotDraining && | ||
| connection_manager_.overload_disable_keepalive_ref_.isSaturated()) { | ||
| connection_manager_.overload_disable_keepalive_ref_.isRandomizedActive( | ||
| connection_manager_.random_generator_)) { |
Signed-off-by: Elisha Ziskind <eziskind@google.com>
Merge remote-tracking branch 'upstream/master' into overload Signed-off-by: Elisha Ziskind <eziskind@google.com>
Signed-off-by: Elisha Ziskind <eziskind@google.com>
Signed-off-by: Elisha Ziskind <eziskind@google.com>
Signed-off-by: Elisha Ziskind <eziskind@google.com>
mattklein123
left a comment
There was a problem hiding this comment.
Thanks for iterating. This looks much better to me!
Commit Message: This change adds support to the disable-keepalive and reject-request HCM overload actions for the overload scaling state so that they can be triggered for a fraction of traffic (proportional to how overloaded Envoy is). Partially addresses issue #13110
Risk Level: low
Testing: unit tests
Signed-off-by: Elisha Ziskind eziskind@google.com