Skip to content

overload: add scaling support to the disable-keepalive and reject-request overload actions#13120

Merged
mattklein123 merged 10 commits intoenvoyproxy:masterfrom
eziskind:overload
Oct 7, 2020
Merged

overload: add scaling support to the disable-keepalive and reject-request overload actions#13120
mattklein123 merged 10 commits intoenvoyproxy:masterfrom
eziskind:overload

Conversation

@eziskind
Copy link
Copy Markdown
Contributor

Commit Message: This change adds support to the disable-keepalive and reject-request HCM overload actions for the overload scaling state so that they can be triggered for a fraction of traffic (proportional to how overloaded Envoy is). Partially addresses issue #13110
Risk Level: low
Testing: unit tests

Signed-off-by: Elisha Ziskind eziskind@google.com

…request overload actions

Signed-off-by: Elisha Ziskind <eziskind@google.com>
Signed-off-by: Elisha Ziskind <eziskind@google.com>
akonradi
akonradi previously approved these changes Sep 16, 2020
Copy link
Copy Markdown
Contributor

@akonradi akonradi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Very clean, nice!

Signed-off-by: Elisha Ziskind <eziskind@google.com>
Signed-off-by: Elisha Ziskind <eziskind@google.com>
akonradi
akonradi previously approved these changes Sep 22, 2020
@akonradi
Copy link
Copy Markdown
Contributor

@mattklein123 can you or another senior maintainer review this?

@antoniovicente antoniovicente self-assigned this Sep 29, 2020
@mattklein123 mattklein123 self-assigned this Sep 29, 2020
* Converts the overload action state to a boolean. The return value will be true with probability
* of the level of saturation.
*/
bool isRandomizedActive(Random::RandomGenerator& random_generator) const {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: I'm unsure about the name of this method but not sure what would be a better name, possibly isPartiallySaturated

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From a public interface perspective, why isn't all of this just hidden by isSaturated()? Should the caller care about scaling if the result ends up being a bool? Also, instead of passing the random generator to this method can it just be passed to the constructor of the state or gotten from the Api after the recent @akonradi refactor?

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can see some callers wanting to react only on the fully saturated state, but it would make sense to force those cases to access value() directly. The definition of isSaturated could be as you describe and use an internal random generation to turn value() into a bool

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OverloadActionState is a value type right now (basically just a wrapper around the float); I'd be hesitant to start threading references into it.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I will admit that I don't know the code structure and would have to research it, but it seems like conceptually we should be able to query the state via one function. If that returns an enum (not saturated, partially saturated, saturated) that's fine, but it seems like all of the randnomness, etc. should be abstracted away. How this is called from the HCM feels not right to me. Can we refactor in some way to better abstract it?

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I think this is a utility function masquerading as a method. The state is available via value() and isSaturated() provides a convenient boolean accessor. What about making this a separate free function, or a method on RandomGenerator, that takes a float (result of .value()) and has the same logic?

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would need to spend more time with the code to have a more formed opinion, so will defer to all of you on how to better name/expose this.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I removed this method and added a RandomGenerator::bernoulli helper method inspired by absl::Bernoulli to use instead.

if (connection_manager_.drain_state_ == DrainState::NotDraining &&
connection_manager_.overload_disable_keepalive_ref_.isSaturated()) {
connection_manager_.overload_disable_keepalive_ref_.isRandomizedActive(
connection_manager_.random_generator_)) {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: No tests would fail if you reverted the changes to this file. I find that concerning.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1 please add HCM tests.

Copy link
Copy Markdown
Contributor Author

@eziskind eziskind Oct 7, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated the existing tests so they would fail with the old implementation.

Copy link
Copy Markdown
Member

@mattklein123 mattklein123 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for working on this. Flushing out some comments.

/wait

* Converts the overload action state to a boolean. The return value will be true with probability
* of the level of saturation.
*/
bool isRandomizedActive(Random::RandomGenerator& random_generator) const {
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From a public interface perspective, why isn't all of this just hidden by isSaturated()? Should the caller care about scaling if the result ends up being a bool? Also, instead of passing the random generator to this method can it just be passed to the constructor of the state or gotten from the Api after the recent @akonradi refactor?

if (connection_manager_.drain_state_ == DrainState::NotDraining &&
connection_manager_.overload_disable_keepalive_ref_.isSaturated()) {
connection_manager_.overload_disable_keepalive_ref_.isRandomizedActive(
connection_manager_.random_generator_)) {
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1 please add HCM tests.

Signed-off-by: Elisha Ziskind <eziskind@google.com>
Merge remote-tracking branch 'upstream/master' into overload

Signed-off-by: Elisha Ziskind <eziskind@google.com>
Signed-off-by: Elisha Ziskind <eziskind@google.com>
Signed-off-by: Elisha Ziskind <eziskind@google.com>
Signed-off-by: Elisha Ziskind <eziskind@google.com>
Signed-off-by: Elisha Ziskind <eziskind@google.com>
Copy link
Copy Markdown
Member

@mattklein123 mattklein123 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for iterating. This looks much better to me!

@mattklein123 mattklein123 merged commit f288c80 into envoyproxy:master Oct 7, 2020
@eziskind eziskind deleted the overload branch October 8, 2020 01:23
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants