Conversation
Add an extension which implements dynamic delay injection based on the delta between target and current guage values. State: manually verified to work. needs tests, a bit of cleanup and docs. Fixes envoyproxy#389 and also envoyproxy#82, we may want to remove the direct fault filter configs from our docs as well as emit a log warning when detecting usage to deprecate. Per-request config example, this gets merged with any static configuratio to determine the active configuration: ``` curl -H "x-nighthawk-test-server-config:{static_delay:\"1s\"}" -vv 127.0.0.1:10000 ``` Full yaml config for the http_filters section: ```yaml http_filters: - name: dynamic-delay config: stats_based_delay: guage_name: "http.ingress_http.downstream_rq_active" guage_target_value: 10 delay_delta: 0.1s - name: envoy.fault config: max_active_faults: 100 delay: header_delay: {} percentage: numerator: 100 - name: test-server config: response_body_size: 10 response_headers: - { header: { key: "x-nh", value: "1" } } ``` Signed-off-by: Otto van der Schaaf <oschaaf@we-amp.com>
Signed-off-by: Otto van der Schaaf <oschaaf@we-amp.com>
Signed-off-by: Otto van der Schaaf <oschaaf@we-amp.com>
|
@mum4k @jmarantz opening this up as a draft to allow for early review. (See the issue description for state), but prevent merging (or the impression of this being in that shape yet). I would like to invite you to assess the high level approach as well as the algorithm/parameterization to achieve the goal we're after here: being able to induce a sustained/fixed level of parallelism on the downstream side (e.g. a proxy running in front of the test server). While working on this PR I was mostly focussed on wiring in the overall mechanics of the desired type of functionality, |
jmarantz
left a comment
There was a problem hiding this comment.
haven't been through all of it yet, but using arbitrary gauges brings a bit of complexity.
Signed-off-by: Otto van der Schaaf <oschaaf@we-amp.com>
|
@jmarantz @jc1e100 I pushed changes to partially address comments in 277c786, and updated the PR description. |
jc1e100
left a comment
There was a problem hiding this comment.
Thanks, the new simplified config format LGTM.
| std::atomic<uint64_t> HttpDynamicDelayDecoderFilterConfig::instances_(0); | ||
|
|
||
| HttpDynamicDelayDecoderFilterConfig::HttpDynamicDelayDecoderFilterConfig( | ||
| nighthawk::server::ResponseOptions proto_config) |
There was a problem hiding this comment.
nighthawk::server::ResponseOptions&& ?
Signed-off-by: Otto van der Schaaf <oschaaf@we-amp.com>
Signed-off-by: Otto van der Schaaf <oschaaf@we-amp.com>
|
Linking the CI coverage report for convenience, this seems ok to me (combined with the issue that tracks that we need covering actual end-to-end functionality). |
|
@dubious90 please review and assign back to me once done. |
Remove the python-based error flow test in favor of a c++ one. Signed-off-by: Otto van der Schaaf <oschaaf@we-amp.com>
Signed-off-by: Otto van der Schaaf <oschaaf@we-amp.com>
jmarantz
left a comment
There was a problem hiding this comment.
looks good but someone more expert in the NH code should also review.
| - `response_headers` - list of headers to add to response. If `append` is set to | ||
| `true`, then the header is appended. | ||
| * `echo_request_headers` - if set to `true`, then append the dump of request headers to the response | ||
| - `echo_request_headers` - if set to `true`, then append the dump of request headers to the response |
There was a problem hiding this comment.
Should add a bullet here for oneof_delay_options, which is your new field in ResponseOptions.
There was a problem hiding this comment.
I think you missed this, unless this is dependent on the result of the larger thread about the use of the fault filter.
There was a problem hiding this comment.
The later revisions have the new options documented in a new section right below this one, does that work for you?
| if (base_config_.has_static_delay()) { | ||
| delay_ms = Envoy::Protobuf::util::TimeUtil::DurationToMilliseconds(base_config_.static_delay()); | ||
| } else if (base_config_.has_concurrency_based_delay()) { | ||
| const nighthawk::server::ConcurrencyBasedDelay& concurrency_config = | ||
| base_config_.concurrency_based_delay(); | ||
| const uint64_t current_value = config_->approximateInstances(); | ||
| delay_ms = computeDelayMilliseconds(current_value, concurrency_config.minimal_delay(), | ||
| concurrency_config.concurrency_delay_factor()); | ||
| } | ||
| if (delay_ms.has_value() && delay_ms > 0) { | ||
| // Emit header to communicate the delay we desire to the fault filter extension. | ||
| const Envoy::Http::LowerCaseString key("x-envoy-fault-delay-request"); | ||
| headers.setCopy(key, absl::StrCat(*delay_ms)); | ||
| } |
There was a problem hiding this comment.
I think it'd help the readability here if you abstracted this section into its own function, as this is the part of the function that is actually setting the delays, rather than just decoding headers.
There was a problem hiding this comment.
Done, let me know if that looks like what you had in mind (see 8ee78db)
There was a problem hiding this comment.
I was actually thinking grab lines 49 - 62 here, so including the part where it determines what the delay is. But I defer to you on what makes the most sense.
There was a problem hiding this comment.
Done, let me know if that looks better.
Signed-off-by: Otto van der Schaaf <oschaaf@we-amp.com>
Signed-off-by: Otto van der Schaaf <oschaaf@we-amp.com>
Signed-off-by: Otto van der Schaaf <oschaaf@we-amp.com>
Signed-off-by: Otto van der Schaaf <oschaaf@we-amp.com>
| - `response_headers` - list of headers to add to response. If `append` is set to | ||
| `true`, then the header is appended. | ||
| * `echo_request_headers` - if set to `true`, then append the dump of request headers to the response | ||
| - `echo_request_headers` - if set to `true`, then append the dump of request headers to the response |
There was a problem hiding this comment.
I think you missed this, unless this is dependent on the result of the larger thread about the use of the fault filter.
|
LGTM, after the two quick comments are resolved. @mum4k to approve. |
…ver-dynamic-delay Partially moves to using the fault filter extension as a base class Signed-off-by: Otto van der Schaaf <oschaaf@we-amp.com>
Signed-off-by: Otto van der Schaaf <oschaaf@we-amp.com>
Signed-off-by: Otto van der Schaaf <oschaaf@we-amp.com>
oschaaf
left a comment
There was a problem hiding this comment.
After some discussion via Slack with @htuch and @dubious90 I changed this to take the fault filter as a base class, so there are some new changes in here.
| - `response_headers` - list of headers to add to response. If `append` is set to | ||
| `true`, then the header is appended. | ||
| * `echo_request_headers` - if set to `true`, then append the dump of request headers to the response | ||
| - `echo_request_headers` - if set to `true`, then append the dump of request headers to the response |
There was a problem hiding this comment.
The later revisions have the new options documented in a new section right below this one, does that work for you?
mum4k
left a comment
There was a problem hiding this comment.
Note - I started reviewing the implementation only after I commented my preference on the inherit / copy / ... the fault injection filter, so I only now noticed that we already invested into the inheritance approach. I would like us to minimize the churn on this PR, since it is already quite large, so let's leave the current approach in.
If you do feel this will add enough maintenance overhead for us to feel it, I think we should consider filing an issue to follow-up on this decision and address it later if we reach a different conclusion.
Since this is a fairly large PR, stopping the review mid stream and providing partial set of comments to avoid blocking you.
source/server/common.h
Outdated
|
|
||
| } // namespace TestServer | ||
|
|
||
| class Utility { |
There was a problem hiding this comment.
I am wondering if we can improve the naming and structure of this new library. Both "common" and "Utility" are very generic names that a) don't tell the reader much and b) attract pretty much any helper / utility function. After all everything is an utility.
Secondly it doesn't seem necessary for this to be a class at all, since the two methods on it don't really need any state. How would you feel about changing these into regular functions and renaming the library into something more specific.
E.g. response_options.h
Feel free to choose any other name that is specific enough not to attract unrelated functions.
There was a problem hiding this comment.
Done in 9be5334, let me know if that looks better.
source/server/common.h
Outdated
|
|
||
| namespace TestServer { | ||
|
|
||
| class HeaderNameValues { |
There was a problem hiding this comment.
As per the comment below (ideally, please read that one first) - are we moving this definition here just because we don't have a better place for it and this file happens to have a generic enough name? And a similar question - does this have to be a class?
Alternative approach - this currently only seems to be used in a single location - how about we just inline the literal there, is it worth having a this defined as a constant?
If yes, can we define a new header specifically designed for this? And if we are doing that - is attaching these onto a class an idiom used in Envoy, or can we can try to figure out an approach that doesn't require a class? Even an ordinary function returning static constexpr might be preferred.
inline absl::string_view MyString() {
static constexpr char kHello[] = "Hello";
return kHello;
}There was a problem hiding this comment.
Like for above, see 9be5334. However, I kept the Envoy construct, because this is shared/reused across the two extensions, and having the LowerCaseString is convenient -- that is the form that consumers are probably going to end up needing when they want access to this.
source/server/common.h
Outdated
| @@ -0,0 +1,50 @@ | |||
| #pragma once | |||
There was a problem hiding this comment.
Can we add unit tests for this library?
There was a problem hiding this comment.
There are pre-existing unit tests for this. For example, see
source/server/common.h
Outdated
| @@ -0,0 +1,50 @@ | |||
| #pragma once | |||
There was a problem hiding this comment.
(for the future) Looks like the changes in this library may have been just indirectly related to the work in this PR. If reasonable, can we try to split such changes into their own PR in the future to minimize the size of PRs and the review complexity?
There was a problem hiding this comment.
Yeah, you're right, sorry about that. Will extract these mechanical refactors out into a separate PR next time.
| namespace Server { | ||
|
|
||
| /** | ||
| * Filter configuration container class for the dynamic delay extension. |
There was a problem hiding this comment.
Can we expand the class doc comment with a few more information, ideally at least an example of use to help the reader put this in context?
The public API of this class has a few functions that aren't obvious (or their use case isn't), so this can help with that too.
| class HttpDynamicDelayDecoderFilterConfig { | ||
|
|
||
| public: | ||
| HttpDynamicDelayDecoderFilterConfig(nighthawk::server::ResponseOptions proto_config, |
There was a problem hiding this comment.
Can we document the constructor and the arguments it takes?
| namespace Server { | ||
|
|
||
| /** | ||
| * Filter configuration container class for the dynamic delay extension. |
There was a problem hiding this comment.
Can we indicate the thread-safety of this class in the comment?
| /** | ||
| * Increments the number of active instances. | ||
| */ | ||
| void incrementInstanceCount() { instances()++; } |
There was a problem hiding this comment.
Can you help me understand what these are for and why is the getter getting an approximate amount rather than exact?
| */ | ||
| uint64_t approximateInstances() const { return instances(); } | ||
|
|
||
| Envoy::Runtime::Loader& runtime() { return runtime_; } |
There was a problem hiding this comment.
Can we document all public methods?
Signed-off-by: Otto van der Schaaf <oschaaf@we-amp.com>
Signed-off-by: Otto van der Schaaf <oschaaf@we-amp.com>
Signed-off-by: Otto van der Schaaf <oschaaf@we-amp.com>
mum4k
left a comment
There was a problem hiding this comment.
Looks good, just waiting for the CI checks to finish.
Signed-off-by: Otto van der Schaaf <oschaaf@we-amp.com>
Add an extension which implements dynamic delay injection based on the number
of concurrent requests being handled, as well as static delays.
Fixes #389 and also #82. We may consider removing the direct fault filter configs
from our docs as well as emit a log warning when detecting usage to deprecate.
It would be nice if we could add the fault filter extension in code instead of imposing
configuration of that upon the end user, making that an implementation detail.
The following shows a per-request config example.
This header-provided configuration gets merged with any provided static configuration to
determine a final configuration for each request:
Full yaml config for the http_filters section:
Signed-off-by: Otto van der Schaaf oschaaf@we-amp.com
State: first stab, manually verified to work. needs