Fix jitter behavior for large values.#552
Conversation
By chance weird behavior was observed for when specifying large jitter values. This fixes behavior for that case by allowing DistributionSamplingRateLimiterTest to schedule multiple jittered values into the future. This also improves testing around this rate limiter behavior. Signed-off-by: Otto van der Schaaf <oschaaf@we-amp.com>
|
@eric846 please review and assign back to me once done. |
test/rate_limiter_test.cc
Outdated
| class DistributionSamplingRateLimiterTest : public RateLimiterTest { | ||
| public: | ||
| DistributionSamplingRateLimiterTest() | ||
| : mock_rate_limiter_(std::make_unique<NiceMock<MockRateLimiter>>()), |
There was a problem hiding this comment.
It seems like this should be called something like inner_mock_rate_limiter_.
test/rate_limiter_test.cc
Outdated
| public: | ||
| DistributionSamplingRateLimiterTest() | ||
| : mock_rate_limiter_(std::make_unique<NiceMock<MockRateLimiter>>()), | ||
| unsafe_mock_rate_limiter_(*mock_rate_limiter_), |
There was a problem hiding this comment.
I was confused by how many rate limiters there are -- can we get by without this field and just refer to *mock_rate_limiter_ in the test code?
test/rate_limiter_test.cc
Outdated
| .WillOnce(Return(true)) | ||
| .WillOnce(Return(false)) | ||
| .WillOnce(Return(false)); | ||
| EXPECT_CALL(*unsafe_discrete_numeric_distribution_sampler_, getValue).WillOnce(Return(1)); |
There was a problem hiding this comment.
Could you comment on all these Return(1) to clarify that they mean delay releasing the request for 1 ns?
| sanity_check_pending_release_ = false; | ||
| if (rate_limiter_->tryAcquireOne()) { | ||
| const Envoy::MonotonicTime adjusted = now + random_distribution_generator_(); | ||
| distributed_timings_.insert( |
There was a problem hiding this comment.
Could you add a comment about the invariant we maintain on distributed_timings_? It looks like it's simply ascending order. Also, is it easy to add unit tests that insert multiple distribution values other than 1ns in sorted and nonsorted order to make sure each acquisition always happens at the expected nanosecond?
Also, is distributed_timings_ ever anticipated to become large? If so, would a tree be better than a vector?
There was a problem hiding this comment.
Also, is
distributed_timings_ever anticipated to become large? If so, would a tree be better than a vector?
Yes I think you are right, but I feel it would become more prudent to optimise this if we find use cases that need to support specifying large ranges of values to distribute timings over. In that case, this could list could grow large. Then we should optimise, but for now I'm happy with correctness, and I'd rather add a documentation note recommending to not configure large values at this point in time.
(For the known use case where people add white noise for desynchronisation purposes, I wouldn't expect this to be a factor, as the distribution range should usually be smaller then the request-release interval).
There was a problem hiding this comment.
Also, is it easy to add unit tests that insert multiple distribution values other than 1ns in sorted and nonsorted order to make sure each acquisition always happens at the expected nanosecond?
Done.
test/rate_limiter_test.cc
Outdated
| std::unique_ptr<NiceMock<MockRateLimiter>> mock_rate_limiter_; | ||
| MockRateLimiter& unsafe_mock_rate_limiter_; | ||
| Envoy::Event::SimulatedTimeSystem time_system_; | ||
| MockDiscreteNumericDistributionSampler* unsafe_discrete_numeric_distribution_sampler_; |
There was a problem hiding this comment.
Could you comment what operations are unsafe on this field?
test/rate_limiter_test.cc
Outdated
| DistributionSamplingRateLimiterTest() | ||
| : mock_rate_limiter_(std::make_unique<NiceMock<MockRateLimiter>>()), | ||
| unsafe_mock_rate_limiter_(*mock_rate_limiter_), | ||
| unsafe_discrete_numeric_distribution_sampler_(new MockDiscreteNumericDistributionSampler()), |
There was a problem hiding this comment.
Is there a special need for bare new, or would making the field std::unique_ptr<MockDiscreteNumericDistributionSampler> discrete_numeric_distribution_sampler_ and referring to *discrete_numeric_distribution_sampler_ in the tests work just the same? This seems like it would get flagged by asan.
There was a problem hiding this comment.
This ends up being owned by the rate limiter, so there shouldn't be an ASAN report about it.
Annoyingly, we need temporary member variables to grab references to the mock before we transfer ownership, as the constructor requires a unique_ptr&&. This works nice in prod code by making the path of least resistance a good one, but it is annoying here as we need to go through this hoop. I renamed some things, and made this code use std::move like we do with the rate limiter mocks, to hopefully clean this up and make it more easy to grok.
Signed-off-by: Otto van der Schaaf <oschaaf@we-amp.com>
|
@eric846 one last thing -- while I didn't implement a tree, the buffered timings are now tracked via a |
Signed-off-by: Otto van der Schaaf <oschaaf@we-amp.com>
|
LGTM, thanks! |
By chance weird behaviour was observed for when specifying large jitter
values. This warranted analysis, and upon investigation the root problem seems
to be that when configuring
--uniform-jitterto a value that would allow theadjusted value(s) to overlap with the pacing of the underlying rate limiter,
the new overlapping value could overwrite the old one that was scheduled.
This modifies behaviour and tests to fix and consolidate, by modifying
DistributionSamplingRateLimiterso that it can queue adjusted releasetimings instead of storing a single one. This will allow it to accumulate multiple
release timings that are adjusted for jitter.
This ensures that the pacing of the rate limiter that modifies timings to add jitter
gets properly disconnected from the pacing of the underlying rate limiter.
NOTE: There are no known consumers running into problems / inaccuracies
because of this. To tease the problem out, either one needs to:
--jitter-uniformin a way that it would adjust timings to overlapwith the frequency imposed by
--rps, or--jitter-uniform > 0sand--burst-size > 0.Signed-off-by: Otto van der Schaaf oschaaf@we-amp.com