Skip to content

http fault: add response rate limit injection#6267

Merged
mattklein123 merged 10 commits intomasterfrom
fault_rl
Mar 15, 2019
Merged

http fault: add response rate limit injection#6267
mattklein123 merged 10 commits intomasterfrom
fault_rl

Conversation

@mattklein123
Copy link
Copy Markdown
Member

Part of #5942

Risk Level: Low, new feature, opt-in.
Testing: New UT/integration tests.
Docs Changes: Added
Release Notes: Added

Part of #5942

Signed-off-by: Matt Klein <mklein@lyft.com>
@mattklein123
Copy link
Copy Markdown
Member Author

@mpuncel see the integration tests in here for an example of how to use simulated time. cc @jmarantz

Signed-off-by: Matt Klein <mklein@lyft.com>
Copy link
Copy Markdown
Contributor

@alyssawilk alyssawilk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cool, I suspect this will be super useful for latency testing. I owe the token logic a long look (maybe we can assign a secondary reviewer?) but here's what I've got for you today

Signed-off-by: Matt Klein <mklein@lyft.com>
Signed-off-by: Matt Klein <mklein@lyft.com>
@mattklein123
Copy link
Copy Markdown
Member Author

@alyssawilk updated per first comment pass.

Signed-off-by: Matt Klein <mklein@lyft.com>
Copy link
Copy Markdown
Contributor

@alyssawilk alyssawilk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM modulo (mostly optional) nits

Signed-off-by: Matt Klein <mklein@lyft.com>
Signed-off-by: Matt Klein <mklein@lyft.com>
Signed-off-by: Matt Klein <mklein@lyft.com>
@mattklein123
Copy link
Copy Markdown
Member Author

@alyssawilk @soya3129 updated

continue_cb_(continue_cb),
// The token bucket is configured with a max token count of the number of ticks per second,
// and refills at the same rate, so that we have a per second limit which refills gradually in
// ~63ms intervals.
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I just remembered something about the backing issue for this PR.. this is bandwidth rate limiting in essence.. but at application layer. No guarantees that the OS isn't buffering it up until MSS and then pushing data [granted there is a timer in the OS, that will cap the wait time]. I used to do this with netem module in the kernel which has a bewildering array of such algorithms, but applied at the netdev level, after TCP packaging. It would be interesting to see if the OS will mask the user space bandwidth throttling or not..

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

also how did you come up with the 63ms figure?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

re: your first comment, if there is a concern can you restate?

re: 63ms, 1000ms / 16 (see other comments).

Copy link
Copy Markdown
Member

@rshriram rshriram left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

..

alyssawilk
alyssawilk previously approved these changes Mar 14, 2019
Copy link
Copy Markdown
Contributor

@alyssawilk alyssawilk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM modulo the other comments being addressed

Signed-off-by: Matt Klein <mklein@lyft.com>
Signed-off-by: Matt Klein <mklein@lyft.com>
@mattklein123
Copy link
Copy Markdown
Member Author

@rshriram @alyssawilk @soya3129 updated

upstream_request_->encodeData(data, false);
decoder->waitForBodyData(1024);

// Advance time and wait for a ticks worth of data.
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ticks->tick

simTime().sleep(std::chrono::milliseconds(63));
decoder->waitForBodyData(1088);

// Advance time and wait for a ticks worth of data.
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ticks->tick

simTime().sleep(std::chrono::milliseconds(63));
decoder->waitForBodyData(1088);

// Advance time and wait for a ticks worth of data and end stream.
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ticks->tick

@mattklein123
Copy link
Copy Markdown
Member Author

@soya3129 I will fix the comment issues in my next change.

Copy link
Copy Markdown
Contributor

@dnoe dnoe left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Re-adding approval bit for alyssa.

@mattklein123 mattklein123 merged commit 628d166 into master Mar 15, 2019
@mattklein123 mattklein123 deleted the fault_rl branch March 15, 2019 20:20
spenceral added a commit to spenceral/envoy that referenced this pull request Mar 20, 2019
* master: (59 commits)
  http fault: add response rate limit injection (envoyproxy#6267)
  xds: introduce initial_fetch_timeout option to limit initialization time (envoyproxy#6048)
  test: fix cpuset-threads tests (envoyproxy#6278)
  server: add an API for registering for notifications for server instance life… (envoyproxy#6254)
  remove remains of TestBase (envoyproxy#6286)
  dubbo_proxy: Implement the routing of Dubbo requests (envoyproxy#5973)
  Revert "stats: add new BoolIndicator stat type (envoyproxy#5813)" (envoyproxy#6280)
  runtime: codifying runtime guarded features (envoyproxy#6134)
  mysql_filter: fix integration test flakes (envoyproxy#6272)
  tls: update BoringSSL to debed9a4 (3683). (envoyproxy#6273)
  rewrite buffer implementation to eliminate evbuffer dependency (envoyproxy#5441)
  Remove the dependency from TimeSystem to libevent by using the Event::Scheduler abstraction as a delegate. (envoyproxy#6240)
  fuzz: fix use of literal in default initialization. (envoyproxy#6268)
  http: add HCM functionality required for rate limiting (envoyproxy#6242)
  Disable mysql_integration_test until it is deflaked. (envoyproxy#6250)
  test: use ipv6_only IPv6 addresses in custom cluster integration tests. (envoyproxy#6260)
  tracing: If parent span is propagated with empty string, it causes th… (envoyproxy#6263)
  upstream: fix oss-fuzz issue envoyproxy#11095. (envoyproxy#6220)
  Wire up panic mode subset to receive updates (envoyproxy#6221)
  docs: clarify xds docs with warming information (envoyproxy#6236)
  ...
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants