Skip to content

multitenant: implement a fallback rate#70163

Merged
craig[bot] merged 2 commits intocockroachdb:masterfrom
RaduBerinde:backup-rate
Sep 14, 2021
Merged

multitenant: implement a fallback rate#70163
craig[bot] merged 2 commits intocockroachdb:masterfrom
RaduBerinde:backup-rate

Conversation

@RaduBerinde
Copy link
Copy Markdown
Member

tenantcostclient: maintain a "buffer" of RUs

This change adjusts the tenant cost controller logic to try to
maintain a "buffer" of 5000 RUs. This is useful to prevent waiting for
more RUs if an otherwise lightly loaded pod suddenly gets a spike of
traffic.

Release note: None

multitenant: implement a "backup" rate

This change implements a "backup" (fallback) throttling rate that a
SQL pod can use if it stops being able to complete token bucket
requests.

The goal is keep tenants without burst RUs throttled and tenants with
lots of RUs unthrottled (or throttled at a high rate). To achieve
this, we calculate a rate at which the tenant would burn through all
their available RUs within 1 hour. The premise here is that if we have
some kind of infrastructure problem, 1 hour is a reasonable time frame
to address it. Beyond 1 hour, the tenant will continue at the same
rate, consuming more RUs than they had available.

Informs #68479.

Release note: None

Release justification: Necessary fix for the distributed rate limiting
functionality, which is vital for the upcoming Serverless MVP release.
It allows CRDB to throttle clusters that have run out of free or paid
request units (which measure CPU and I/O usage). This functionality is
only enabled in multi-tenant scenarios and should have no impact on
our dedicated customers.

@cockroach-teamcity
Copy link
Copy Markdown
Member

This change is Reviewable

Copy link
Copy Markdown
Contributor

@andy-kimball andy-kimball left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:lgtm:

Reviewable status: :shipit: complete! 1 of 0 LGTMs obtained (waiting on @ajwerner)

Copy link
Copy Markdown
Contributor

@ajwerner ajwerner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If I had a nit it's that I don't like the word backup as much as I like fallback. How come you chose backup, which very much is a term in cockroach, and, probably something users can initiate?

Reviewed 1 of 5 files at r1, 9 of 14 files at r2, all commit messages.
Reviewable status: :shipit: complete! 1 of 0 LGTMs obtained (waiting on @RaduBerinde)


pkg/ccl/multitenantccl/tenantcostclient/tenant_side.go, line 366 at r2 (raw file):

	ctx context.Context, resp *roachpb.TokenBucketResponse,
) {
	if log.V(1) {

drive-by: this should be log.ExpensiveLogEnabled(ctx, 1)

@RaduBerinde
Copy link
Copy Markdown
Member Author

Yeah I agree. I thought of "fallback" after I wrote most of the code. I will rename.

@RaduBerinde RaduBerinde force-pushed the backup-rate branch 2 times, most recently from e135a8b to 52e88c8 Compare September 14, 2021 16:32
@RaduBerinde
Copy link
Copy Markdown
Member Author

I fell back to the backup word fallback.

Copy link
Copy Markdown
Contributor

@ajwerner ajwerner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:lgtm_strong:

Reviewed 1 of 5 files at r1, 12 of 12 files at r3, all commit messages.
Reviewable status: :shipit: complete! 1 of 0 LGTMs obtained (and 1 stale) (waiting on @RaduBerinde)

@RaduBerinde RaduBerinde changed the title multitenant: implement a "backup" rate multitenant: implement a fallback rate Sep 14, 2021
This change adjusts the tenant cost controller logic to try to
maintain a "buffer" of 5000 RUs. This is useful to prevent waiting for
more RUs if an otherwise lightly loaded pod suddenly gets a spike of
traffic.

Release note: None
This change implements a fallback throttling rate that a SQL pod can
use if it stops being able to complete token bucket
requests.

The goal is keep tenants without burst RUs throttled and tenants with
lots of RUs unthrottled (or throttled at a high rate). To achieve
this, we calculate a rate at which the tenant would burn through all
their available RUs within 1 hour. The premise here is that if we have
some kind of infrastructure problem, 1 hour is a reasonable time frame
to address it. Beyond 1 hour, the tenant will continue at the same
rate, consuming more RUs than they had available.

Informs cockroachdb#68479.

Release note: None

Release justification: Necessary fix for the distributed rate limiting
functionality, which is vital for the upcoming Serverless MVP release.
It allows CRDB to throttle clusters that have run out of free or paid
request units (which measure CPU and I/O usage). This functionality is
only enabled in multi-tenant scenarios and should have no impact on
our dedicated customers.
@RaduBerinde
Copy link
Copy Markdown
Member Author

bors r+

@craig
Copy link
Copy Markdown
Contributor

craig bot commented Sep 14, 2021

Build succeeded:

@craig craig bot merged commit a59ea3e into cockroachdb:master Sep 14, 2021
@blathers-crl
Copy link
Copy Markdown

blathers-crl bot commented Sep 14, 2021

Encountered an error creating backports. Some common things that can go wrong:

  1. The backport branch might have already existed.
  2. There was a merge conflict.
  3. The backport branch contained merge commits.

You might need to create your backport manually using the backport tool.


error creating merge commit from 56f94dd to blathers/backport-release-21.2-70163: POST https://api.github.com/repos/cockroachdb/cockroach/merges: 409 Merge conflict []

you may need to manually resolve merge conflicts with the backport tool.

Backport to branch 21.2.x failed. See errors above.


🦉 Hoot! I am a Blathers, a bot for CockroachDB. My owner is otan.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants