Skip to content

kvserver: limit the number of kv spans configurable by a tenant #70555

@irfansharif

Description

@irfansharif

#66348 outlined a scheme to support zone configs for secondary tenants. Since the unit of what we can configure in KV is a Range, tenants now being able to set zone configs grants them the ability to induce an arbitrary number of splits in KV -- something we want to put guardrails against (see RFC for more details).

There's some discussion elsewhere for how we could achieve this, copying over one in particular:

We could have each pod lease out a local quota, counting against the tenant's global limit. For N pods of a tenant, and a limit L, we could either lease out partial quotas L/N, or have a single pod lease out L entirely and require all DDLs/zcfgs to through that one pod. The scheme below describes the general case of each pod getting some fractional share of the total quota.

  • When each pod considers a DDL/zcfg txn, before committing it will attempt to acquire a quota from its local leased limit. The quota acquired is an over-estimate of how much the txn will cost (in terms of additional splits induced); we cannot under-estimate as KV could then later reject the committed zcfg.
    • If there's insufficient quota, we'll abort the txn and error out accordingly.
    • If there's sufficient quota, we'll commit the txn and rely on the eventual reconciliation to reliably succeed. We'll queue up a work item for the for the reconciliation job, marking alongside it the lease expiration of the quota it was accepted with.
  • When the local pool runs out of quota and has to request/lease more, it needs to ensure that all the txns that have been committed have been reconciled. We don't want it to be the case that a quota is leased, allocated to an operation in the work queue, it stays in the work queue past its lease expiration date, and we're able to lease the "same" quota again and queue up a second work item that will now get rejected. We need to make sure that when we lease a limit at t, there are no unprocessed work items with leased tokens expiring before t.
  • Whenever the work item is processed, the amount we over-estimated by could either be returned to the local quota pool (with the same expiration ts), or we could simply drop it entirely and rely on the local quota pool to eventually re-acquire it from the global pool.

The RFC also captures a possible schema we could use on the host tenant to store these limits:

CREATE TABLE system.spans_configurations_per_tenant (
    tenant_id               INT,
    max_span_configurations INT,
)

The actual RPCs/interfaces are TBD. Probably we also want the ability to configure these limits on a per-tenant basis.

Epic CRDB-10563

Jira issue: CRDB-10116

Metadata

Metadata

Assignees

Labels

A-kvAnything in KV that doesn't belong in a more specific category.A-multitenancyRelated to multi-tenancyA-zone-configsC-enhancementSolution expected to add code/behavior + preserve backward-compat (pg compat issues are exception)GA-blockerT-kvKV Teambranch-release-22.1Used to mark GA and release blockers, technical advisories, and bugs for 22.1

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions