Skip to content

release-23.1: sql: refrain from performing invalid splits during tenant creation#104974

Merged
arulajmani merged 1 commit intocockroachdb:release-23.1from
arulajmani:backport23.1-104920
Jun 16, 2023
Merged

release-23.1: sql: refrain from performing invalid splits during tenant creation#104974
arulajmani merged 1 commit intocockroachdb:release-23.1from
arulajmani:backport23.1-104920

Conversation

@arulajmani
Copy link
Copy Markdown
Collaborator

@arulajmani arulajmani commented Jun 15, 2023

Normally, MakeTenantPrefix(ID).PrefixEnd() is the same as MakeTenantPrefix(ID + 1). However, this isn't true if the ID corresponds to one of the boundary points in EncodeUvarintAscending. Prior to this patch, various places in the code did not appreciate this subtlety. This proved problematic when we started creating split points at MakeTenantPrefix(ID).PrefixEnd(). Now, if the tenant ID was equal to one of these boundary points (e.g. 109), decoding the key MakeTenantPrefix(ID).PrefixEnd() does not return the expected tenant ID (in our case, 110). Instead, it results in an error.

Worse yet, various places in KV assume doing such tenant decoding at a range's start key is kosher. We've since disallowed KV from accepting such splits in #104802. This patch fixes the root cause of why these splits were being issued in the first place -- over in
ac54eba, we started creating spilt points at the end of a tenant's keyspace (in addition to the start of it). We did so using PrefixEnd(), which we have since learned is not what we wanted here.

In addition to the initial split point, we also fix the span config records we seed during tenant creation. We've also added a test to ensure all split points created during cluster creation are kosher. The test uses randomized tenant IDs -- I confirmed that it fails with tenant ID = 109 (without my changes).

Lastly, there's a bit more auditing work that needs to be done here about these assumptions. That's captured in a followup issue #104928.

Prevents #104606 from happening.

Release note (bug fix): Fixes a bug where tenant creation for certain IDs would always fail because of invalid split points. Additionally, such tenant creation failure could leave the host cluster's span config state entirely busted -- we prevent that as well.
Release justification: high priority fix.

Normally, `MakeTenantPrefix(ID).PrefixEnd()` is the same as
`MakeTenantPrefix(ID + 1)`. However, this isn't true if the ID
corresponds to one of the boundary points in `EncodeUvarintAscending`.
Prior to this patch, various places in the code did not appreciate this
subtlety. This proved problematic when we started creating split points
at `MakeTenantPrefix(ID).PrefixEnd()`. Now, if the tenant ID was equal
to one of these boundary points (e.g. 109), decoding the key
`MakeTenantPrefix(ID).PrefixEnd()` does not return the expected tenant
ID (in our case, 110). Instead, it results in an error.

Worse yet, various places in KV assume doing such tenant decoding at a
range's start key is kosher. We've since disallowed KV from accepting
such splits in cockroachdb#104802.
This patch fixes the root cause of why these splits were being issued
in the first place -- over in
cockroachdb@ac54eba,
we started creating spilt points at the end of a tenant's keyspace
(in addition to the start of it). We did so using `PrefixEnd()`, which
we have since learned is not what we wanted here.

In addition to the initial split point, we also fix the span config
records we seed during tenant creation. We've also added a test to
ensure all split points created during cluster creation are kosher.
The test uses randomized tenant IDs -- I confirmed that it fails with
tenant ID = 109 (without my changes).

Lastly, there's a bit more auditing work that needs to be done here
about these assumptions. That's captured in a followup issue
cockroachdb#104928.

Prevents cockroachdb#104606 from
happening.

Release note (bug fix): Fixes a bug where tenant creation for certain
IDs would always fail because of invalid split points. Additionally,
such tenant creation failure could leave the host cluster's span
config state entirely busted -- we prevent that as well.
@arulajmani arulajmani requested a review from nvb June 15, 2023 14:07
@arulajmani arulajmani requested review from a team as code owners June 15, 2023 14:07
@cockroach-teamcity
Copy link
Copy Markdown
Member

This change is Reviewable

@arulajmani
Copy link
Copy Markdown
Collaborator Author

Had to create this manually because the test porting wasn't a clean cherry-pick.

@arulajmani arulajmani merged commit c93080b into cockroachdb:release-23.1 Jun 16, 2023
knz added a commit to knz/cockroach that referenced this pull request Sep 8, 2023
Ever since cockroachdb#104974 we have decided that the exclusive end key of a
virtual keyspace is at "current tenant ID + 1".

For this to be possible without special cases, it's important that
this key is always computable. For this, we must ensure that no tenant
can ever use an ID equal to MaxUint64.

This patch achieves it.

Release note: None
knz added a commit to knz/cockroach that referenced this pull request Sep 8, 2023
Prior to this patch, the RPC authz code was restricting
requests from secondary tenants to the range

   /Tenant/ID ... /Tenant/ID/PrefixEnd

However, meanwhile, ever since cockroachdb#104974 was merged,
the tenant keyspace is actually defined as the range

   /Tenant/ID ... /Tenant/<ID+1>

Under the hood, ".PrefixEnd()" computes the "prefix end" key by
incrementing the last byte. Therefore, for most integer values of ID,
/Tenant/ID/PrefixEnd and /Tenant/<ID+1> are equivalent. So the
mismatch above goes unnoticed until ... we run into an ID value for
which they are not equivalent.

An example such value is ID 511. This ID encodes to a sequence of
bytes such that the .PrefixEnd() key is larger than the /Tenant/511
prefix, but strictly smaller than /Tenant/512, which is the actual
exclusive end key for that virtual cluster.

Because of this mismatch, the RPC code would reject any request
from tenant 511 where the end key was /Tenant/512, even though it
should have accepted them.

This patch fixes it.

Release note: None
craig bot pushed a commit that referenced this pull request Sep 8, 2023
108064: acceptance, ccl, cli, cmd, config, gossip, internal, jobs: cleanup lock usages r=Santamaura a=Santamaura

This PR updates unsafe/unnecessary manual unlocks to defer unlocks.
The criteria for leaving some unlocks as is:
 - will not cause a leak
 - releases the resources much earlier
 - deferring will cause a deadlock without significant refactor

Part of #107946

Release note: None

Signoffs
- [x] acceptance
- [x] backupccl
- [x] changefeedccl
- [x] multitenantccl
- [x] sqlproxyccl
- [x] cli
- [x] cmd
- [x] config
- [x] gossip
- [x] internal
- [x] jobs

110241: *: fix bugs related to invalid use of PrefixEnd to compute tenant end key r=stevendanna a=knz

Ever since #104974 we have decided that the exclusive end key of a virtual keyspace is at "current tenant ID + 1", not "start key / PrefixEnd".

This PR completes the investigation requested in #104928 and fixes some bugs in the process.

**See individual commits for details.**

Fixes #104928.
Fixes #110051.
Epic: CRDB-26091

Co-authored-by: Alex Santamaura <alexsantamaura@gmail.com>
Co-authored-by: Raphael 'kena' Poss <knz@thaumogen.net>
knz added a commit to knz/cockroach that referenced this pull request Sep 12, 2023
Ever since cockroachdb#104974 we have decided that the exclusive end key of a
virtual keyspace is at "current tenant ID + 1".

For this to be possible without special cases, it's important that
this key is always computable. For this, we must ensure that no tenant
can ever use an ID equal to MaxUint64.

This patch achieves it.

Release note: None
knz added a commit to knz/cockroach that referenced this pull request Sep 12, 2023
Prior to this patch, the RPC authz code was restricting
requests from secondary tenants to the range

   /Tenant/ID ... /Tenant/ID/PrefixEnd

However, meanwhile, ever since cockroachdb#104974 was merged,
the tenant keyspace is actually defined as the range

   /Tenant/ID ... /Tenant/<ID+1>

Under the hood, ".PrefixEnd()" computes the "prefix end" key by
incrementing the last byte. Therefore, for most integer values of ID,
/Tenant/ID/PrefixEnd and /Tenant/<ID+1> are equivalent. So the
mismatch above goes unnoticed until ... we run into an ID value for
which they are not equivalent.

An example such value is ID 511. This ID encodes to a sequence of
bytes such that the .PrefixEnd() key is larger than the /Tenant/511
prefix, but strictly smaller than /Tenant/512, which is the actual
exclusive end key for that virtual cluster.

Because of this mismatch, the RPC code would reject any request
from tenant 511 where the end key was /Tenant/512, even though it
should have accepted them.

This patch fixes it.

Release note: None
@yuzefovich yuzefovich changed the title sql: refrain from performing invalid splits during tenant creation release-23.1: sql: refrain from performing invalid splits during tenant creation Sep 12, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants