release-23.1: sql: refrain from performing invalid splits during tenant creation#104974
Merged
arulajmani merged 1 commit intocockroachdb:release-23.1from Jun 16, 2023
Merged
Conversation
Normally, `MakeTenantPrefix(ID).PrefixEnd()` is the same as `MakeTenantPrefix(ID + 1)`. However, this isn't true if the ID corresponds to one of the boundary points in `EncodeUvarintAscending`. Prior to this patch, various places in the code did not appreciate this subtlety. This proved problematic when we started creating split points at `MakeTenantPrefix(ID).PrefixEnd()`. Now, if the tenant ID was equal to one of these boundary points (e.g. 109), decoding the key `MakeTenantPrefix(ID).PrefixEnd()` does not return the expected tenant ID (in our case, 110). Instead, it results in an error. Worse yet, various places in KV assume doing such tenant decoding at a range's start key is kosher. We've since disallowed KV from accepting such splits in cockroachdb#104802. This patch fixes the root cause of why these splits were being issued in the first place -- over in cockroachdb@ac54eba, we started creating spilt points at the end of a tenant's keyspace (in addition to the start of it). We did so using `PrefixEnd()`, which we have since learned is not what we wanted here. In addition to the initial split point, we also fix the span config records we seed during tenant creation. We've also added a test to ensure all split points created during cluster creation are kosher. The test uses randomized tenant IDs -- I confirmed that it fails with tenant ID = 109 (without my changes). Lastly, there's a bit more auditing work that needs to be done here about these assumptions. That's captured in a followup issue cockroachdb#104928. Prevents cockroachdb#104606 from happening. Release note (bug fix): Fixes a bug where tenant creation for certain IDs would always fail because of invalid split points. Additionally, such tenant creation failure could leave the host cluster's span config state entirely busted -- we prevent that as well.
Member
Collaborator
Author
|
Had to create this manually because the test porting wasn't a clean cherry-pick. |
nvb
approved these changes
Jun 16, 2023
knz
added a commit
to knz/cockroach
that referenced
this pull request
Sep 8, 2023
Ever since cockroachdb#104974 we have decided that the exclusive end key of a virtual keyspace is at "current tenant ID + 1". For this to be possible without special cases, it's important that this key is always computable. For this, we must ensure that no tenant can ever use an ID equal to MaxUint64. This patch achieves it. Release note: None
knz
added a commit
to knz/cockroach
that referenced
this pull request
Sep 8, 2023
Prior to this patch, the RPC authz code was restricting requests from secondary tenants to the range /Tenant/ID ... /Tenant/ID/PrefixEnd However, meanwhile, ever since cockroachdb#104974 was merged, the tenant keyspace is actually defined as the range /Tenant/ID ... /Tenant/<ID+1> Under the hood, ".PrefixEnd()" computes the "prefix end" key by incrementing the last byte. Therefore, for most integer values of ID, /Tenant/ID/PrefixEnd and /Tenant/<ID+1> are equivalent. So the mismatch above goes unnoticed until ... we run into an ID value for which they are not equivalent. An example such value is ID 511. This ID encodes to a sequence of bytes such that the .PrefixEnd() key is larger than the /Tenant/511 prefix, but strictly smaller than /Tenant/512, which is the actual exclusive end key for that virtual cluster. Because of this mismatch, the RPC code would reject any request from tenant 511 where the end key was /Tenant/512, even though it should have accepted them. This patch fixes it. Release note: None
craig bot
pushed a commit
that referenced
this pull request
Sep 8, 2023
108064: acceptance, ccl, cli, cmd, config, gossip, internal, jobs: cleanup lock usages r=Santamaura a=Santamaura This PR updates unsafe/unnecessary manual unlocks to defer unlocks. The criteria for leaving some unlocks as is: - will not cause a leak - releases the resources much earlier - deferring will cause a deadlock without significant refactor Part of #107946 Release note: None Signoffs - [x] acceptance - [x] backupccl - [x] changefeedccl - [x] multitenantccl - [x] sqlproxyccl - [x] cli - [x] cmd - [x] config - [x] gossip - [x] internal - [x] jobs 110241: *: fix bugs related to invalid use of PrefixEnd to compute tenant end key r=stevendanna a=knz Ever since #104974 we have decided that the exclusive end key of a virtual keyspace is at "current tenant ID + 1", not "start key / PrefixEnd". This PR completes the investigation requested in #104928 and fixes some bugs in the process. **See individual commits for details.** Fixes #104928. Fixes #110051. Epic: CRDB-26091 Co-authored-by: Alex Santamaura <alexsantamaura@gmail.com> Co-authored-by: Raphael 'kena' Poss <knz@thaumogen.net>
knz
added a commit
to knz/cockroach
that referenced
this pull request
Sep 12, 2023
Ever since cockroachdb#104974 we have decided that the exclusive end key of a virtual keyspace is at "current tenant ID + 1". For this to be possible without special cases, it's important that this key is always computable. For this, we must ensure that no tenant can ever use an ID equal to MaxUint64. This patch achieves it. Release note: None
knz
added a commit
to knz/cockroach
that referenced
this pull request
Sep 12, 2023
Prior to this patch, the RPC authz code was restricting requests from secondary tenants to the range /Tenant/ID ... /Tenant/ID/PrefixEnd However, meanwhile, ever since cockroachdb#104974 was merged, the tenant keyspace is actually defined as the range /Tenant/ID ... /Tenant/<ID+1> Under the hood, ".PrefixEnd()" computes the "prefix end" key by incrementing the last byte. Therefore, for most integer values of ID, /Tenant/ID/PrefixEnd and /Tenant/<ID+1> are equivalent. So the mismatch above goes unnoticed until ... we run into an ID value for which they are not equivalent. An example such value is ID 511. This ID encodes to a sequence of bytes such that the .PrefixEnd() key is larger than the /Tenant/511 prefix, but strictly smaller than /Tenant/512, which is the actual exclusive end key for that virtual cluster. Because of this mismatch, the RPC code would reject any request from tenant 511 where the end key was /Tenant/512, even though it should have accepted them. This patch fixes it. Release note: None
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Normally,
MakeTenantPrefix(ID).PrefixEnd()is the same asMakeTenantPrefix(ID + 1). However, this isn't true if the ID corresponds to one of the boundary points inEncodeUvarintAscending. Prior to this patch, various places in the code did not appreciate this subtlety. This proved problematic when we started creating split points atMakeTenantPrefix(ID).PrefixEnd(). Now, if the tenant ID was equal to one of these boundary points (e.g. 109), decoding the keyMakeTenantPrefix(ID).PrefixEnd()does not return the expected tenant ID (in our case, 110). Instead, it results in an error.Worse yet, various places in KV assume doing such tenant decoding at a range's start key is kosher. We've since disallowed KV from accepting such splits in #104802. This patch fixes the root cause of why these splits were being issued in the first place -- over in
ac54eba, we started creating spilt points at the end of a tenant's keyspace (in addition to the start of it). We did so using
PrefixEnd(), which we have since learned is not what we wanted here.In addition to the initial split point, we also fix the span config records we seed during tenant creation. We've also added a test to ensure all split points created during cluster creation are kosher. The test uses randomized tenant IDs -- I confirmed that it fails with tenant ID = 109 (without my changes).
Lastly, there's a bit more auditing work that needs to be done here about these assumptions. That's captured in a followup issue #104928.
Prevents #104606 from happening.
Release note (bug fix): Fixes a bug where tenant creation for certain IDs would always fail because of invalid split points. Additionally, such tenant creation failure could leave the host cluster's span config state entirely busted -- we prevent that as well.
Release justification: high priority fix.