Skip to content

sql: refrain from performing invalid splits during tenant creation #104920

Merged
craig[bot] merged 1 commit intocockroachdb:masterfrom
arulajmani:tenant-span-patch
Jun 15, 2023
Merged

sql: refrain from performing invalid splits during tenant creation #104920
craig[bot] merged 1 commit intocockroachdb:masterfrom
arulajmani:tenant-span-patch

Conversation

@arulajmani
Copy link
Copy Markdown
Collaborator

@arulajmani arulajmani commented Jun 14, 2023

Normally, MakeTenantPrefix(ID).PrefixEnd() is the same as
MakeTenantPrefix(ID + 1). However, this isn't true if the ID
corresponds to one of the boundary points in EncodeUvarintAscending.
Prior to this patch, various places in the code did not appreciate this
subtlety. This proved problematic when we started creating split points
at MakeTenantPrefix(ID).PrefixEnd(). Now, if the tenant ID was equal
to one of these boundary points (e.g. 109), decoding the key
MakeTenantPrefix(ID).PrefixEnd() does not return the expected tenant
ID (in our case, 110). Instead, it results in an error.

Worse yet, various places in KV assume doing such tenant decoding at a
range's start key is kosher. We've since disallowed KV from accepting
such splits in #104802.
This patch fixes the root cause of why these splits were being issued
in the first place -- over in
ac54eba,
we started creating spilt points at the end of a tenant's keyspace
(in addition to the start of it). We did so using PrefixEnd(), which
we have since learned is not what we wanted here.

In addition to the initial split point, we also fix the span config
records we seed during tenant creation. We've also added a test to
ensure all split points created during cluster creation are kosher.
The test uses randomized tenant IDs -- I confirmed that it fails with
tenant ID = 109 (without my changes).

Lastly, there's a bit more auditing work that needs to be done here
about these assumptions. That's captured in a followup issue
#104928.

Prevents #104606 from
happening.

Release note (bug fix): Fixes a bug where tenant creation for certain
IDs would always fail because of invalid split points. Additionally,
such tenant creation failure could leave the host cluster's span
config state entirely busted -- we prevent that as well.

@cockroach-teamcity
Copy link
Copy Markdown
Member

This change is Reviewable

@blathers-crl
Copy link
Copy Markdown

blathers-crl bot commented Jun 14, 2023

It looks like your PR touches production code but doesn't add or edit any test code. Did you consider adding tests to your PR?

🦉 Hoot! I am a Blathers, a bot for CockroachDB. My owner is dev-inf.

@arulajmani arulajmani changed the title sql: correctly split during tenant creation sql: refrain from performing invalid splits during tenant creation Jun 14, 2023
@arulajmani arulajmani marked this pull request as ready for review June 14, 2023 23:24
@arulajmani arulajmani requested a review from a team as a code owner June 14, 2023 23:24
@arulajmani arulajmani requested a review from a team June 14, 2023 23:24
@arulajmani arulajmani requested review from a team as code owners June 14, 2023 23:24
@arulajmani arulajmani requested a review from nvb June 14, 2023 23:24
@arulajmani arulajmani added the backport-23.1.x PAST MAINTENANCE SUPPORT: 23.1 patch releases via ER request only label Jun 14, 2023
@arulajmani arulajmani requested a review from irfansharif June 14, 2023 23:34
Normally, `MakeTenantPrefix(ID).PrefixEnd()` is the same as
`MakeTenantPrefix(ID + 1)`. However, this isn't true if the ID
corresponds to one of the boundary points in `EncodeUvarintAscending`.
Prior to this patch, various places in the code did not appreciate this
subtlety. This proved problematic when we started creating split points
at `MakeTenantPrefix(ID).PrefixEnd()`. Now, if the tenant ID was equal
to one of these boundary points (e.g. 109), decoding the key
`MakeTenantPrefix(ID).PrefixEnd()` does not return the expected tenant
ID (in our case, 110). Instead, it results in an error.

Worse yet, various places in KV assume doing such tenant decoding at a
range's start key is kosher. We've since disallowed KV from accepting
such splits in cockroachdb#104802.
This patch fixes the root cause of why these splits were being issued
in the first place -- over in
cockroachdb@ac54eba,
we started creating spilt points at the end of a tenant's keyspace
(in addition to the start of it). We did so using `PrefixEnd()`, which
we have since learned is not what we wanted here.

In addition to the initial split point, we also fix the span config
records we seed during tenant creation. We've also added a test to
ensure all split points created during cluster creation are kosher.
The test uses randomized tenant IDs -- I confirmed that it fails with
tenant ID = 109 (without my changes).

Lastly, there's a bit more auditing work that needs to be done here
about these assumptions. That's captured in a followup issue
cockroachdb#104928.

Prevents cockroachdb#104606 from
happening.

Release note (bug fix): Fixes a bug where tenant creation for certain
IDs would always fail because of invalid split points. Additionally,
such tenant creation failure could leave the host cluster's span
config state entirely busted -- we prevent that as well.
@arulajmani
Copy link
Copy Markdown
Collaborator Author

The datarace failure here seems to be #104838

Copy link
Copy Markdown
Contributor

@nvb nvb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:lgtm:

Reviewed 8 of 8 files at r1, all commit messages.
Reviewable status: :shipit: complete! 1 of 0 LGTMs obtained (waiting on @irfansharif)

@arulajmani
Copy link
Copy Markdown
Collaborator Author

TFTRs!

bors r=nvanbenschoten,ecwall

@craig
Copy link
Copy Markdown
Contributor

craig bot commented Jun 15, 2023

Build succeeded:

@craig craig bot merged commit 3c55b81 into cockroachdb:master Jun 15, 2023
@blathers-crl
Copy link
Copy Markdown

blathers-crl bot commented Jun 15, 2023

Encountered an error creating backports. Some common things that can go wrong:

  1. The backport branch might have already existed.
  2. There was a merge conflict.
  3. The backport branch contained merge commits.

You might need to create your backport manually using the backport tool.


error creating merge commit from 42fc585 to blathers/backport-release-23.1-104920: POST https://api.github.com/repos/cockroachdb/cockroach/merges: 409 Merge conflict []

you may need to manually resolve merge conflicts with the backport tool.

Backport to branch 23.1.x failed. See errors above.


🦉 Hoot! I am a Blathers, a bot for CockroachDB. My owner is dev-inf.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

backport-23.1.x PAST MAINTENANCE SUPPORT: 23.1 patch releases via ER request only

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants