Skip to content

server: hash tenantID into ClusterID used in SQL exec#74005

Merged
craig[bot] merged 1 commit intocockroachdb:masterfrom
dt:cluster-id
Apr 6, 2022
Merged

server: hash tenantID into ClusterID used in SQL exec#74005
craig[bot] merged 1 commit intocockroachdb:masterfrom
dt:cluster-id

Conversation

@dt
Copy link
Copy Markdown
Contributor

@dt dt commented Dec 17, 2021

Fixes #74480.

Prior to this patch, in a multi-tenant deployment the ClusterID value
visible to tenans was the same as the ClusterID for the storage
cluster.

This original design was defective because it prevented the
backup/restore subsystem from distinguishing backups made for
different tenants.

In this commit, we change the ClusterID for tenants to become a new
UUID that is the storage cluster's ClusterID with a hash of the
tenant's ID added to the lower 64 bits. This makes it so that two
tenants on the same storage cluster will observe different values when
calling ClusterID().

Release note (bug fix): Separate tenants now use unique values for the
internal field ClusterID. Previously separate tenants that
were hosted on the same storage cluster would be assigned the same
ClusterID.

Jira issue: CRDB-14750

@dt dt added the do-not-merge bors won't merge a PR with this label. label Dec 17, 2021
@dt dt requested a review from knz December 17, 2021 23:52
@dt dt marked this pull request as draft December 17, 2021 23:52
@cockroach-teamcity
Copy link
Copy Markdown
Member

This change is Reviewable

@dt
Copy link
Copy Markdown
Contributor Author

dt commented Dec 18, 2021

I looked at who all uses ClusterID in pkg/sql and it looks like there are really just two classes of use:
a) returning it in crdb_internal debugging/introspection builtins/tables, and
b) passing it to license checks for ccl features (partitioning, MR, etc) since license can be cluster-id scoped.

Case (a) seems ambivalent to what the value it -- it just shows you it.
Case (b) would potentially be interesting if tenants were checking license keys against specific cluster IDs during SQL feature usage calls, but they don't -- they toggle a licensed status once at startup using a separate mechanism and then the checks take a fast path seeing the toggle on (we should maybe do this everywhere, instead of needing to plumb around a cluster id to every check site).

Looking at pkg/ccl, it has a bunch more cases of (b) and then there's backup's case, where we really do need the semantics described here i.e. no two tenants ever have same value.

@knz knz requested review from a team and RaduBerinde December 21, 2021 17:59
@dt dt marked this pull request as ready for review January 4, 2022 14:28
@dt dt removed the do-not-merge bors won't merge a PR with this label. label Jan 4, 2022
Copy link
Copy Markdown
Contributor

@knz knz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed 2 of 2 files at r1, all commit messages.
Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @dt and @RaduBerinde)


pkg/server/server_sql.go, line 528 at r1 (raw file):

	// hash of tenant ID achieves this.
	if !codec.ForSystemTenant() {
		clusterIDForSQL = &base.ClusterIDContainer{}

Can you change this mechanism to switch to writing to cfg.ClusterIDContainer, thanks

Copy link
Copy Markdown
Contributor

@knz knz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It also might be worth adding a unit test that checks that for a single host cluster with two tenants, all 3 things (system tenant plus each of the two non-system tenants) see different cluster IDs.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @dt and @RaduBerinde)

Copy link
Copy Markdown
Contributor

@knz knz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @dt and @RaduBerinde)


pkg/server/server_sql.go, line 528 at r1 (raw file):

Previously, knz (kena) wrote…

Can you change this mechanism to switch to writing to cfg.ClusterIDContainer, thanks

Hm actually, can we talk about this, I don't exactly understand what is going on here.

Copy link
Copy Markdown
Contributor Author

@dt dt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @knz and @RaduBerinde)


pkg/server/server_sql.go, line 528 at r1 (raw file):

Previously, knz (kena) wrote…

Hm actually, can we talk about this, I don't exactly understand what is going on here.

I expanded the comment a bit as a starting point, since if it isn't clear to someone who talked it through previously from the diff, then I clearly needed to do more for future readers too. Does the expanded comment help at all?

Copy link
Copy Markdown
Contributor

@knz knz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed all commit messages.
Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @dt, @knz, and @RaduBerinde)


pkg/server/server_sql.go, line 528 at r1 (raw file):

Previously, dt (David Taylor) wrote…

I expanded the comment a bit as a starting point, since if it isn't clear to someone who talked it through previously from the diff, then I clearly needed to do more for future readers too. Does the expanded comment help at all?

yep that is good, thanks


pkg/server/server_sql.go, line 525 at r2 (raw file):

	// clusterIDForSQL is the cluster ID that SQL processes use.
	//
	// Not that this is separate from, but sometimes pointed at, the ClusterID in

nit: Note


pkg/server/server_sql.go, line 539 at r2 (raw file):

	// interacting with processes on the same cluster or not. For example, when a
	// BACKUP job resumes writing to some external files, it is vital that be able
	// to verify that the data in those files what wriitten by the same cluster,

nit: written


pkg/server/server_sql.go, line 545 at r2 (raw file):

	// Thus for the non-system tenant, this ClusterID is derived from the unique
	// host cluster's ID, but with the TenantID hased into its lower bits to make
	// an ID that is unique to this tenant on this host cluster.

Can you also add:

Note: we are careful not to populate this per-tenant cluster ID into the
ClusterIDContainer referenced to by cfg.rpcContext, which is also the same
one referenced to by cfg.ClusterIDContainer, cfg.idProvider and several other places.
There are two reasons for this. The main reason is that these references to the
host cluster's cluster ID are used in logging and telemetry to ensure that
log/telemetry entries from tenants connected to the same host cluster
are identified as such, using a common cluster ID.

And I'd like to see these various assertions exercised in unit tests:

  • check that the SQL idea of cluster ID indeed ends up different
  • check that cluster IDs included in the log entries (esp with output format JSON) remain the same

@knz
Copy link
Copy Markdown
Contributor

knz commented Jan 5, 2022

@dt: can you file an issue and link it from the PR, then ping @mwang1026 on it? Michael and Jon need it for accounting purposes.

Copy link
Copy Markdown
Contributor Author

@dt dt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @dt, @knz, and @RaduBerinde)


pkg/server/server_sql.go, line 545 at r2 (raw file):

main reason is that these references to the host cluster's cluster ID are used in logging and telemetry

er, I think the main reason is that that rpcContext is what we use to make RPCs to the host cluster, so it needs to have a matching cluster ID in it. any logging / telemetry usage is secondary.

Copy link
Copy Markdown
Contributor Author

@dt dt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @dt, @knz, and @RaduBerinde)


pkg/server/server_sql.go, line 528 at r1 (raw file):

Previously, knz (kena) wrote…

yep that is good, thanks

Done.


pkg/server/server_sql.go, line 545 at r2 (raw file):

check that cluster IDs included in the log entries (esp with output format JSON) remain the same

Hm, I think we should log them both, separately. The tenant cluster -- that is, the sql pods -- have their cluster ID, and then they're hosted by/connected to some kv host cluster to provide their storage. We should log both of those, not just the latter.

@dt dt requested review from a team as code owners January 5, 2022 23:24
Copy link
Copy Markdown
Contributor

@knz knz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed 1 of 2 files at r2, 3 of 3 files at r3, all commit messages.
Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @dt and @RaduBerinde)


pkg/server/server_sql.go, line 545 at r2 (raw file):

I think the main reason is that that rpcContext is what we use to make RPCs to the host cluster

Yes you're right. Note I wrote "there are two reasons" but I only wrote one. You can insert the other first :)

I think we should log them both, separately

What you did for the messages at server startup is a-ok 💯

However this comment here (and the reason for me to insist this be documented) is that separately we also log the host cluster ID on every log entry, all the time for the purpose of routing log data to the right collection point. That value needs to be correct and not tweaked. And that's also what deserves being validated in a unit test.

Maybe you can phrase this better than I did in the comment?

@knz
Copy link
Copy Markdown
Contributor

knz commented Jan 18, 2022

Do you need help with moving this forward?

@dt
Copy link
Copy Markdown
Contributor Author

dt commented Jan 18, 2022

Yeah, I had to push this to the back burner for some more urgent bug fixes and backports over the last couple weeks so I haven't gotten around to writing the relevant tests.

One thing to note: this currently does change the cluster ID reported by a pod when it sends its usage telemetry, to the new tenant-unique cluster ID. I'm of two minds on this.

On the one hand, I think this is Good/Ideal/Correct from an ideological point. Tenant A and Tenant B are distinct two distinct clusters from the point of view of using sql features, making tables, running types of queries, etc, and being on the same host cluster or not is irrelevant to that, so when reporting and collecting telemetry on that sql usage, reporting it under and tracking it under their separate cluster IDs, the same as if they were two separate dedicated clusters, seems most correct to me. If you want to know how many clusters have more than 1000 tables, counting up the distinct cluster IDs that reported >1k table count should tell you the right answer regardless of how many were dedicated, serverless on separate hosts, or serverless but on same host. Now of course we might also want to look for trends correlated to host clusters -- does a certain host cluster skew some reported metric or something -- so we'd also want to include new, separate hostClusterID field in from serverless clusters, but from a principled view on what a unique cluster is, I think I like each tenant sending its telemetry using its individual cluster ID as its "cluster".

On the other hand, the existing reporting has already been extended to handle distinct serverless clusters not reporting under distinct cluster IDs, and instead has started treating the pair of <cluster id, tenant id> as a unique identifier for a logical cluster (perhaps the fact we had to go do that was a signal we actually should have been doing something like this PR, to give individual clusters their own IDs?). Thus, a more practical answer is to just have the telemetry reporter grab the host cluster ID when generating reports to continue to conform to the existing expectation. This certainly seems expedient, if maybe less ideologically pure?

@knz
Copy link
Copy Markdown
Contributor

knz commented Jan 18, 2022

let's ask @thtruo what a good design would be here.

@knz
Copy link
Copy Markdown
Contributor

knz commented Jan 20, 2022

Discussed with @thtruo .
For telemetry, we're moving away from the crdb-known ClusterID to identify clusters.
Instead, (my understanding is that) the log collection pipeline is adding the control plane's idea of the cluster's identity in the envelope of telemetry messages. We also populate the tenant ID, and there may even be an "org ID" in the future.

If that is correct, then we're in the clear with making the cluster ID work the way you've defined here.

@dt
Copy link
Copy Markdown
Contributor Author

dt commented Jan 20, 2022

Thanks for resolving that!

I feel like there's a similar argument to the one I made above to be made about the cluster ID in structured logging messages as well -- that each serverless tenant cluster should emit log messages tagged with its own unique cluster ID as the cluster, rather than the cluster ID of the host cluster it is attached to (which could go in a separate field, or be determined by looking at the high bits), but I see you implied otherwise in earlier comments here. I don't think I feel strongly enough to argue the point though, so I think I'd like to also leave that determination to server folks.

@pransudash
Copy link
Copy Markdown
Contributor

Quick question here for my clarity: Is the proposed solution then to have clusterId be the same as the unique id used across the control plane and in crdb intrusion tables? Or would this hash of clusterId and tenantId result in a new id? If it's the latter, then telemetry logging will not be able to link events to specific clusters.

@knz
Copy link
Copy Markdown
Contributor

knz commented Jan 28, 2022

I feel like there's a similar argument to the one I made above to be made about the cluster ID in structured logging messages as well -- that each serverless tenant cluster should emit log messages tagged with its own unique cluster ID as the cluster, rather than the cluster ID of the host cluster it is attached to (which could go in a separate field, or be determined by looking at the high bits), but I see you implied otherwise in earlier comments here.

Yes, this flows from the discussion with Tommy, and invalidates my earlier arguments.

This makes me wonder how much we need to preserve the host storage cluster ID in SQL instances at all.

Is the proposed solution then to have clusterId be the same as the unique id used across the control plane and in crdb intrusion tables?

no, the opposite, however...

If it's the latter, then telemetry logging will not be able to link events to specific clusters.

This is mistaken, if my understanding of the logging pipeline is right. @thtruo or @logston should be able to confirm this: I believe that the cluster ID of the underlying storage cluster is added to telemetry messages alongside whichever value crdb has chosen. (This addition is a feature I've requested a few months ago, due to a bug in crdb: it's possible for some logging messages to fail to include a cluster ID when it originates in crdb, so we really appreciate that the log collector re-adds it regardless of what crdb does).

@knz
Copy link
Copy Markdown
Contributor

knz commented Jan 28, 2022

@dt in light of these findings, how much remains on this PR?

@dt dt requested review from a team and stevendanna and removed request for a team April 5, 2022 16:38
@knz knz force-pushed the cluster-id branch 2 times, most recently from 9a3db0c to 49598ac Compare April 5, 2022 17:49
Copy link
Copy Markdown
Contributor

@knz knz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok so as per my tweaks to the code:

  • the fields "storage cluster ID" and "logical cluster ID" are now properly named throughout the code, especially in the locations where both are important side-by-side.

  • the diagnostic upload via HTTP (not telemetry logging) now uses the storage-level cluster ID in the uuid query parameter, and the per-tenant cluster ID in the (new) logical_uuid query parameter.

    I am not sure the PM team actually cares about this (I don't think we do HTTP diagnostics report for multitenant) so I won't make more noise on this topic.

Reviewed 1 of 3 files at r4, 40 of 48 files at r5, 17 of 17 files at r6, 5 of 5 files at r7, all commit messages.
Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @RaduBerinde and @stevendanna)

@otan otan removed the request for review from a team April 6, 2022 02:09
@knz knz force-pushed the cluster-id branch 4 times, most recently from 00bf2e2 to 6a1853f Compare April 6, 2022 17:48
Prior to this patch, in a multi-tenant deployment the ClusterID value
visible to tenans was the same as the ClusterID for the storage
cluster.

This original design was defective because it prevented the
backup/restore subsystem from distinguishing backups made for
different tenants.

In this commit, we define a new *logical* ClusterID for tenants to
become a new UUID that is the storage cluster's ClusterID with a hash
of the tenant's ID added to the lower 64 bits. This makes it so that
two tenants on the same storage cluster will observe different values
when calling ClusterID().

Release note (bug fix): Separate tenants now use unique values for the
internal field ClusterID. Previously separate tenants that
were hosted on the same storage cluster would be assigned the same
ClusterID.
@knz
Copy link
Copy Markdown
Contributor

knz commented Apr 6, 2022

all right, this seems good to go now.

bors r+

@knz
Copy link
Copy Markdown
Contributor

knz commented Apr 6, 2022

@dt I assume we want this backported to 22.1?

@craig
Copy link
Copy Markdown
Contributor

craig bot commented Apr 6, 2022

Build succeeded:

@dt
Copy link
Copy Markdown
Contributor Author

dt commented Apr 6, 2022

Yep, that'd be my vote (the linked issue is marked GA blocker)

@craig craig bot merged commit d4daa99 into cockroachdb:master Apr 6, 2022
@dt
Copy link
Copy Markdown
Contributor Author

dt commented Apr 6, 2022

Also, thank you for taking this over!!!

@knz
Copy link
Copy Markdown
Contributor

knz commented Apr 6, 2022

blathers backport 22.1

@blathers-crl
Copy link
Copy Markdown

blathers-crl bot commented Apr 6, 2022

Encountered an error creating backports. Some common things that can go wrong:

  1. The backport branch might have already existed.
  2. There was a merge conflict.
  3. The backport branch contained merge commits.

You might need to create your backport manually using the backport tool.


error creating merge commit from 0dc2929 to blathers/backport-release-22.1-74005: POST https://api.github.com/repos/cockroachdb/cockroach/merges: 409 Merge conflict []

you may need to manually resolve merge conflicts with the backport tool.

Backport to branch 22.1 failed. See errors above.


🦉 Hoot! I am a Blathers, a bot for CockroachDB. My owner is otan.

@dt dt deleted the cluster-id branch April 7, 2022 12:26
craig bot pushed a commit that referenced this pull request Apr 8, 2022
79622: roachtest: update libpq test r=otan a=rafiss

fixes #79570

This allows the query cancel tests to pass.

Release note: None

79632: cli: cosmetic change in the output of `start` r=dt a=knz

This fixes a cosmetic bug that made its way in the previous change
to `reportServerInfo`. (#74005)

Release note: None

Co-authored-by: Rafi Shamim <rafi@cockroachlabs.com>
Co-authored-by: Raphael 'kena' Poss <knz@thaumogen.net>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

server: ClusterID non-unique for tenants

6 participants