streamingccl: switch to CREATE TENANT ... FROM REPLICATION#92070
streamingccl: switch to CREATE TENANT ... FROM REPLICATION#92070craig[bot] merged 3 commits intocockroachdb:masterfrom
Conversation
|
|
6c71ef0 to
29d4e0e
Compare
9ba1dae to
4df6ece
Compare
lidorcarmel
left a comment
There was a problem hiding this comment.
I did a quick pass.. just a few questions.
Reviewed 5 of 5 files at r1, 36 of 36 files at r2, all commit messages.
Reviewable status:complete! 0 of 0 LGTMs obtained (waiting on @adityamaru, @baoalvin1, @dt, and @stevendanna)
-- commits line 12 at r1:
did we say no release notes? so that we will not have docs for now?
Code quote:
Release note (sql change): add syntax to create a tenant whose data will be
streamed from a tenant on the source cluster
pkg/ccl/streamingccl/streamclient/client.go line 119 at r2 (raw file):
SrcAddr streamingccl.PartitionAddress SrcLocality roachpb.Locality SrcTenantID roachpb.TenantID
can/should we use TenantName here instead of ID?
Code quote:
SrcTenantID roachpb.TenantIDpkg/sql/sem/tree/create.go line 2190 at r1 (raw file):
// ReplicationSourceAddress is the address of the source cluster that we are // replicating data from. ReplicationSourceAddress Expr
those probably are not needed here?
can we drop CreateTenant completely and have only CreateTenantFromReplication? (and rename CreateTenantFromReplication to CreateTenant... I think you see where I'm going..)
Code quote:
// ReplicationSourceTenantName is the name of the tenant that we are
// replicating into the newly created tenant.
ReplicationSourceTenantName Name
// ReplicationSourceAddress is the address of the source cluster that we are
// replicating data from.
ReplicationSourceAddress Expr
stevendanna
left a comment
There was a problem hiding this comment.
Nice. Overall looks reasonable.
I think there is at least 1 more use of the old syntax over in the roachtest.
In a follow up PR, we probably want to rip out the RESTORE TENANT ... FROM REPLICATION syntax from sql.y and friends as well.
| sourceTenantID roachpb.TenantID, | ||
| destinationTenantID roachpb.TenantID, |
| { | ||
| $$.val = &tree.CreateTenant{Name: tree.Name($3)} | ||
| } | ||
| | CREATE TENANT name FROM REPLICATION OF name ON string_or_placeholder |
There was a problem hiding this comment.
I think this statement might be the one where we really end up wanting to use placeholders for these names. But, no need to change it now, we can do a follow-up PR where we just change them.
adityamaru
left a comment
There was a problem hiding this comment.
Reviewable status:
complete! 0 of 0 LGTMs obtained (waiting on @baoalvin1, @dt, @lidorcarmel, and @stevendanna)
Previously, lidorcarmel (Lidor Carmel) wrote…
did we say no release notes? so that we will not have docs for now?
Old habits die hard, removing!
pkg/ccl/streamingccl/streamclient/client.go line 119 at r2 (raw file):
Previously, lidorcarmel (Lidor Carmel) wrote…
can/should we use TenantName here instead of ID?
nice catch, its now an unused field. Deleted.
pkg/ccl/streamingccl/streamproducer/stream_lifetime.go line 35 at r2 (raw file):
Previously, stevendanna (Steven Danna) wrote…
Looking at this file, it looks like this is done? We create the PTS record in this function, advanced in updateReplicationStreamProgress, and then released in the producer job's OnFailOrCancel routine.
indeed, removed!
pkg/sql/parser/sql.y line 4166 at r2 (raw file):
Previously, stevendanna (Steven Danna) wrote…
I think this statement might be the one where we really end up wanting to use placeholders for these names. But, no need to change it now, we can do a follow-up PR where we just change them.
👍 makes sense
4df6ece to
24bdc3a
Compare
This change adds the following syntax: `CREATE TENANT <dest-name> FROM REPLICATION OF <source-name> ON <source-cluster>` The syntax will be hooked up in a future commit. Informs: cockroachdb#91240 Release note:None
This change is a refactor of the replication logic to support the new syntax that addresses tenants by name instead of ID. When setting up the stream ingestion job on the destination cluster we do not know the source tenant ID. This information is necessary for us to setup our rekeyer that is used to rekey every KV streamed from the source before ingestion. The `SourceTenantID` is obtained from the source cluster as part of the `Topology` that governs how we setup our streaming partitions. The rest of the streaming logic is largely unchanged. Fixes: cockroachdb#91240 Release note: None
24bdc3a to
cd00c95
Compare
@stevendanna i tacked on one more commit with changes to the roachtest. I'm just running it to confirm I didn't break anything but it looks to be running smoothly as of now. |
adityamaru
left a comment
There was a problem hiding this comment.
Yup, I'll fast follow with the deletion.
Reviewable status:
complete! 0 of 0 LGTMs obtained (waiting on @baoalvin1, @dt, @lidorcarmel, and @stevendanna)
pkg/sql/sem/tree/create.go line 2190 at r1 (raw file):
Previously, lidorcarmel (Lidor Carmel) wrote…
those probably are not needed here?
can we drop
CreateTenantcompletely and have onlyCreateTenantFromReplication? (and rename CreateTenantFromReplication to CreateTenant... I think you see where I'm going..)
my bad, this is detritus from when I was debating between introducing a new tree.Node or re-using CreateTenant.
The reason we need two distinct nodes is that CreateTenantFromReplication is a CCL node which means it relies on having a planHook that is used to plan the statement (see stream_ingestion_planning.go). This is the pattern we follow for calling CCL code from a non-CCL package like pkg/sqlsince having a non-CCL package import a CCL package is not allowed except for testing purposes.CreateTenant` OTOH is a non-ccl node and doesn't require a planhook for execution.
lidorcarmel
left a comment
There was a problem hiding this comment.
Reviewed 38 of 38 files at r3, 37 of 37 files at r4, 1 of 1 files at r5, all commit messages.
Reviewable status:complete! 0 of 0 LGTMs obtained (waiting on @adityamaru, @baoalvin1, @dt, and @stevendanna)
pkg/cmd/roachtest/tests/cluster_to_cluster.go line 109 at r5 (raw file):
t.Status("starting replication stream") streamReplStmt := fmt.Sprintf("CREATE TENANT \"%s\" FROM REPLICATION OF \"%s\" ON '%s'",
tiny nit: maybe use %q to avoid the \"? or quote everything with ' instead of "?
Code quote:
\"%s\" FROM REPLICATION OF \"%s\" pkg/sql/sem/tree/create.go line 2190 at r1 (raw file):
Previously, adityamaru (Aditya Maru) wrote…
my bad, this is detritus from when I was debating between introducing a new tree.Node or re-using
CreateTenant.The reason we need two distinct nodes is that
CreateTenantFromReplicationis a CCL node which means it relies on having aplanHookthat is used to plan the statement (see stream_ingestion_planning.go). This is the pattern we follow for calling CCL code from a non-CCL package likepkg/sqlsince having a non-CCL package import a CCL package is not allowed except for testing purposes.CreateTenant` OTOH is a non-ccl node and doesn't require a planhook for execution.
good to know, thanks for explaining!
Release note: None
cd00c95 to
1b42eb4
Compare
adityamaru
left a comment
There was a problem hiding this comment.
Reviewable status:
complete! 0 of 0 LGTMs obtained (waiting on @baoalvin1, @dt, @lidorcarmel, and @stevendanna)
pkg/cmd/roachtest/tests/cluster_to_cluster.go line 109 at r5 (raw file):
Previously, lidorcarmel (Lidor Carmel) wrote…
tiny nit: maybe use
%qto avoid the\"? or quote everything with'instead of"?
Done.
|
TFTRs! bors r=lidorcarmel,stevendanna |
|
Build succeeded: |
This change is a refactor of the replication logic
to support the new syntax that addresses tenants by name
instead of ID.
When setting up the stream ingestion job on the destination
cluster we do not know the source tenant ID. This information is necessary
for us to setup our rekeyer that is used to rekey every KV streamed
from the source before ingestion. The
SourceTenantIDis obtained fromthe source cluster as part of the
Topologythat governs how wesetup our streaming partitions.
The rest of the streaming logic is largely unchanged.
Fixes: #91240
Release note: None