Skip to content

sql: add support for CREATE TABLE AS ... AS OF SYSTEM TIME#142147

Merged
craig[bot] merged 1 commit intocockroachdb:masterfrom
rafiss:ctas-aost
Mar 3, 2025
Merged

sql: add support for CREATE TABLE AS ... AS OF SYSTEM TIME#142147
craig[bot] merged 1 commit intocockroachdb:masterfrom
rafiss:ctas-aost

Conversation

@rafiss
Copy link
Copy Markdown
Collaborator

@rafiss rafiss commented Feb 28, 2025

Previously, running a statement of the form

CREATE TABLE t AS SELECT ... FROM ... AS OF SYSTEM TIME x

was not supported.

Now, it is supported. The semantics are that the table creation happens at the transaction timestamp, but the backfill that's performed to fetch the data from the SELECT is performed at the user-specified timestamp x.

This is useful for copying data from tables that are experiencing write traffic. Reading the contended table's data at a historical timestamp avoids contention on the CREATE TABLE AS.

informs #39123
Epic: CRDB-43310
Release note (sql change): CREATE TABLE AS SELECT ... FROM ... AS OF SYSTEM TIME x is now supported. It cannot be executed within an explicit transaction.

@rafiss rafiss requested a review from fqazi February 28, 2025 20:29
@rafiss rafiss requested review from a team as code owners February 28, 2025 20:29
@rafiss rafiss requested review from rytaft and removed request for a team February 28, 2025 20:29
@cockroach-teamcity
Copy link
Copy Markdown
Member

This change is Reviewable

Copy link
Copy Markdown
Collaborator

@fqazi fqazi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great work! Just a few minor questions.

:lgtm:

Reviewed 7 of 7 files at r1, all commit messages.
Reviewable status: :shipit: complete! 1 of 0 LGTMs obtained (waiting on @rytaft)


pkg/sql/conn_executor_exec.go line 2788 at r1 (raw file):

			return nil
		}
		originalTxn := planner.txn

As a future thing we may want a protected timestamp


pkg/sql/conn_executor_exec.go line 2798 at r1 (raw file):

		planner.extendedEvalCtx.Descs = historicalCollection
		restoreOriginalPlanner = func() {
			historicalCollection.ReleaseAll(ctx)

Does the historicalTxn itself need a clean up? Or will the planner commit / rollback the txn on its own.

Copy link
Copy Markdown
Contributor

@spilchen spilchen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 1 of 0 LGTMs obtained (waiting on @fqazi, @rafiss, and @rytaft)


pkg/sql/conn_executor_exec.go line 2788 at r1 (raw file):

Previously, fqazi (Faizan Qazi) wrote…

As a future thing we may want a protected timestamp

Would the protected timestamp prevent GC from removing deleted rows during backfill? If so, it seems kind of important.


pkg/sql/opt/optbuilder/select.go line 1489 at r1 (raw file):

	switch b.stmt.(type) {
	case *tree.CreateTable:
		asOf.ForBackfill = true

Why do we need to set this here? It appears that the asOf struct is generated, validated, and then discarded. Or is this for the *b.evalCtx.AsOfSystemTime != asOf check below?

Copy link
Copy Markdown
Collaborator Author

@rafiss rafiss left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 1 of 0 LGTMs obtained (waiting on @fqazi, @rytaft, and @spilchen)


pkg/sql/conn_executor_exec.go line 2788 at r1 (raw file):

Previously, spilchen wrote…

Would the protected timestamp prevent GC from removing deleted rows during backfill? If so, it seems kind of important.

we already add a protected timestamp during the backfill here:

ptsCleaners = append(ptsCleaners,
sc.execCfg.ProtectedTimestampManager.TryToProtectBeforeGC(ctx, sc.job, tbl, ts))

@fqazi were you suggesting that we should add one during planning also? i don't think we should necessarily. planning should be very quick, and it is expected to fail if the chosen timestamp is from before the GC threshold.


pkg/sql/conn_executor_exec.go line 2798 at r1 (raw file):

Previously, fqazi (Faizan Qazi) wrote…

Does the historicalTxn itself need a clean up? Or will the planner commit / rollback the txn on its own.

good point; i added a commit


pkg/sql/opt/optbuilder/select.go line 1489 at r1 (raw file):

Previously, spilchen wrote…

Why do we need to set this here? It appears that the asOf struct is generated, validated, and then discarded. Or is this for the *b.evalCtx.AsOfSystemTime != asOf check below?

yeah, this is for the *b.evalCtx.AsOfSystemTime != asOf check below. i can mention this in the comment.

Copy link
Copy Markdown
Collaborator

@fqazi fqazi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 1 of 0 LGTMs obtained (waiting on @rytaft and @spilchen)


pkg/sql/conn_executor_exec.go line 2788 at r1 (raw file):

Previously, rafiss (Rafi Shamim) wrote…

we already add a protected timestamp during the backfill here:

ptsCleaners = append(ptsCleaners,
sc.execCfg.ProtectedTimestampManager.TryToProtectBeforeGC(ctx, sc.job, tbl, ts))

@fqazi were you suggesting that we should add one during planning also? i don't think we should necessarily. planning should be very quick, and it is expected to fail if the chosen timestamp is from before the GC threshold.

Done.

Ah, so this is already covered. We shouldn't need one for planning

Previously, running a statement of the form

    CREATE TABLE t AS SELECT ... FROM ... AS OF SYSTEM TIME x

was not supported.

Now, it is supported. The semantics are that the table creation happens
at the transaction timestamp, but the backfill that's performed to fetch
the data from the `SELECT` is performed at the user-specified timestamp
x.

This is useful for copying data from tables that are experiencing write
traffic. Reading the contended table's data at a historical timestamp
avoids contention on the CREATE TABLE AS.

Release note (sql change): CREATE TABLE AS SELECT ... FROM ... AS OF
SYSTEM TIME x is now supported. It cannot be executed within an explicit
transaction.
Copy link
Copy Markdown
Contributor

@spilchen spilchen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:lgtm:

Reviewed 3 of 3 files at r2, 1 of 1 files at r3.
Reviewable status: :shipit: complete! 1 of 0 LGTMs obtained (and 1 stale) (waiting on @fqazi and @rytaft)

@rafiss
Copy link
Copy Markdown
Collaborator Author

rafiss commented Mar 3, 2025

tftrs!

bors r+

@craig
Copy link
Copy Markdown
Contributor

craig bot commented Mar 3, 2025

@craig craig bot merged commit 36aa616 into cockroachdb:master Mar 3, 2025
23 of 24 checks passed
@BramGruneir
Copy link
Copy Markdown
Member

This is amazing! 🥳

@rafiss rafiss deleted the ctas-aost branch March 4, 2025 23:24
@drewdeally
Copy link
Copy Markdown

Will this support DML operations such as insert into A select * from B asot ?

@rafiss
Copy link
Copy Markdown
Collaborator Author

rafiss commented Mar 5, 2025

No, that's a more general issue tracked in #35712. There aren't plans to work on that currently, but if there's demand for it, a TREQ is the way to go.

We also have merged #142259, which allows AS OF SYSTEM TIME to be used with CREATE MATERIALIZED VIEW and REFRESH MATERIALIZED VIEW statements.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants