Skip to content

cli: support SQL disk spilling in tenants#71040

Merged
craig[bot] merged 1 commit intocockroachdb:masterfrom
jaylim-crl:jay/disk-temp-storage
Oct 6, 2021
Merged

cli: support SQL disk spilling in tenants#71040
craig[bot] merged 1 commit intocockroachdb:masterfrom
jaylim-crl:jay/disk-temp-storage

Conversation

@jaylim-crl
Copy link
Copy Markdown
Contributor

@jaylim-crl jaylim-crl commented Oct 3, 2021

Previously, temp storage for SQL tenants was configured to point to memory
with a limit of 100MB. As a result, all ephemeral data when processing large
queries end up going to memory, and there was no way to configure this to
point to disk. This commit changes that behavior, and the default temp storage
for SQL tenants is now the same as SQL for dedicated, i.e. disk, with a limit
of 32GB. The operator can configure this through the --max-disk-temp-storage
flag.

Release note (cli change): cockroach mt start-sql will now support the
following flags to configure ephemeral storage for SQL when processing large
queries: --store, --temp-dir, and --max-disk-temp-storage.

Release note (sql change): SQL tenants will now spill to disk by default
when processing large queries, instead of memory.

Release justification: The upcoming Serverless MVP release plans to allow
spilling to ephemeral disk for SQL operations (See CC-4983), but the
existing start-sql for tenants doesn't support that. This commit changes
the default spilling behavior in multi-tenant scenarios, and allows an operator
to configure ephemeral SQL storage through the CLI, and should have no
impact on dedicated customers.

@jaylim-crl jaylim-crl requested review from a team as code owners October 3, 2021 05:28
@cockroach-teamcity
Copy link
Copy Markdown
Member

This change is Reviewable

Copy link
Copy Markdown
Contributor Author

@jaylim-crl jaylim-crl left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @andy-kimball and @darinpp)


pkg/cli/mt_start_sql.go, line 71 at r1 (raw file):

	// at the time of writing stateless and may not be provisioned with
	// suitable storage.
	serverCfg.Stores.Specs = nil

We can remove this safely now since we already default to logging to stderr:

if !isServerCmd || defaultLogDir == nil {
// Client commands, as well as servers without an on-disk store directory,
// default to output to stderr, with no capture of stray errors.
h.Config = logconfig.DefaultStderrConfig()
}


pkg/cli/mt_start_sql.go, line 91 at r1 (raw file):

	}

	// Initialize the target directory for temporary storage. If encryption at

Taken from

cockroach/pkg/cli/start.go

Lines 442 to 472 in bc42b66

// Next we initialize the target directory for temporary storage.
// If encryption at rest is enabled in any fashion, we'll want temp
// storage to be encrypted too. To achieve this, we use
// the first encrypted store as temp dir target, if any.
// If we can't find one, we use the first StoreSpec in the list.
//
// While we look, we also clean up any abandoned temporary directories. We
// don't know which store spec was used previously—and it may change if
// encryption gets enabled after the fact—so we check each store.
var specIdx = 0
for i, spec := range serverCfg.Stores.Specs {
if spec.IsEncrypted() {
// TODO(jackson): One store's EncryptionOptions may say to encrypt
// with a real key, while another store's say to use key=plain.
// This provides no guarantee that we'll use the encrypted one's.
specIdx = i
}
if spec.InMemory {
continue
}
recordPath := filepath.Join(spec.Path, server.TempDirsRecordFilename)
if err := storage.CleanupTempDirs(recordPath); err != nil {
return errors.Wrap(err, "could not cleanup temporary directories from record file")
}
}
if serverCfg.TempStorageConfig, err = initTempStorageConfig(
ctx, serverCfg.Settings, stopper, serverCfg.Stores.Specs[specIdx],
); err != nil {
return err
}

Copy link
Copy Markdown
Contributor Author

@jaylim-crl jaylim-crl left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also tested this on CC with --max-disk-temp-storage=50MiB, and it seems to work as intended:

$ cockroach sql --url '...'  -e 'select trunc(random()*1000000000)::int as name from generate_series(1, 100000000) order by name;'
ERROR: temp disk storage: disk budget exceeded: 1048576 bytes requested, 52428800 currently allocated, 52428800 bytes in budget
SQLSTATE: 53100
Failed running "sql"

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @andy-kimball and @darinpp)

Copy link
Copy Markdown
Contributor

@knz knz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed 2 of 2 files at r1, all commit messages.
Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @andy-kimball and @darinpp)


pkg/cli/mt_start_sql.go, line 91 at r1 (raw file):

Previously, jaylim-crl (Jay Lim) wrote…

Taken from

cockroach/pkg/cli/start.go

Lines 442 to 472 in bc42b66

// Next we initialize the target directory for temporary storage.
// If encryption at rest is enabled in any fashion, we'll want temp
// storage to be encrypted too. To achieve this, we use
// the first encrypted store as temp dir target, if any.
// If we can't find one, we use the first StoreSpec in the list.
//
// While we look, we also clean up any abandoned temporary directories. We
// don't know which store spec was used previously—and it may change if
// encryption gets enabled after the fact—so we check each store.
var specIdx = 0
for i, spec := range serverCfg.Stores.Specs {
if spec.IsEncrypted() {
// TODO(jackson): One store's EncryptionOptions may say to encrypt
// with a real key, while another store's say to use key=plain.
// This provides no guarantee that we'll use the encrypted one's.
specIdx = i
}
if spec.InMemory {
continue
}
recordPath := filepath.Join(spec.Path, server.TempDirsRecordFilename)
if err := storage.CleanupTempDirs(recordPath); err != nil {
return errors.Wrap(err, "could not cleanup temporary directories from record file")
}
}
if serverCfg.TempStorageConfig, err = initTempStorageConfig(
ctx, serverCfg.Settings, stopper, serverCfg.Stores.Specs[specIdx],
); err != nil {
return err
}

Could you extract this code into a common function and call it from both places.

@jaylim-crl jaylim-crl force-pushed the jay/disk-temp-storage branch from 3858392 to aaf5203 Compare October 3, 2021 20:08
@jaylim-crl jaylim-crl requested a review from knz October 3, 2021 20:08
Copy link
Copy Markdown
Contributor Author

@jaylim-crl jaylim-crl left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TFTR!

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @andy-kimball, @darinpp, and @knz)


pkg/cli/mt_start_sql.go, line 91 at r1 (raw file):

Previously, knz (kena) wrote…

Could you extract this code into a common function and call it from both places.

Done.

Copy link
Copy Markdown
Contributor

@knz knz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed 2 of 2 files at r2, all commit messages.
Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @andy-kimball and @darinpp)

@jaylim-crl
Copy link
Copy Markdown
Contributor Author

TFTR!

bors r=knz

@craig
Copy link
Copy Markdown
Contributor

craig bot commented Oct 6, 2021

Build failed:

@jaylim-crl
Copy link
Copy Markdown
Contributor Author

bors r+

@craig
Copy link
Copy Markdown
Contributor

craig bot commented Oct 6, 2021

Build failed:

Previously, temp storage for SQL tenants was configured to point to memory
with a limit of 100MB. As a result, all ephemeral data when processing large
queries end up going to memory, and there was no way to configure this to
point to disk. This commit changes that behavior, and the default temp storage
for SQL tenants is now the same as SQL for dedicated, i.e. disk, with a limit
of 32GB. The operator can configure this through the `--max-disk-temp-storage`
flag.

Release note (cli change): `cockroach mt start-sql` will now support the
following flags to configure ephemeral storage for SQL when processing large
queries: `--store`, `--temp-dir`, and `--max-disk-temp-storage`.

Release note (sql change): SQL tenants will now spill to disk by default
when processing large queries, instead of memory.
@jaylim-crl jaylim-crl force-pushed the jay/disk-temp-storage branch from aaf5203 to af5a5a5 Compare October 6, 2021 12:34
@jaylim-crl
Copy link
Copy Markdown
Contributor Author

bors r+

@craig
Copy link
Copy Markdown
Contributor

craig bot commented Oct 6, 2021

Build succeeded:

@craig craig bot merged commit 5175a31 into cockroachdb:master Oct 6, 2021
tbg added a commit to tbg/cockroach that referenced this pull request Oct 14, 2021
In cockroachdb#71040 we added disk spilling which in particular added the following
call to the `mt start-sql` code path:

https://github.com/cockroachdb/cockroach/blob/af5a5a5065ce80c5e6568b4b422bf5c3a179e173/pkg/cli/mt_start_sql.go#L90-L89

The tenant doesn't support the `--store` flag, and so this will always
be the default of `cockroach-data`.

This has the unfortunate effect of trying to clean up the temp dirs for
that directory, even if `--temp-dir` is supplied:

https://github.com/cockroachdb/cockroach/blob/6999e5fded43f59eb5839dc8b943fd1e2a33a3fd/pkg/cli/start.go#L223-L227

In the `multitenant-upgrade` roachtest, as it happens there was actually
a cockroach host instance running under `cockroach-data`, and so the
tenant would fail to try to remove its (locked) temp dirs.

This commit fixes that issue by making start-sql use an in-memory store
spec. This fixes the test flake, but I wonder if the temp storage
feature for tenants is working properly. I worry about this because
the concept of a "temp engine" always seems to require a store:

https://github.com/cockroachdb/cockroach/blob/6999e5fded43f59eb5839dc8b943fd1e2a33a3fd/pkg/cli/start.go#L274-L280

and I am not sure how deep this goes. Frankly I don't understand why
if you are providing a Path you also need to provide a StoreSpec. I
will leave untangling this to @knz and @jaylim-crl, who know this
code better than I do.

Fixes cockroachdb#69920.

Release note: None
tbg added a commit to tbg/cockroach that referenced this pull request Oct 15, 2021
In cockroachdb#71040 we added disk spilling for tenants which added the following
call to the `mt start-sql` code path:

https://github.com/cockroachdb/cockroach/blob/af5a5a5065ce80c5e6568b4b422bf5c3a179e173/pkg/cli/mt_start_sql.go#L90-L89

The defaults for the store match that of a regular CockroachDB node, and
the tenant will thus attempt to clean up temp dirs for `cockroach-data`
if no store is specified:

https://github.com/cockroachdb/cockroach/blob/6999e5fded43f59eb5839dc8b943fd1e2a33a3fd/pkg/cli/start.go#L223-L227

In the `multitenant-upgrade` roachtest, as it happens there was actually
a cockroach host instance running under `cockroach-data`, and so the
tenant would fail to try to remove its (locked) temp dirs.

Fix that by passing the `--store` flag to the tenants in this test. This
is mildly annoying since the predecessor version doesn't understand it
and so the test has to figure out when it is legal to pass it. Anyway,
it is done now.

I will point out that it isn't the greatest choice to have tenants
default to `cockroach-data` as the resulting interaction with a CRDB
server results in an unfortunate UX. I filed cockroachdb#71603 to that effect.

Release note: None
craig bot pushed a commit that referenced this pull request Oct 19, 2021
71604: roachtest: fix multitenant-upgrade r=jaylim-crl a=tbg

In #71040 we added disk spilling for tenants which added the following
call to the `mt start-sql` code path:

https://github.com/cockroachdb/cockroach/blob/af5a5a5065ce80c5e6568b4b422bf5c3a179e173/pkg/cli/mt_start_sql.go#L90-L89

The defaults for the store match that of a regular CockroachDB node, and
the tenant will thus attempt to clean up temp dirs for `cockroach-data`
if no store is specified:

https://github.com/cockroachdb/cockroach/blob/6999e5fded43f59eb5839dc8b943fd1e2a33a3fd/pkg/cli/start.go#L223-L227

In the `multitenant-upgrade` roachtest, as it happens there was actually
a cockroach host instance running under `cockroach-data`, and so the
tenant would fail to try to remove its (locked) temp dirs.

Fix that by passing the `--store` flag to the tenants in this test. This
is mildly annoying since the predecessor version doesn't understand it
and so the test has to figure out when it is legal to pass it. Anyway,
it is done now.

I will point out that it isn't the greatest choice to have tenants
default to `cockroach-data` as the resulting interaction with a CRDB
server results in an unfortunate UX. I filed #71603 to that effect.

Release note: None


Co-authored-by: Tobias Grieger <tobias.schottdorf@gmail.com>
blathers-crl bot pushed a commit that referenced this pull request Oct 19, 2021
In #71040 we added disk spilling for tenants which added the following
call to the `mt start-sql` code path:

https://github.com/cockroachdb/cockroach/blob/af5a5a5065ce80c5e6568b4b422bf5c3a179e173/pkg/cli/mt_start_sql.go#L90-L89

The defaults for the store match that of a regular CockroachDB node, and
the tenant will thus attempt to clean up temp dirs for `cockroach-data`
if no store is specified:

https://github.com/cockroachdb/cockroach/blob/6999e5fded43f59eb5839dc8b943fd1e2a33a3fd/pkg/cli/start.go#L223-L227

In the `multitenant-upgrade` roachtest, as it happens there was actually
a cockroach host instance running under `cockroach-data`, and so the
tenant would fail to try to remove its (locked) temp dirs.

Fix that by passing the `--store` flag to the tenants in this test. This
is mildly annoying since the predecessor version doesn't understand it
and so the test has to figure out when it is legal to pass it. Anyway,
it is done now.

I will point out that it isn't the greatest choice to have tenants
default to `cockroach-data` as the resulting interaction with a CRDB
server results in an unfortunate UX. I filed #71603 to that effect.

Release note: None
cameronnunez pushed a commit that referenced this pull request Oct 27, 2021
In #71040 we added disk spilling for tenants which added the following
call to the `mt start-sql` code path:

https://github.com/cockroachdb/cockroach/blob/af5a5a5065ce80c5e6568b4b422bf5c3a179e173/pkg/cli/mt_start_sql.go#L90-L89

The defaults for the store match that of a regular CockroachDB node, and
the tenant will thus attempt to clean up temp dirs for `cockroach-data`
if no store is specified:

https://github.com/cockroachdb/cockroach/blob/6999e5fded43f59eb5839dc8b943fd1e2a33a3fd/pkg/cli/start.go#L223-L227

In the `multitenant-upgrade` roachtest, as it happens there was actually
a cockroach host instance running under `cockroach-data`, and so the
tenant would fail to try to remove its (locked) temp dirs.

Fix that by passing the `--store` flag to the tenants in this test. This
is mildly annoying since the predecessor version doesn't understand it
and so the test has to figure out when it is legal to pass it. Anyway,
it is done now.

I will point out that it isn't the greatest choice to have tenants
default to `cockroach-data` as the resulting interaction with a CRDB
server results in an unfortunate UX. I filed #71603 to that effect.

Release note: None
AlexTalks added a commit to AlexTalks/cockroach that referenced this pull request Nov 25, 2021
Previously, the `multitenant-upgrade` test had several workarounds and
branches for dealing with predecessor versions less than v21.2,
particularly the fact that these versions did not support the `--store`
flag passed on the `mt start-sql` CLI command.  This was introduced
in cockroachdb#71040, and subsequently incorporated into the roachtest (including
workarounds) in cockroachdb#71604.  Now that the current version is v22.1, and thus
the predecessor version supports the `--store` flag, it is necessary to
remove the workaround in order to ensure proper functionality of the
test.  Additionally, this change cleans up an even earlier workaround
for validating the cluster version for predecessor versions less than
v21.1.2.

Fixes cockroachdb#72971.

Release note: None
craig bot pushed a commit that referenced this pull request Nov 29, 2021
72948: ui: add index stats to table details page r=lindseyjin a=lindseyjin

Resolves #67647, #72842

Previously, there was no way to view and clear index usage stats from
the frontend db console. This commit adds Index Stats tables for each
table on the Table Detail pages, allowing users to view index names,
total reads, and last used statistics. This commit also adds the
functionality of clearing all index stats as a button on the Index Stats
tables.

![image](https://user-images.githubusercontent.com/29153209/142700094-cad60f81-8c83-48cc-b1f4-970e80d951ee.png)

Release note (ui change): Add index stats table and button to clear
index usage stats on the Table Details page for each table.

73145: roachtest: fix flags used in multitenant-upgrade for versions >= v21.2 r=AlexTalks a=AlexTalks

Previously, the `multitenant-upgrade` test had several workarounds and
branches for dealing with predecessor versions less than v21.2,
particularly the fact that these versions did not support the `--store`
flag passed on the `mt start-sql` CLI command.  This was introduced
in #71040, and subsequently incorporated into the roachtest (including
workarounds) in #71604.  Now that the current version is v22.1, and thus
the predecessor version supports the `--store` flag, it is necessary to
remove the workaround in order to ensure proper functionality of the
test.  Additionally, this change cleans up an even earlier workaround
for validating the cluster version for predecessor versions less than
v21.1.2.

Fixes #72971.

Release note: None

Co-authored-by: Lindsey Jin <lindsey.jin@cockroachlabs.com>
Co-authored-by: Alex Sarkesian <sarkesian@cockroachlabs.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants