Skip to content

sql: Replace WITH BUCKET_COUNT with new bucket_count storage param.#76112

Merged
craig[bot] merged 1 commit intocockroachdb:masterfrom
chengxiong-ruan:refactor-bucket-count-to-use-new-syntax
Feb 9, 2022
Merged

sql: Replace WITH BUCKET_COUNT with new bucket_count storage param.#76112
craig[bot] merged 1 commit intocockroachdb:masterfrom
chengxiong-ruan:refactor-bucket-count-to-use-new-syntax

Conversation

@chengxiong-ruan
Copy link
Copy Markdown
Contributor

a follow up of pr #76068

Release note (sql change): We have add support for the new bucket_count
storage param syntax. We prefer it over the old WITH BUCKET_COUNT=xxx
syntax. With this change, crdb outputs the new syntax for SHOW CREATE.
Though for the AST tree formatting, we still respect the old syntax if
user used it.

@cockroach-teamcity
Copy link
Copy Markdown
Member

This change is Reviewable

@chengxiong-ruan chengxiong-ruan changed the title sql: Respect new bucket_count storage param for formatting sql: Replace WITH BUCKET_COUNT with new bucket_count storage param. Feb 5, 2022
@chengxiong-ruan chengxiong-ruan marked this pull request as ready for review February 7, 2022 14:26
@chengxiong-ruan chengxiong-ruan requested a review from a team February 7, 2022 14:26
@chengxiong-ruan chengxiong-ruan requested a review from a team as a code owner February 7, 2022 14:26
Copy link
Copy Markdown

@postamar postamar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed 22 of 22 files at r1, all commit messages.
Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @ajwerner and @chengxiong-ruan)


pkg/sql/sem/tree/create.go, line 723 at r1 (raw file):

					ctx.WriteString(" USING HASH WITH BUCKET_COUNT=")
					ctx.FormatNode(node.PrimaryKey.ShardBuckets)
				}

This seems dangerous. Couldn't you end up with USING HASH WITH BUCKET_COUNT=123 WITH (bucket_count=123) somehow? As I'm reviewing this, I can only feel that this doesn't happen because the tests are exercised with the default value for the most part. I may be wrong.

It feels like the logic for determining whether USING HASH is there should be separate from the logic determining whether to use the legacy WITH BUCKET_COUNT vs the new WITH (bucket_count.

This comment applies to other places that changed in the tree package.

@mgartner
Copy link
Copy Markdown
Contributor

mgartner commented Feb 7, 2022


pkg/sql/sem/tree/create.go, line 723 at r1 (raw file):

Previously, postamar (Marius Posta) wrote…

This seems dangerous. Couldn't you end up with USING HASH WITH BUCKET_COUNT=123 WITH (bucket_count=123) somehow? As I'm reviewing this, I can only feel that this doesn't happen because the tests are exercised with the default value for the most part. I may be wrong.

It feels like the logic for determining whether USING HASH is there should be separate from the logic determining whether to use the legacy WITH BUCKET_COUNT vs the new WITH (bucket_count.

This comment applies to other places that changed in the tree package.

You may already be aware of this, but wanted to mention that there are multiple formatting functions. This one formats the parsed AST, while another code path in catformat formats an index descriptor. I agree this feels a bit dangerous, but maybe less so given that the use cases of formatting AST nodes are less common (or less important) that descriptors?

A possible alternative is to always output the preferred full length format, whether the default was used or not: USING HASH WITH (bucket_count=x).

@chengxiong-ruan
Copy link
Copy Markdown
Contributor Author


pkg/sql/sem/tree/create.go, line 723 at r1 (raw file):

Previously, mgartner (Marcus Gartner) wrote…

You may already be aware of this, but wanted to mention that there are multiple formatting functions. This one formats the parsed AST, while another code path in catformat formats an index descriptor. I agree this feels a bit dangerous, but maybe less so given that the use cases of formatting AST nodes are less common (or less important) that descriptors?

A possible alternative is to always output the preferred full length format, whether the default was used or not: USING HASH WITH (bucket_count=x).

Thanks for raising this!
Yeah, I agree this would be dangerous somehow. I originally thought we should print out whatever users type int. But I don't have strong standing here and I'm open to @mgartner 's suggestion that always printing USING HASH WITH (bucket_count=x) if @postamar is ok with that.

For descriptors, it should be safe since we have validations on application level to prevent that. Is there a way we can add such kind of validations during parsing?

@postamar
Copy link
Copy Markdown

postamar commented Feb 7, 2022

Assuming it's feasible and doesn't break anything, I completely agree with @mgartner 's suggestion.

@chengxiong-ruan chengxiong-ruan force-pushed the refactor-bucket-count-to-use-new-syntax branch from c11824d to b84a3a9 Compare February 7, 2022 20:03
a follow up of pr cockroachdb#76068

Release note (sql change): We have add support for the new `bucket_count`
storage param syntax. We prefer it over the old `WITH BUCKET_COUNT=xxx`
syntax. With this change, crdb outputs the new syntax for `SHOW CREATE`.
Though for the AST tree formatting, we still respect the old syntax if
user used it.
@chengxiong-ruan chengxiong-ruan force-pushed the refactor-bucket-count-to-use-new-syntax branch from b84a3a9 to 9c2434a Compare February 8, 2022 16:52
Copy link
Copy Markdown

@postamar postamar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:lgtm:

Reviewed 6 of 6 files at r2, all commit messages.
Reviewable status: :shipit: complete! 1 of 0 LGTMs obtained (waiting on @ajwerner)

@chengxiong-ruan
Copy link
Copy Markdown
Contributor Author

thanks for the review!
bors r+

craig bot pushed a commit that referenced this pull request Feb 8, 2022
75560: sql: allow non-ALL privileges to be granted WITH GRANT OPTION r=rafiss a=ecwall



75762: sql,roachpb: add plan hash correct value to persisted stats r=maryliag a=maryliag

Previously, we didn't have the plan hash/gist values, so
a dummy value was being used instead. Now that we have the value,
this commit uses those values to be corrected stored.
The Plan Hash is saved on its own column and is part of a
statement key. A plan gist is a string saved in the metadata
and can later on converted back into a logical plan.

Partially addresses #72129

Release note (sql change): Saving plan hash/gist to the Statements
persisted stats.

75880: server: better error message for tsdump initialization r=liamgillies a=liamgillies

Beforehand, the error message for tsdump initialization was
unclear and didn't provide enough information to support
engineers on how to fix it. To address this, the error
message has been revamped with instructions and commands
to get tsdump working.

Release note (cli change): Added instructions to an error
message when initializing tsdump.

76112: sql: Replace `WITH BUCKET_COUNT` with new `bucket_count` storage param. r=chengxiong-ruan a=chengxiong-ruan

a follow up of pr #76068

Release note (sql change): We have add support for the new `bucket_count`
storage param syntax. We prefer it over the old `WITH BUCKET_COUNT=xxx`
syntax. With this change, crdb outputs the new syntax for `SHOW CREATE`.
Though for the AST tree formatting, we still respect the old syntax if
user used it.

76129: geo: add pgerror codes for errors.Newf r=rafiss a=otan

Release note (sql change): Added PG error codes to the majority of
spatial related functions.

76242: sql/logictest: fix flakey new_schema_changer r=ajwerner a=ajwerner

IDs are not deterministic due to the non-transactional nature of the descriptor
ID allocator sequence. The ID wasn't really adding value to the test anyway
but rather the name was interesting.

Fixes #76237

Release note: None

Co-authored-by: Evan Wall <wall@cockroachlabs.com>
Co-authored-by: Marylia Gutierrez <marylia@cockroachlabs.com>
Co-authored-by: Liam Gillies <liam.gillies@cockroachlabs.com>
Co-authored-by: Chengxiong Ruan <chengxiongruan@gmail.com>
Co-authored-by: Oliver Tan <otan@cockroachlabs.com>
Co-authored-by: Andrew Werner <awerner32@gmail.com>
@craig
Copy link
Copy Markdown
Contributor

craig bot commented Feb 8, 2022

Build failed (retrying...):

@craig
Copy link
Copy Markdown
Contributor

craig bot commented Feb 8, 2022

Build failed (retrying...):

@craig
Copy link
Copy Markdown
Contributor

craig bot commented Feb 9, 2022

Build succeeded:

@craig craig bot merged commit 25f081c into cockroachdb:master Feb 9, 2022
@chengxiong-ruan chengxiong-ruan deleted the refactor-bucket-count-to-use-new-syntax branch February 10, 2022 15:33
rafiss added a commit to rafiss/cockroach that referenced this pull request Jan 27, 2026
Previously, when displaying hash sharded indexes with STORING columns in
pg_indexes, the STORING clause appeared after the WITH (bucket_count=...)
clause. This produced invalid SQL that could not be re-executed:

  CREATE INDEX idx ON tbl USING btree (c1, c2) USING HASH WITH (bucket_count=16) STORING (c3)

The SQL grammar requires STORING to come before storage parameters.

This commit fixes the output order so that STORING appears before
WITH (bucket_count=...):

  CREATE INDEX idx ON tbl USING btree (c1, c2) USING HASH STORING (c3) WITH (bucket_count=16)

This matches the grammar and allows the statement to be re-executed.

This bug was introduced in cockroachdb#76112.

Fixes cockroachdb#161516

Release note (bug fix): Fixed a bug where the index definition shown in
pg_indexes for hash sharded indexes with STORING columns was not valid
SQL. The STORING clause now appears in the correct position.
craig bot pushed a commit that referenced this pull request Jan 27, 2026
161882: sql: fix STORING clause position for hash sharded indexes in pg_indexes r=rafiss a=rafiss

Previously, when displaying hash sharded indexes with STORING columns in pg_indexes, the STORING clause appeared after the WITH (bucket_count=...) clause. This produced invalid SQL that could not be re-executed:

    CREATE INDEX idx ON tbl USING btree (c1, c2) USING HASH WITH (bucket_count=16) STORING (c3)

The SQL grammar requires STORING to come before storage parameters.

This commit fixes the output order so that STORING appears before WITH (bucket_count=...):

    CREATE INDEX idx ON tbl USING btree (c1, c2) USING HASH STORING (c3) WITH (bucket_count=16)

This matches the grammar and allows the statement to be re-executed.

This bug was introduced in #76112.

Fixes #161516

Release note (bug fix): Fixed a bug where the index definition shown in pg_indexes for hash sharded indexes with STORING columns was not valid SQL. The STORING clause now appears in the correct position.

Co-authored-by: Rafi Shamim <rafi@cockroachlabs.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants