Skip to content

sql: enable storage parameter syntax for primary key#75971

Merged
craig[bot] merged 1 commit intocockroachdb:masterfrom
chengxiong-ruan:enable-with-storage-param-syntax-for-pk
Feb 4, 2022
Merged

sql: enable storage parameter syntax for primary key#75971
craig[bot] merged 1 commit intocockroachdb:masterfrom
chengxiong-ruan:enable-with-storage-param-syntax-for-pk

Conversation

@chengxiong-ruan
Copy link
Copy Markdown
Contributor

@chengxiong-ruan chengxiong-ruan commented Feb 3, 2022

This is a prerequisite to support WITH (bucket_count=...) syntax
for hash sharded index.

Release note (sql change): Previously, WITH (param=value) syntax
is not allowed for primary key. However, that does not align with
postgresql. Also, to support WITH (bucket_count=...) syntax for hash
sharded index, we need to enable it as well. This pr also adds
extra validation to prevent existing storage params from being
applied to primary keys. Which means, even the syntax is accepted
by the parser, none of existing storage param for inverted index
is allowed on primary key. We don't actually set those params
on top of PKs internally, but we need to tell user which param is
supprted or not. Eventually, when we add support for bucket_count
param, a param white list will be maintained for PKs.

@cockroach-teamcity
Copy link
Copy Markdown
Member

This change is Reviewable

@chengxiong-ruan
Copy link
Copy Markdown
Contributor Author


pkg/sql/create_table.go, line 2691 at r1 (raw file):

// validateUniqueConstraintParamsForCreateTable validate storage params of
// unique constraints passed in through `CREATE TABLE` statement.
func validateUniqueConstraintParamsForCreateTable(n *tree.CreateTable) error {

I didn't put these validations into the param observer pattern because right now we don't call paramparse.SetStorageParameters() for primary index and index of unique constraints since inverted index can't be PK or Unique. We could call it anyway and do validations there, but I don't see a clean way to create hash shard index within the param observer, and I think it's kinda bad to create index within the param observer. We may do a refactor if we want to support other storage params on pk and unique constraint in the future, but right now I think this is sufficient defense and good error hints for users.

@chengxiong-ruan chengxiong-ruan force-pushed the enable-with-storage-param-syntax-for-pk branch 2 times, most recently from 2adb407 to 686ab41 Compare February 3, 2022 21:22
@chengxiong-ruan chengxiong-ruan marked this pull request as ready for review February 3, 2022 21:23
@chengxiong-ruan chengxiong-ruan requested review from a team and ajwerner February 3, 2022 21:23
Copy link
Copy Markdown

@postamar postamar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I did a superficial review and couldn't find anything obviously wrong, this looks good enough to me.

Reviewed 15 of 18 files at r1, 4 of 4 files at r2, all commit messages.
Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @ajwerner and @chengxiong-ruan)


pkg/sql/paramparse/validation.go, line 23 at r2 (raw file):

func ValidateUniqueConstraintParams(params tree.StorageParams, isPK bool) error {
	// TODO (issue 75243): add `bucket_count` as a valid param. Current dummy
	// implementation is just for a proof of concept and make golint happy.

Shouldn't bucket_count only be valid if USING HASH is present, though? Perhaps I'm wrong but the validation of the params seems like it will be more context-dependent than this function lets on. You already need an isPK bool arg to describe the context, perhaps it's worth wrapping it in a struct which describes this context.

Perhaps I'm getting ahead of myself here.


pkg/sql/paramparse/validation.go, line 30 at r2 (raw file):

		return pgerror.Newf(pgcode.InvalidParameterValue, "invalid storage param %q on Primary Key", params[0].Key)
	}
	return pgerror.Newf(pgcode.InvalidParameterValue, "invalid storage param %q on Unique Index", params[0].Key)

nit: no need to capitalize "unique index".

Copy link
Copy Markdown
Contributor Author

@chengxiong-ruan chengxiong-ruan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @ajwerner and @postamar)


pkg/sql/paramparse/validation.go, line 23 at r2 (raw file):

Previously, postamar (Marius Posta) wrote…

Shouldn't bucket_count only be valid if USING HASH is present, though? Perhaps I'm wrong but the validation of the params seems like it will be more context-dependent than this function lets on. You already need an isPK bool arg to describe the context, perhaps it's worth wrapping it in a struct which describes this context.

Perhaps I'm getting ahead of myself here.

You're definitely another man ahead of time! Yes, bucket_count should only exists if USING HASH is present. I'm having a follow up pr to actually add this validation. I planned to have this validation outside of here, but I think you brought a good point to wrap more (just one more) context in, will add it in my next pr if you are ok with that.


pkg/sql/paramparse/validation.go, line 30 at r2 (raw file):

Previously, postamar (Marius Posta) wrote…

nit: no need to capitalize "unique index".

sg, will fix this

@chengxiong-ruan chengxiong-ruan force-pushed the enable-with-storage-param-syntax-for-pk branch from 686ab41 to 814d219 Compare February 3, 2022 22:38
@chengxiong-ruan
Copy link
Copy Markdown
Contributor Author


pkg/sql/paramparse/validation.go, line 23 at r2 (raw file):

Previously, chengxiong-ruan (Chengxiong Ruan) wrote…

You're definitely another man ahead of time! Yes, bucket_count should only exists if USING HASH is present. I'm having a follow up pr to actually add this validation. I planned to have this validation outside of here, but I think you brought a good point to wrap more (just one more) context in, will add it in my next pr if you are ok with that.

Actually, validation of bucket_count and USING_HASH coexisting should be held for normal indexes as well. So need to do it outside of this context :(

This is a prerequisite to support `WITH (bucket_count=...)` syntax
for hash sharded index.

Release note (sql change): Previously, `WITH (param=value)` syntax
is not allowed for primary key. However, that does not align with
postgresql. Also, to support `WITH (bucket_count=...)` syntax for hash
sharded index, we need to enable it as well. This pr also adds
extra validation to prevent existing storage params from being
applied to primary keys. Which means, even the syntax is accepted
by the parser, none of existing storage param for inverted index
is allowed on primary key. We don't actually set those params
on top of PKs internally, but we need to tell user which param is
supprted or not. Eventually, when we add support for `bucket_count`
param, a param white list will be maintained for PKs.
@chengxiong-ruan chengxiong-ruan force-pushed the enable-with-storage-param-syntax-for-pk branch from 814d219 to 350354f Compare February 4, 2022 02:11
@chengxiong-ruan
Copy link
Copy Markdown
Contributor Author

thanks for the review!
bors r+

@craig
Copy link
Copy Markdown
Contributor

craig bot commented Feb 4, 2022

Build succeeded:

@craig craig bot merged commit b51c9b4 into cockroachdb:master Feb 4, 2022
chengxiong-ruan added a commit to chengxiong-ruan/cockroach that referenced this pull request Feb 4, 2022
A follow up of pr cockroachdb#75971 .
Will send another pr to update `SHOW CREATE` output.

Release note (sql change): `bucket_count` storage parameter is added.
To create hash sharded index, now we may use the new syntax:
`USING HASH WITH (bucket_count=xxx)`. `bucket_count` storage param can
only be use together with `USING HASH`. The old `WITH BUCKET_COUNT=xxx`
syntax is still supported for backward compatibility. However, only
either the old or new syntax can be used, but not both. Err is returned
for clause like `USING HASH WITH BUCKET_COUNT=5 WITH (bucket_count=5).
@chengxiong-ruan chengxiong-ruan deleted the enable-with-storage-param-syntax-for-pk branch February 4, 2022 17:54
chengxiong-ruan added a commit to chengxiong-ruan/cockroach that referenced this pull request Feb 4, 2022
A follow up of pr cockroachdb#75971 .
Will send another pr to update `SHOW CREATE` output.

Release note (sql change): `bucket_count` storage parameter is added.
To create hash sharded index, now we may use the new syntax:
`USING HASH WITH (bucket_count=xxx)`. `bucket_count` storage param can
only be use together with `USING HASH`. The old `WITH BUCKET_COUNT=xxx`
syntax is still supported for backward compatibility. However, only
either the old or new syntax can be used, but not both. Err is returned
for clause like `USING HASH WITH BUCKET_COUNT=5 WITH (bucket_count=5).
chengxiong-ruan added a commit to chengxiong-ruan/cockroach that referenced this pull request Feb 5, 2022
A follow up of pr cockroachdb#75971 .
Will send another pr to update `SHOW CREATE` output.

Release note (sql change): `bucket_count` storage parameter is added.
To create hash sharded index, now we may use the new syntax:
`USING HASH WITH (bucket_count=xxx)`. `bucket_count` storage param can
only be use together with `USING HASH`. The old `WITH BUCKET_COUNT=xxx`
syntax is still supported for backward compatibility. However, only
either the old or new syntax can be used, but not both. Err is returned
for clause like `USING HASH WITH BUCKET_COUNT=5 WITH (bucket_count=5).
craig bot pushed a commit that referenced this pull request Feb 5, 2022
76068: sql: support bucket_count storage parameter r=chengxiong-ruan a=chengxiong-ruan

A follow up of pr #75971 .
Will send another pr to update `SHOW CREATE` output.

Release note (sql change): `bucket_count` storage parameter is added.
To create hash sharded index, now we may use the new syntax:
`USING HASH WITH (bucket_count=xxx)`. `bucket_count` storage param can
only be use together with `USING HASH`. The old `WITH BUCKET_COUNT=xxx`
syntax is still supported for backward compatibility. However, only
either the old or new syntax can be used, but not both. Err is returned
for clause like `USING HASH WITH BUCKET_COUNT=5 WITH (bucket_count=5).

Co-authored-by: Chengxiong Ruan <chengxiongruan@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants