Skip to content

Use faster set for column IDs in schemaexpr#49819

Merged
craig[bot] merged 2 commits intocockroachdb:masterfrom
mgartner:use-fast-int-set-in-schemaexpr
Jun 4, 2020
Merged

Use faster set for column IDs in schemaexpr#49819
craig[bot] merged 2 commits intocockroachdb:masterfrom
mgartner:use-fast-int-set-in-schemaexpr

Conversation

@mgartner
Copy link
Copy Markdown
Contributor

@mgartner mgartner commented Jun 3, 2020

sqlbase: add ColSet wrapper for util.FastIntSet of ColumnID

There are numerous places where a map[sqlbase.ColumnID]struct{} or a
util.FastIntSet is used to represent a set of sqlbase.ColumnID. This
commit adds a typed wrapper around util.FastIntSet which is an
efficient and ergonimic replacement for maps and util.FastIntSet.

Release note: None

schemaexpr: use sqlbase.ColSet instead of maps

This commit replaces maps used as sets of integers with sqlbase.ColSet
because it is a more efficient set implementation.

Release note: None

@cockroach-teamcity
Copy link
Copy Markdown
Member

This change is Reviewable

Copy link
Copy Markdown
Member

@RaduBerinde RaduBerinde left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:lgtm:

Reviewable status: :shipit: complete! 1 of 0 LGTMs obtained (waiting on @mgartner, @mjibson, @RaduBerinde, and @rytaft)


pkg/sql/schemaexpr/column.go, line 154 at r1 (raw file):

func replaceVars(
	desc *sqlbase.MutableTableDescriptor, rootExpr tree.Expr,
) (tree.Expr, util.FastIntSet, error) {

I'd name this return arg so it's more evident what it is, e.g. _ tree.Expr, columnIDs util.FastIntSet, _ error


pkg/sql/schemaexpr/computed_column.go, line 61 at r1 (raw file):

	}

	var dependencies util.FastIntSet

[nit] Unfortunately it's not easy to tell what the numbers in the set are, maybe depColumnIDs. Or, if there is or we expect wider-spread usage of sets of sqlbase.ColumnID, we may want to make a nicely typed wrapper like opt.ColSet.

@mgartner mgartner force-pushed the use-fast-int-set-in-schemaexpr branch 3 times, most recently from 65ccc0a to ae9c179 Compare June 3, 2020 15:16
Copy link
Copy Markdown
Contributor Author

@mgartner mgartner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (and 1 stale) (waiting on @mjibson, @RaduBerinde, and @rytaft)


pkg/sql/schemaexpr/column.go, line 154 at r1 (raw file):

Previously, RaduBerinde wrote…

I'd name this return arg so it's more evident what it is, e.g. _ tree.Expr, columnIDs util.FastIntSet, _ error

Now that I've added a typed wrapper, I'm thinking this is OK to be unnamed. Do you agree?


pkg/sql/schemaexpr/computed_column.go, line 61 at r1 (raw file):

Previously, RaduBerinde wrote…

[nit] Unfortunately it's not easy to tell what the numbers in the set are, maybe depColumnIDs. Or, if there is or we expect wider-spread usage of sets of sqlbase.ColumnID, we may want to make a nicely typed wrapper like opt.ColSet.

Good point. I've updated the name, and I've also created the sqlbase.ColSet typed wrapper, since it looks like there's quite a few places that could use it.

Is that name good, or would sqlbase.ColIdSet or sqlbase.ColumnIDSet be better?

My quick searching of some places this could be used:

image.png

image copy 1.png

@mgartner mgartner changed the title sql: use util.FastIntSet in schemaexpr Use faster set for column IDs in schemaexpr Jun 3, 2020
@mgartner mgartner force-pushed the use-fast-int-set-in-schemaexpr branch from ae9c179 to c63ded6 Compare June 3, 2020 15:29
@RaduBerinde
Copy link
Copy Markdown
Member


pkg/sql/schemaexpr/computed_column.go, line 61 at r1 (raw file):

Previously, mgartner (Marcus Gartner) wrote…

Good point. I've updated the name, and I've also created the sqlbase.ColSet typed wrapper, since it looks like there's quite a few places that could use it.

Is that name good, or would sqlbase.ColIdSet or sqlbase.ColumnIDSet be better?

My quick searching of some places this could be used:

image.png

image copy 1.png

Nice! Maybe TableColSet? because you can only reasonably use it for columns of a single table (ids are unique only within each table)

@mgartner mgartner force-pushed the use-fast-int-set-in-schemaexpr branch from c63ded6 to 166d492 Compare June 3, 2020 17:49
There are numerous places where a `map[sqlbase.ColumnID]struct{}` or a
`util.FastIntSet` is used to represent a set of `sqlbase.ColumnID`. This
commit adds a typed wrapper around `util.FastIntSet` which is an
efficient and ergonomic replacement for maps and `util.FastIntSet`.

Release note: None
import "github.com/cockroachdb/cockroach/pkg/util"

// TableColSet efficiently stores an unordered set of column ids.
type TableColSet struct {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is the reason for wrapping util.FastIntSet here? As far as I can tell all the methods just call into util.FastIntSet.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's so it's strongly typed, which makes it evident what it stores and you don't have to cast ColumnIDs to ints and back.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess, but it also feels like a good amount of duplication.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

After working without a wrapper for opt.ColSet for a few releases (until Go implemented mid-stack inlining) I can tell you that the duplication is worth it.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds good -- i haven't had to use this extensively to really have an opinion.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the bit of duplication is worth it for the clarity and conciseness it provides when using TableColSet.

@mgartner mgartner force-pushed the use-fast-int-set-in-schemaexpr branch 2 times, most recently from b44a672 to 0a54fd8 Compare June 3, 2020 22:38
Copy link
Copy Markdown
Collaborator

@rytaft rytaft left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:lgtm:

Reviewed 5 of 5 files at r2, 5 of 6 files at r3, 2 of 2 files at r4.
Reviewable status: :shipit: complete! 1 of 0 LGTMs obtained (and 1 stale) (waiting on @mgartner, @mjibson, @RaduBerinde, and @rohany)

This commit replaces maps used as sets of integers with
sqlbase.TableColSet because it is a more efficient set implementation.

Release note: None
@mgartner mgartner force-pushed the use-fast-int-set-in-schemaexpr branch from 0a54fd8 to f61f13e Compare June 4, 2020 03:15
@mgartner
Copy link
Copy Markdown
Contributor Author

mgartner commented Jun 4, 2020

bors r+

@craig
Copy link
Copy Markdown
Contributor

craig bot commented Jun 4, 2020

Merge conflict (retrying...)

@craig
Copy link
Copy Markdown
Contributor

craig bot commented Jun 4, 2020

Build succeeded

@craig craig bot merged commit 994d306 into cockroachdb:master Jun 4, 2020
@mgartner mgartner deleted the use-fast-int-set-in-schemaexpr branch June 4, 2020 19:44
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants