Skip to content

sql: Add column family support for secondary indexes#42073

Merged
craig[bot] merged 1 commit intocockroachdb:masterfrom
rohany:col-fam-secondary
Dec 4, 2019
Merged

sql: Add column family support for secondary indexes#42073
craig[bot] merged 1 commit intocockroachdb:masterfrom
rohany:col-fam-secondary

Conversation

@rohany
Copy link
Copy Markdown
Contributor

@rohany rohany commented Oct 31, 2019

This PR adds support for stored columns in secondary indexes to respect
column families.

Secondary indexes respect the family definitions applied
to tables, and break secondary index k/v pairs into mulitple
depending on the family and stored column configurations.

This PR adds and details an extension of the primary index
column family encoding for secondary indexes, and implements it.

This encoding was implemented by updating how secondary indexes are
encoded and decoded in EncodeSecondaryIndex and the fetchers.

This change will not be respected until all nodes in the cluster are
at least running version 20.1.

Release note (sql change): Allow stored columns in secondary indexes
to respect column family definitions upon they table they are on.

@rohany rohany requested a review from a team as a code owner October 31, 2019 18:52
@cockroach-teamcity
Copy link
Copy Markdown
Member

This change is Reviewable

@rohany rohany force-pushed the col-fam-secondary branch from 4a884a9 to a1114bd Compare November 4, 2019 18:40
Copy link
Copy Markdown
Member

@jordanlewis jordanlewis left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great work so far! Really excited about the PR.

My main comment is that this needs way more tests. Encoding is very subtle (as I'm sure you have found out already :). The only way to make sure that we've gotten it correct is to write tests.

Does @mjibson's column family randomizer tool help here? Or maybe it's already doing its magic?

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @danhhz, @jordanlewis, @RaduBerinde, @rohany, and @solongordon)


docs/tech-notes/encoding.md, line 286 at r1 (raw file):

for each indexed column, the data therein are non-null and equal.

Secondary indexes have been extended to include support for column families.

Be more clear about when this change is happening. Imagine someone reading this doc in its entirety without knowledge of this migration.


docs/tech-notes/encoding.md, line 321 at r1 (raw file):

### Value encoding

If the value corresponds to column family 0:

This is still for secondary indexes right? Leave that in.


docs/tech-notes/encoding.md, line 323 at r1 (raw file):

If the value corresponds to column family 0:

The KV value will have value type bytes, and consists of

nit: grammar - tense doesn't agree (will have vs consists of)


docs/tech-notes/encoding.md, line 339 at r1 (raw file):

KV key and the KV value. (Ditto for stored column data in the old format.)

Else the KV Value will have value type `TUPLE` and consists of all stored

Be more explicit about this else - this is a text document not a program. Say "For indexes with more than 1 column family, the non-0th column families KV values will..."


docs/tech-notes/encoding.md, line 342 at r1 (raw file):

columns in that family in the `TUPLE` encoded format.

### Backwards Compatibility

For what? Be more explicit here so it's clear to future readers.


pkg/sql/colencoding/key_encoding.go, line 238 at r1 (raw file):

// TODO(jordan): each type could be optimized here.
// TODO(jordan): should use this approach in the normal row fetcher.
// TODO(rohany): what is the approach in the normal row fetcher?

I think the normal row fetcher doesn't try to skip table key values at all if they're not necessary. It just decodes everything. Look at DecodeKeyVals.


pkg/sql/colexec/cfetcher.go, line 374 at r1 (raw file):

			// Primary indexes only contain ascendingly-encoded
			// values. If this ever changes, we'll probably have to
			// figure out the directions here too.

Hmm... do we have to worry about this? Can you test your new code with descending index columns too? Same comment for fetcher.go.


pkg/sql/colexec/cfetcher.go, line 661 at r1 (raw file):

			// is that the extra columns are not always there, and are used to unique-ify
			// the index key, rather than provide the primary key column values.
			if rf.table.isSecondaryIndex && rf.table.index.Unique && len(rf.table.desc.Families) != 1 {

We should only do this logic if the secondary index has nullable columns. Also, if this condition gets much more complex, you shouldn't recheck it for every single key. It's statically known, so you can figure it out at the start of the run of the fetcher and cache it as a single boolean.


pkg/sql/colexec/cfetcher.go, line 664 at r1 (raw file):

				// TODO (rohany): we don't want to have to look at each indexed value twice (once
				//  while decoding and once here). A possible optimization is to pass whether a
				//  null was encountered while decoding and use that information here instead.

Yeah, this is going to make scans signinficantly slower I'm pretty sure. I think this optimization is very worthwhile. The fetchers are very performance sensitive and it's worth checking the impact here.


pkg/sql/sqlbase/index_encoding.go, line 813 at r1 (raw file):

		}

		// TODO (rohany): do I need to skip this logic for an inverted index?

Good question... do you? Inverted indexes I don't think should have column families at all. And you definitely won't be able to make them primary later - they're fundamentally different beasts (a single database row can have more than one corresponding entries in a single inverted index).


pkg/sql/sqlbase/index_encoding.go, line 820 at r1 (raw file):

			// Copy the key, as we will modify it and move it into the resulting IndexEntry.
			entryKey := make([]byte, len(key))
			copy(entryKey, key)

This isn't necessary on the first iteration (which is the most common case). Can we avoid?


pkg/sql/sqlbase/index_encoding.go, line 825 at r1 (raw file):

			// TODO (rohany): make this faster by computing this up front with
			//  fastIntSets? I wonder if there was a way to reuse this computed
			//  information across calls to EncodeSecondaryIndex.

You should definitely precompute this. Otherwise it's like a duplicate N^2 loop on every key we're encoding. Also, it's pointless in the common case of 1 column family. Should we just have a special case for that vs more than 1?

@rohany rohany force-pushed the col-fam-secondary branch from a1114bd to d603a3d Compare November 8, 2019 00:13
Copy link
Copy Markdown
Contributor Author

@rohany rohany left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, matts column family randomizer is running already -- it caught a bunch of bugs already!

I updated the PR with alot of the TODO optimizations implemented, and added a bunch of more detailed k/v trace tests that helped find some more bugs.

Working on this more opened some more TODO's, so PTAL!

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @danhhz, @jordanlewis, @RaduBerinde, and @solongordon)


docs/tech-notes/encoding.md, line 286 at r1 (raw file):

Previously, jordanlewis (Jordan Lewis) wrote…

Be more clear about when this change is happening. Imagine someone reading this doc in its entirety without knowledge of this migration.

Done.


docs/tech-notes/encoding.md, line 321 at r1 (raw file):

Previously, jordanlewis (Jordan Lewis) wrote…

This is still for secondary indexes right? Leave that in.

Done.


docs/tech-notes/encoding.md, line 339 at r1 (raw file):

Previously, jordanlewis (Jordan Lewis) wrote…

Be more explicit about this else - this is a text document not a program. Say "For indexes with more than 1 column family, the non-0th column families KV values will..."

Done.


docs/tech-notes/encoding.md, line 342 at r1 (raw file):

Previously, jordanlewis (Jordan Lewis) wrote…

For what? Be more explicit here so it's clear to future readers.

Done.


pkg/sql/colencoding/key_encoding.go, line 238 at r1 (raw file):

Previously, jordanlewis (Jordan Lewis) wrote…

I think the normal row fetcher doesn't try to skip table key values at all if they're not necessary. It just decodes everything. Look at DecodeKeyVals.

oh ok -- i think i read this originally as "should use the approach in the normal row fetcher" and was confused why we didn't want to skip columns


pkg/sql/colexec/cfetcher.go, line 374 at r1 (raw file):

Previously, jordanlewis (Jordan Lewis) wrote…

Hmm... do we have to worry about this? Can you test your new code with descending index columns too? Same comment for fetcher.go.

I don't think we have to worry about this for this PR -- will have to double check this once the primary key compatibility changes in the fetcher go through. Either way, we can ensure that the new candidate primary key has ascending columns as well (as a first step).


pkg/sql/colexec/cfetcher.go, line 661 at r1 (raw file):

Previously, jordanlewis (Jordan Lewis) wrote…

We should only do this logic if the secondary index has nullable columns. Also, if this condition gets much more complex, you shouldn't recheck it for every single key. It's statically known, so you can figure it out at the start of the run of the fetcher and cache it as a single boolean.

I added the optimization below, so the check here can remain simple -- these current booleans and whether or not a null was actually decoded.


pkg/sql/colexec/cfetcher.go, line 664 at r1 (raw file):

Previously, jordanlewis (Jordan Lewis) wrote…

Yeah, this is going to make scans signinficantly slower I'm pretty sure. I think this optimization is very worthwhile. The fetchers are very performance sensitive and it's worth checking the impact here.

Done.


pkg/sql/sqlbase/index_encoding.go, line 813 at r1 (raw file):

Previously, jordanlewis (Jordan Lewis) wrote…

Good question... do you? Inverted indexes I don't think should have column families at all. And you definitely won't be able to make them primary later - they're fundamentally different beasts (a single database row can have more than one corresponding entries in a single inverted index).

Makes sense -- i'll move back to the existing logic for the inverted indexes just to be safe.


pkg/sql/sqlbase/index_encoding.go, line 820 at r1 (raw file):

Previously, jordanlewis (Jordan Lewis) wrote…

This isn't necessary on the first iteration (which is the most common case). Can we avoid?

Done.


pkg/sql/sqlbase/index_encoding.go, line 825 at r1 (raw file):

Previously, jordanlewis (Jordan Lewis) wrote…

You should definitely precompute this. Otherwise it's like a duplicate N^2 loop on every key we're encoding. Also, it's pointless in the common case of 1 column family. Should we just have a special case for that vs more than 1?

Done -- I have to look into how to precompute this information across calls to EncodeSecondaryIndex. That might have to be the work of a future PR.

@rohany rohany force-pushed the col-fam-secondary branch 3 times, most recently from b9dd83f to 9b9ee41 Compare November 8, 2019 02:46
@rohany rohany requested a review from jordanlewis November 8, 2019 02:46
@rohany rohany force-pushed the col-fam-secondary branch 5 times, most recently from 6869797 to 308a389 Compare November 8, 2019 16:49
@rohany
Copy link
Copy Markdown
Contributor Author

rohany commented Nov 14, 2019

Friendly ping

@danhhz
Copy link
Copy Markdown
Contributor

danhhz commented Nov 14, 2019

Dunno if you're waiting on me, but I'm going to remove myself as one of the reviewers. Happy to hash out high level tradeoffs around column families but my knowledge of the details has rotted too much to be of any use.

@danhhz danhhz removed their request for review November 14, 2019 18:11
@rohany
Copy link
Copy Markdown
Contributor Author

rohany commented Nov 14, 2019

Dunno if you're waiting on me, but I'm going to remove myself as one of the reviewers. Happy to hash out high level tradeoffs around column families but my knowledge of the details has rotted too much to be of any use.

Oh my bad -- i thought you were still familiar with this area. Thanks for your help earlier with going through the high level details!

Copy link
Copy Markdown
Member

@RaduBerinde RaduBerinde left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @jordanlewis, @RaduBerinde, @rohany, and @solongordon)


docs/tech-notes/encoding.md, line 347 at r2 (raw file):

Because the key and value encodings for column family 0 have not changed,
we can read indexes encoded without column family support without any changes.

How does this work exactly? If I have an index that has the old encoding on-disk and I want to read a specific family (being able to do that is kind of the point of families), I won't scan family 0. I would need to know that I can't scan specific families with this index.

@rohany
Copy link
Copy Markdown
Contributor Author

rohany commented Nov 15, 2019

How does this work exactly? If I have an index that has the old encoding on-disk and I want to read a specific family (being able to do that is kind of the point of families), I won't scan family 0. I would need to know that I can't scan specific families with this index.

Thats a good point -- there are definitely cases like this where we wont want to use the encoding with families, so a versioning mark on the index descriptor is needed. I think the core of what I was trying to get at is that there doesn't need to be any special casing done for reading old encodings at the fetcher level, because the old encoding is compatible with the new one.

@RaduBerinde
Copy link
Copy Markdown
Member

Yeah, that makes sense - a mark on the descriptor that tells us that the index has all rows encoded using the current families.

@rohany
Copy link
Copy Markdown
Contributor Author

rohany commented Nov 15, 2019

Having a mark on the descriptor will also allow me to not have to plumb a cluster.settings object all the way down to encode key, which is something i left as a todo :)

@rohany rohany force-pushed the col-fam-secondary branch 2 times, most recently from 0613566 to b3d7133 Compare November 16, 2019 17:09
Copy link
Copy Markdown
Member

@jordanlewis jordanlewis left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:lgtm: besides the unknown about the thing @dt is tagged on.

Reviewable status: :shipit: complete! 1 of 0 LGTMs obtained (waiting on @dt, @jordanlewis, and @solongordon)


pkg/sql/sqlbase/system.go, line 324 at r4 (raw file):

Previously, jordanlewis (Jordan Lewis) wrote…

Hmm... this is a great question. It will not have impact on tables in earlier versions. I am not sure if this is a valid change. @dt would you weigh in?

Ping @dt

@rohany rohany force-pushed the col-fam-secondary branch 2 times, most recently from 241f194 to 26bbcb1 Compare December 2, 2019 15:44
Copy link
Copy Markdown
Contributor

@dt dt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (and 1 stale) (waiting on @jordanlewis, @rohany, and @solongordon)


pkg/sql/sqlbase/system.go, line 324 at r4 (raw file):

Previously, jordanlewis (Jordan Lewis) wrote…

Ping @dt

Uhh. I'm pretty out of the loop on descriptors and bootstrapping but I'm pretty sure this will only affect what is written on initial cluster creation and then what is used in a handful of unit tests. I don't know if you need to change this since you need a migration for existing descriptors anyway but I don't think it hurts anything either.

@rohany
Copy link
Copy Markdown
Contributor Author

rohany commented Dec 3, 2019

I had to change these to stop a bunch of unittests from failing -- do you know how/who to ask so we can confirm this only affects initial cluster creation?

Copy link
Copy Markdown
Contributor

@dt dt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (and 1 stale) (waiting on @jordanlewis, @rohany, and @solongordon)


pkg/sql/sqlbase/system.go, line 324 at r4 (raw file):

Previously, dt (David Taylor) wrote…

Uhh. I'm pretty out of the loop on descriptors and bootstrapping but I'm pretty sure this will only affect what is written on initial cluster creation and then what is used in a handful of unit tests. I don't know if you need to change this since you need a migration for existing descriptors anyway but I don't think it hurts anything either.

to clarify, changing this would be expected to make the unit test that the descriptor matches the result of CREATE TABLE pass, but my point above was that this only affects new clusters -- on existing clusters, these tables already exist, so depending on what tests are failing, that could be an indication that it is actually important that existing descriptors get migrated too?

@rohany
Copy link
Copy Markdown
Contributor Author

rohany commented Dec 4, 2019

Only the tests that were checking that executing the corresponding create table statement on these special tables created the corresponding hardcoded table descriptors were failing

@rohany
Copy link
Copy Markdown
Contributor Author

rohany commented Dec 4, 2019

I'm looking at some earlier commits that bump versions on table descriptors / index descriptors and they change these without including migrations.

This PR adds support for stored columns in secondary indexes to respect
column families.

Secondary indexes respect the family definitions applied
to tables, and break secondary index k/v pairs into mulitple
depending on the family and stored column configurations.

This PR adds and details an extension of the primary index
column family encoding for secondary indexes, and implements it.

This encoding was implemented by updating how secondary indexes are
encoded and decoded in `EncodeSecondaryIndex` and the fetchers.

This change will not be respected until all nodes in the cluster are
at least running version 20.1.

Release note (sql change): Allow stored columns in secondary indexes
to respect column family definitions upon they table they are on.
@rohany
Copy link
Copy Markdown
Contributor Author

rohany commented Dec 4, 2019

I don't think we'd want a migration for this in the first place -- the existing indexes are encoded with the old format, and we wouldn't want to bump them up to the new format (without rewriting all of their data). Only when a new cluster is created should the new version be used.

@rohany
Copy link
Copy Markdown
Contributor Author

rohany commented Dec 4, 2019

I'm going to go ahead and try merging this!

bors r=jordanlewis

craig bot pushed a commit that referenced this pull request Dec 4, 2019
42073: sql: Add column family support for secondary indexes r=jordanlewis a=rohany

This PR adds support for stored columns in secondary indexes to respect
column families.

Secondary indexes respect the family definitions applied
to tables, and break secondary index k/v pairs into mulitple
depending on the family and stored column configurations.

This PR adds and details an extension of the primary index
column family encoding for secondary indexes, and implements it.

This encoding was implemented by updating how secondary indexes are
encoded and decoded in `EncodeSecondaryIndex` and the fetchers.

This change will not be respected until all nodes in the cluster are
at least running version 20.1.

Release note (sql change): Allow stored columns in secondary indexes
to respect column family definitions upon they table they are on.

Co-authored-by: rohany <rohany@alumni.cmu.edu>
@craig
Copy link
Copy Markdown
Contributor

craig bot commented Dec 4, 2019

Build succeeded

@craig craig bot merged commit 1bcc7b8 into cockroachdb:master Dec 4, 2019
rohany added a commit to rohany/cockroach that referenced this pull request Dec 5, 2019
Fixes cockroachdb#42992.

PR cockroachdb#42073 introduced column families for secondary indexes, but also introduced
a bug where if the extra columns of a unique index were included because that
row had a null value, then those extra columns needed to be skipped when
decoding keys. If the type of that extra column was not supported by the
vectorized engine, and internal error was thrown.  This PR fixes the bug by
updating the key skipping logic to handle all valid types.

There is no release note because this was not a change visible
through major versions.

Release note: None
craig bot pushed a commit that referenced this pull request Dec 17, 2019
43002: sql: fix decoding error in fetcher r=jordanlewis a=rohany

Fixes #42992.

PR #42073 introduced column families for secondary indexes, but also introduced
a bug where if the extra columns of a unique index were included because that
row had a null value, then those extra columns needed to be skipped when
decoding keys. If the type of that extra column was not supported by the
vectorized engine, and internal error was thrown.  This PR fixes the bug by
updating the key skipping logic to handle all valid types.

There is no release note because this was not a change visible
through major versions.

Release note: None

Co-authored-by: Rohan Yadav <rohany@alumni.cmu.edu>
rohany added a commit to rohany/cockroach that referenced this pull request Jan 13, 2020
Fixes cockroachdb#42992.

PR cockroachdb#42073 introduced column families for secondary indexes, but also introduced
a bug where if the extra columns of a unique index were included because that
row had a null value, then those extra columns needed to be skipped when
decoding keys. If the type of that extra column was not supported by the
vectorized engine, and internal error was thrown.  This PR fixes the bug by
updating the key skipping logic to handle all valid types.

There is no release note because this was not a change visible
through major versions.

Release note: None
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

7 participants