sql: Add column family support for secondary indexes#42073
sql: Add column family support for secondary indexes#42073craig[bot] merged 1 commit intocockroachdb:masterfrom
Conversation
4a884a9 to
a1114bd
Compare
jordanlewis
left a comment
There was a problem hiding this comment.
Great work so far! Really excited about the PR.
My main comment is that this needs way more tests. Encoding is very subtle (as I'm sure you have found out already :). The only way to make sure that we've gotten it correct is to write tests.
Does @mjibson's column family randomizer tool help here? Or maybe it's already doing its magic?
Reviewable status:
complete! 0 of 0 LGTMs obtained (waiting on @danhhz, @jordanlewis, @RaduBerinde, @rohany, and @solongordon)
docs/tech-notes/encoding.md, line 286 at r1 (raw file):
for each indexed column, the data therein are non-null and equal. Secondary indexes have been extended to include support for column families.
Be more clear about when this change is happening. Imagine someone reading this doc in its entirety without knowledge of this migration.
docs/tech-notes/encoding.md, line 321 at r1 (raw file):
### Value encoding If the value corresponds to column family 0:
This is still for secondary indexes right? Leave that in.
docs/tech-notes/encoding.md, line 323 at r1 (raw file):
If the value corresponds to column family 0: The KV value will have value type bytes, and consists of
nit: grammar - tense doesn't agree (will have vs consists of)
docs/tech-notes/encoding.md, line 339 at r1 (raw file):
KV key and the KV value. (Ditto for stored column data in the old format.) Else the KV Value will have value type `TUPLE` and consists of all stored
Be more explicit about this else - this is a text document not a program. Say "For indexes with more than 1 column family, the non-0th column families KV values will..."
docs/tech-notes/encoding.md, line 342 at r1 (raw file):
columns in that family in the `TUPLE` encoded format. ### Backwards Compatibility
For what? Be more explicit here so it's clear to future readers.
pkg/sql/colencoding/key_encoding.go, line 238 at r1 (raw file):
// TODO(jordan): each type could be optimized here. // TODO(jordan): should use this approach in the normal row fetcher. // TODO(rohany): what is the approach in the normal row fetcher?
I think the normal row fetcher doesn't try to skip table key values at all if they're not necessary. It just decodes everything. Look at DecodeKeyVals.
pkg/sql/colexec/cfetcher.go, line 374 at r1 (raw file):
// Primary indexes only contain ascendingly-encoded // values. If this ever changes, we'll probably have to // figure out the directions here too.
Hmm... do we have to worry about this? Can you test your new code with descending index columns too? Same comment for fetcher.go.
pkg/sql/colexec/cfetcher.go, line 661 at r1 (raw file):
// is that the extra columns are not always there, and are used to unique-ify // the index key, rather than provide the primary key column values. if rf.table.isSecondaryIndex && rf.table.index.Unique && len(rf.table.desc.Families) != 1 {
We should only do this logic if the secondary index has nullable columns. Also, if this condition gets much more complex, you shouldn't recheck it for every single key. It's statically known, so you can figure it out at the start of the run of the fetcher and cache it as a single boolean.
pkg/sql/colexec/cfetcher.go, line 664 at r1 (raw file):
// TODO (rohany): we don't want to have to look at each indexed value twice (once // while decoding and once here). A possible optimization is to pass whether a // null was encountered while decoding and use that information here instead.
Yeah, this is going to make scans signinficantly slower I'm pretty sure. I think this optimization is very worthwhile. The fetchers are very performance sensitive and it's worth checking the impact here.
pkg/sql/sqlbase/index_encoding.go, line 813 at r1 (raw file):
} // TODO (rohany): do I need to skip this logic for an inverted index?
Good question... do you? Inverted indexes I don't think should have column families at all. And you definitely won't be able to make them primary later - they're fundamentally different beasts (a single database row can have more than one corresponding entries in a single inverted index).
pkg/sql/sqlbase/index_encoding.go, line 820 at r1 (raw file):
// Copy the key, as we will modify it and move it into the resulting IndexEntry. entryKey := make([]byte, len(key)) copy(entryKey, key)
This isn't necessary on the first iteration (which is the most common case). Can we avoid?
pkg/sql/sqlbase/index_encoding.go, line 825 at r1 (raw file):
// TODO (rohany): make this faster by computing this up front with // fastIntSets? I wonder if there was a way to reuse this computed // information across calls to EncodeSecondaryIndex.
You should definitely precompute this. Otherwise it's like a duplicate N^2 loop on every key we're encoding. Also, it's pointless in the common case of 1 column family. Should we just have a special case for that vs more than 1?
a1114bd to
d603a3d
Compare
rohany
left a comment
There was a problem hiding this comment.
Yeah, matts column family randomizer is running already -- it caught a bunch of bugs already!
I updated the PR with alot of the TODO optimizations implemented, and added a bunch of more detailed k/v trace tests that helped find some more bugs.
Working on this more opened some more TODO's, so PTAL!
Reviewable status:
complete! 0 of 0 LGTMs obtained (waiting on @danhhz, @jordanlewis, @RaduBerinde, and @solongordon)
docs/tech-notes/encoding.md, line 286 at r1 (raw file):
Previously, jordanlewis (Jordan Lewis) wrote…
Be more clear about when this change is happening. Imagine someone reading this doc in its entirety without knowledge of this migration.
Done.
docs/tech-notes/encoding.md, line 321 at r1 (raw file):
Previously, jordanlewis (Jordan Lewis) wrote…
This is still for secondary indexes right? Leave that in.
Done.
docs/tech-notes/encoding.md, line 339 at r1 (raw file):
Previously, jordanlewis (Jordan Lewis) wrote…
Be more explicit about this else - this is a text document not a program. Say "For indexes with more than 1 column family, the non-0th column families KV values will..."
Done.
docs/tech-notes/encoding.md, line 342 at r1 (raw file):
Previously, jordanlewis (Jordan Lewis) wrote…
For what? Be more explicit here so it's clear to future readers.
Done.
pkg/sql/colencoding/key_encoding.go, line 238 at r1 (raw file):
Previously, jordanlewis (Jordan Lewis) wrote…
I think the normal row fetcher doesn't try to skip table key values at all if they're not necessary. It just decodes everything. Look at
DecodeKeyVals.
oh ok -- i think i read this originally as "should use the approach in the normal row fetcher" and was confused why we didn't want to skip columns
pkg/sql/colexec/cfetcher.go, line 374 at r1 (raw file):
Previously, jordanlewis (Jordan Lewis) wrote…
Hmm... do we have to worry about this? Can you test your new code with descending index columns too? Same comment for fetcher.go.
I don't think we have to worry about this for this PR -- will have to double check this once the primary key compatibility changes in the fetcher go through. Either way, we can ensure that the new candidate primary key has ascending columns as well (as a first step).
pkg/sql/colexec/cfetcher.go, line 661 at r1 (raw file):
Previously, jordanlewis (Jordan Lewis) wrote…
We should only do this logic if the secondary index has nullable columns. Also, if this condition gets much more complex, you shouldn't recheck it for every single key. It's statically known, so you can figure it out at the start of the run of the fetcher and cache it as a single boolean.
I added the optimization below, so the check here can remain simple -- these current booleans and whether or not a null was actually decoded.
pkg/sql/colexec/cfetcher.go, line 664 at r1 (raw file):
Previously, jordanlewis (Jordan Lewis) wrote…
Yeah, this is going to make scans signinficantly slower I'm pretty sure. I think this optimization is very worthwhile. The fetchers are very performance sensitive and it's worth checking the impact here.
Done.
pkg/sql/sqlbase/index_encoding.go, line 813 at r1 (raw file):
Previously, jordanlewis (Jordan Lewis) wrote…
Good question... do you? Inverted indexes I don't think should have column families at all. And you definitely won't be able to make them primary later - they're fundamentally different beasts (a single database row can have more than one corresponding entries in a single inverted index).
Makes sense -- i'll move back to the existing logic for the inverted indexes just to be safe.
pkg/sql/sqlbase/index_encoding.go, line 820 at r1 (raw file):
Previously, jordanlewis (Jordan Lewis) wrote…
This isn't necessary on the first iteration (which is the most common case). Can we avoid?
Done.
pkg/sql/sqlbase/index_encoding.go, line 825 at r1 (raw file):
Previously, jordanlewis (Jordan Lewis) wrote…
You should definitely precompute this. Otherwise it's like a duplicate N^2 loop on every key we're encoding. Also, it's pointless in the common case of 1 column family. Should we just have a special case for that vs more than 1?
Done -- I have to look into how to precompute this information across calls to EncodeSecondaryIndex. That might have to be the work of a future PR.
b9dd83f to
9b9ee41
Compare
6869797 to
308a389
Compare
|
Friendly ping |
|
Dunno if you're waiting on me, but I'm going to remove myself as one of the reviewers. Happy to hash out high level tradeoffs around column families but my knowledge of the details has rotted too much to be of any use. |
Oh my bad -- i thought you were still familiar with this area. Thanks for your help earlier with going through the high level details! |
RaduBerinde
left a comment
There was a problem hiding this comment.
Reviewable status:
complete! 0 of 0 LGTMs obtained (waiting on @jordanlewis, @RaduBerinde, @rohany, and @solongordon)
docs/tech-notes/encoding.md, line 347 at r2 (raw file):
Because the key and value encodings for column family 0 have not changed, we can read indexes encoded without column family support without any changes.
How does this work exactly? If I have an index that has the old encoding on-disk and I want to read a specific family (being able to do that is kind of the point of families), I won't scan family 0. I would need to know that I can't scan specific families with this index.
Thats a good point -- there are definitely cases like this where we wont want to use the encoding with families, so a versioning mark on the index descriptor is needed. I think the core of what I was trying to get at is that there doesn't need to be any special casing done for reading old encodings at the fetcher level, because the old encoding is compatible with the new one. |
|
Yeah, that makes sense - a mark on the descriptor that tells us that the index has all rows encoded using the current families. |
|
Having a mark on the descriptor will also allow me to not have to plumb a cluster.settings object all the way down to encode key, which is something i left as a todo :) |
0613566 to
b3d7133
Compare
jordanlewis
left a comment
There was a problem hiding this comment.
besides the unknown about the thing @dt is tagged on.
Reviewable status:
complete! 1 of 0 LGTMs obtained (waiting on @dt, @jordanlewis, and @solongordon)
pkg/sql/sqlbase/system.go, line 324 at r4 (raw file):
Previously, jordanlewis (Jordan Lewis) wrote…
Hmm... this is a great question. It will not have impact on tables in earlier versions. I am not sure if this is a valid change. @dt would you weigh in?
Ping @dt
241f194 to
26bbcb1
Compare
dt
left a comment
There was a problem hiding this comment.
Reviewable status:
complete! 0 of 0 LGTMs obtained (and 1 stale) (waiting on @jordanlewis, @rohany, and @solongordon)
pkg/sql/sqlbase/system.go, line 324 at r4 (raw file):
Previously, jordanlewis (Jordan Lewis) wrote…
Ping @dt
Uhh. I'm pretty out of the loop on descriptors and bootstrapping but I'm pretty sure this will only affect what is written on initial cluster creation and then what is used in a handful of unit tests. I don't know if you need to change this since you need a migration for existing descriptors anyway but I don't think it hurts anything either.
|
I had to change these to stop a bunch of unittests from failing -- do you know how/who to ask so we can confirm this only affects initial cluster creation? |
dt
left a comment
There was a problem hiding this comment.
Reviewable status:
complete! 0 of 0 LGTMs obtained (and 1 stale) (waiting on @jordanlewis, @rohany, and @solongordon)
pkg/sql/sqlbase/system.go, line 324 at r4 (raw file):
Previously, dt (David Taylor) wrote…
Uhh. I'm pretty out of the loop on descriptors and bootstrapping but I'm pretty sure this will only affect what is written on initial cluster creation and then what is used in a handful of unit tests. I don't know if you need to change this since you need a migration for existing descriptors anyway but I don't think it hurts anything either.
to clarify, changing this would be expected to make the unit test that the descriptor matches the result of CREATE TABLE pass, but my point above was that this only affects new clusters -- on existing clusters, these tables already exist, so depending on what tests are failing, that could be an indication that it is actually important that existing descriptors get migrated too?
|
Only the tests that were checking that executing the corresponding |
|
I'm looking at some earlier commits that bump versions on table descriptors / index descriptors and they change these without including migrations. |
This PR adds support for stored columns in secondary indexes to respect column families. Secondary indexes respect the family definitions applied to tables, and break secondary index k/v pairs into mulitple depending on the family and stored column configurations. This PR adds and details an extension of the primary index column family encoding for secondary indexes, and implements it. This encoding was implemented by updating how secondary indexes are encoded and decoded in `EncodeSecondaryIndex` and the fetchers. This change will not be respected until all nodes in the cluster are at least running version 20.1. Release note (sql change): Allow stored columns in secondary indexes to respect column family definitions upon they table they are on.
26bbcb1 to
1bcc7b8
Compare
|
I don't think we'd want a migration for this in the first place -- the existing indexes are encoded with the old format, and we wouldn't want to bump them up to the new format (without rewriting all of their data). Only when a new cluster is created should the new version be used. |
|
I'm going to go ahead and try merging this! bors r=jordanlewis |
42073: sql: Add column family support for secondary indexes r=jordanlewis a=rohany This PR adds support for stored columns in secondary indexes to respect column families. Secondary indexes respect the family definitions applied to tables, and break secondary index k/v pairs into mulitple depending on the family and stored column configurations. This PR adds and details an extension of the primary index column family encoding for secondary indexes, and implements it. This encoding was implemented by updating how secondary indexes are encoded and decoded in `EncodeSecondaryIndex` and the fetchers. This change will not be respected until all nodes in the cluster are at least running version 20.1. Release note (sql change): Allow stored columns in secondary indexes to respect column family definitions upon they table they are on. Co-authored-by: rohany <rohany@alumni.cmu.edu>
Build succeeded |
Fixes cockroachdb#42992. PR cockroachdb#42073 introduced column families for secondary indexes, but also introduced a bug where if the extra columns of a unique index were included because that row had a null value, then those extra columns needed to be skipped when decoding keys. If the type of that extra column was not supported by the vectorized engine, and internal error was thrown. This PR fixes the bug by updating the key skipping logic to handle all valid types. There is no release note because this was not a change visible through major versions. Release note: None
43002: sql: fix decoding error in fetcher r=jordanlewis a=rohany Fixes #42992. PR #42073 introduced column families for secondary indexes, but also introduced a bug where if the extra columns of a unique index were included because that row had a null value, then those extra columns needed to be skipped when decoding keys. If the type of that extra column was not supported by the vectorized engine, and internal error was thrown. This PR fixes the bug by updating the key skipping logic to handle all valid types. There is no release note because this was not a change visible through major versions. Release note: None Co-authored-by: Rohan Yadav <rohany@alumni.cmu.edu>
Fixes cockroachdb#42992. PR cockroachdb#42073 introduced column families for secondary indexes, but also introduced a bug where if the extra columns of a unique index were included because that row had a null value, then those extra columns needed to be skipped when decoding keys. If the type of that extra column was not supported by the vectorized engine, and internal error was thrown. This PR fixes the bug by updating the key skipping logic to handle all valid types. There is no release note because this was not a change visible through major versions. Release note: None
This PR adds support for stored columns in secondary indexes to respect
column families.
Secondary indexes respect the family definitions applied
to tables, and break secondary index k/v pairs into mulitple
depending on the family and stored column configurations.
This PR adds and details an extension of the primary index
column family encoding for secondary indexes, and implements it.
This encoding was implemented by updating how secondary indexes are
encoded and decoded in
EncodeSecondaryIndexand the fetchers.This change will not be respected until all nodes in the cluster are
at least running version 20.1.
Release note (sql change): Allow stored columns in secondary indexes
to respect column family definitions upon they table they are on.