sql: optimize NeededColumnFamilyIDs#44239
Conversation
|
This needs tests but is ready for an initial look if you're interested. |
nvb
left a comment
There was a problem hiding this comment.
Very nice! Thanks for putting this together so fast.
Reviewed 2 of 2 files at r1.
Reviewable status:complete! 0 of 0 LGTMs obtained (waiting on @rohany and @solongordon)
pkg/sql/sqlbase/index_encoding.go, line 243 at r1 (raw file):
// NeededColumnFamilyIDs returns the minimal set of column families required to // retrieve neededCols for the specified table and index.
Make a note that the returned column families are in sorted order.
pkg/sql/sqlbase/index_encoding.go, line 266 at r1 (raw file):
// Iterate over the column families to find which ones contain needed // columns. We also keep track of whether all of the needed family columns are // nullable, since this means we need column family 0 as a sentinel.
"as a sentinel, regardless of whether any of the columns in it are needed"
pkg/sql/sqlbase/index_encoding.go, line 292 at r1 (raw file):
if needed { neededFamilyIDs = append(neededFamilyIDs, family.ID) if !nullable || family.ID == 0 {
nit: I'd set nullable == false in the if family.ID == 0 { branch. You'll need to move up the two bool declarations.
pkg/sql/sqlbase/index_encoding.go, line 301 at r1 (raw file):
} // If all the needed families are nullable, we need family 0 as a sentinel.
Add a note that we'll never enter this branch if family 0 is already in neededFamilyIDs.
pkg/sql/sqlbase/index_encoding.go, line 303 at r1 (raw file):
// If all the needed families are nullable, we need family 0 as a sentinel. if allFamiliesNullable { return append([]FamilyID{0}, neededFamilyIDs...)
Let's avoid the extra allocation, since this one is easy to avoid:
if allFamiliesNullable {
// Prepend family 0.
neededFamilyIDs = append(neededFamilyIDs, 0)
copy(neededFamilyIDs[1:], neededFamilyIDs)
neededFamilyIDs[0] = family0.ID
}
rohany
left a comment
There was a problem hiding this comment.
Reviewable status:
complete! 0 of 0 LGTMs obtained (waiting on @solongordon)
pkg/sql/sqlbase/index_encoding.go, line 279 at r1 (raw file):
for _, columnID := range family.ColumnIDs { columnOrdinal := colIdxMap[columnID] // We need this column family if it includes a needed column, unless that
This is a bit awkward, due to where we place the composite columns in primary vs secondary indexes (sorry):
If an index is the primary index, or is a secondary index encoded like a primary index (i.e. using the primary index encoding), then the composite value is included in the family that the column is in.
If an index is using the secondary index encoding, then composite values are always included in family 0.
40cc474 to
367426e
Compare
solongordon
left a comment
There was a problem hiding this comment.
Reviewable status:
complete! 0 of 0 LGTMs obtained (waiting on @nvanbenschoten and @rohany)
pkg/sql/sqlbase/index_encoding.go, line 243 at r1 (raw file):
Previously, nvanbenschoten (Nathan VanBenschoten) wrote…
Make a note that the returned column families are in sorted order.
Does that mean it's guaranteed that table.Families is in sorted order? I don't see any reason it shouldn't be, but I was trying to avoid that assumption just in case.
pkg/sql/sqlbase/index_encoding.go, line 266 at r1 (raw file):
Previously, nvanbenschoten (Nathan VanBenschoten) wrote…
"as a sentinel, regardless of whether any of the columns in it are needed"
Done.
pkg/sql/sqlbase/index_encoding.go, line 279 at r1 (raw file):
Previously, rohany (Rohan Yadav) wrote…
This is a bit awkward, due to where we place the composite columns in primary vs secondary indexes (sorry):
If an index is the primary index, or is a secondary index encoded like a primary index (i.e. using the primary index encoding), then the composite value is included in the family that the column is in.
If an index is using the secondary index encoding, then composite values are always included in family 0.
Thank you, I suspected I didn't get this quite right. (As usual I couldn't find this in encoding.md but now I see that it's there.)
pkg/sql/sqlbase/index_encoding.go, line 292 at r1 (raw file):
Previously, nvanbenschoten (Nathan VanBenschoten) wrote…
nit: I'd set
nullable == falsein theif family.ID == 0 {branch. You'll need to move up the two bool declarations.
Done.
pkg/sql/sqlbase/index_encoding.go, line 301 at r1 (raw file):
Previously, nvanbenschoten (Nathan VanBenschoten) wrote…
Add a note that we'll never enter this branch if family 0 is already in
neededFamilyIDs.
Done.
pkg/sql/sqlbase/index_encoding.go, line 303 at r1 (raw file):
Previously, nvanbenschoten (Nathan VanBenschoten) wrote…
Let's avoid the extra allocation, since this one is easy to avoid:
if allFamiliesNullable { // Prepend family 0. neededFamilyIDs = append(neededFamilyIDs, 0) copy(neededFamilyIDs[1:], neededFamilyIDs) neededFamilyIDs[0] = family0.ID }
👍 Done.
solongordon
left a comment
There was a problem hiding this comment.
Reviewable status:
complete! 0 of 0 LGTMs obtained (waiting on @nvanbenschoten and @rohany)
pkg/sql/sqlbase/index_encoding.go, line 279 at r1 (raw file):
Previously, solongordon (Solon) wrote…
Thank you, I suspected I didn't get this quite right. (As usual I couldn't find this in
encoding.mdbut now I see that it's there.)
Done.
rohany
left a comment
There was a problem hiding this comment.
This logic is p tricky, so lets get some tests on it
Reviewable status:
complete! 0 of 0 LGTMs obtained (waiting on @nvanbenschoten, @rohany, and @solongordon)
pkg/sql/sqlbase/index_encoding.go, line 284 at r2 (raw file):
// Set column family 0 aside in case we need it as a sentinel. family0 = family if index.EncodingType == SecondaryIndexEncoding && compositeNeeded {
You actually need to check if the index is not the primary index here and that the encoding type is SecondaryIndexEncoding. That field is only populated on secondary indexes (is there a way to make this more clear? I'm worried this will lead to some bugs in the future)
pkg/sql/sqlbase/index_encoding.go, line 291 at r2 (raw file):
} for _, columnID := range family.ColumnIDs { if needed && !nullable {
could we move this outside of the loop?
pkg/sql/sqlbase/index_encoding.go, line 300 at r2 (raw file):
// composite. if neededCols.Contains(columnOrdinal) && (!indexedIDs.Contains(int(columnID)) || compositeIDs.Contains(int(columnID))) {
don't we need to only check the composite here if the index is a primary index? Otherwise we might erroneously generate spans for secondary indexes with composite columns (even though we already included family 0)
367426e to
43476ba
Compare
solongordon
left a comment
There was a problem hiding this comment.
Still iterating on getting existing logic tests to pass, then will add more of my own. I'll comment when this is ready for another look.
Reviewable status:
complete! 0 of 0 LGTMs obtained (waiting on @nvanbenschoten and @rohany)
pkg/sql/sqlbase/index_encoding.go, line 284 at r2 (raw file):
Previously, rohany (Rohan Yadav) wrote…
You actually need to check if the index is not the primary index here and that the encoding type is SecondaryIndexEncoding. That field is only populated on secondary indexes (is there a way to make this more clear? I'm worried this will lead to some bugs in the future)
Ooh that is tricky. I made a bad assumption that PrimaryIndexEncoding was the zero value.
pkg/sql/sqlbase/index_encoding.go, line 291 at r2 (raw file):
Previously, rohany (Rohan Yadav) wrote…
could we move this outside of the loop?
We could, but it allows exiting the loop early in some cases.
pkg/sql/sqlbase/index_encoding.go, line 300 at r2 (raw file):
Previously, rohany (Rohan Yadav) wrote…
don't we need to only check the composite here if the index is a primary index? Otherwise we might erroneously generate spans for secondary indexes with composite columns (even though we already included family 0)
Yes. Done.
43476ba to
b2a0141
Compare
3e506d9 to
55368bb
Compare
|
Getting closer to making existing tests pass. I was previously not handling mutation columns, which got caught by |
5ba62cd to
c17dcd0
Compare
|
This should be ready for another look. This function has turned into a bit of a monster and I'd welcome suggestions for making it more efficient. I've mostly been focusing on correctness. |
solongordon
left a comment
There was a problem hiding this comment.
Reviewable status:
complete! 0 of 0 LGTMs obtained (waiting on @nvanbenschoten and @rohany)
pkg/sql/sqlbase/index_encoding.go, line 284 at r2 (raw file):
Previously, solongordon (Solon) wrote…
Ooh that is tricky. I made a bad assumption that
PrimaryIndexEncodingwas the zero value.
Done.
| neededCols = neededCols.Copy() | ||
|
|
||
| // Build some necessary data structures for column metadata. | ||
| columns := table.ColumnsWithMutations(true) |
There was a problem hiding this comment.
The SpanBuilder could cache these structures on creation of the object and pass them through to here.
pkg/sql/sqlbase/index_encoding.go
Outdated
| if indexedCols.Contains(columnOrdinal) && !compositeCols.Contains(columnOrdinal) { | ||
| // We can decode this column from the index key, so no particular family | ||
| // is needed. | ||
| neededCols.Remove(columnOrdinal) |
There was a problem hiding this comment.
are you allowedto remove stuff while iterating over the object?
pkg/sql/sqlbase/index_encoding.go
Outdated
| // is needed. | ||
| neededCols.Remove(columnOrdinal) | ||
| } | ||
| if compositesInFamily0 && compositeCols.Contains(columnOrdinal) || |
There was a problem hiding this comment.
can you add a comment here about this case?
nvb
left a comment
There was a problem hiding this comment.
Reviewed 1 of 1 files at r2, 11 of 12 files at r3.
Reviewable status:complete! 0 of 0 LGTMs obtained (waiting on @nvanbenschoten, @rohany, and @solongordon)
pkg/sql/logictest/testdata/logic_test/family, line 337 at r3 (raw file):
query TT SELECT field, description FROM [EXPLAIN SELECT c FROM t1 WHERE a = 10] WHERE field IN ('table', 'spans')
Does SELECT d from t1 WHERE a = 10 result in two distinct scans instead of a single scan over two families? If so, could you add that case as well?
pkg/sql/sqlbase/index_encoding.go, line 243 at r1 (raw file):
Previously, solongordon (Solon) wrote…
Does that mean it's guaranteed that
table.Familiesis in sorted order? I don't see any reason it shouldn't be, but I was trying to avoid that assumption just in case.
That's a good question. I'm not sure, but we seem to rely on this assumption in callers of this function, at least in order to merge spans. We shouldn't say it if it's not true, but I suspect you're right about table.Families, so maybe we're missing documentation on that slice as well.
pkg/sql/sqlbase/structured.go, line 3230 at r3 (raw file):
// mutation columns. func (desc *TableDescriptor) ColumnsWithMutations(mutations bool) []*ColumnDescriptor { nCols := len(desc.Columns)
If we're willing to switch the return type to []ColumnDesciptor then we can avoid allocating a new slice in what I expect is the overwhelmingly common case where !mutations || len(desc.Mutations) == 0 with something like:
nCols := len(desc.Columns)
columns := desc.Columns[:nCols:nCols]
if mutations {
for i := range desc.Mutations {
if col := desc.Mutations[i].GetColumn(); col != nil {
columns = append(columns, col)
}
}
}
return columnsc17dcd0 to
b3bd5db
Compare
solongordon
left a comment
There was a problem hiding this comment.
Reviewable status:
complete! 0 of 0 LGTMs obtained (waiting on @nvanbenschoten and @rohany)
pkg/sql/logictest/testdata/logic_test/family, line 337 at r3 (raw file):
Previously, nvanbenschoten (Nathan VanBenschoten) wrote…
Does
SELECT d from t1 WHERE a = 10result in two distinct scans instead of a single scan over two families? If so, could you add that case as well?
That particular query only needs to scan column family 0, but I know what you mean. Added a test case.
pkg/sql/sqlbase/index_encoding.go, line 252 at r3 (raw file):
Previously, rohany (Rohan Yadav) wrote…
The
SpanBuildercould cache these structures on creation of the object and pass them through to here.
Good idea. I'm going to save this for a follow-up PR if that's ok with you.
pkg/sql/sqlbase/index_encoding.go, line 288 at r3 (raw file):
Previously, rohany (Rohan Yadav) wrote…
are you allowedto remove stuff while iterating over the object?
Good point. It seems like yes but I agree it's dangerous to rely on that since ForEach doesn't make it explicit. Done.
pkg/sql/sqlbase/index_encoding.go, line 290 at r3 (raw file):
Previously, rohany (Rohan Yadav) wrote…
can you add a comment here about this case?
Done.
pkg/sql/sqlbase/structured.go, line 3230 at r3 (raw file):
Previously, nvanbenschoten (Nathan VanBenschoten) wrote…
If we're willing to switch the return type to
[]ColumnDesciptorthen we can avoid allocating a new slice in what I expect is the overwhelmingly common case where!mutations || len(desc.Mutations) == 0with something like:nCols := len(desc.Columns) columns := desc.Columns[:nCols:nCols] if mutations { for i := range desc.Mutations { if col := desc.Mutations[i].GetColumn(); col != nil { columns = append(columns, col) } } } return columns
Good point, done.
rohany
left a comment
There was a problem hiding this comment.
LGTM aside from nit
Reviewable status:
complete! 0 of 0 LGTMs obtained (waiting on @nvanbenschoten and @solongordon)
pkg/sql/sqlbase/index_encoding.go, line 252 at r3 (raw file):
Previously, solongordon (Solon) wrote…
Good idea. I'm going to save this for a follow-up PR if that's ok with you.
Sounds good
pkg/sql/sqlbase/index_encoding.go, line 274 at r4 (raw file):
// columns which are not indexed.) var family0 *ColumnFamilyDescriptor hasSecondaryEncoding := index.ID != table.PrimaryIndex.ID &&
nit -- maybe we could move this into a function on the index descriptor? I've had to do this in some places as well, it would be a good idea to centralize it.
b3bd5db to
e7aa821
Compare
solongordon
left a comment
There was a problem hiding this comment.
Reviewable status:
complete! 0 of 0 LGTMs obtained (waiting on @nvanbenschoten and @rohany)
pkg/sql/sqlbase/index_encoding.go, line 274 at r4 (raw file):
Previously, rohany (Rohan Yadav) wrote…
nit -- maybe we could move this into a function on the index descriptor? I've had to do this in some places as well, it would be a good idea to centralize it.
Ah, yeah I had the same thought. Done.
I made the logic for determining which column families need to be scanned more clever in two ways: - Previously we were always including column family 0 as a sentinel, since other column families have no KV entry if all their columns are null. This is not necessary if any of the column families being scanned have a NOT NULL column. - If a needed column is indexed and not composite, it can be decoded from the key, so we don't need to take it into account when determining the needed column families. Release note: None
e7aa821 to
2ba11ea
Compare
|
bors r+ |
44239: sql: optimize NeededColumnFamilyIDs r=solongordon a=solongordon I made the logic for determining which column families need to be scanned more clever in two ways: - Previously we were always including column family 0 as a sentinel, since other column families have no KV entry if all their columns are null. This is not necessary if any of the column families being scanned have a NOT NULL column. - If a needed column is indexed and not composite, it can be decoded from the key, so we don't need to take it into account when determining the needed column families. Release note: None Co-authored-by: Solon Gordon <solon@cockroachlabs.com>
Build succeeded |
This change marks all columns in the YCSB "usertable" as NOT NULL. Doing so allows the load generator to take advantage of cockroachdb#44239, which avoids the KV lookup on the primary column family entirely when querying one of the other column families. The query plans before and after demonstrate this: ``` --- Before root@:26257/ycsb> EXPLAIN SELECT field5 FROM usertable WHERE ycsb_key = 'key'; tree | field | description ------------+-------------+------------------------------------------ | distributed | false | vectorized | false render | | └── scan | | | table | usertable@primary | spans | /"key"/0-/"key"/1 /"key"/6/1-/"key"/6/2 | parallel | --- After root@:26257/ycsb> EXPLAIN SELECT field5 FROM usertable WHERE ycsb_key = 'key'; tree | field | description ------------+-------------+------------------------ | distributed | false | vectorized | false render | | └── scan | | | table | usertable@primary | spans | /"key"/6/1-/"key"/6/2 ``` This becomes very important when running YCSB with a column family per field and with implicit SELECT FOR UPDATE (see cockroachdb#45159). Now that (as of cockroachdb#45701) UPDATE statements acquire upgrade locks during their initial row fetch, we don't want them acquiring upgrade locks on the primary column family of the row they are intending to update a single column in. This re-introduces the contention between writes to different columns in the same row that column families helped avoid (see cockroachdb#32704). By marking each column as NOT NULL, we can continue to avoid this contention.
This change marks all columns in the YCSB "usertable" as NOT NULL. Doing so allows the load generator to take advantage of cockroachdb#44239, which avoids the KV lookup on the primary column family entirely when querying one of the other column families. The query plans before and after demonstrate this: ``` --- Before root@:26257/ycsb> EXPLAIN SELECT field5 FROM usertable WHERE ycsb_key = 'key'; tree | field | description ------------+-------------+------------------------------------------ | distributed | false | vectorized | false render | | └── scan | | | table | usertable@primary | spans | /"key"/0-/"key"/1 /"key"/6/1-/"key"/6/2 | parallel | --- After root@:26257/ycsb> EXPLAIN SELECT field5 FROM usertable WHERE ycsb_key = 'key'; tree | field | description ------------+-------------+------------------------ | distributed | false | vectorized | false render | | └── scan | | | table | usertable@primary | spans | /"key"/6/1-/"key"/6/2 ``` This becomes very important when running YCSB with a column family per field and with implicit SELECT FOR UPDATE (see cockroachdb#45159). Now that (as of cockroachdb#45701) UPDATE statements acquire upgrade locks during their initial row fetch, we don't want them acquiring upgrade locks on the primary column family of the row they are intending to update a single column in. This re-introduces the contention between writes to different columns in the same row that column families helped avoid (see cockroachdb#32704). By marking each column as NOT NULL, we can continue to avoid this contention.
This change marks all columns in the YCSB "usertable" as NOT NULL. Doing so allows the load generator to take advantage of cockroachdb#44239, which avoids the KV lookup on the primary column family entirely when querying one of the other column families. The query plans before and after demonstrate this: ``` --- Before root@:26257/ycsb> EXPLAIN SELECT field5 FROM usertable WHERE ycsb_key = 'key'; tree | field | description ------------+-------------+------------------------------------------ | distributed | false | vectorized | false render | | └── scan | | | table | usertable@primary | spans | /"key"/0-/"key"/1 /"key"/6/1-/"key"/6/2 | parallel | --- After root@:26257/ycsb> EXPLAIN SELECT field5 FROM usertable WHERE ycsb_key = 'key'; tree | field | description ------------+-------------+------------------------ | distributed | false | vectorized | false render | | └── scan | | | table | usertable@primary | spans | /"key"/6/1-/"key"/6/2 ``` This becomes very important when running YCSB with a column family per field and with implicit SELECT FOR UPDATE (see cockroachdb#45159). Now that (as of cockroachdb#45701) UPDATE statements acquire upgrade locks during their initial row fetch, we don't want them acquiring upgrade locks on the primary column family of the row they are intending to update a single column in. This re-introduces the contention between writes to different columns in the same row that column families helped avoid (see cockroachdb#32704). By marking each column as NOT NULL, we can continue to avoid this contention.
I made the logic for determining which column families need to be
scanned more clever in two ways:
since other column families have no KV entry if all their columns are
null. This is not necessary if any of the column families being
scanned have a NOT NULL column.
from the key, so we don't need to take it into account when
determining the needed column families.
Release note: None