sql: add delete preserving index encoding, use for primary and second…#71813
sql: add delete preserving index encoding, use for primary and second…#71813craig[bot] merged 1 commit intocockroachdb:masterfrom
Conversation
pkg/roachpb/data.proto
Outdated
| RangeClosedTimestampPolicy closed_timestamp_policy = 3; | ||
| } | ||
|
|
||
| message IndexValueWrapper { |
There was a problem hiding this comment.
Can you move this out to another file that sql-schema then owns? I know it is a random point in time to start applying hygiene to the roachpb package but I am just cleaning up the ownership notifications for roachpb over here: #71919
Thank you!
stevendanna
left a comment
There was a problem hiding this comment.
Reviewable status:
complete! 0 of 0 LGTMs obtained (waiting on @dt and @rhu713)
pkg/sql/delete_preserving_index_test.go, line 30 at r1 (raw file):
"github.com/cockroachdb/cockroach/pkg/sql/tests" "github.com/cockroachdb/cockroach/pkg/startupmigrations" "github.com/cockroachdb/cockroach/pkg/testutils/lint/passes/fmtsafe/testdata/src/github.com/cockroachdb/errors"
Looks like goimports picked up the wrong errors package. you'll want github.com/cockroachdb/errors
pkg/sql/delete_preserving_index_test.go, line 132 at r1 (raw file):
prefixEnd := append(prefix, []byte("\xff")...) revisions, err := storageccl.GetAllRevisions(context.Background(), kvDB, prefix, prefixEnd, now, end)
I believe we typically want to avoid importing ccl code in non-ccl packages. It's test code, but we might still want to check on this and perhaps just move where this function lives.
pkg/sql/catalog/descpb/structured.proto, line 566 at r1 (raw file):
optional string predicate = 23 [(gogoproto.nullable) = false]; optional bool use_delete_preserving_encoding = 24 [(gogoproto.nullable) = false];
Perhaps add a comment on why we have a separate bool for this instead of using the encode_type field.
pkg/sql/row/helper.go, line 94 at r1 (raw file):
// Secondary indexes. Indexes []catalog.Index indexEntries map[catalog.Index][]rowenc.IndexEntry
It looks like before this was stored on the struct to save on allocations in encodeSecondaryIndexes and encodeIndexes. Now, it looks like we are always allocating a new map. If we don't think that allocation savings is important anymore, we can probably remove this from the struct.
If we do still care about the allocation savings, perhaps we could so something where we store all of the entries in one slice and then have the maps that we return contain subslices of that single backing slice or something.
pkg/sql/row/helper.go, line 314 at r1 (raw file):
if err != nil { return err }
I wonder if it would be worth it to cache deleteEncoding somewhere, since we are always writing the same value. If for some reason we do want to encode it every time, I believe you can pass a protoutil.Message directly to (*batch).Put which has some code to avoid unnecessary allocation and copying in this case.
pkg/sql/row/helper.go, line 317 at r1 (raw file):
if traceKV { log.VEventf(ctx, 2, "Put %s -> %s", entry.Key, entry.Value.PrettyPrint())
Wonder if it would be useful to add something like (delete) to this entry to remind us this is a delete.
pkg/sql/rowenc/index_encoding.go, line 1199 at r1 (raw file):
indexEntries[i].Value.SetBytes(encodedEntry) } }
We could avoid iterating over indexEntries again by doing this when we are constructing entry. Although, I suppose it is nice that this is parallel with how we are doing it in EncodeSecondaryIndex.
23c0261 to
9515a3d
Compare
rhu713
left a comment
There was a problem hiding this comment.
Reviewable status:
complete! 0 of 0 LGTMs obtained (waiting on @dt, @stevendanna, and @tbg)
pkg/sql/delete_preserving_index_test.go, line 30 at r1 (raw file):
Previously, stevendanna (Steven Danna) wrote…
Looks like goimports picked up the wrong errors package. you'll want
github.com/cockroachdb/errors
Done.
pkg/sql/delete_preserving_index_test.go, line 132 at r1 (raw file):
Previously, stevendanna (Steven Danna) wrote…
I believe we typically want to avoid importing ccl code in non-ccl packages. It's test code, but we might still want to check on this and perhaps just move where this function lives.
Done. I moved the helper function to the sql/backfill package.
pkg/sql/catalog/descpb/structured.proto, line 566 at r1 (raw file):
Previously, stevendanna (Steven Danna) wrote…
Perhaps add a comment on why we have a separate bool for this instead of using the encode_type field.
Done. Added to the message-level comment.
pkg/sql/row/helper.go, line 94 at r1 (raw file):
Previously, stevendanna (Steven Danna) wrote…
It looks like before this was stored on the struct to save on allocations in
encodeSecondaryIndexesandencodeIndexes. Now, it looks like we are always allocating a new map. If we don't think that allocation savings is important anymore, we can probably remove this from the struct.If we do still care about the allocation savings, perhaps we could so something where we store all of the entries in one slice and then have the maps that we return contain subslices of that single backing slice or something.
Done. Removed this stored field for this PR.
pkg/sql/row/helper.go, line 314 at r1 (raw file):
Previously, stevendanna (Steven Danna) wrote…
I wonder if it would be worth it to cache deleteEncoding somewhere, since we are always writing the same value. If for some reason we do want to encode it every time, I believe you can pass a
protoutil.Messagedirectly to(*batch).Putwhich has some code to avoid unnecessary allocation and copying in this case.
Done. Cached the deleteEncoding as a var in the file.
pkg/sql/row/helper.go, line 317 at r1 (raw file):
Previously, stevendanna (Steven Danna) wrote…
Wonder if it would be useful to add something like
(delete)to this entry to remind us this is a delete.
Done.
pkg/sql/rowenc/index_encoding.go, line 1199 at r1 (raw file):
Previously, stevendanna (Steven Danna) wrote…
We could avoid iterating over indexEntries again by doing this when we are constructing entry. Although, I suppose it is nice that this is parallel with how we are doing it in EncodeSecondaryIndex.
How expensive is it to iterate through the loop this way? I want to make sure to catch all of the places that an entry might be built and this seemed like a reasonable way to do it.
pkg/roachpb/data.proto, line 757 at r1 (raw file):
Previously, tbg (Tobias Grieger) wrote…
Can you move this out to another file that sql-schema then owns? I know it is a random point in time to start applying hygiene to the roachpb package but I am just cleaning up the ownership notifications for
roachpbover here: #71919
Thank you!
Done. I moved this to catalog/descpb/structured.proto
dt
left a comment
There was a problem hiding this comment.
Reviewed 9 of 18 files at r1, 5 of 9 files at r2.
Reviewable status:complete! 0 of 0 LGTMs obtained (waiting on @dt, @rhu713, @stevendanna, and @tbg)
pkg/sql/delete_preserving_index_test.go, line 132 at r1 (raw file):
Previously, rhu713 (Rui Hu) wrote…
Done. I moved the helper function to the sql/backfill package.
+1 to not depending on ccl, but I'm not sure backfill is the right place for it either. maybe kv/kvclient (since it is a client utility helper for using the KV API) or kv/bulk ?
pkg/sql/row/helper.go, line 94 at r1 (raw file):
Previously, rhu713 (Rui Hu) wrote…
Done. Removed this stored field for this PR.
hm, we'll want to work with sql folks to make sure we benchmark and profile some write-heavy workloads after this merges since allocating on every row might show up. commented below with a potential other idea.
pkg/sql/row/helper.go, line 198 at r2 (raw file):
) (secondaryIndexEntries map[catalog.Index][]rowenc.IndexEntry, err error) { indexEntries := make(map[catalog.Index][]rowenc.IndexEntry, len(rh.Indexes))
I wonder if we should keep rh.indexEntries around and just change its type to map instead to avoid re-allocating every time? we could do for i := range rh.indexEntries { rh.indexEntries[i] = rh.indexEntries[i][:0]} or something to reset it?
pkg/sql/backfill/revision_reader.go, line 3 at r2 (raw file):
// Copyright 2016 The Cockroach Authors. // // Licensed as a CockroachDB Enterprise file under the Cockroach Community
I think we also need to switch this to the BSL header instead of the CCL header.
pkg/sql/backfill/revision_reader.go, line 9 at r2 (raw file):
// https://github.com/cockroachdb/cockroach/blob/master/licenses/CCL.txt package backfill
mentioned below, think this should probably live elsewhere, maybe pkg/kv/kvclient
0f4f882 to
39c206e
Compare
rhu713
left a comment
There was a problem hiding this comment.
Reviewable status:
complete! 0 of 0 LGTMs obtained (waiting on @dt, @stevendanna, and @tbg)
pkg/sql/delete_preserving_index_test.go, line 132 at r1 (raw file):
Previously, dt (David Taylor) wrote…
+1 to not depending on ccl, but I'm not sure
backfillis the right place for it either. maybe kv/kvclient (since it is a client utility helper for using the KV API) or kv/bulk ?
Done. Moved to kv/client.
pkg/sql/row/helper.go, line 94 at r1 (raw file):
Previously, dt (David Taylor) wrote…
hm, we'll want to work with sql folks to make sure we benchmark and profile some write-heavy workloads after this merges since allocating on every row might show up. commented below with a potential other idea.
Okay, sounds good. That'll be after this PR is merged right? or before?
pkg/sql/row/helper.go, line 198 at r2 (raw file):
Previously, dt (David Taylor) wrote…
I wonder if we should keep rh.indexEntries around and just change its type to
mapinstead to avoid re-allocating every time? we could dofor i := range rh.indexEntries { rh.indexEntries[i] = rh.indexEntries[i][:0]}or something to reset it?
Okay. I changed this logic to make() the map when its len() is too short as an initialization check. And then I'm resetting it as you've suggested afterwards. Let me know what you think about it.
pkg/sql/backfill/revision_reader.go, line 3 at r2 (raw file):
Previously, dt (David Taylor) wrote…
I think we also need to switch this to the BSL header instead of the CCL header.
Done.
pkg/sql/backfill/revision_reader.go, line 9 at r2 (raw file):
Previously, dt (David Taylor) wrote…
mentioned below, think this should probably live elsewhere, maybe pkg/kv/kvclient
Done. Moved to kv/client.
dt
left a comment
There was a problem hiding this comment.
Reviewable status:
complete! 0 of 0 LGTMs obtained (waiting on @dt, @rhu713, @stevendanna, and @tbg)
pkg/sql/delete_preserving_index_test.go, line 132 at r1 (raw file):
Previously, rhu713 (Rui Hu) wrote…
Done. Moved to kv/client.
I think they call the package kvclient vs just client unless you meant to make a new package?
pkg/sql/row/helper.go, line 94 at r1 (raw file):
Previously, rhu713 (Rui Hu) wrote…
Okay, sounds good. That'll be after this PR is merged right? or before?
I think after is probably OK -- we can get this working to unblock the schema change work, then check the perf for any stay allocs or whatever in parallel after
pkg/sql/row/helper.go, line 198 at r2 (raw file):
Previously, rhu713 (Rui Hu) wrote…
Okay. I changed this logic to make() the map when its len() is too short as an initialization check. And then I'm resetting it as you've suggested afterwards. Let me know what you think about it.
I doubt this matters, since I think it'll be the same len for all rows, but I think you could just compare to nil for the initialization.
39c206e to
74bc56c
Compare
rhu713
left a comment
There was a problem hiding this comment.
Reviewable status:
complete! 0 of 0 LGTMs obtained (waiting on @dt, @stevendanna, and @tbg)
pkg/sql/delete_preserving_index_test.go, line 132 at r1 (raw file):
Previously, dt (David Taylor) wrote…
I think they call the package
kvclientvs justclientunless you meant to make a new package?
Oops typoed the refactor, fixed.
pkg/sql/row/helper.go, line 198 at r2 (raw file):
Previously, dt (David Taylor) wrote…
I doubt this matters, since I think it'll be the same len for all rows, but I think you could just compare to nil for the initialization.
Okay, used nil for the initialization.
|
Looking good, I'm going to go ahead and tag a SQL reviewer here to bring some SQL execution expertise. @rytaft do you know who might be best to give this a look? |
|
Tagging @mgartner for a review, since I think he has some familiarity with this code. |
mgartner
left a comment
There was a problem hiding this comment.
Very cool!
I think we should add some kvtrace logic tests in pkg/sql/exec/execbuilder/testdata/delete_nonmetamorphic (a new file) and .../update_nonmetamorphic and .../upsert_nonmetamorphic that show these index deletes working on complicated indexes like partial indexes, expression indexes, inverted indexes, multi-column inverted indexes. See pkg/sql/exec/execbuilder/testdata/partial_index_nonmetamorphic for some examples. I think you'll have to do some "descriptor surgery" on the index descriptor to get it into a state where UseDeletePreservingEncoding is true. For this you should be able to use crdb_internal.pb_to_json, crdb_internal.json_to_pb, and crdb_internal.unsafe_upsert_descriptor. See pkg/sql/logictest/testdata/logic_test/schema_repair for some examples. Let me know if you run into issues.
Reviewed 11 of 18 files at r1, 3 of 9 files at r2, 3 of 4 files at r5, all commit messages.
Reviewable status:complete! 0 of 0 LGTMs obtained (waiting on @dt, @mgartner, @rhu713, @stevendanna, and @tbg)
-- commits, line 8 at r5:
nit: give a short explanation of why delete-preserving index entries are needed, or at least link to the RFC.
pkg/sql/catalog/descpb/structured.proto, line 566 at r5 (raw file):
optional string predicate = 23 [(gogoproto.nullable) = false]; optional bool use_delete_preserving_encoding = 24 [(gogoproto.nullable) = false];
A comment describing what it means when this field is true might be helpful. Consider referencing the RFC
pkg/sql/catalog/descpb/structured.proto, line 1488 at r5 (raw file):
optional bytes value = 1; optional bool deleted = 2 [(gogoproto.nullable) = false];
nit: explain that when deleted is true, value will be nil.
pkg/sql/catalog/tabledesc/index.go, line 337 at r5 (raw file):
} func (w index) UseDeletePreservingEncoding() bool {
nit: add a comment to explain what this means, maybe referencing the RFC
pkg/sql/row/deleter.go, line 187 at r5 (raw file):
for _, entry := range secondaryIndexEntry { if err := rd.Helper.deleteIndexEntry(ctx, b, idx, nil, &entry, traceKV); err != nil {
No need to change this, but I'll note that DeleteIndexRow is only called when truncating an entire index, so I don't think you need to use the helper here. More specifically, this code path is only exercised for truncating interleaved indexes, so we can likely delete it all now anyway, given that we no longer support interleaved indexes. I'll make a note to myself to delete this all soon, so don't worry about it.
pkg/sql/row/helper.go, line 94 at r1 (raw file):
Previously, dt (David Taylor) wrote…
I think after is probably OK -- we can get this working to unblock the schema change work, then check the perf for any stay allocs or whatever in parallel after
I don't see the need to make this a map. The i-th element of secondaryIndexEntries should correspond to the i-th secondary index in rh.Indexes. So when you need to call rh.deleteIndexEntry for index entry i, you can pass rh.Indexes[i] as the index.
Perhaps an even better option is to simplify the interface of rh.deleteIndexEntry to:
func (rh *rowHelper) deleteIndexEntry(
ctx context.Context,
batch *kv.Batch,
indexOrd int,
valDirs []encoding.Direction,
traceKV bool,
) errordeleteIndexEntry can get the indexOrd-th index from rh.Indexes and the indexOrd-th entry from rh.indexEntries. But I do see that the update code must pass in entries not in rh.indexEntries, so you'd need a separate version of deleteIndexEntry for that. I'm not sure if that's worth it.
Regardless of the approach, I think we can remove this map, unless I'm missing something.
pkg/sql/row/helper.go, line 318 at r5 (raw file):
if index.UseDeletePreservingEncoding() { if traceKV { log.VEventf(ctx, 2, "Put (delete) %s -> %s", entry.Key, entry.Value.PrettyPrint())
We aren't actually putting entry.Value, so should we remove that from the log?
pkg/sql/row/inserter.go, line 185 at r5 (raw file):
// For determinism, add the entries for the secondary indexes in the same // order as they appear in the helper.
If we remove the map, I think we can revert this change.
pkg/sql/row/updater.go, line 508 at r5 (raw file):
// For determinism, add the entries for the secondary indexes in the same // order as they appear in the helper.
ditto
pkg/sql/rowenc/index_encoding.go, line 1191 at r5 (raw file):
if index.UseDeletePreservingEncoding() { for i := range indexEntries {
nit: This pattern is the same as for secondary indexes. You could make a helper function like wrapIndexEntries to de-duplicate some code.
|
pkg/sql/row/helper.go, line 94 at r1 (raw file):
If there is a partial index that should not be written to, then the number of entries will be less than the number of indexes. One possibility to avoid the map would be to return the list of indexes that entries were encoded for in the same order as the entries. You could reuse this slice to avoid allocations. |
dt
left a comment
There was a problem hiding this comment.
Reviewable status:
complete! 0 of 0 LGTMs obtained (waiting on @dt, @mgartner, @rhu713, @stevendanna, and @tbg)
pkg/sql/row/helper.go, line 94 at r1 (raw file):
Previously, mgartner (Marcus Gartner) wrote…
I realized that my statement above is false:
The i-th element of secondaryIndexEntries should correspond to the i-th secondary index in rh.Indexes
If there is a partial index that should not be written to, then the number of entries will be less than the number of indexes.
One possibility to avoid the map would be to return the list of indexes that entries were encoded for in the same order as the entries. You could reuse this slice to avoid allocations.
Is the map a problem? we're still reusing it across rows, like we did the slice before, and zero-slicing and resting the slices in it, so do we think it is a substantial increase in allocs?
dt
left a comment
There was a problem hiding this comment.
Reviewable status:
complete! 0 of 0 LGTMs obtained (waiting on @dt, @mgartner, @rhu713, @stevendanna, and @tbg)
pkg/sql/row/helper.go, line 94 at r1 (raw file):
Previously, dt (David Taylor) wrote…
Is the map a problem? we're still reusing it across rows, like we did the slice before, and zero-slicing and resting the slices in it, so do we think it is a substantial increase in allocs?
- s/resting/reusing/
|
pkg/sql/row/helper.go, line 94 at r1 (raw file): Previously, dt (David Taylor) wrote…
My main concern is point updates to a single row, where the reuse is irrelevant. I'd expect allocation of a map and a slice for each index to be more costly than allocation of a single slice, but I don't know how significant that would be. Maybe I'm unnecessarily paranoid? Running this benchmark might give us an idea of the impact. |
74bc56c to
86912ee
Compare
86912ee to
a1eef2b
Compare
rhu713
left a comment
There was a problem hiding this comment.
Okay, I'll work on these tests now.
Reviewable status:
complete! 0 of 0 LGTMs obtained (waiting on @dt, @mgartner, @rhu713, @stevendanna, and @tbg)
Previously, mgartner (Marcus Gartner) wrote…
nit: give a short explanation of why delete-preserving index entries are needed, or at least link to the RFC.
Done.
pkg/sql/catalog/descpb/structured.proto, line 566 at r5 (raw file):
Previously, mgartner (Marcus Gartner) wrote…
A comment describing what it means when this field is true might be helpful. Consider referencing the RFC
Done.
pkg/sql/catalog/descpb/structured.proto, line 1488 at r5 (raw file):
Previously, mgartner (Marcus Gartner) wrote…
nit: explain that when
deletedis true,valuewill benil.
Done.
pkg/sql/catalog/tabledesc/index.go, line 337 at r5 (raw file):
Previously, mgartner (Marcus Gartner) wrote…
nit: add a comment to explain what this means, maybe referencing the RFC
Done.
pkg/sql/row/deleter.go, line 187 at r5 (raw file):
Previously, mgartner (Marcus Gartner) wrote…
No need to change this, but I'll note that
DeleteIndexRowis only called when truncating an entire index, so I don't think you need to use the helper here. More specifically, this code path is only exercised for truncating interleaved indexes, so we can likely delete it all now anyway, given that we no longer support interleaved indexes. I'll make a note to myself to delete this all soon, so don't worry about it.
Gotcha.
pkg/sql/row/helper.go, line 94 at r1 (raw file):
Previously, mgartner (Marcus Gartner) wrote…
My main concern is point updates to a single row, where the reuse is irrelevant. I'd expect allocation of a map and a slice for each index to be more costly than allocation of a single slice, but I don't know how significant that would be. Maybe I'm unnecessarily paranoid? Running this benchmark might give us an idea of the impact.
Here's a comparison between the new changes and master with 10 runs, at first glance there doesn't seem to be too much difference?
name old time/op new time/op delta
SQL/Cockroach/Update/count=1-16 488µs ± 3% 493µs ± 2% ~ (p=0.094 n=9+9)
SQL/Cockroach/Update/count=10-16 667µs ± 3% 673µs ± 5% ~ (p=0.720 n=9+10)
SQL/Cockroach/Update/count=100-16 2.20ms ± 7% 2.13ms ± 4% -3.07% (p=0.019 n=10+10)
SQL/Cockroach/Update/count=1000-16 19.1ms ± 5% 19.1ms ± 4% ~ (p=0.965 n=10+8)
SQL/MultinodeCockroach/Update/count=1-16 942µs ± 3% 958µs ± 3% ~ (p=0.063 n=9+9)
SQL/MultinodeCockroach/Update/count=10-16 1.16ms ± 4% 1.20ms ± 3% +3.87% (p=0.011 n=9+9)
SQL/MultinodeCockroach/Update/count=100-16 2.84ms ± 4% 2.84ms ± 4% ~ (p=0.959 n=8+8)
SQL/MultinodeCockroach/Update/count=1000-16 19.6ms ±11% 19.0ms ±11% ~ (p=0.222 n=9+9) ```
pkg/sql/row/helper.go, line 318 at r5 (raw file):
Previously, mgartner (Marcus Gartner) wrote…
We aren't actually putting
entry.Value, so should we remove that from the log?
Done. Removed.
pkg/sql/rowenc/index_encoding.go, line 1191 at r5 (raw file):
Previously, mgartner (Marcus Gartner) wrote…
nit: This pattern is the same as for secondary indexes. You could make a helper function like
wrapIndexEntriesto de-duplicate some code.
Done.
mgartner
left a comment
There was a problem hiding this comment.
Looks great! I'll take another look once you add those tests.
Reviewed 1 of 4 files at r5, 1 of 1 files at r6, 4 of 4 files at r7, all commit messages.
Reviewable status:complete! 0 of 0 LGTMs obtained (waiting on @dt, @mgartner, @rhu713, and @stevendanna)
pkg/sql/delete_preserving_index_test.go, line 102 at r7 (raw file):
if deletePreservingEncoding { // Mutate index descriptor //to use the delete-preserving encoding.
nit: space after // and wrap at the 80-character mark (this probably fits on one line)
pkg/sql/catalog/descpb/structured.proto, line 570 at r7 (raw file):
// This is necessary to preserve the delete history for the MVCC-compatible // index backfiller // `docs/RFCS/20211004_incremental_index_backfiller.md#new-index-encoding-for-deletions-vs-mvcc`
nit: make it clear that if true, entries are not deleted from the index, but remain with the deleted bit set to true.
nit: remove the back ticks `
pkg/sql/catalog/tabledesc/index.go, line 341 at r7 (raw file):
// This is necessary to preserve the delete history for the MVCC-compatible // index backfiller // `docs/RFCS/20211004_incremental_index_backfiller.md#new-index-encoding-for-deletions-vs-mvcc`
ditto: explain delete behavior and remove backticks
pkg/sql/row/helper.go, line 94 at r1 (raw file):
Previously, rhu713 (Rui Hu) wrote…
Here's a comparison between the new changes and master with 10 runs, at first glance there doesn't seem to be too much difference?
name old time/op new time/op delta SQL/Cockroach/Update/count=1-16 488µs ± 3% 493µs ± 2% ~ (p=0.094 n=9+9) SQL/Cockroach/Update/count=10-16 667µs ± 3% 673µs ± 5% ~ (p=0.720 n=9+10) SQL/Cockroach/Update/count=100-16 2.20ms ± 7% 2.13ms ± 4% -3.07% (p=0.019 n=10+10) SQL/Cockroach/Update/count=1000-16 19.1ms ± 5% 19.1ms ± 4% ~ (p=0.965 n=10+8) SQL/MultinodeCockroach/Update/count=1-16 942µs ± 3% 958µs ± 3% ~ (p=0.063 n=9+9) SQL/MultinodeCockroach/Update/count=10-16 1.16ms ± 4% 1.20ms ± 3% +3.87% (p=0.011 n=9+9) SQL/MultinodeCockroach/Update/count=100-16 2.84ms ± 4% 2.84ms ± 4% ~ (p=0.959 n=8+8) SQL/MultinodeCockroach/Update/count=1000-16 19.6ms ±11% 19.0ms ±11% ~ (p=0.222 n=9+9) ```
I agree, this looks fine. Thanks for running the benchmarks!
postamar
left a comment
There was a problem hiding this comment.
I have mostly comments and questions, the only change I'm requesting pertains to the added field in descpb.IndexDescriptor.
| // This is necessary to preserve the delete history for the MVCC-compatible | ||
| // index backfiller | ||
| // `docs/RFCS/20211004_incremental_index_backfiller.md#new-index-encoding-for-deletions-vs-mvcc` | ||
| optional bool use_delete_preserving_encoding = 24 [(gogoproto.nullable) = false]; |
There was a problem hiding this comment.
Is there a reason why this field needs to be added, vs extending the descpb.IndexDescriptorVersion enum? We already have two fields in this message that specify how indexes are encoded, I'm quite reluctant to add a third unless there's a very good reason. This proto schema is messy enough as it is I don't want to make it worse.
Currently it seems to me that you could achieve the same results by adding:
- an enum value
PrimaryIndexWithDeletePreservingVersionto supersedePrimaryIndexWithStoredColumnsVersion, - an enum value
SecondaryIndexWithDeletePreservingVersionto supersedeStrictIndexColumnIDGuaranteesVersion.
There was a problem hiding this comment.
In the RFC discussion we debated adding a new entry in the format enum vs a separate flag. I argued for new flag, since it seemed independent of the format -- whatever the format it, we can then box that and store it (or the fact it was deleted), so if there are 10 new formats, we don't need 10 new delete-preserving versions of them.
There was a problem hiding this comment.
That's a good-enough reason for me, thanks for clarifying.
There was a problem hiding this comment.
I feel this is worth mentioning in the field's comments. I know there's a link to the RFC but I'm guessing most people will be just as lazy as I was and won't go there.
| // because there would have to be a separate type for each encoding that we | ||
| // already have for indexes. The alternative would get harder to maintain if we | ||
| // added more index encodings in the future. | ||
| message IndexValueWrapper { |
There was a problem hiding this comment.
Why is this in the descpb package? Rather, my real question is, why do we need a protobuf message to wrap an encoded index value at all?
It seems that the same results could be achieved by, say, pre-pending a 0 or a 1 byte to the encoded value bytes, with 0 being a tombstone and 1 signalling an actual value. Conceptually, it's the same thing as what you're proposing here, unless I'm missing something.
Perhaps I'm way off base here but in fact it seems to me that there might be a way to achieve what you want to do without needing to introduce a new encoding format at all, by going deeper into the workings of rowenc into the messy innards of EncodePrimaryIndex and EncodeSecondaryIndex. These all use the same writeColumnValues function to encode non-key column values:
cockroach/pkg/sql/rowenc/index_encoding.go
Lines 1438 to 1461 in 48a8dd8
This admittedly is not my area of expertise but couldn't we pretend that every table has a secret "deletion tombstone" nullable integer column that has a very large, out-of-bounds ID? Marking a row as deleted would be equivalent to having a non-NULL value in that column.
Encoding a deletion tombstone would involve an extra iteration of the for-loop in that function with colIDDiff set to math.MaxUint: column IDs start at 1 and end at math.MaxUint so colIDDiff can in practice never exceed math.MaxUint -1, making this a legal value (from the point of view of the encoding) but impossible for a real column to have.
Similarly, when decoding in DecodeRow we check colIDDiff == math.MaxUint in which case the row is considered deleted.
Is this crazy? If not, implementing this properly would involve a bit of plumbing, however there's no need to modify the index descriptor at all, existing rows all remain valid, etc.
There was a problem hiding this comment.
Hm.
pre-pending a 0 or a 1 byte to the encoded value bytes, with 0 being a tombstone and 1 signalling an actual value
Eh, I don't love rolling out own "pack fields into bytes" encoding when that's exactly what proto the proto will do for us -- prepend the field ID, and then its value, but using a mature, well tested library.
I'm a little wary of the magic extra column, and how closely it starts to tie this to the internals of SQL value encoding. I think I sorta like that we just implement the Put/Del API but take whatever was put, or del'ed, and record it according to our own container, with that fact separated from the details of what was put or del'ed.
There was a problem hiding this comment.
I'm not saying we should pre-pend with magic numbers, I'm saying that we're essentially doing the same thing with the protos. But then I didn't realize there was a desire to keep this wrapper at a different abstraction level. Again, the rationale seems like it's worth explaining in comments.
I trust your judgment about this and defer to you. That being said I'd prefer if this wrapper proto lived somewhere else; as I understand things, it doesn't belong in descpb.
There was a problem hiding this comment.
Yeah, fair, descpb is a little odd since this is row data not descriptor metadata. I think it moved here since @tbg asked for it -- as a sql-layer-specific type -- to not be in roachpb. We might be able sell @tbg on the idea that a message type with more generic name, like "DeleteableValue" that has a generic bytes value field and deleted bit belongs in roachpb, or maybe we could just move the message with its current index-specific name over to execinfrapb?
There was a problem hiding this comment.
we could just move the message with its current index-specific name over to execinfrapb?
Assuming that's where it makes most sense to be, sure. I don't really have an opinion other than descpb is not the right place.
a1eef2b to
c75395d
Compare
rhu713
left a comment
There was a problem hiding this comment.
@mgartner Just added upsert, update, and delete tests for partial, expression, inverted, and multicolumn inverted indices. Lmk if I've missed anything else.
Reviewable status:
complete! 0 of 0 LGTMs obtained (waiting on @dt, @mgartner, @stevendanna, and @tbg)
pkg/sql/catalog/descpb/structured.proto, line 570 at r7 (raw file):
Previously, mgartner (Marcus Gartner) wrote…
nit: make it clear that if true, entries are not deleted from the index, but remain with the deleted bit set to true.
nit: remove the back ticks `
Done.
pkg/sql/catalog/tabledesc/index.go, line 341 at r7 (raw file):
Previously, mgartner (Marcus Gartner) wrote…
ditto: explain delete behavior and remove backticks
Done.
pkg/sql/row/inserter.go, line 185 at r5 (raw file):
Previously, mgartner (Marcus Gartner) wrote…
If we remove the map, I think we can revert this change.
Resolved after running benchmarks.
pkg/sql/row/updater.go, line 508 at r5 (raw file):
Previously, mgartner (Marcus Gartner) wrote…
ditto
Resolved after running benchmarks.
b046035 to
07db836
Compare
mgartner
left a comment
There was a problem hiding this comment.
Nice work! The tests look great. I think it's worth adding a test for delete, update, and upsert with a "regular" index too - see my inline comments.
Reviewed 6 of 7 files at r8, 1 of 1 files at r9, all commit messages.
Reviewable status:complete! 0 of 0 LGTMs obtained (waiting on @dt, @mgartner, @rhu713, @stevendanna, and @tbg)
pkg/sql/opt/exec/execbuilder/testdata/delete_nonmetamorphic, line 1 at r9 (raw file):
# LogicTest: local !metamorphic
I think it's worth adding a similar test for just a regular index like CREATE TABLE t (a INT PRIMARY KEY, b INT, c INT, INDEX (b, c)).
pkg/sql/opt/exec/execbuilder/testdata/delete_nonmetamorphic, line 12 at r9 (raw file):
c STRING, FAMILY (a, b, c), CHECK (b > 0),
nit: remove this check -- I don't think it's necessary.
pkg/sql/opt/exec/execbuilder/testdata/delete_nonmetamorphic, line 17 at r9 (raw file):
let $t_id select id from system.namespace where name='tpi'
nit: capitalize keywords SELECT FROM and WHERE
pkg/sql/opt/exec/execbuilder/testdata/delete_nonmetamorphic, line 71 at r9 (raw file):
statement ok CREATE INDEX t_a_plus_b_idx ON tei ((a + b))
nit: you should be able to include this index in CREATE TABLE.
pkg/sql/opt/exec/execbuilder/testdata/delete_nonmetamorphic, line 74 at r9 (raw file):
let $t_id select id from system.namespace where name='tei'
nit: capitalize keywords SELECT FROM and WHERE
pkg/sql/opt/exec/execbuilder/testdata/delete_nonmetamorphic, line 127 at r9 (raw file):
let $t_id select id from system.namespace where name='tii'
nit: capitalize keywords SELECT FROM and WHERE
pkg/sql/opt/exec/execbuilder/testdata/delete_nonmetamorphic, line 183 at r9 (raw file):
let $t_id select id from system.namespace where name='tmi'
nit: capitalize keywords SELECT FROM and WHERE
pkg/sql/opt/exec/execbuilder/testdata/update_nonmetamorphic, line 100 at r9 (raw file):
statement ok SET enable_implicit_select_for_update = true
Some tests with a regular index would be good here too.
pkg/sql/opt/exec/execbuilder/testdata/update_nonmetamorphic, line 110 at r9 (raw file):
c STRING, FAMILY (a, b, c), CHECK (b > 0),
nit: remove check
pkg/sql/opt/exec/execbuilder/testdata/update_nonmetamorphic, line 115 at r9 (raw file):
let $t_id select id from system.namespace where name='tpi'
nit: capitalize keywords SELECT FROM and WHERE
pkg/sql/opt/exec/execbuilder/testdata/update_nonmetamorphic, line 195 at r9 (raw file):
statement ok CREATE INDEX t_a_plus_b_idx ON tei ((a + b))
nit: you should be able to move into CREATE TABLE
pkg/sql/opt/exec/execbuilder/testdata/update_nonmetamorphic, line 198 at r9 (raw file):
let $t_id select id from system.namespace where name='tei'
nit: capitalize keywords SELECT FROM and WHERE
pkg/sql/opt/exec/execbuilder/testdata/update_nonmetamorphic, line 259 at r9 (raw file):
let $t_id select id from system.namespace where name='tii'
nit: capitalize keywords SELECT FROM and WHERE
pkg/sql/opt/exec/execbuilder/testdata/update_nonmetamorphic, line 314 at r9 (raw file):
let $t_id select id from system.namespace where name='tmi'
nit: capitalize keywords SELECT FROM and WHERE
pkg/sql/opt/exec/execbuilder/testdata/upsert_nonmetamorphic, line 74 at r9 (raw file):
batch flow coordinator CPut /Table/55/2/2/0 -> /BYTES/0x8a (expecting does not exist) exec stmt execution failed after 0 rows: duplicate key value violates unique constraint "woo"
ditto: add regular index case
pkg/sql/opt/exec/execbuilder/testdata/upsert_nonmetamorphic, line 85 at r9 (raw file):
c STRING, FAMILY (a, b, c), CHECK (b > 0),
nit: remove check
pkg/sql/opt/exec/execbuilder/testdata/upsert_nonmetamorphic, line 90 at r9 (raw file):
let $t_id select id from system.namespace where name='tpi'
nit: capitalize keywords SELECT FROM and WHERE
pkg/sql/opt/exec/execbuilder/testdata/upsert_nonmetamorphic, line 173 at r9 (raw file):
let $t_id select id from system.namespace where name='tei'
nit: capitalize keywords SELECT FROM and WHERE
pkg/sql/opt/exec/execbuilder/testdata/upsert_nonmetamorphic, line 243 at r9 (raw file):
let $t_id select id from system.namespace where name='tii'
nit: capitalize keywords SELECT FROM and WHERE
pkg/sql/opt/exec/execbuilder/testdata/upsert_nonmetamorphic, line 298 at r9 (raw file):
let $t_id select id from system.namespace where name='tmi'
nit: capitalize keywords SELECT FROM and WHERE
7fa72e3 to
3d0a5aa
Compare
rhu713
left a comment
There was a problem hiding this comment.
Done. Added the regular index tests.
Reviewable status:
complete! 0 of 0 LGTMs obtained (waiting on @dt, @mgartner, @postamar, @stevendanna, and @tbg)
pkg/sql/catalog/descpb/structured.proto, line 571 at r7 (raw file):
Previously, postamar (Marius Posta) wrote…
I feel this is worth mentioning in the field's comments. I know there's a link to the RFC but I'm guessing most people will be just as lazy as I was and won't go there.
Done. Added decision reasoning in the comment.
pkg/sql/catalog/descpb/structured.proto, line 1490 at r7 (raw file):
Previously, postamar (Marius Posta) wrote…
we could just move the message with its current index-specific name over to execinfrapb?
Assuming that's where it makes most sense to be, sure. I don't really have an opinion other than
descpbis not the right place.
@dt @postamar I added a new pb directory called rowencpb to the rowenc package, and moved this proto to that. I tried moving it to execinfrapb but this resulted in an import cycle. Please take a look and let me know if this is the right way to resolve this.
pkg/sql/opt/exec/execbuilder/testdata/delete_nonmetamorphic, line 1 at r9 (raw file):
Previously, mgartner (Marcus Gartner) wrote…
I think it's worth adding a similar test for just a regular index like
CREATE TABLE t (a INT PRIMARY KEY, b INT, c INT, INDEX (b, c)).
Done.
pkg/sql/opt/exec/execbuilder/testdata/delete_nonmetamorphic, line 12 at r9 (raw file):
Previously, mgartner (Marcus Gartner) wrote…
nit: remove this check -- I don't think it's necessary.
Done.
pkg/sql/opt/exec/execbuilder/testdata/delete_nonmetamorphic, line 17 at r9 (raw file):
Previously, mgartner (Marcus Gartner) wrote…
nit: capitalize keywords
SELECTFROMandWHERE
Done.
pkg/sql/opt/exec/execbuilder/testdata/delete_nonmetamorphic, line 71 at r9 (raw file):
Previously, mgartner (Marcus Gartner) wrote…
nit: you should be able to include this index in
CREATE TABLE.
Done.
pkg/sql/opt/exec/execbuilder/testdata/delete_nonmetamorphic, line 74 at r9 (raw file):
Previously, mgartner (Marcus Gartner) wrote…
nit: capitalize keywords
SELECTFROMandWHERE
Done.
pkg/sql/opt/exec/execbuilder/testdata/delete_nonmetamorphic, line 127 at r9 (raw file):
Previously, mgartner (Marcus Gartner) wrote…
nit: capitalize keywords
SELECTFROMandWHERE
Done.
pkg/sql/opt/exec/execbuilder/testdata/delete_nonmetamorphic, line 183 at r9 (raw file):
Previously, mgartner (Marcus Gartner) wrote…
nit: capitalize keywords
SELECTFROMandWHERE
Done.
pkg/sql/opt/exec/execbuilder/testdata/update_nonmetamorphic, line 100 at r9 (raw file):
Previously, mgartner (Marcus Gartner) wrote…
Some tests with a regular index would be good here too.
Done. Added regular index test.
pkg/sql/opt/exec/execbuilder/testdata/update_nonmetamorphic, line 110 at r9 (raw file):
Previously, mgartner (Marcus Gartner) wrote…
nit: remove check
Done.
pkg/sql/opt/exec/execbuilder/testdata/update_nonmetamorphic, line 115 at r9 (raw file):
Previously, mgartner (Marcus Gartner) wrote…
nit: capitalize keywords
SELECTFROMandWHERE
Done.
pkg/sql/opt/exec/execbuilder/testdata/update_nonmetamorphic, line 195 at r9 (raw file):
Previously, mgartner (Marcus Gartner) wrote…
nit: you should be able to move into CREATE TABLE
Done.
pkg/sql/opt/exec/execbuilder/testdata/update_nonmetamorphic, line 198 at r9 (raw file):
Previously, mgartner (Marcus Gartner) wrote…
nit: capitalize keywords
SELECTFROMandWHERE
Done.
pkg/sql/opt/exec/execbuilder/testdata/update_nonmetamorphic, line 259 at r9 (raw file):
Previously, mgartner (Marcus Gartner) wrote…
nit: capitalize keywords
SELECTFROMandWHERE
Done.
pkg/sql/opt/exec/execbuilder/testdata/update_nonmetamorphic, line 314 at r9 (raw file):
Previously, mgartner (Marcus Gartner) wrote…
nit: capitalize keywords
SELECTFROMandWHERE
Done.
pkg/sql/opt/exec/execbuilder/testdata/upsert_nonmetamorphic, line 74 at r9 (raw file):
Previously, mgartner (Marcus Gartner) wrote…
ditto: add regular index case
Done.
pkg/sql/opt/exec/execbuilder/testdata/upsert_nonmetamorphic, line 85 at r9 (raw file):
Previously, mgartner (Marcus Gartner) wrote…
nit: remove check
Done.
pkg/sql/opt/exec/execbuilder/testdata/upsert_nonmetamorphic, line 90 at r9 (raw file):
Previously, mgartner (Marcus Gartner) wrote…
nit: capitalize keywords
SELECTFROMandWHERE
Done.
pkg/sql/opt/exec/execbuilder/testdata/upsert_nonmetamorphic, line 173 at r9 (raw file):
Previously, mgartner (Marcus Gartner) wrote…
nit: capitalize keywords
SELECTFROMandWHERE
Done.
pkg/sql/opt/exec/execbuilder/testdata/upsert_nonmetamorphic, line 243 at r9 (raw file):
Previously, mgartner (Marcus Gartner) wrote…
nit: capitalize keywords
SELECTFROMandWHERE
Done.
pkg/sql/opt/exec/execbuilder/testdata/upsert_nonmetamorphic, line 298 at r9 (raw file):
Previously, mgartner (Marcus Gartner) wrote…
nit: capitalize keywords
SELECTFROMandWHERE
Done.
mgartner
left a comment
There was a problem hiding this comment.
Reviewed 10 of 10 files at r10, all commit messages.
Reviewable status:complete! 1 of 0 LGTMs obtained (waiting on @dt, @mgartner, @postamar, @stevendanna, and @tbg)
36f4dc7 to
e2106cb
Compare
…ary index As part of the transition for Bulk Ops to use only MVCC operations, the index backfiller must be rewritten. Part of the rewrite is to introduce a temporary index at the beginning of the backfill encoded with a new delete-preserving index encoding. The delete-preserving encoding results in index values being encoded with an additional bit that indicates whether or not the value has been deleted, which is necessary to preserve the delete history during the backfill. The entire RFC is here: https://github.com/cockroachdb/cockroach/blob/master/docs/RFCS/20211004_incremental_index_backfiller.md#new-index-encoding-for-deletions-vs-mvcc This patch adds this encoding and uses the encoding for writes to indices configured with this encoding. Release note: None
e2106cb to
26a054c
Compare
|
bors r+ |
|
Build succeeded: |
…ary index
As part of the transition for Bulk Ops to use only MVCC operations, the index
backfiller must be rewritten. Part of the rewrite is to introduce a temporary
index at the beginning of the backfill encoded with a new delete-preserving
index encoding. This patch adds this encoding and uses the encoding for writes
to indices configured with this encoding.
Release note: None