execinfra: prefer leased table descriptors#74722
execinfra: prefer leased table descriptors#74722craig[bot] merged 1 commit intocockroachdb:masterfrom
Conversation
| } | ||
| if ctx != nil && ctx.Descriptors != nil { | ||
| leased, _ := ctx.Descriptors.GetLeasedTableByID(ctx.EvalCtx.Ctx(), ctx.Txn, desc.ID) | ||
| if leased != nil && leased.GetVersion() == desc.Version { |
There was a problem hiding this comment.
Is checking the version even necessary at all?
There was a problem hiding this comment.
I believe that it is, yes. Or, at least, I certainly don't want to have to convince myself it's not. The alternative is that the descriptor is either newer or older.
It's worth noting that much hand-wringing was done to convince ourselves that it's safe to use enums that are out of sync because the values are always there if they're going to be used. That made me uneasy at the time. I worry about more complex cross-descriptor interactions.
My preference in the longer term is to encode id and version pairs for all needed descriptors and then make it cheap to retrieve them in one go. Generally they'll be cached. That way we'd have nice invariants that all of the exact descriptors matched across machines.
The edge case is if the descriptor hasn't been committed yet. In that case we'll still need to serialize it all. That fact has always made me sort of sad.
There was a problem hiding this comment.
Thanks for this information, I was also wondering about the enums. I agree with your general assessment.
|
1cd6d1e to
cefc87c
Compare
yuzefovich
left a comment
There was a problem hiding this comment.
Awesome! 🚀
Reviewable status:
complete! 0 of 0 LGTMs obtained (waiting on @ajwerner, @postamar, and @yuzefovich)
pkg/sql/colfetcher/colbatch_scan.go, line 203 at r2 (raw file):
limitHint := rowinfra.RowLimit(execinfra.LimitHint(spec.LimitHint, post)) // TODO(ajwerner): The need to construct an immutable here
This TODO can now be removed, right?
pkg/sql/colfetcher/index_join.go, line 409 at r2 (raw file):
} // TODO(ajwerner): The need to construct an immutable here
ditto
|
Yes, good catch on those TODOs, they flew under my nose. I'm still messing around with this, btw, and I'll keep the microbenchmark results updated. |
This commit makes better use of the table descriptor protos in the
DistSQL specs by first checking if they've already been leased. If so,
then we use those instead of re-generating catalog.TableDescriptor
instances.
This has a statistically-significant impact on memory allocations, as
illustrated with this microbenchmark which I ran on my development
machine:
name old time/op new time/op delta
KV/Scan/SQL/rows=1-16 190µs ± 7% 184µs ± 4% -3.60% (p=0.016 n=10+8)
KV/Scan/SQL/rows=10-16 196µs ± 4% 198µs ±12% ~ (p=0.762 n=8+10)
name old alloc/op new alloc/op delta
KV/Scan/SQL/rows=1-16 19.5kB ± 1% 17.4kB ± 1% -11.12% (p=0.000 n=9+10)
KV/Scan/SQL/rows=10-16 21.0kB ± 1% 18.9kB ± 1% -10.20% (p=0.000 n=10+10)
name old allocs/op new allocs/op delta
KV/Scan/SQL/rows=1-16 222 ± 0% 210 ± 1% -5.59% (p=0.000 n=7+10)
KV/Scan/SQL/rows=10-16 256 ± 0% 244 ± 0% -4.84% (p=0.000 n=10+7)
This change opens us to the possibility of no longer shipping the whole
table descriptor proto in the DistSQL spec.
Release note: None
cefc87c to
b4e85a5
Compare
|
I made only very minor changes since the reviews: removed the TODOs and removed an unnecessary cast that the linter was complaining about. I'll merge as soon as this passes CI. This change now opens the door to replacing the whole table descriptor proto in the specs with an id+version pair. This ought to have quite an effect in reducing the footprint of proto de/serialization! This requires some version gating and improvements to the catalog API. The latter (batching + retrieval by version in addition to by ID) is stuff that Schema needs to do anyway at some point. |
ajwerner
left a comment
There was a problem hiding this comment.
Reviewable status:
complete! 1 of 0 LGTMs obtained (waiting on @ajwerner, @postamar, and @yuzefovich)
|
Thanks for the reviews! bors r+ |
|
Build failed (retrying...): |
|
Build succeeded: |
|
Very nice! Does this resolve #74352 on its own, or would you also like to address the "replacing the whole table descriptor proto in the specs with an id+version pair" point first? |
|
Thanks! No, I think it only resolves part of it. Indeed, the other part would be not shipping around the whole table descriptor proto in the first place, which is a lot more work than I'm prepared to do right now. |
This commit makes better use of the table descriptor protos in the
DistSQL specs by first checking if they've already been leased. If so,
then we use those instead of re-generating catalog.TableDescriptor
instances.
Release note: None