colexec: improve sort operator a bit#67638
Conversation
Previously, we were missing memory accounting around some slices allocated internally by the sort operator which is now added. Additionally, the references to those slices are now kept by the operator which will be useful when the sorter is used by the external sorter. Additionally, this commit eliminates some bounds checks. Release note: None
|
@yuzefovich where are the assertion comments omitted? I probably have enough context to fix it fairly easily. |
|
I think the best way to see them is through the web browser on github.com, if you look at |
|
As Drew correctly pointed out, the gcassert comments are always added into the generated code wherever appropriate, but sometimes there are useless I also ran the microbenchmarks for the external sort, and they didn't show much of change (we're only using int64 though). |
There was a problem hiding this comment.
I think we're missing a BCE here - colBCE should be true.
DrewKimball
left a comment
There was a problem hiding this comment.
After my two comments are resolved, LGTM.
mgartner
left a comment
There was a problem hiding this comment.
Reviewable status:
complete! 0 of 0 LGTMs obtained (waiting on @yuzefovich)
pkg/sql/colexec/sort.go, line 419 at r1 (raw file):
// that were true in the distinct vector. sizeBefore := memsize.Int * int64(cap(p.scratch.partitions)) p.scratch.partitions = boolVecToSel64(partitionsCol, p.scratch.partitions[:0])
Not related to your change, but why is it called boolVecToSel64 if []int is returned. I'd expect []int64 to be returned.
This is achieved by templating the inlined execgen functions. Note that in some cases this templating leaves redundant `_ = true` lines, but figuring out what's up with that is left as a TODO. Additionally, this commit removes some redundant attempts at achieving BCE for non-sliceable vectors (which had a negative impact in case of JSONs) and clarifies the name of a utility function. Release note: None
yuzefovich
left a comment
There was a problem hiding this comment.
Reviewable status:
complete! 0 of 0 LGTMs obtained (waiting on @DrewKimball and @mgartner)
pkg/sql/colexec/sort.go, line 419 at r1 (raw file):
Previously, mgartner (Marcus Gartner) wrote…
Not related to your change, but why is it called
boolVecToSel64if[]intis returned. I'd expect[]int64to be returned.
The selection vector used to be []int64 but it was refactored a couple of releases ago. Fixed.
pkg/sql/colexec/colexecbase/distinct_tmpl.go, line 273 at r2 (raw file):
Previously, DrewKimball (Drew Kimball) wrote…
I think we're missing a BCE here -
colBCEshould be true.
No, I don't think so - actually, we're accessing col with checkIdx == order[outputIdx], and the compiler cannot prove that values in order slice are in bounds. Removed the redundant attempt at BCE for it.
pkg/sql/colexec/colexecbase/distinct_tmpl.go, line 279 at r2 (raw file):
Previously, DrewKimball (Drew Kimball) wrote…
ditto
Done.
|
TFTRs! bors r+ |
|
Build succeeded: |
colexec: improve sort operator a bit
Previously, we were missing memory accounting around some slices
allocated internally by the sort operator which is now added.
Additionally, the references to those slices are now kept by the
operator which will be useful when the sorter is used by the external
sorter.
Additionally, this commit eliminates some bounds checks.
Release note: None
colexec: add some BCE assertions for distinct and sort
This is achieved by templating the inlined execgen functions. Note that
in some cases this templating leaves redundant
_ = truelines, butfiguring out what's up with that is left as a TODO.
Additionally, this commit removes some redundant attempts at achieving
BCE for non-sliceable vectors (which had a negative impact in case of
JSONs) and clarifies the name of a utility function.
Release note: None