colexec: poor performance of TopK sort when BYTES-like vectors are present

TopK sort works by maintaining a max heap of K elements. This requires `Set` operations at arbitrary positions on the vectors. Our flat-bytes implementation of `BYTES` and alike types doesn't support arbitrary `Set`s, so the TopK sort has to call `CopySlice` which involves copying and memory movement. In other words, `Set` on `coldata.Bytes` is not a constant-time operation, but linear in the length of `data`. During the top K sort in the worst case we'll have to perform an arbitrary set for each input tuple, as a result, the complexity of the TopK algorithm becomes actually worse than of the general sort.

Here is a concrete example. First we perform a TopK sort which takes over 5 minutes on my laptop:
![Screen Shot 2022-01-25 at 9 40 07 PM](https://user-images.githubusercontent.com/5135385/151109741-1be7ecee-2770-4789-9275-7926cfcd24b4.png)
Next I modified `execplan.go` to use the general sort for the same query, and now it's only 1.4 seconds, and the sort actually spilled to disk:
![Screen Shot 2022-01-25 at 9 40 14 PM](https://user-images.githubusercontent.com/5135385/151109754-83c0320f-6ceb-4af5-a436-ff32dc03af68.png)

Some ideas on how to address this issue:
- We could refactor the way `coldata.Bytes` is implemented. For example, we can take an inspiration from https://github.com/facebookincubator/velox. There, they chose to store string elements out of order by storing the length of the element and a pointer to the start of the element in the whole string buffer. Additionally, they store 4 byte prefix of the string inline, and strings up to 12 bytes are fully stored inline in this “metadata” object. Such representation easily supports `Set`s at arbitrary indices. The disadvantage of this approach is that serialization/deserialization in Arrow format is no longer cheap.
- We could have a second implementation of `Bytes` vector where we have `[][]byte`. This representation will only be used by the TopK sort internally during its execution and then we would copy the result into `coldata.Bytes`. It might be a bit annoying because  we'll need to special case of the behavior in the TopK, but overall it should be tolerable. This is my preferred approach.
- We could change the planning to not use the TopK sort if input types include bytes-like vector AND limit is relatively large (say over 100) in order to fallback to the general sort. We might have to do the corresponding change in the optimizer as well to better cost the plans. This approach seems like a step back, so I'd only consider it for a backport if needed (I think on 21.2 and prior we didn't have support for TopK in the optimizer).

Jira issue: CRDB-12714

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

colexec: poor performance of TopK sort when BYTES-like vectors are present #75536

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

colexec: poor performance of TopK sort when BYTES-like vectors are present #75536

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions