Skip to content

colexec: poor performance of TopK sort when BYTES-like vectors are present #75536

@yuzefovich

Description

@yuzefovich

TopK sort works by maintaining a max heap of K elements. This requires Set operations at arbitrary positions on the vectors. Our flat-bytes implementation of BYTES and alike types doesn't support arbitrary Sets, so the TopK sort has to call CopySlice which involves copying and memory movement. In other words, Set on coldata.Bytes is not a constant-time operation, but linear in the length of data. During the top K sort in the worst case we'll have to perform an arbitrary set for each input tuple, as a result, the complexity of the TopK algorithm becomes actually worse than of the general sort.

Here is a concrete example. First we perform a TopK sort which takes over 5 minutes on my laptop:
Screen Shot 2022-01-25 at 9 40 07 PM
Next I modified execplan.go to use the general sort for the same query, and now it's only 1.4 seconds, and the sort actually spilled to disk:
Screen Shot 2022-01-25 at 9 40 14 PM

Some ideas on how to address this issue:

  • We could refactor the way coldata.Bytes is implemented. For example, we can take an inspiration from https://github.com/facebookincubator/velox. There, they chose to store string elements out of order by storing the length of the element and a pointer to the start of the element in the whole string buffer. Additionally, they store 4 byte prefix of the string inline, and strings up to 12 bytes are fully stored inline in this “metadata” object. Such representation easily supports Sets at arbitrary indices. The disadvantage of this approach is that serialization/deserialization in Arrow format is no longer cheap.
  • We could have a second implementation of Bytes vector where we have [][]byte. This representation will only be used by the TopK sort internally during its execution and then we would copy the result into coldata.Bytes. It might be a bit annoying because we'll need to special case of the behavior in the TopK, but overall it should be tolerable. This is my preferred approach.
  • We could change the planning to not use the TopK sort if input types include bytes-like vector AND limit is relatively large (say over 100) in order to fallback to the general sort. We might have to do the corresponding change in the optimizer as well to better cost the plans. This approach seems like a step back, so I'd only consider it for a backport if needed (I think on 21.2 and prior we didn't have support for TopK in the optimizer).

Jira issue: CRDB-12714

Metadata

Metadata

Assignees

Labels

C-cleanupTech debt, refactors, loose ends, etc. Solution not expected to significantly change behavior.C-performancePerf of queries or internals. Solution not expected to change functional behavior.T-sql-queriesSQL Queries Team

Type

No type

Projects

Status

Done

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions