kv: Scans with limit acquire excessively large latches

A query like `SELECT * FROM t WHERE indexed_column > $1 LIMIT 1` has a very large key Span (from key `$1` to the end of the table or the end of the range, whichever comes first), but really only depends on a small amount of data (from `$1` to the first key greater than that value). The command queue only sees the former, so this query must wait behind any updates to any other rows in the table, not just the rows that it will eventually return.

We could minimize this contention at the expense of throughput as follows. For read-only commands with (small) limits, execute the command first, before putting it in the command queue. If it reaches its limit, narrow the span based on the keys that were actually touched. Put it in the command queue under the narrowed span. After waiting on the command queue, execute it again. If it doesn't hit the limit while staying inside the narrowed span, something has changed out from under us and we have to re-queue with a broader span. 

There is probably a way to be clever and avoid the double execution in the common case, e.g. if the command queue allows the narrowed span to execute immediately we can use the results of the first execution. 

This is a major cause of slowness for `photos` (#9247). For example, [this trace](https://app.lightstep.com/gamma/trace?span_guid=391b115dc203c1c7&at_micros=1474689136990626#span-391b115dc203c1c7) spends 100ms in the command queue on its first scan. There is probably a related problem with the timestamp cache, but I haven't confirmed its existence or impact yet. 


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

kv: Scans with limit acquire excessively large latches #9521

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

kv: Scans with limit acquire excessively large latches #9521

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions