storage: track iteration stats per engine.Iterator

For better observability into the behavior of a query, we should be tracking detailed stats on iterator behavior, and log them to the query trace. This will be helpful when diagnosing slow queries. Using node profiles, like we did in diagnosing slow queries due to garbage in the LSM here https://github.com/cockroachlabs/support/issues/710#issuecomment-737607669 is not in general an effective technique.
And as we increase the minimum size of a replica, we will potentially increase the variability in different parts of the key space within the replica, so replica level `MVCCStats` may not be a good enough indicator of MVCC garbage encountered by a query.

These stats can also be aggregated per-replica, and be an indicator of replica read load.

This will require changes in Pebble too, since we don't currently track stats in the iterator stack consisting of `Iterator`, `mergingIter`, `levelIter`.
Some examples of counts that would be useful:
- Obsolete MVCC versions encountered (measured in the `storage.pebbleIterator`).
- Failed nexts before a seek (in `pebbleMVCCScanner`)
- Obsolete Pebble versions
- key-value pairs iterated over by Pebble. The total bytes (key+value) iterated over (even if we didn't read the value).
- ssblocks loaded.
- `pebble.Iterator` seeks and file level seeks.
- Number of nexts done by `mergingIter`. This can help find instances of https://github.com/cockroachdb/pebble/issues/1070


Jira issue: CRDB-3336

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

storage: track iteration stats per engine.Iterator #59069

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

storage: track iteration stats per engine.Iterator #59069

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions