-
Notifications
You must be signed in to change notification settings - Fork 4.1k
storage: benchmark and optimize ComputeStatsForRange #84544
Description
As with other reads, ComputeStatsForRange now has to take MVCC range tombstones into account as well, which comes with a performance penalty. Many of the optimizations in #82559 and #83049 will also apply here, but we should see if there are additional optimization we can do.
One optimization that comes to mind is to construct the iterator internally, with appropriate bounds and options. Setting precise bounds would avoid a key comparison as described in #41899, yielding a 30% improvement. However, we sometimes need to compute stats for SSTs (using e.g. NewPebbleMemSSTIterator), which might preclude constructing the iterator internally.
We should also get rid of Engine.ComputeStats() while we're at it, and migrate all callers to use ComputeStatsForRange. However, one benefit of Engine.ComputeStats() is that it respects spanset assertions. We should instead consider forcing MVCC iterators to specify both a lower and upper bound, and do spanset assertions on NewMVCCIterator instead.
This needs to be benchmarked against release-22.1.
Jira issue: CRDB-17722
Epic CRDB-2624