-
Notifications
You must be signed in to change notification settings - Fork 4.1k
sql: implement contention registry and expose through StatusServer and virtual table #57114
Description
One of SQL Execution's 21.1 goals is to surface global KV contention information. To do this, each gateway node will have to keep track of contention events for its queries.
This map view is still pending design, but we can start work on the backend given the general idea. This “global” view should also be queryable via a virtual table. This kind of API is reminiscent of other features like SHOW QUERIES, where each server has a local view of its queries and there is a cluster-level API through the DB console as well as SQL shell to query the global state.
The proposal in this case is to keep an in-memory contention registry keyed by table ID/indexID pair that is updated by the DistSQL receiver when receiving contention events. When a StatusServer receives a request for a global contention view, it will broadcast a contention request to each node for its local contention view, and merge it with its own local view.
Diving a bit deeper, what should this contention registry look like? It depends on the questions we want to answer. At a high level, we want to be able to answer which tables are experiencing contention and allow the user to dive deeper into a table to understand what key/row range is experiencing contention and which transactions are responsible for this contention. Therefore, the proposal is a top-level map keyed by a table ID/index ID with a value struct that looks something like:
type tableContentionInfo struct {
contentionEvents uint64
cumulativeContentionTime time.Duration
orderedKeyContentionInfo []keyContentionInfo
}
type keyContentionInfo struct {
key roachpb.Key
contendingTransactions []roachpb.Transaction
}
This map cannot grow unboundedly, so an eviction policy needs to be put into place. As a simple starter, we can also keep track of the last contention event for a given table and use an LRU policy with a maximum size for the map. In a given map entry, we could similarly use an LRU policy for keys.
Merging these structs should be relatively straightforward.