Skip to content

sql: Statement statistics should be persisted #56219

@andy-kimball

Description

@andy-kimball

Today, Statement statistics (e.g. count, retries, timings grouped by statement fingerprint) are accumulated in memory. When the AdminUI calls the Statement API on the gateway CRDB node, it will fan-out the call to all other CRDB nodes. Every node will return its in-memory statement statistics, and the gateway CRDB node will merge together the results and return them. At no time are these statement statistics persisted to a table.

There are several problems with this behavior:

  1. When CRDB nodes are upgraded or restart, they lose their accumulated statement statistics. This can cause unexpected results to be shown to the user that are missing key bits of information.
  2. Statement statistics are periodically cleared on the server, meaning that users can only get a limited history of statement statistics. Customers have repeatedly asked for a longer history, so they can examine trends going back hours or days.
  3. This design does not work well with multi-tenant clusters, since pure-SQL nodes do not know about one another, and will be brought up/down on a more frequent basis than single-tenant CRDB nodes.

One idea to address these problems is to periodically persist statement statistics to system tables that can later be queried by the user using SQL (e.g. similar to the "replication reports" system tables). This gives a rich query-ability experience and should scale up to large amounts of data. We could implement a configurable retention period / data volume to prevent runaway storage usage. Once we had this, the Statements API would simply query this table using SQL. Or the user could query this table directly using SQL.

One downside to the suggested approach is that data would not be as "fresh", since it would only be persisted periodically (say once every 30 or 60 seconds). If we need fresher data in the AdminUI, we could still do a fan-out that triggers persistence of the latest in-memory data (i.e. a "flush" operation).

Metadata

Metadata

Assignees

No one assigned

    Labels

    A-sql-cli-observabilityIssues related to surfacing SQL observability in SHOW, CRDB_INTERNAL, SYSTEM, etc.A-sql-executionRelating to SQL execution.A-sql-observabilityRelated to observability of the SQL layerC-enhancementSolution expected to add code/behavior + preserve backward-compat (pg compat issues are exception)

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions