reporting: capture scrubbed information about queries and schemas

To gather information to guide work on optimizations, and to track which features are used, we want to receive more information about SQL queries from production clusters.

Here's how it looks like:

- reporting is enabled by default, users can opt-out
- server will push information to our registration server
- what information is pushed: SQL query distribution & database schemas.
  - from every node in the cluster
  - at a predefined interval (say, once per day) this info is encoded, compressed, sent to our reg server, then cleared on the node
- regarding "SQL query distribution": during execution, each node maintains internally (in memory) a map from "query structure" (see below for definition) to a tuple
    - tuple: **(count, first/last latency quantiles, nr rows, error status)**
    - count = how many times that structure was encountered 
    - first latency = from the point the query is received on pgwire to the point the first result is returned
    - last latency = from the point the query is received on pgwire to the point the last result is returned
- **SQL query structure:** either SQL query syntax or logical plan (or both) are used as a key in the distribution map defined above, with a twist:
   - all SQL Datums are replaced by placeholders; 
   - all db/table/column names are replaced by their numeric IDs
   (We work under the assumption that applications always reuse a small set of query structure, but with a large variability on the constant parameters)
- **database schemas** are also reported, alongside with the distribution map defined above
  - reports table IDs together with types, constraints, index definitions etc (names scrubbed away)
  - default/check expressions also scrubbed from datums & names; all names replaced by equivalent IDs
- CLI utility or web UI allows users to inspect the data that is being collected.

Proposed approach by Peter:

- [x] make application naming functional #14085 #14089
- [x] equip the executor with per-app statistics #14092 
- [x] store some rudimentary stats (count, error code, number of result rows) in a per-node struct with proper mutex #14181
- [x] expose the data via a virtual table #14181
- [x] make a function that scrubs a query #14845
- [x] make a function that scrubs the table/db names from the collected statics.
- [x] add settings: how frequently it's reported, how frequently it's cleared
- [ ] add a cmdline flag to identify the operator/user
  - [ ] add a setting for the user
  - [ ] link the registration form to editing this user
- [x] update the reg server schema to add a table 
- [x] update the reg server program/binary to accept the new queries
- [x] reporting loop - upload the collected stats to reg server with user, clear them out
- [x] document what is being reported - either in docs or admin UIs

Bonus features:
- [x] make a function that scrubs a schema
- [ ] make a json endpoint in the debug interface
- [ ] make a CLI utility that dumps it
- [x] submit data to reporting server + UI to opt-out




Provide feedback

Saved searches

Use saved searches to filter your results more quickly

reporting: capture scrubbed information about queries and schemas #13968

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

reporting: capture scrubbed information about queries and schemas #13968

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions