Aggregate RawRecords before picking topN records

## Development Task
Currently, RawRecords are aggregated by "sql+plan+table+region+key_range". We pick the topN RawRecords ordered by cpu time. Other RawRecords which are not in topN are merged into one single "Other" record. Since "N" is a limit number, when queries access many different regions, the length of RawRecords will be much larger than N, and many records which map to one of the topN "sql+plan" will be merged into the "Other" record. It would cause data distortion in some cases.  
For example:
A cluster with only 1 tikv node, running QueryA and QueryB
QueryA access 100 regions concurrently in one second, each takes 100.5ms tikv cpu time.
QueryB access 100 regions concurrently in one second, takes [150, 149, 148,... 51]ms time.
We will get 200 RawRecords for these 200 region requests, and then 49 of QueryA and 50 of QueryB will be picked, others will be merged. So in the final picked records, we would conclude that QueryB takes much larger cpu time than QueryA. However, they actually take almost the same cpu time.

So we can aggregate RawRecords using "sql+plan+table" before picking topN records to reduce such data distortion cases.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Aggregate RawRecords before picking topN records #18844

Development Task

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Aggregate RawRecords before picking topN records #18844

Description

Development Task

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions