server: add more info to the combined statement endpoint#109040
server: add more info to the combined statement endpoint#109040craig[bot] merged 1 commit intocockroachdb:masterfrom
Conversation
koorosh
left a comment
There was a problem hiding this comment.
Reviewed 1 of 4 files at r1, all commit messages.
Reviewable status:complete! 0 of 0 LGTMs obtained (waiting on @maryliag)
pkg/server/combined_statement_stats.go line 342 at r1 (raw file):
fmt.Sprintf(` SELECT COALESCE(min(aggregated_ts), now()) FROM %s %s`, table, whereClause), args...)
I have concern about this statement, I ran EXPLAIN ANALYZE on this query and it used full scan on it.
EXPLAIN ANALYZE SELECT COALESCE(min(aggregated_ts), now()) FROM crdb_internal.statement_statistics WHERE true AND aggregated_ts >= now() AND aggregated_ts <= now();result:
planning time: 9ms
execution time: 9ms
distribution: local
vectorized: true
• render
│
└── • group (scalar)
│ nodes: n1
...
│
└── • scan
....
table: statement_statistics@fingerprint_stats_idx ---------------------- WARNING: the row count estimate is inaccurate, consider running 'ANALYZE statement_statistics'
spans: FULL SCAN (SOFT LIMIT)
Most of the indexes on system.statement\_statistics table defined with condition WHERE app\_name NOT LIKE '$ internal%' so I suggest to include this clause in WHERE condition of the statement:
EXPLAIN ANALYZE
SELECT COALESCE(min(aggregated_ts), now()) FROM crdb_internal.statement_statistics
WHERE true
AND aggregated_ts >= now()
AND aggregated_ts <= now()
AND app_name NOT LIKE '$ internal%' -- <- add thiswhich gives us:
└── • scan
nodes: n1
...
table: statement_statistics@execution_count_idx (partial index)
spans: [/'2023-08-18 17:31:07.875656+00' - /'2023-08-19 17:40:35.478088+00']
pkg/server/serverpb/status.proto line 1644 at r1 (raw file):
// OlderAggregatedTsReturned is the timestamp of the oldest entry returned, // or NOW() if there is no data returned. google.protobuf.Timestamp older_aggregated_ts_returned = 8 [(gogoproto.nullable) = false, (gogoproto.stdtime) = true];
The naming of field is a bit confusing.
koorosh
left a comment
There was a problem hiding this comment.
Reviewable status:
complete! 0 of 0 LGTMs obtained (waiting on @maryliag)
pkg/server/combined_statement_stats.go line 345 at r1 (raw file):
if err != nil { return timeutil.Now(), err
What is the motivation to return now() instead of NULL in case of error or empty results?
maryliag
left a comment
There was a problem hiding this comment.
Reviewable status:
complete! 0 of 0 LGTMs obtained (waiting on @koorosh)
pkg/server/combined_statement_stats.go line 342 at r1 (raw file):
Previously, koorosh (Andrii Vorobiov) wrote…
I have concern about this statement, I ran
EXPLAIN ANALYZEon this query and it used full scan on it.EXPLAIN ANALYZE SELECT COALESCE(min(aggregated_ts), now()) FROM crdb_internal.statement_statistics WHERE true AND aggregated_ts >= now() AND aggregated_ts <= now();result:
planning time: 9ms execution time: 9ms distribution: local vectorized: true • render │ └── • group (scalar) │ nodes: n1 ... │ └── • scan .... table: statement_statistics@fingerprint_stats_idx ---------------------- WARNING: the row count estimate is inaccurate, consider running 'ANALYZE statement_statistics' spans: FULL SCAN (SOFT LIMIT)Most of the indexes on
system.statement\_statisticstable defined with conditionWHERE app\_name NOT LIKE '$ internal%'so I suggest to include this clause in WHERE condition of the statement:EXPLAIN ANALYZE SELECT COALESCE(min(aggregated_ts), now()) FROM crdb_internal.statement_statistics WHERE true AND aggregated_ts >= now() AND aggregated_ts <= now() AND app_name NOT LIKE '$ internal%' -- <- add thiswhich gives us:
└── • scan nodes: n1 ... table: statement_statistics@execution_count_idx (partial index) spans: [/'2023-08-18 17:31:07.875656+00' - /'2023-08-19 17:40:35.478088+00']
We actually have a cluster setting to decide if we're returning internal statements on this endpoint or not, so I added the changes to take that into consideration. The default value is to filter out internal, so now is doing that correctly! Nice callout!
pkg/server/combined_statement_stats.go line 345 at r1 (raw file):
Previously, koorosh (Andrii Vorobiov) wrote…
What is the motivation to return
now()instead of NULL in case of error or empty results?
I get compile error if I try to just return null, since the function expects a Time. I changed to just the empty Time component, which ends up also returning NOW, but at east shows that we don't really care the value itself
pkg/server/serverpb/status.proto line 1644 at r1 (raw file):
Previously, koorosh (Andrii Vorobiov) wrote…
The naming of field is a bit confusing.
Made a slight change, let me know if is more clear now. We do select a time period, but we might not have data for that full period, so let's say we request from august 10 to august 19, but we only have data starting on aug 15, so that is the value returned here. A following PR will use this value to show a warning in the UI.
Let me know if the name is better now.
koorosh
left a comment
There was a problem hiding this comment.
Reviewed 4 of 4 files at r2, all commit messages.
Reviewable status:complete! 0 of 0 LGTMs obtained (waiting on @maryliag)
pkg/server/combined_statement_stats.go line 342 at r1 (raw file):
Previously, maryliag (Marylia Gutierrez) wrote…
We actually have a cluster setting to decide if we're returning internal statements on this endpoint or not, so I added the changes to take that into consideration. The default value is to filter out internal, so now is doing that correctly! Nice callout!
Great!
pkg/server/combined_statement_stats.go line 345 at r1 (raw file):
Previously, maryliag (Marylia Gutierrez) wrote…
I get compile error if I try to just return null, since the function expects a Time. I changed to just the empty Time component, which ends up also returning NOW, but at east shows that we don't really care the value itself
It requires following changes to make it work with null:
- Remove
(gogoproto.nullable) = falseattribute fromoldest_aggregated_ts_returnedfield in ptoto message
google.protobuf.Timestamp oldest\_aggregated\_ts\_returned = 8 [(gogoproto.stdtime) = true];- Return pointer to time.Date in
getOldestDateand change returned values to eithernilor address of value
getOldestDate := func(table string) (\*time.Time, error) {My concern about returning current time is to avoid confusion when users reset stats so there's no records in tables but oldest aggregated ts would return current time which in turn would be problematic.
pkg/server/combined_statement_stats.go line 375 at r2 (raw file):
}() return tree.MustBeDTimestampTZ(row[0]).Time, nil
tree.MustBeDTimestampTZ panics if it cannot retrieve the value. Should we be more tolerant and use AsDTimestampTZ that indicates where the value parsed or not?
pkg/server/serverpb/status.proto line 1644 at r1 (raw file):
Previously, maryliag (Marylia Gutierrez) wrote…
Made a slight change, let me know if is more clear now. We do select a time period, but we might not have data for that full period, so let's say we request from august 10 to august 19, but we only have data starting on aug 15, so that is the value returned here. A following PR will use this value to show a warning in the UI.
Let me know if the name is better now.
Got it! Thanks for detailed explanation it is clear now! Thank you!
maryliag
left a comment
There was a problem hiding this comment.
Reviewable status:
complete! 0 of 0 LGTMs obtained (waiting on @koorosh)
pkg/server/combined_statement_stats.go line 345 at r1 (raw file):
Previously, koorosh (Andrii Vorobiov) wrote…
It requires following changes to make it work with
null:
- Remove
(gogoproto.nullable) = falseattribute fromoldest_aggregated_ts_returnedfield in ptoto messagegoogle.protobuf.Timestamp oldest\_aggregated\_ts\_returned = 8 [(gogoproto.stdtime) = true];
- Return pointer to time.Date in
getOldestDateand change returned values to eithernilor address of valuegetOldestDate := func(table string) (\*time.Time, error) {My concern about returning current time is to avoid confusion when users reset stats so there's no records in tables but
oldest aggregated tswould return current time which in turn would be problematic.
Nice! I made the changes and now returns null when no data is returned :)
thank you for the detailed steps
pkg/server/combined_statement_stats.go line 375 at r2 (raw file):
Previously, koorosh (Andrii Vorobiov) wrote…
tree.MustBeDTimestampTZpanics if it cannot retrieve the value. Should we be more tolerant and useAsDTimestampTZthat indicates where the value parsed or not?
the value should ne null or timestamp, so I added a check for DNull first, then make the call the parse the timestamp, now it will probably return null or the timestamp.
koorosh
left a comment
There was a problem hiding this comment.
LGTM, great!
Also want to clarify if we need to do the same check for transaction statistics?
Reviewed 3 of 3 files at r3, all commit messages.
Reviewable status:complete! 0 of 0 LGTMs obtained (waiting on @maryliag)
pkg/server/combined_statement_stats.go line 375 at r2 (raw file):
Previously, maryliag (Marylia Gutierrez) wrote…
the value should ne null or timestamp, so I added a check for DNull first, then make the call the parse the timestamp, now it will probably return null or the timestamp.
Nice! 🎉
Previously, it was hard to know which tables were used to populate the SQL Activity, making debug for it complicated. This commit adds extra information to the `combinedstmts` to help: - olderAggregatedTsReturned - stmtsSourceTable - txnsSourceTable Returning a value for `olderAggregatedTsReturned` will also allow us to show proper warning when the older timestamp doesn't match the start date of the selected time period on the Search Criteria. Part Of cockroachdb#108989 Release note: None
maryliag
left a comment
There was a problem hiding this comment.
On transaction statistics we also return statement statistics, so it will go over the same parts of the code, I just had to made an adjustement.
Examples of responses when:
Statement page with results:
"stmtsTotalRuntimeSecs": 89.46363,
"txnsTotalRuntimeSecs": 0,
"oldestAggregatedTsReturned": "2023-08-20T14:00:00Z",
"stmtsSourceTable": "crdb\_internal.statement\_statistics\_persisted",
"txnsSourceTable": ""
Statement page with no results:
"stmtsTotalRuntimeSecs": 0,
"txnsTotalRuntimeSecs": 0,
"oldestAggregatedTsReturned": null,
"stmtsSourceTable": "crdb\_internal.statement\_statistics",
"txnsSourceTable": ""
Transactions page with results:
"stmtsTotalRuntimeSecs": 90.2161,
"txnsTotalRuntimeSecs": 47.55528,
"oldestAggregatedTsReturned": "2023-08-20T14:00:00Z",
"stmtsSourceTable": "crdb\_internal.statement\_statistics\_persisted",
"txnsSourceTable": "crdb\_internal.transaction\_statistics\_persisted"
Transactions page with no results:
"stmtsTotalRuntimeSecs": 0,
"txnsTotalRuntimeSecs": 0,
"oldestAggregatedTsReturned": null,
"stmtsSourceTable": "crdb\_internal.statement\_statistics",
"txnsSourceTable": "crdb\_internal.transaction\_statistics"
```
Reviewable status:
complete! 0 of 0 LGTMs obtained (waiting on @koorosh)
koorosh
left a comment
There was a problem hiding this comment.
Sorry if my question was misleading.
getOldestDate function is only called to get oldest timestamp for statement_activity tables but not transaction_activity tables. I'm not sure if it is necessary or not.
Reviewed all commit messages.
Reviewable status:complete! 0 of 0 LGTMs obtained (waiting on @maryliag)
maryliag
left a comment
There was a problem hiding this comment.
That was on purpose, because the statements data would reach the limit and be deleted first and that is used by both statements and transactions, so having just one of them is enough.
Reviewable status:
complete! 0 of 0 LGTMs obtained (waiting on @koorosh)
koorosh
left a comment
There was a problem hiding this comment.
Reviewable status:
complete! 1 of 0 LGTMs obtained (waiting on @maryliag)
koorosh
left a comment
There was a problem hiding this comment.
Reviewable status:
complete! 1 of 0 LGTMs obtained (waiting on @maryliag)
|
TFTR! bors r+ |
|
Build succeeded: |
|
blathers backport 23.1 |
|
Encountered an error creating backports. Some common things that can go wrong:
You might need to create your backport manually using the backport tool. error setting reviewers, but backport branch blathers/backport-release-23.1-109040 is ready: POST https://api.github.com/repos/cockroachdb/cockroach/pulls/109132/requested_reviewers: 422 Reviews may only be requested from collaborators. One or more of the teams you specified is not a collaborator of the cockroachdb/cockroach repository. [] Backport to branch 23.1 failed. See errors above. 🦉 Hoot! I am a Blathers, a bot for CockroachDB. My owner is dev-inf. |
Previously, it was hard to know which tables were
used to populate the SQL Activity, making debug for it complicated.
This commit adds extra information to the
combinedstmtsto help:Returning a value for
oldestAggregatedTsReturnedwill also allow us to show proper warning when the older timestamp doesn't match the start date of the selected time period on the Search Criteria.Part Of #108989
Release note: None