Add queries for collecting query store runtime and wait statistics#8465
Add queries for collecting query store runtime and wait statistics#8465IgorKuchmienko wants to merge 9 commits intoinfluxdata:masterfrom
Conversation
…tore wait statistics, for Azure SQL DB and MI Add ability to store in a hashtable a custom value associated with the query result, and use it when the query is run next time. For query store queries, we store the time when the query was run last, and in the next run we only return the data between the last run time and the current time, rounded down to a multiple of the reporting interval.
There was a problem hiding this comment.
All the comments put for SQLDB apply to MI as well. I have looked closer at SQLDB for now but haven't tested on running workload. For all the timestamp calculations please add comments for maintainability. I also feel we should have some "check" given we now have a cache to NOT run this query if it has been run say < 15 mins ago or something. Think of a person who had an old config file, who hasn't added this new addition to their "exclude" list so this would for them run by default at a 10 second interval. Min collection should be 15 mins. Also have you done any perf testing on collecting this on a SQL with workload? And for MI on an MI with multiple databases running a workload?
The other thing I noticed, is if someone changes the interval below the collection time as is we are not "merging' or deduping right so will get queries counted twice.
|
|
||
| ` | ||
|
|
||
| const sqlAzureDBQueryStoreRuntimeStatistics = sqlAzureDBPartQueryPeriod + ` |
There was a problem hiding this comment.
You may want to check Updateability (until we have query store on read replicas). If not you will be collecting query store data (duplicate or same) from 2 replicas if read replica in BC/Hyperscale is configured for monitoring. Collect only if it is READ_WRITE.
https://docs.microsoft.com/en-us/sql/t-sql/functions/databasepropertyex-transact-sql?view=sql-server-ver15
There was a problem hiding this comment.
We still need this check here aka collect only if a read/write database and not a read only replica
…ime_stats_interval_id. Need this aggregation so that we do not get multiple rows for the same query in the same time interval. See here https://docs.microsoft.com/en-us/sql/relational-databases/system-catalog-views/sys-query-store-runtime-stats-transact-sql?view=sql-server-ver15
Yes, this is exactly what the cache is used for. Whatever the interval in telegraf is set to, by using the cached time we guarantee that nothing is returned from telegraf while there is no new data in query store, and no duplicates are returned. I edited the comment in the description of this PR to elaborate on this, and I added comments in the implementation. |
|
These tests were run for the queries.
In both cases, a CPU load was generated with a T-SQL script on the DB.
The plug-in was configured to output the data into a socket, and it was then streamed into an Azure Log Analytics workspace by the Log Analytics agent installed on the VM. The tests showed that the queries produce the data as expected: once per aggregation interval and with no duplicate records. There are no gaps in data and telegraf log file tgf.log contains no errors. Test results are here: Tests.zip. |
denzilribeiro
left a comment
There was a problem hiding this comment.
Need some changes, error handling in the MI part, checks for databases that they are online, that it isn't a read replica and some minor chances of some columns to tags.
| DB_NAME() AS [database_name], | ||
| CONVERT(nvarchar(30), MAX(rsi.start_time), 126) AS interval_start_time, | ||
| CONVERT(nvarchar(30), MAX(rsi.end_time), 126) AS interval_end_time, | ||
| MAX(q.query_id) AS query_id, |
There was a problem hiding this comment.
query_id should be a TAG and it isn't ( aka convert to string) as isn't a measurement just like plan_id was converted
| DB_NAME() AS [database_name], | ||
| CONVERT(nvarchar(30), MAX(rsi.start_time), 126) AS interval_start_time, | ||
| CONVERT(nvarchar(30), MAX(rsi.end_time), 126) AS interval_end_time, | ||
| MAX(q.query_id) AS query_id, |
There was a problem hiding this comment.
query_id should be a TAG and it isn't ( aka convert to string) as isn't a measurement just like plan_id was converted
| DB_NAME() AS [database_name], | ||
| CONVERT(nvarchar(30), MAX(rsi.start_time), 126) AS interval_start_time, | ||
| CONVERT(nvarchar(30), MAX(rsi.end_time), 126) AS interval_end_time, | ||
| MAX(q.query_id) AS query_id, |
There was a problem hiding this comment.
Query_ID should be a TAG ( convert to string) , it isn't a measurement - same like plan_id
| [database_name] nvarchar(128), | ||
| interval_start_time nvarchar(30), | ||
| interval_end_time nvarchar(30), | ||
| query_id bigint, |
There was a problem hiding this comment.
query_id should be string/tag same like plan_id as isn't a measurement
| DB_NAME() AS [database_name], | ||
| CONVERT(nvarchar(30), MAX(rsi.start_time), 126) AS interval_start_time, | ||
| CONVERT(nvarchar(30), MAX(rsi.end_time), 126) AS interval_end_time, | ||
| MAX(q.query_id) AS query_id, |
There was a problem hiding this comment.
Query_ID should be a TAG ( convert to string) , it isn't a measurement - same like plan_id
|
|
||
| ` | ||
|
|
||
| const sqlAzureDBQueryStoreRuntimeStatistics = sqlAzureDBPartQueryPeriod + ` |
There was a problem hiding this comment.
We still need this check here aka collect only if a read/write database and not a read only replica
| SET NOCOUNT ON; | ||
|
|
||
| DECLARE dbCursor CURSOR LOCAL FOR | ||
| SELECT |
There was a problem hiding this comment.
Need other conditions, aka database is online, and read/write ( not read replica) - aka sys.database state column
This also needs error handling, check if query store is enabled as there is no Try Catch block here and it is iterating through multiple databases
| FROM sys.dm_os_schedulers AS s | ||
| ` | ||
|
|
||
| const sqlAzureMIPartQueryPeriod = ` |
There was a problem hiding this comment.
This is a sizable query, have we thought of limiting what we retrieve? Aka take for example MI with 20 databases, creating a temp table with everything in query store could be expensive? Should you get like the top 50 from each?
| [database_name] nvarchar(128), | ||
| interval_start_time nvarchar(30), | ||
| interval_end_time nvarchar(30), | ||
| query_id bigint, |
There was a problem hiding this comment.
query_id should be a TAG ( string) as not a measurement
| /* Convert start and end times to datetime */ | ||
| DECLARE @currIntervalEndDate datetime = DATEADD(second, @currIntervalEndTimestamp % @secondsInDay, DATEADD(day, @currIntervalEndTimestamp / @secondsInDay, 0)); | ||
| DECLARE @currIntervalStartDate datetime = DATEADD(second, @currIntervalStartTimestamp % @secondsInDay, DATEADD(day, @currIntervalStartTimestamp / @secondsInDay, 0)); | ||
|
|
There was a problem hiding this comment.
@currIntervalStartDate @currIntervalEndDate are declared as datetime, but start_time and end_time columns are datetimeoffset, which has higher precision, so rounding can cause unexpected results. More importantly, on MI, timezone may be something other than UTC and may have daylight savings, which won't be handled correctly with datetime.
There was a problem hiding this comment.
@dimitri-furman Is the recommendation to use datetime2 at all places? datetime2 seems to have higher precision than datetime, however it is not aware of timezone offset.
There was a problem hiding this comment.
Query Store uses datetimeoffset in sys.query_store_runtime_stats_interval. We should stay consistent and use the same data type to avoid type conversion problems. But you should check if the current time math that uses datetime still works correctly with datetimeoffset.
| 'sqlserver_azuredb_querystore_runtime_stats' AS [measurement], | ||
| REPLACE(@@SERVERNAME, '\', ':') AS [sql_instance], | ||
| DB_NAME() AS [database_name], | ||
| CONVERT(nvarchar(30), MAX(rsi.start_time), 126) AS interval_start_time, |
There was a problem hiding this comment.
This partially truncates the time zone. Need to use nvarchar(33) or longer.
| RETURN | ||
| END; | ||
|
|
||
| IF OBJECT_ID ('tempdb.dbo.#ExecForeachDb') IS NOT NULL DROP PROCEDURE #ExecForeachDb; |
There was a problem hiding this comment.
Simplify: DROP PROCEDURE IF EXISTS #ExecForeachDb;
| BEGIN | ||
| SET NOCOUNT ON; | ||
|
|
||
| DECLARE dbCursor CURSOR LOCAL FOR |
There was a problem hiding this comment.
Declaring cursor as FAST_FORWARD will be more efficient
| FETCH NEXT FROM dbCursor INTO @dbName; | ||
|
|
||
| WHILE @@Fetch_Status=0 BEGIN | ||
| SET @CurrSqlText = N''USE ''+ QUOTENAME(@dbName ,''"'') + @queryText |
There was a problem hiding this comment.
A database may be dropped between the time the cursor is open and the time this USE executes.
|
Hi @IgorKuchmienko, are you still able to working on this PR? Please let me know if there's anything we can do to help |
Yes, we'll be working on this PR. Unfortunately, the PR got delayed and ETA is around July. |
|
Splitting up this PR into 2 for easier review. New PR - #9150 (drafted wip to address review comments) |
@denzilribeiro : Re " Think of a person who had an old config file, who hasn't added this new addition to their "exclude" list so this would for them run by default at a 10 second interval. Min collection should be 15 mins. " - for an user with old config and latest telegraf bits, yes it would start to collect QDS data at the interval specified. One possible way to avoid it is to introduce another config parameter "query_store_collection" set to false by default. If user wants to enables QDS queries, they will need to set query_store_collection = true and also remove the qds query list from exclude. if they just change query_store_collection to true without modifying exclude list, data will still not be collected. Let me know your thoughts |
|
Closing in favor of #9355 |
Required for all PRs:
Add queries for collecting query store runtime and wait statistics, for Azure SQL DB and MI.
Add ability to store in a hashtable a custom value associated with the query result, and use it when the query is run next time.
For query store queries, we store the time when the query was run last, and in the next run we only return the data between the last run time and the current time, rounded down to a multiple of the reporting interval.
The implementation takes care of two cases:
In this case, the query will return data only once per aggregation interval, and will return an empty set for each subsequent telegraf call within the same aggregation interval.
In this case, the query will return records from several aggregation intervals, i.e. all the new data since the previous telegraf call.