-
Notifications
You must be signed in to change notification settings - Fork 25
Description
Problem
Dashboard's collectors use incremental collection — each cycle only fetches data since the last collection time. Lite's collectors re-scan fixed time windows every cycle, doing redundant work.
This is the single biggest performance difference between Dashboard and Lite collection.
Query Store collector (avg 2,286ms vs Dashboard's 126ms)
Root cause: Lite scans a fixed 60-minute window with GROUP BY + TOP 100 every cycle. Dashboard tracks MAX(utc_last_execution_time) and only fetches rows newer than that — typically 1-5 minutes of data, no aggregation needed.
Fix: Track last collection time in DuckDB, pass as cutoff to SQL query. Store raw Query Store intervals (like Dashboard does) instead of re-aggregating every cycle. Aggregation moves to the display/trend queries.
Deadlock collector (avg 1,194ms, spiking to 3,837ms)
Two issues:
- Ring buffer full parse: The XE ring_buffer query reads the entire buffer (up to 4MB XML) every cycle, then filters by cutoff time inside
.nodes()XQuery. SQL Server must parse the full XML before filtering. - XML parsing in DuckDB timing:
ExtractVictimSqlText()doesXElement.Parse()on every deadlock graph XML inside the DuckDB stopwatch block. This is C# XML parsing being misattributed as DuckDB time, but more importantly it's O(n) full XML parses per collection. Should be done once, outside the appender loop, or deferred to display time.
Fix: Already has @cutoff_time from DuckDB MAX(deadlock_time) — the SQL-side is incremental. The real fix is moving XML parsing out of the insert loop (parse victim text at display time, not collection time) and ensuring the ring buffer doesn't accumulate unbounded events.
All other collectors
Every collector that uses a fixed time window should switch to incremental:
| Collector | Current Window | Fix |
|---|---|---|
| query_store | Fixed 60 min | Track last collection time, fetch only new runtime_stats intervals |
| query_snapshots | Fixed window | Track last collection_time |
| blocked_process_report | Fixed window | Track last event_time |
| cpu_utilization | Fixed 60 min (ring buffer) | Track last sample_time |
| perfmon_stats | Delta-based (already incremental via cumulative counters) | OK |
| wait_stats | Delta-based (already incremental via cumulative counters) | OK |
| memory_stats | Point-in-time snapshot | OK (no window) |
| file_io_stats | Delta-based | OK |
| tempdb_stats | Point-in-time snapshot | OK |
| memory_clerks | Point-in-time snapshot | OK |
Pattern
Dashboard uses this pattern in every collector:
-- Get last collection time
SELECT @cutoff_time = MAX(last_execution_time) FROM collect.table;
-- Only fetch new data
WHERE rs.last_execution_time >= @cutoff_timeLite equivalent:
// Query DuckDB for last collection time
var lastTime = await GetLastCollectionTimeAsync(serverId, "table_name");
// Pass to SQL query
command.Parameters.Add(new SqlParameter("@cutoff_time", SqlDbType.DateTime2)
{ Value = lastTime ?? DateTime.UtcNow.AddMinutes(-10) });The deadlock collector already does this (lines 361-376 in RemoteCollectorService.Deadlocks.cs) — extend the same pattern to all collectors that scan time windows.
Additional: Deadlock XML parsing
Move ExtractVictimSqlText() out of the collection hot path:
- Option A: Store raw XML, extract victim text at display time (lazy)
- Option B: Parse once per batch outside the appender loop, build a dictionary, then look up during append
Files involved
Lite/Services/RemoteCollectorService.QueryStore.cs— biggest winLite/Services/RemoteCollectorService.Deadlocks.cs— XML parsingLite/Services/RemoteCollectorService.cs— add sharedGetLastCollectionTimeAsync()helper- All other
RemoteCollectorService.*.csfiles with fixed time windows