-
Notifications
You must be signed in to change notification settings - Fork 25
Description
Problem
The Full Dashboard deadlock pipeline has significant latency compared to Lite. Lite reads directly from the XE ring buffer on each refresh and shows deadlocks instantly. The Full Dashboard requires 3 sequential scheduled job steps before data appears:
deadlock_xml_collector→collect.deadlock_xmlprocess_deadlock_xml(sp_BlitzLock) →collect.deadlocksblocking_deadlock_analyzer→collect.blocking_deadlock_stats
Each step depends on the previous one completing, and each waits for its own job schedule. This means deadlocks can take 3x the collection interval to appear on charts.
Observed Impact
Lite shows a deadlock spike at 5:00 PM. Full Dashboard shows nothing in the same time range because the pipeline hasn't completed all 3 steps yet.
Proposed Improvements
1. Chain-trigger dependent collectors
When deadlock_xml_collector finds new data, it should immediately trigger process_deadlock_xml and then blocking_deadlock_analyzer in the same job step, rather than waiting for separate scheduled runs. Finding data = parse it now.
2. Increase collection frequency for key metrics
The Full Dashboard should match or exceed Lite's collection cadence for critical real-time metrics. The only difference between Full and Lite should be persistence (SQL Server vs DuckDB), not freshness. Consider:
- More frequent collection intervals for deadlocks, blocking, and other event-driven data
- Separating fast collectors (XE event reads) from slow collectors (DMV aggregations) so fast ones can run more often
3. Review other pipelines for the same pattern
Any multi-hop aggregation pipeline in the Full Dashboard likely has the same lag issue. Audit all collector → analyzer chains.