Skip to content

Alert on poison wait type spikes (THREADPOOL, RESOURCE_SEMAPHORE, etc.) #56

@erikdarlingdata

Description

@erikdarlingdata

Problem

The Dashboard and Lite apps alert on blocking, deadlocks, and high CPU, but don't alert on dangerous wait type spikes that signal imminent trouble — THREADPOOL exhaustion, memory grant starvation, etc. These "poison waits" are leading indicators of severe problems.

Proposed Solution

Monitor deltas of specific dangerous wait types and alert when they increase significantly.

Poison wait types to monitor:

  • THREADPOOL — worker thread exhaustion, server stops accepting requests
  • RESOURCE_SEMAPHORE — memory grant starvation, queries can't get memory to execute
  • RESOURCE_SEMAPHORE_QUERY_COMPILE — can't compile queries due to memory pressure

Potentially also:

  • CMEMTHREAD — memory object contention
  • SOS_SCHEDULER_YIELD (sustained high delta) — CPU scheduler saturation

Alert trigger: Configurable delta threshold per collection interval, or a significant increase over baseline (e.g., delta > N ms in a single interval).

Note on THREADPOOL: This is self-defeating — if THREADPOOL is saturated, collections may stop running. The valuable signal is the last successful collection showing THREADPOOL spiking before things go dark. That's the canary in the coal mine.

Context

  • Wait stats deltas are already collected in both Dashboard (SQL Server) and Lite (DuckDB)
  • Should be implemented before Alerts History view for centralized alert assessment #52 (Alerts History view) so the new alert type is included in the history UI
  • CPU alerts currently have minimal context; this gives a richer picture of server distress

Scope

  • Full Dashboard
  • Lite
  • SQL collection scripts (already collected)
  • Installer
  • Documentation

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions