Skip to content

docs: add diagnostics and metrics guide#537

Merged
kcenon merged 2 commits into
mainfrom
docs/issue-532-document-diagnostics-and-metrics-backends
Feb 8, 2026
Merged

docs: add diagnostics and metrics guide#537
kcenon merged 2 commits into
mainfrom
docs/issue-532-document-diagnostics-and-metrics-backends

Conversation

@kcenon

@kcenon kcenon commented Feb 8, 2026

Copy link
Copy Markdown
Owner

Summary

  • Add comprehensive docs/DIAGNOSTICS_METRICS_GUIDE.md covering thread pool diagnostics and metrics subsystems
  • Documents all 6 diagnostics types (thread_pool_diagnostics, health_status, bottleneck_report, execution_event, job_info, thread_info)
  • Documents all 7 metrics types (metrics_service, metrics_base, thread_pool_metrics, enhanced_metrics, latency_histogram, sliding_window_counter, metrics_backend)
  • Includes MetricsBackend interface documentation with custom implementation guide
  • Documents 3 built-in backends (Prometheus, JSON, Logging) with setup examples
  • Provides bottleneck report interpretation guide with detection logic, severity levels, and response recommendations
  • Complete end-to-end usage example combining diagnostics + metrics + export backends

Closes #532

Test Plan

  • All code examples use correct API signatures matching source headers
  • All default values match source code exactly
  • All enum values (bottleneck_type, health_state, event_type, job_status, worker_state) match source
  • Cross-references to source files and other docs are valid

@github-actions

github-actions Bot commented Feb 8, 2026

Copy link
Copy Markdown
Contributor

📊 Performance Benchmark Results

Performance Benchmark Report

No benchmark data available.

ℹ️ No baseline reference available

This is the first benchmark run or baseline file is missing.

@kcenon

kcenon commented Feb 8, 2026

Copy link
Copy Markdown
Owner Author

Test Plan Verification Results

1. Code examples use correct API signatures — PASS

Verified 100+ API calls across all 13 source headers:

  • Diagnostics API: thread_pool_diagnostics constructor, dump_thread_states(), format_thread_dump(), get_active_jobs(), get_pending_jobs(limit=100), get_recent_jobs(limit=100), detect_bottlenecks(), health_check(), is_healthy(), enable_tracing(bool, history_size=1000), add_event_listener(), remove_event_listener(), get_recent_events(limit=100), to_json(), to_string(), to_prometheus()
  • Metrics API: MetricsBase, ThreadPoolMetrics, EnhancedThreadPoolMetrics, LatencyHistogram, SlidingWindowCounter, metrics_service — all signatures match source
  • Backend API: PrometheusBackend, JsonBackend, LoggingBackend, BackendRegistry — all signatures match source

2. All default values match source code — PASS

Verified all default values:

Item Guide Value Source Value Status
diagnostics_config (7 fields) Documented Match PASS
health_thresholds (8 fields) Documented Match PASS
BUCKET_COUNT 64 64 PASS
DEFAULT_BUCKETS_PER_SECOND 10 10 PASS
Backend default prefix thread_pool thread_pool PASS
JsonBackend pretty default true true PASS
Method default params All documented All match PASS

3. All enum values match source code — PASS

Verified 28 enum values across 5 enums:

Enum Count Values Status
bottleneck_type 7 none, queue_full, slow_consumer, worker_starvation, lock_contention, uneven_distribution, memory_pressure PASS
health_state 4 healthy(200), degraded(200), unhealthy(503), unknown(503) PASS
event_type 7 job_submitted, job_started, job_completed, job_failed, job_cancelled, worker_started, worker_stopped PASS
job_status 6 pending, running, completed, failed, cancelled, timed_out PASS
worker_state 4 idle, busy, starting, stopping PASS
Severity levels 4 0=none, 1=low, 2=medium, 3=critical PASS

4. Cross-references to source files and docs are valid — 1 BUG FOUND, FIXED

Source file links (13/13): All valid
Documentation links: 3/4 valid

Bug found and fixed: Line 1518 referenced DAG_GUIDE.md which does not exist in the repository. Fixed to POLICY_QUEUE_GUIDE.md in commit be78587.

Summary

Test Plan Item Result
API signatures correct PASS
Default values match PASS
Enum values match PASS
Cross-references valid PASS (1 bug fixed)

Overall: ALL PASS (1 dead link bug found and fixed)

@github-actions

github-actions Bot commented Feb 8, 2026

Copy link
Copy Markdown
Contributor

📊 Performance Benchmark Results

Performance Benchmark Report

No benchmark data available.

ℹ️ No baseline reference available

This is the first benchmark run or baseline file is missing.

@kcenon kcenon merged commit 8d12f85 into main Feb 8, 2026
26 checks passed
@kcenon kcenon deleted the docs/issue-532-document-diagnostics-and-metrics-backends branch February 8, 2026 16:45
kcenon added a commit that referenced this pull request Apr 13, 2026
* docs: add diagnostics and metrics guide (#532)

* docs: fix dead link to DAG_GUIDE.md in diagnostics metrics guide (#532)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Task] docs: Document diagnostics and metrics backends

1 participant