Skip to content

Add Claude skill for CI build troubleshooting#8172

Merged
lucaspimentel merged 5 commits intomasterfrom
lpimentel/ai-skills/troubleshoot-azure-devops
Feb 9, 2026
Merged

Add Claude skill for CI build troubleshooting#8172
lucaspimentel merged 5 commits intomasterfrom
lpimentel/ai-skills/troubleshoot-azure-devops

Conversation

@lucaspimentel
Copy link
Member

@lucaspimentel lucaspimentel commented Feb 6, 2026

Summary of changes

Adds a new Claude Code skill (/troubleshoot-ci-build) that automates CI failure analysis for Azure DevOps builds in the dd-trace-dotnet repository.

Reason for change

Manual CI failure investigation is time-consuming and requires multiple steps:

  • Navigating to Azure DevOps
  • Finding the failed build
  • Checking test results
  • Comparing with recent master builds to determine if failures are new or pre-existing
  • Categorizing failures (infrastructure, flaky, or real bugs)

This skill streamlines the process by automating these steps and providing actionable insights directly in the Claude Code interface.

Implementation details

The skill provides three usage modes:

/troubleshoot-ci-build pr <PR_NUMBER>                          # Analyze the CI build for a specific PR
/troubleshoot-ci-build build <BUILD_ID>                        # Analyzes a specific build by ID
/troubleshoot-ci-build compare <BUILD_ID> <BASELINE_BUILD_ID>  # Compare two builds

Key features:

  • Fetches build and test results from Azure DevOps API
  • Compares against recent master builds to identify new failures
  • Categorizes failures using known failure patterns (infrastructure, flaky, real)
  • Provides summary-first output with expandable details
  • Includes actionable recommendations for each failure type

The skill includes:

  • Main skill logic in SKILL.md with comprehensive Azure DevOps API integration
  • Usage documentation in README.md
  • Known failure patterns database in failure-patterns.md (460 lines of categorized test failures)

Test coverage

N/A - This is a Claude Code skill (automation tool), not production code.

Other details

Example output:

---
CI Failure Analysis for Build 195393

Build: https://dev.azure.com/datadoghq/dd-trace-dotnet/_build/results?buildId=195393
Status: ❌ Failed
Branch: refs/pull/7628/merge (PR #7628)
Commit: 88d5a6057f3f20c72b028c698e05a4dd851b3911
Finished: 2026-02-06 17:52:32 UTC

Quick Overview

Failed Stages (2):
- unit_tests_linux
- integration_tests_azure_functions

Failed Jobs (6 platforms):
- managed glibc_net5.0 (Linux unit tests)
- windows net6.0 (Azure Functions)
- windows net7.0 (Azure Functions)
- windows net8.0 (Azure Functions)
- windows net9.0 (Azure Functions)
- windows net10.0 (Azure Functions)

Failed Tasks (2 distinct):
- Run 'RunManagedUnitTests --framework net5.0 --code-coverage-enabled False' in Docker
- Run Azure Functions tests (5 occurrences across different .NET versions)

Failed Tests (5 Azure Functions tests, all SubmitsTraces variants):
- Datadog.Trace.ClrProfiler.IntegrationTests.AzureFunctionsTests+IsolatedRuntimeV4.SubmitsTraces
- Datadog.Trace.ClrProfiler.IntegrationTests.AzureFunctionsTests+IsolatedRuntimeV4AspNetCore.SubmitsTraces
- Datadog.Trace.ClrProfiler.IntegrationTests.AzureFunctionsTests+IsolatedRuntimeV4AspNetCoreV1.SubmitsTraces
- Datadog.Trace.ClrProfiler.IntegrationTests.AzureFunctionsTests+IsolatedRuntimeV4HostLogsDisabled.SubmitsTraces
- Datadog.Trace.ClrProfiler.IntegrationTests.AzureFunctionsTests+IsolatedRuntimeV4SdkV1.SubmitsTraces

Pattern Analysis:
- All Azure Functions integration tests failing across multiple .NET versions (net6.0-net10.0)
- All failures are in the SubmitsTraces test for isolated runtime V4 variants
- One Linux unit test failure on net5.0 (exit code 255, specific test not identified from error messages)

---
🔍 What would you like to investigate?

1. Compare with master - Check if these failures exist in recent master builds
2. View specific logs - Attempt to download logs for failed tasks (for net5.0 or Azure Functions tests)
3. Categorize failures - Analyze failure types (infrastructure/flaky/real)
4. Show full analysis - Run complete analysis with all details

https://dev.azure.com/datadoghq/dd-trace-dotnet/_build/results?buildId=195393&view=logs

🤖 Co-Authored-By: Claude Code

@lucaspimentel lucaspimentel requested a review from a team as a code owner February 6, 2026 22:33
Copy link

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: ec5c5ea1f5

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

@lucaspimentel lucaspimentel added the AI Generated Largely based on code generated by an AI or LLM. This label is the same across all dd-trace-* repos label Feb 6, 2026
@pr-commenter

This comment was marked as off-topic.

@lucaspimentel lucaspimentel force-pushed the lpimentel/ai-skills/troubleshoot-azure-devops branch from ff545a2 to 6caab27 Compare February 6, 2026 23:49
Copy link
Collaborator

@bouwkast bouwkast left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could / should we add a snippet to say something like retry once and if it fails again to reach out in our slack channel?

Just a general statement at top or something?

lucaspimentel and others added 4 commits February 9, 2026 10:30
Co-authored-by: Steven Bouwkamp <steven.bouwkamp@datadoghq.com>
Co-authored-by: Steven Bouwkamp <steven.bouwkamp@datadoghq.com>
@lucaspimentel
Copy link
Member Author

@bouwkast

should we add a snippet to say something like retry once and if it fails again to reach out in our slack channel?

good idea, done!

@lucaspimentel lucaspimentel enabled auto-merge (squash) February 9, 2026 15:38
@pr-commenter

This comment was marked as off-topic.

@lucaspimentel lucaspimentel merged commit 168868d into master Feb 9, 2026
98 of 101 checks passed
@lucaspimentel lucaspimentel deleted the lpimentel/ai-skills/troubleshoot-azure-devops branch February 9, 2026 16:20
@github-actions github-actions bot added this to the vNext-v3 milestone Feb 9, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

AI Generated Largely based on code generated by an AI or LLM. This label is the same across all dd-trace-* repos area:docs

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants