Skip to content

Add VMR codeflow health check to pipelines skill#13326

Merged
JanProvaznik merged 4 commits intodotnet:mainfrom
JanProvaznik:dev/janpro/vmr-codeflow-skill
Mar 5, 2026
Merged

Add VMR codeflow health check to pipelines skill#13326
JanProvaznik merged 4 commits intodotnet:mainfrom
JanProvaznik:dev/janpro/vmr-codeflow-skill

Conversation

@JanProvaznik
Copy link
Copy Markdown
Member

Summary

Extends the pipelines-health-check skill to monitor VMR (dotnet/dotnet) codeflow PRs and the \dotnet-unified-build\ pipeline.

Changes

*New: \check-vmr-codeflow.ps1*

  • Finds open codeflow PRs from dotnet/msbuild → dotnet/dotnet via GitHub API
  • Gets \dotnet-unified-build\ pipeline runs from \dnceng-public/public\ via Azure DevOps REST API
  • Extracts failed jobs/tasks with error messages from build timelines
  • Categorizes failures (TASK_HOST, COMPILATION, BUILD_COMMAND, SIGNING, NUGET_AUTH, etc.)
  • Parses PR comments to extract included upstream msbuild PR numbers
  • Correlates failures with upstream PRs by matching error categories to PR title keywords
  • Outputs structured JSON with health summary per codeflow PR

*Updated: \SKILL.md*

  • Added VMR triggers in 'When to Use' and frontmatter description
  • Added \dotnet-unified-build\ pipeline reference and key URLs
  • Added \\ data collection step
  • Added VMR Codeflow PRs table rendering with failure details and upstream PR listing
  • Added VMR codeflow failure/stale as new problem types
  • Added subagent prompt template for VMR codeflow failure investigation with change correlation
  • Added VMR-specific troubleshooting entries

Motivation

When MSBuild source changes are codeflowed into the VMR, the \dotnet-unified-build\ pipeline validates them. Failures there (like the recent MSB4216 task host errors from the app host changes) weren't covered by the existing skill. This addition lets the agent automatically detect VMR failures and correlate them back to the specific msbuild PRs that likely caused them.

Testing

Tested locally — script successfully found 2 open codeflow PRs (#5183, #5204), identified TASK_HOST failures, and correlated them with upstream PRs.

Add check-vmr-codeflow.ps1 that finds open codeflow PRs from
dotnet/msbuild to dotnet/dotnet, gets dotnet-unified-build pipeline
runs from dnceng-public/public, extracts failure details, and
correlates them with included upstream msbuild PRs.

Update SKILL.md with VMR pipeline reference info, data collection
step, table rendering instructions, subagent template for VMR
codeflow failure investigation, and troubleshooting section.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR extends the pipelines-health-check skill to also monitor VMR (dotnet/dotnet) codeflow PRs and the dotnet-unified-build pipeline. When MSBuild source changes are flowed into the VMR, the new tooling detects pipeline failures and correlates them back to the specific upstream MSBuild PRs that likely caused them.

Changes:

  • New check-vmr-codeflow.ps1 script that discovers open codeflow PRs via GitHub API, retrieves Azure DevOps pipeline runs for each PR branch, extracts and categorizes failures from build timelines, and correlates failures with included upstream MSBuild PRs.
  • Updated SKILL.md to document the new data collection step, add a VMR Codeflow PRs table section, define new problem types (VMR codeflow failure/stale), and add a subagent prompt template for VMR codeflow investigation.

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 6 comments.

File Description
.github/skills/pipelines-health-check/check-vmr-codeflow.ps1 New script: searches for open codeflow PRs, retrieves pipeline runs and timelines, categorizes failures, and emits structured JSON
.github/skills/pipelines-health-check/SKILL.md Updated skill documentation: adds VMR triggers, pipeline table row, key URLs, data collection step, table rendering instructions, problem types, and subagent prompt for VMR codeflow failures

You can also share your feedback on Copilot code review. Take the survey.

JanProvaznik and others added 3 commits March 4, 2026 15:00
- Rename $matches to $regexMatches to avoid shadowing PS automatic variable
- Add short reference regex (msbuild#NNN) and fix misleading comment
- Reorder failure category checks: specific BinaryToolTask before general MSB4216
- Add DefinitionId parameter to Get-PipelineRunsForPR for pipeline filtering
- Add GITHUB_TOKEN support via shared $githubHeaders for rate limit mitigation
- Fix broken dotnet-unified-build URL in SKILL.md (was using non-numeric definitionId)

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Replace all unauthenticated Invoke-RestMethod GitHub API calls with
gh CLI which handles authentication automatically. This eliminates
rate limiting issues (60 req/hr unauthenticated vs 5000 req/hr
authenticated) and properly surfaces failures instead of silently
degrading.

Add Invoke-GitHubApi helper that wraps gh api with error handling.
Add gh CLI availability and auth check at script startup.
Update SKILL.md to document gh CLI dependency.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
- Add Sanitize-ErrorString to both check-pipeline-health.ps1 and
  check-vmr-codeflow.ps1 to strip control characters and truncate
  error messages to 500 chars, preventing broken JSON output
- Replace verbose stages/allJobs arrays with compact jobSummary
  counts in VMR script, reducing output from 173KB to ~84KB
- Cap upstream PR detail fetching at 30 per codeflow PR; remaining
  PR numbers are listed in additionalUpstreamPRNumbers array
- Increase ConvertTo-Json depth from 8 to 10 to avoid truncation
- Update SKILL.md with guidance on script timing (use initial_wait
  of 120s+) and note that JSON is now safe to parse directly

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Copy link
Copy Markdown
Member

@JanKrivanek JanKrivanek left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:shipit:

@JanProvaznik JanProvaznik enabled auto-merge (squash) March 5, 2026 12:13
@JanProvaznik JanProvaznik merged commit 399d597 into dotnet:main Mar 5, 2026
10 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants