Skip to content

Add pr-triage skill for merge-readiness analysis#125005

Closed
danmoseley wants to merge 14 commits intodotnet:mainfrom
danmoseley:pr-triage-skill
Closed

Add pr-triage skill for merge-readiness analysis#125005
danmoseley wants to merge 14 commits intodotnet:mainfrom
danmoseley:pr-triage-skill

Conversation

@danmoseley
Copy link
Copy Markdown
Member

@danmoseley danmoseley commented Feb 28, 2026

Add pr-triage skill for merge-readiness analysis

A Copilot skill that analyzes open PRs and ranks them by how close they are to merging. It identifies the next action for each PR and who owns that action.

How it works

  1. A PowerShell script (Get-PrTriageData.ps1) does all the mechanical work: batched GraphQL queries, Build Analysis extraction, review/thread parsing, 12-dimension scoring, next-action determination, and "who" identification
  2. The LLM's job is just: parse user request → build script flags → run script → format the JSON output as a readable table with a few observations
  3. Results are cached locally in SQL so follow-up questions ("which community PRs need review?", "who has the most open PRs?") are answered instantly without re-running queries

Performance: ~80 seconds for the full repo (300+ PRs). Single areas take 10-20 seconds. Once loaded into SQL, follow-up queries are near-instant.

Example outputs

These gists show what the skill produces for different queries (all generated from a single ~80-second scan):

Repo-wide:

Per-person ("what should X look at next?"):

Per-area:

Cross-repo:

Scoring dimensions (0–10 composite)

Weight Dimension What it measures
3.0 CI (Build Analysis) Hard blocker — can't merge if BA is red
3.0 Conflicts Hard blocker — unmergeable
3.0 Maintainer Review Hard blocker — requires owner/triager approval
2.0 Feedback Unresolved review threads
2.0 Approval Strength Who approved: area owner > triager > contributor
1.5 Staleness Days since last update
1.5 Discussion Complexity Thread count and distinct commenters
1.0 Alignment Has area label, not untriaged
1.0 Freshness Recent activity
1.0 Size Smaller = easier to review
0.5 Community Flags community PRs for visibility
0.5 Velocity Review momentum

18 filter parameters

What's in this PR

Commit 1: Sync ci-analysis skill from lewing/agent-plugins (no modifications)

Commit 2: New pr-triage skill

  • SKILL.md — skill definition with parameter reference, workflow steps, formatting guidance
  • scripts/Get-PrTriageData.ps1 — the scoring script (~520 lines)
  • references/merge-readiness-rubric.md — per-dimension scoring details
  • references/runtime-signals.md — labels, area owners, approval authority levels
  • references/batch-analysis-workflow.md — batching and rate limiting guidance

Limitations

  • Area owners are parsed from docs/area-owners.md at runtime (138 areas). If the file is unavailable, the script still runs but shows generic "area owner" instead of specific names
  • For bot-authored PRs (copilot-swe-agent), the "who" field shows the human who triggered the bot (from assignees). If no assignee is found, it falls back to the area owner
  • The script targets dotnet/runtime specifically (Build Analysis step, area labels, reviewer lists). It works on other repos for basic scoring but won't identify area owners
  • GraphQL fetches up to 100 check contexts per PR — PRs with >100 checks may miss Build Analysis
  • The "who" field doesn't currently surface PRs where someone is a requested reviewer but not yet the next-action owner

Copilot AI review requested due to automatic review settings February 28, 2026 08:23
@github-actions github-actions bot added the area-skills Agent Skills label Feb 28, 2026
danmoseley and others added 2 commits February 28, 2026 01:27
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
PowerShell script (Get-PrTriageData.ps1) fetches and scores open PRs using
batched GraphQL. Analyzes 300+ PRs in ~80 seconds with 12 scoring dimensions
(0-10 scale), identifies next action and responsible person for each PR.

Features:
- 18 filter parameters (label, author, community, age, score, etc.)
- Build Analysis extraction for CI status
- Discussion complexity scoring (threads, commenters)
- Area owner and reviewer authority detection
- CSV and JSON output modes for SQL caching
- Time estimate guidance for users
- Optional sentiment check for top PRs

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR introduces a new pr-triage skill (PowerShell + reference docs) to rank open PRs by merge readiness and identify next actions/owners. It also includes a separate change in System.Text.RegularExpressions to precompute/vectorize set matching and adds a regression test.

Changes:

  • Add pr-triage skill docs, references, and Get-PrTriageData.ps1 to batch-fetch PR metadata via GraphQL and compute merge-readiness scores.
  • Update ci-analysis skill documentation to clarify workflow, tooling, and investigation guidance.
  • Optimize regex interpreter Set opcodes via precomputed SearchValues<char> and add a regression test for malformed “set-like” literals.

Reviewed changes

Copilot reviewed 13 out of 13 changed files in this pull request and generated 1 comment.

Show a summary per file
File Description
src/libraries/System.Text.RegularExpressions/tests/FunctionalTests/Regex.Match.Tests.cs Adds regression coverage for a previously crashing “set-like” literal.
src/libraries/System.Text.RegularExpressions/src/System/Text/RegularExpressions/RegexInterpreterCode.cs Adds precomputed set matchers for vectorized Set opcode scanning.
src/libraries/System.Text.RegularExpressions/src/System/Text/RegularExpressions/RegexInterpreter.cs Uses the new precomputed set matchers in Setrep/Setloop fast paths.
.github/skills/pr-triage/scripts/Get-PrTriageData.ps1 Implements PR listing, batched GraphQL fetch, scoring, and next-action attribution.
.github/skills/pr-triage/SKILL.md Defines the skill contract, usage guidance, and output format expectations.
.github/skills/pr-triage/references/runtime-signals.md Documents runtime-specific labels/signals and ownership/approval heuristics.
.github/skills/pr-triage/references/merge-readiness-rubric.md Provides a detailed scoring rubric and dimension definitions.
.github/skills/pr-triage/references/batch-analysis-workflow.md Documents intended batching/rate-limiting workflow and caching approach.
.github/skills/ci-analysis/SKILL.md Refines ci-analysis skill description, workflow, and anti-patterns.
.github/skills/ci-analysis/references/script-modes.md Documents script modes/parameters for CI analysis.
.github/skills/ci-analysis/references/recommendation-generation.md Documents how to synthesize recommendations from JSON output.
.github/skills/ci-analysis/references/manual-investigation.md Adds guidance for remote Helix log/artifact searching.
.github/skills/ci-analysis/references/failure-interpretation.md Adds/clarifies CI failure interpretation and recovery guidance.
.github/skills/ci-analysis/references/delegation-patterns.md Updates delegation guidance around TRX/testResults discovery.
.github/skills/ci-analysis/references/azdo-helix-reference.md Clarifies AzDO org selection based on build URL.
.github/skills/ci-analysis/references/analysis-workflow.md Adds a concise “steps 1–3” workflow reference.
Comments suppressed due to low confidence (11)

.github/skills/pr-triage/references/batch-analysis-workflow.md:40

  • This document contradicts itself about app/copilot-swe-agent: the filter table says to exclude it as a bot (line 18), but the default exclusions section says to include it (line 38). Align these statements (and the script behavior) so users know whether copilot-swe-agent PRs are scanned by default.
| Limit N (default 20) | "triage top 10 PRs" | `--limit N` |
| Area label | "triage area-System.Net PRs" | `--label area-System.Net` |
| Author | "triage PRs by stephentoub" | `--author stephentoub` |
| Exclude drafts (default on) | — | Filter `isDraft: false` |
| Exclude bots (default on) | — | Filter out `app/dotnet-maestro`, `app/copilot-swe-agent` |
| Exclude `needs-author-action` | "show PRs ready for maintainer" | Filter out label |
| Exclude `no-recent-activity` (default on) | — | Filter out label |
| Recently updated | "triage PRs updated this week" | `--sort updated` + date filter |
| Date range | "triage PRs opened in last 30 days" | Filter by `createdAt` or `updatedAt` |
| Community PRs | "triage community contributions" | `--label community-contribution` |
| Has maintainer approval | "show approved PRs" | Pass 2 filter: check reviews |
| Green Build Analysis | "show PRs with passing CI" | Pass 2 filter: check Build Analysis |
| Small/localized | "triage simple PRs" | Pass 2 filter: changedFiles, additions, deletions |

## Pass 1 — Quick-Screen

```bash
gh pr list --repo dotnet/runtime --state open --limit 100 \
  --json number,title,author,labels,isDraft,mergeable,createdAt,updatedAt,changedFiles,additions,deletions

Apply default exclusions:

  1. Remove isDraft: true
  2. Remove bot authors (app/dotnet-maestro) — but include app/copilot-swe-agent (invoked by maintainers, treat as normal PRs)
  3. Remove PRs with needs-author-action label (unless user explicitly includes them)
  4. Remove PRs with no-recent-activity label (unless user explicitly includes them)
**.github/skills/pr-triage/scripts/Get-PrTriageData.ps1:36**
* The help text says `-Limit` defaults to 100 (line 10), but the parameter default is 500 (line 36). This mismatch will confuse callers; update either the comment or the default value so they agree.

.PARAMETER Limit
Maximum PRs to return from gh pr list (default 100)
.PARAMETER Repo
Repository (default "dotnet/runtime")
.EXAMPLE
.\Get-PrTriageData.ps1 -Label "area-CodeGen-coreclr"
#>
param(
[string]$Label,
[string]$Author,
[string]$Assignee,
[switch]$Community,
[int]$MinAge,
[int]$MaxAge,
[int]$UpdatedWithin,
[int]$MinApprovals,
[double]$MinScore,
[string]$HasLabel,
[string]$ExcludeLabel,
[switch]$IncludeDrafts,
[switch]$ExcludeCopilot,
[switch]$IncludeNeedsAuthor,
[switch]$IncludeStale,
[string]$MyActions,
[string]$NextAction,
[string]$PrNumber,
[int]$Top = 0,
[int]$Limit = 500,

**.github/skills/pr-triage/scripts/Get-PrTriageData.ps1:57**
* Area owner parsing filters out any `@dotnet/<team>` mentions (`Where-Object { $_ -notmatch '^dotnet/' }`). In dotnet/runtime's `docs/area-owners.md`, many areas list owners as GitHub teams (e.g., `@dotnet/ncl`, `@dotnet/area-*`), so this will drop the primary owners and can make maintainer-approval detection and `who` tagging inaccurate. Consider retaining team handles for tagging and/or handling team-only rows explicitly (e.g., keep teams in the owner list instead of discarding them).
    if ($line -match '^\|\s*(area-\S+)\s*\|\s*@(\S+)\s*\|\s*(.+?)\s*\|') {
        $areaName = $matches[1].Trim()
        $lead = $matches[2].Trim()
        $ownerField = $matches[3].Trim()
        $people = @([regex]::Matches($ownerField, '@(\S+)') | ForEach-Object { $_.Groups[1].Value } |
            Where-Object { $_ -notmatch '^dotnet/' })
        if ($people.Count -eq 0) { $people = @($lead) }
**.github/skills/pr-triage/scripts/Get-PrTriageData.ps1:85**
* `$copilotAgent` is computed but never used, which is dead code and suggests an incomplete filter/reporting feature. Either remove it or use it (e.g., include in the scanned/excluded summary if intended).

$drafts = @($prsRaw | Where-Object { $.isDraft })
$bots = @($prsRaw | Where-Object { -not $
.isDraft -and $.author.login -match "^(app/)?dotnet-maestro|^(app/)?github-actions" })
$copilotAgent = @($prsRaw | Where-Object { -not $
.isDraft -and $.author.login -match "copilot-swe-agent" })
$needsAuthor = @($prsRaw | Where-Object { -not $
.isDraft -and ($.labels.name -contains "needs-author-action") })
$stale = @($prsRaw | Where-Object { -not $
.isDraft -and ($_.labels.name -contains "no-recent-activity") })

**.github/skills/pr-triage/scripts/Get-PrTriageData.ps1:348**
* `next_action` selection checks `-not $hasAnyReview` before checking Build Analysis failure. That means a PR with red CI and no reviews will be classified as "Maintainer: review needed" even though CI is a hard blocker per the rubric. Consider prioritizing CI failure/pending ahead of review-needed so the suggested next step reflects merge blockers.
elseif (-not $hasAnyReview) {
    $prNextAction = "Maintainer: review needed"
    # Pick specific owners to tag: prefer owners who have reviewed similar PRs
    if ($prOwners.Count -gt 0) {
        $who = @($prOwners | Select-Object -First 2)
    } else {
        $who = @("area owner")
    }
}
elseif ($baConclusion -eq "FAILURE") {
    $prNextAction = "Author: fix CI failures"
    $who = @($pr.author.login)
}
**.github/skills/pr-triage/SKILL.md:305**
* This section says the script scores **11** dimensions, but the implementation includes an additional "Discussion complexity" dimension (`$discussionScore * 1.5`) in the composite. Update the documentation to match the actual dimensions/weights used so users can interpret scores correctly.

Score Dimensions (0-10 scale)

The script scores 11 dimensions with weighted composite (see rubric):

Weight Dimension What it measures
3.0 CI (Build Analysis) Hard blocker — can't merge if BA is red
3.0 Conflicts Hard blocker — unmergeable
3.0 Maintainer Review Hard blocker — runtime requires owner/triager approval
2.0 Feedback Unresolved review threads
2.0 Approval Strength Who approved: area owner > triager > community
1.5 Staleness Days since last update
1.0 Alignment Has area label, not untriaged
1.0 Freshness Recent activity
1.0 Size Smaller = easier to review
0.5 Community Flags community PRs for visibility — they need different handling
0.5 Velocity Review momentum
**.github/skills/pr-triage/references/runtime-signals.md:17**
* The label taxonomy says `community-contribution` should "Boost priority", but the scoring rubric (and the script) treat community PRs as lower-scoring / flagged rather than boosted. Reword this triage impact to match the rubric (e.g., "Flag for visibility / consider prioritizing") so the guidance is consistent.

| needs-author-action | Reviewer requested changes, waiting on author | Maintainers, or auto when review has Changes_requested | ❌ Blocking — author must act |
| needs-further-triage | Author responded to needs-author-action, needs maintainer re-review | Auto (when author comments on non-untriaged issue) | ⚠️ Maintainer should re-engage |
| untriaged | PR hasn't been categorized by area yet | Auto on creation | ❌ Missing area routing |
| no-recent-activity | No activity for 14 days | Auto (resourceManagement.yml) | ⚠️ Will auto-close in 14 more days |
| backlog-cleanup-candidate | Issue inactive for 1644 days | Auto | Ignore for PR triage |
| community-contribution | PR from external contributor | Maintainers | Boost priority — retain contributors |
| area-* (e.g., area-System.Net) | Component area label | Maintainers/auto | Used to find area owners |

**.github/skills/pr-triage/SKILL.md:24**
* The PR title/description are about adding the `pr-triage` skill, but this PR also changes the regex engine and adds a new regression test under `System.Text.RegularExpressions`. That looks like an unrelated product change and will make the PR harder to review/land; consider splitting the regex fix into a separate PR (or updating the PR description to explicitly cover it).

PR Triage Skill

Identify which pull requests in dotnet/runtime are ready to merge and determine the next action for each PR. This is a read-only analysis skill — it never modifies PRs, issues, labels, or comments.

Community contributions are flagged so maintainers can make informed prioritization
decisions. Timely feedback — even a quick "not right now" — respects contributors' time.

Architecture

All data fetching and scoring is done by a PowerShell script. Your job as the AI is:

**.github/skills/pr-triage/scripts/Get-PrTriageData.ps1:288**
* `$feedbackScore` only considers unresolved review threads. If `-IncludeNeedsAuthor` is used, PRs with the `needs-author-action` label can be included but still score as fully addressing feedback when threads are resolved/absent, which contradicts the label meaning (blocking). Consider incorporating the label into feedback scoring (e.g., `needs-author-action` => 0.0) and/or into `next_action` selection.
# Labels
$isCommunity = $labelNames -contains "community-contribution"
$hasAreaLabel = ($labelNames | Where-Object { $_ -match "^area-" }).Count -gt 0
$isUntriaged = $labelNames -contains "untriaged"

# Dates
$updatedAt = [DateTime]::Parse($pr.updatedAt)
$createdAt = [DateTime]::Parse($pr.createdAt)
$daysSinceUpdate = ($now - $updatedAt).TotalDays
$ageInDays = ($now - $createdAt).TotalDays

# Check counts
$passed = @($checks | Where-Object { $_.conclusion -eq "SUCCESS" }).Count
$failed = @($checks | Where-Object { $_.conclusion -eq "FAILURE" }).Count
$running = @($checks | Where-Object { $_.status -eq "IN_PROGRESS" -or $_.status -eq "QUEUED" }).Count

# --- DIMENSION SCORING ---
$ciScore = switch ($baConclusion) { "SUCCESS" { 1.0 } "ABSENT" { 0.5 } "IN_PROGRESS" { 0.5 } default { 0.0 } }
$stalenessScore = if ($daysSinceUpdate -le 3) { 1.0 } elseif ($daysSinceUpdate -le 14) { 0.5 } else { 0.0 }
$maintScore = if ($hasOwnerApproval) { 1.0 } elseif ($hasTriagerApproval) { 0.75 } elseif ($hasAnyReview) { 0.5 } else { 0.0 }
$feedbackScore = if ($unresolvedThreads -eq 0) { 1.0 } else { 0.5 }
$conflictScore = switch ($pr.mergeable) { "MERGEABLE" { 1.0 } "UNKNOWN" { 0.5 } "CONFLICTING" { 0.0 } default { 0.5 } }
**.github/skills/pr-triage/scripts/Get-PrTriageData.ps1:314**
* `$alignScore` is capped at 0.5 when aligned, but the composite normalization assumes a max weight sum of 20.0 (which implies alignment can contribute 1.0). As written, no PR can reach the documented maximum score because alignment never hits 1.0. Either make aligned PRs score 1.0 (when they have an area label and are not untriaged) or adjust `$rawMax`/weights to match the implemented scoring.
$alignScore = if ($isUntriaged -or -not $hasAreaLabel) { 0.0 } else { 0.5 }
$freshScore = if ($daysSinceUpdate -le 14) { 1.0 } elseif ($daysSinceUpdate -le 30) { 0.5 } else { 0.0 }
$totalLines = $pr.additions + $pr.deletions
$sizeScore = if ($pr.changedFiles -le 5 -and $totalLines -le 200) { 1.0 } elseif ($pr.changedFiles -le 20 -and $totalLines -le 500) { 0.5 } else { 0.0 }
$communityScore = if ($isCommunity) { 0.5 } else { 1.0 }
$approvalScore = if ($approvalCount -ge 2 -and $hasOwnerApproval) { 1.0 }
                 elseif ($hasOwnerApproval -or ($hasTriagerApproval -and $approvalCount -ge 2)) { 0.75 }
                 elseif ($hasTriagerApproval -or $approvalCount -ge 2) { 0.5 }
                 elseif ($approvalCount -ge 1) { 0.5 }
                 else { 0.0 }
$velocityScore = if ($reviews.Count -eq 0) { if ($ageInDays -le 14) { 0.5 } else { 0.0 } }
                 elseif ($daysSinceUpdate -le 7) { 1.0 } elseif ($daysSinceUpdate -le 14) { 0.5 } else { 0.0 }
# Discussion complexity: many threads/commenters = harder to push forward
# Light (≤5 threads, ≤2 commenters) = 1.0, moderate = 0.5, heavy (>15 threads or >5 commenters) = 0.0
$discussionScore = if ($totalThreads -le 5 -and $distinctCommenters -le 2) { 1.0 }
                   elseif ($totalThreads -le 15 -and $distinctCommenters -le 5) { 0.5 }
                   else { 0.0 }

# Composite: weighted sum normalized to 0-10 scale
$rawMax = 20.0
$rawScore = ($ciScore * 3) + ($conflictScore * 3) + ($maintScore * 3) +
    ($feedbackScore * 2) + ($approvalScore * 2) + ($stalenessScore * 1.5) +
    ($discussionScore * 1.5) +
    ($alignScore * 1) + ($freshScore * 1) + ($sizeScore * 1) +
    ($communityScore * 0.5) + ($velocityScore * 0.5)
$composite = [Math]::Round(($rawScore / $rawMax) * 10, 1)
**.github/skills/pr-triage/scripts/Get-PrTriageData.ps1:521**
* The `-OutputCsv` path writes tab-separated data that includes untrusted PR metadata such as `title`, `who`, `next_action`, `blockers`, and `why` directly into cells without sanitization. If a PR title or other field begins with spreadsheet metacharacters (e.g., `=`, `+`, `-`, `@`), opening this TSV in Excel or similar tools can trigger formula execution (CSV/TSV injection), allowing data exfiltration or command execution via spreadsheet functions. To mitigate this, sanitize or escape these fields before emitting them for spreadsheet import (for example, by prefixing values that start with these characters with a safe character or otherwise neutralizing formula evaluation).
foreach ($r in $results) {
    $t = ($r.title -replace "`t"," ").Substring(0, [Math]::Min(70, $r.title.Length))
    $lines += "$($r.number)`t$t`t$($r.author)`t$($r.score)`t$($r.ci)`t$($r.ci_detail)`t$($r.unresolved_threads)`t$($r.total_threads)`t$($r.distinct_commenters)`t$($r.mergeable)`t$($r.approval_count)`t$(if ($r.is_community) {1} else {0})`t$($r.age_days)`t$($r.days_since_update)`t$($r.changed_files)`t$($r.lines_changed)`t$($r.next_action)`t$($r.who)`t$($r.blockers)`t$($r.why)"
</details>

danmoseley and others added 2 commits February 28, 2026 01:41
- Fix #2: anchor bot-author regex to avoid false matches
- Fix #3: remove duplicate who assignment in unresolved threads block
- Fix #4: guard totalComments against null reviewThreads
- Fix #5: add total_comments to CSV output
- Fix #6: remove unused copilotAgent variable
- Fix #7: cap screened-out drafts to 10 to reduce JSON bloat
- Fix #8: safer empty check after owner-approver intersection
- Fix #9: add Discussion Complexity to SKILL.md scoring table (12 dims)
- Fix #10: correct rawMax from 20.0 to 20.5
- Fix #11: update example JSON to match actual output structure
- Fix #13: rewrite batch-analysis-workflow.md for script-based architecture
- Fix #14: remove stale hardcoded owners from runtime-signals.md
- Fix #15: align rubric weight table with script's 12 dimensions
- Fix #16: fix community framing inconsistency in runtime-signals.md
- Fix alignScore: binary (has area label or not) instead of capping at 0.5

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
- B: Help text and SKILL.md now say Limit default is 500 (matching param)
- F: Rubric bot-PR section: copilot-swe-agent included by default (not excluded)
- H: SKILL.md example uses -Top 20 instead of -Limit 20 for top-N queries

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Copilot AI review requested due to automatic review settings February 28, 2026 08:50
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 13 out of 13 changed files in this pull request and generated 2 comments.

Comments suppressed due to low confidence (9)

.github/skills/pr-triage/references/runtime-signals.md:45

  • This reference claims a CODEOWNERS-based fallback for PRs without an area-* label, but the script currently doesn’t consult .github/CODEOWNERS (it falls back to a generic "area owner" string). Either implement the CODEOWNERS fallback in the script or adjust this documentation so it doesn’t promise behavior that isn’t present.
**Fallback**: If the PR has no area label, use `.github/CODEOWNERS` to match file paths to reviewers.

.github/skills/pr-triage/scripts/Get-PrTriageData.ps1:353

  • The stale/rebase recommendation runs before the "Ready to merge" condition, which can prevent already-mergeable PRs from being classified as ready. Consider evaluating the merge-ready condition before recommending "merge main" (or base that recommendation on build staleness rather than PR updatedAt).
    elseif ($daysSinceUpdate -gt 14) {
        $prNextAction = "Author: merge main (stale $([int]$daysSinceUpdate)d)"
        $who = @($pr.author.login)
    }
    elseif ($ciScore -eq 1 -and $conflictScore -eq 1 -and $maintScore -ge 0.75 -and $feedbackScore -eq 1) {

.github/skills/pr-triage/scripts/Get-PrTriageData.ps1:69

  • The community triager list is hardcoded even though docs/area-owners.md is already fetched at startup and contains an authoritative "Community Triagers" section. Parsing triagers from the fetched markdown would avoid drift when the list changes and would keep behavior aligned with the references.
$communityTriagers = @("a74nh","am11","clamp03","Clockwork-Muse","filipnavara",
    "huoyaoyuan","martincostello","omajid","Sergio0694","shushanhf",
    "SingleAccretion","teo-tsirpanis","tmds","vcsjones","xoofx")

.github/skills/pr-triage/scripts/Get-PrTriageData.ps1:137

  • The GraphQL fragment requests reviews(last:10). On PRs with many review rounds, this can miss relevant approvals/changes-requested reviews and skew scoring/next-action ownership. Consider using latestReviews (one per reviewer) or increasing the window to better reflect current review state.
$fragment = 'number comments{totalCount} reviews(last:10){nodes{author{login}state}} reviewThreads(first:50){nodes{isResolved comments(first:5){nodes{author{login}}}}} commits(last:1){nodes{commit{statusCheckRollup{contexts(first:100){nodes{...on CheckRun{name conclusion status}}}}}}}'

.github/skills/pr-triage/scripts/Get-PrTriageData.ps1:244

  • Discussion metrics are computed from reviewThreads(... comments(first:5) ...), so total_comments/distinct_commenters can be undercounted for threads with >5 comments. If these are intended to be accurate (not sampled), consider querying per-thread comments { totalCount } (and reviewThreads { totalCount }) and only sampling comment nodes for author attribution.
    $threadCommentSum = if ($gql -and $gql.reviewThreads.nodes) {
        ($gql.reviewThreads.nodes | ForEach-Object { $_.comments.nodes.Count } | Measure-Object -Sum).Sum
    } else { 0 }

.github/skills/pr-triage/references/runtime-signals.md:58

  • This section implies community triagers are detected by parsing docs/area-owners.md, but the script currently uses a hardcoded triager list. To avoid drift, consider either parsing the "Community Triagers" section from the fetched markdown or updating this reference to reflect the current implementation.
| **1. Area owner/lead** | Listed in Owners/Lead column of `docs/area-owners.md` for the PR's area label | Parse area-owners.md, match reviewer login against Owners column | Has merge authority for the area |
| **2. Community triager** | Trusted community members with triage permissions, listed in area-owners.md | Parse the "Community Triagers" section at the bottom of `docs/area-owners.md` | Strong signal — deeply familiar with the repo. Does not have merge authority but review is highly valued. |
| **3. Frequent contributor** | Has many merged PRs or commits in the touched area | `gh api "repos/dotnet/runtime/commits?author={login}&path={dir}&per_page=5"` — 3+ hits = frequent | Valuable domain expertise. Weight 1× (standard). |
| **4. New contributor** | First-time or infrequent contributor to this area | No commits found in touched paths | Review appreciated — every contributor's feedback helps improve the PR. Weight 0.5×. |

.github/skills/pr-triage/scripts/Get-PrTriageData.ps1:518

  • -OutputCsv emits tab-separated output (TSV) rather than comma-separated CSV. This makes the parameter name misleading for callers. Consider either outputting real CSV or renaming the switch (and updating docs) to reflect TSV/tab-separated output.
if ($OutputCsv) {
    # Tab-separated output for easy SQL/spreadsheet import
    $header = "number`ttitle`tauthor`tscore`tci`tci_detail`tunresolved_threads`ttotal_threads`ttotal_comments`tdistinct_commenters`tmergeable`tapproval_count`tis_community`tage_days`tdays_since_update`tchanged_files`tlines_changed`tnext_action`twho`tblockers`twhy"

.github/skills/ci-analysis/references/manual-investigation.md:41

  • This new section adds content that isn’t reflected in the document’s Table of Contents at the top, so the TOC is now incomplete. Consider adding a TOC entry for "Search Helix Logs and Artifacts Remotely" to keep navigation accurate.
## Search Helix Logs and Artifacts Remotely

> 💡 **Prefer remote search over download.** `hlx_search_log` and `hlx_search_file` let you find errors in Helix console logs and uploaded files without downloading them first. Use these before falling back to `hlx_logs` (full log) or `hlx_download` (file download).

.github/skills/pr-triage/scripts/Get-PrTriageData.ps1:522

  • The -OutputCsv path writes user-controlled PR titles directly into a tab-separated file without escaping, which enables CSV/TSV formula injection when the file is opened in Excel or similar tools. A malicious contributor could craft a PR title beginning with =, +, -, or @ so that, when imported into a spreadsheet, it executes a formula (e.g., making external network requests and exfiltrating data from the maintainer’s environment). Before appending fields to $lines, ensure you neutralize or escape formula-leading characters in text fields (such as title, next_action, who, blockers, and why) so that spreadsheet applications always treat these values as plain text rather than formulas.
        $t = ($r.title -replace "`t"," ").Substring(0, [Math]::Min(70, $r.title.Length))
        $lines += "$($r.number)`t$t`t$($r.author)`t$($r.score)`t$($r.ci)`t$($r.ci_detail)`t$($r.unresolved_threads)`t$($r.total_threads)`t$($r.total_comments)`t$($r.distinct_commenters)`t$($r.mergeable)`t$($r.approval_count)`t$(if ($r.is_community) {1} else {0})`t$($r.age_days)`t$($r.days_since_update)`t$($r.changed_files)`t$($r.lines_changed)`t$($r.next_action)`t$($r.who)`t$($r.blockers)`t$($r.why)"

danmoseley and others added 2 commits February 28, 2026 01:57
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
- Why column uses emoji + plain English (e.g. '✅ CI passed · 👍 owner approved · 📦 small change')
- Community authors prefixed with 🌐 globe
- CI falls back to overall check results when Build Analysis absent (non-runtime repos)

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Copilot AI review requested due to automatic review settings February 28, 2026 09:11
- CI failure now checked before no-review in next_action chain (CI is a
  hard blocker that should surface first)
- needs-author-action label sets feedbackScore=0 and drives next_action
  when -IncludeNeedsAuthor is used
- Added needs-author-action to blockers list

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 13 out of 13 changed files in this pull request and generated 2 comments.

Comments suppressed due to low confidence (13)

.github/skills/pr-triage/scripts/Get-PrTriageData.ps1:137

  • The batched GraphQL query limits reviews(last:10) and reviewThreads(first:50) (and thread comments first:5). For PRs with lots of activity, this can drop older approvals and/or unresolved threads, which will mis-score maintainer approval, thread counts, and next-action ownership. Consider raising these limits (within cost constraints), or at least documenting clearly in the output/refs that scoring is based on truncated data so consumers don’t treat it as authoritative.
# --- Step 3: Batched GraphQL (reviews, threads, Build Analysis, thread authors) ---
$fragment = 'number comments{totalCount} reviews(last:10){nodes{author{login}state}} reviewThreads(first:50){nodes{isResolved comments(first:5){nodes{author{login}}}}} commits(last:1){nodes{commit{statusCheckRollup{contexts(first:100){nodes{...on CheckRun{name conclusion status}}}}}}}'

.github/skills/pr-triage/scripts/Get-PrTriageData.ps1:517

  • quick_actions.needs_maintainer_review counts only entries whose next_action matches "review needed". This will miss PRs categorized as Maintainer: review/merge, which are also maintainer actions. Consider matching on ^Maintainer: (or counting both patterns) so the summary metrics align with the actual next_action values being produced.
        drafts_count = $drafts.Count
        drafts = @($drafts | Select-Object -First 10 | ForEach-Object { @{ number = $_.number; author = $_.author.login; title = $_.title.Substring(0, [Math]::Min(60, $_.title.Length)) } })
        bots = @($bots | ForEach-Object { @{ number = $_.number; author = $_.author.login } })
        needs_author_action = @($needsAuthor | ForEach-Object { @{ number = $_.number; author = $_.author.login } })
        stale = @($stale | ForEach-Object { @{ number = $_.number; author = $_.author.login } })

.github/skills/pr-triage/references/runtime-signals.md:43

  • This reference says "Do not rely on hardcoded owner lists" and that community triagers are parsed from docs/area-owners.md, but the script currently hardcodes the community triager list. Either update the script to parse the triagers from area-owners.md or adjust this doc to describe the actual behavior and its drift risk.
**Note**: The script parses `docs/area-owners.md` dynamically at startup to get
the full owner table (~138 areas). Do not rely on hardcoded owner lists.

.github/skills/pr-triage/scripts/Get-PrTriageData.ps1:138

  • The GraphQL query only requests ... on CheckRun { name conclusion status } for statusCheckRollup.contexts. contexts can also include StatusContext items; those come back as empty objects here and then get filtered out by Where-Object { $_.name }, which can incorrectly produce ABSENT/"No CI" (especially for non-runtime repos). Consider also requesting the StatusContext fields and normalizing them into the same $checks list (or request __typename and handle both types).
# --- Step 3: Batched GraphQL (reviews, threads, Build Analysis, thread authors) ---
$fragment = 'number comments{totalCount} reviews(last:10){nodes{author{login}state}} reviewThreads(first:50){nodes{isResolved comments(first:5){nodes{author{login}}}}} commits(last:1){nodes{commit{statusCheckRollup{contexts(first:100){nodes{...on CheckRun{name conclusion status}}}}}}}'

.github/skills/pr-triage/references/runtime-signals.md:45

  • The doc claims a fallback to .github/CODEOWNERS when a PR has no area label, but the current script behavior falls back to a generic "area owner" placeholder and doesn’t consult CODEOWNERS. Update the doc to avoid promising behavior that isn’t implemented, or implement CODEOWNERS-based fallback in the script.
**Fallback**: If the PR has no area label, use `.github/CODEOWNERS` to match file paths to reviewers.

.github/skills/pr-triage/references/merge-readiness-rubric.md:427

  • The weight table shown above sums to 20.0, but this section states a max raw score of 20.5 and normalizes using 20.5. This conflicts with the implemented formula in Get-PrTriageData.ps1 and will confuse consumers about what scores mean. Align the documented max/normalization factor with the actual weights (or vice versa).
**Max possible raw score**: 20.5. Normalized to a 0–10 scale: `(rawScore / 20.5) × 10`.

.github/skills/pr-triage/scripts/Get-PrTriageData.ps1:162

  • gh api graphql ... 2>&1 | ConvertFrom-Json merges stderr into stdout before JSON parsing. If gh emits any non-JSON error/warning text, ConvertFrom-Json will throw and the original error payload may be lost. Prefer capturing stdout only for JSON parsing and handling failures explicitly (e.g., check $LASTEXITCODE / wrap the call and surface $result.errors from the GraphQL response).
    $repoOwner = $repoParts[0]
    $repoName = $repoParts[1]
    $query = "{ repository(owner:`"$repoOwner`",name:`"$repoName`") { $($parts -join ' ') } }"
    $result = gh api graphql -f query="$query" 2>&1 | ConvertFrom-Json
    for ($i = 0; $i -lt $b.Count; $i++) {

.github/skills/pr-triage/scripts/Get-PrTriageData.ps1:58

  • Area owner parsing drops any @dotnet/... team handles from the Owners column (Where-Object { $_ -notmatch '^dotnet/' }). In dotnet/runtime’s docs/area-owners.md, many areas list owners as teams (e.g., area-CodeGen-coreclr has owners @dotnet/jit-contrib), so this logic collapses ownership to just the lead and will miss legitimate maintainer approvals by other team members. Consider either (a) not discarding team handles and treating them as valid “who to tag” outputs, and/or (b) resolving team membership via GitHub API/GraphQL to properly recognize approvals from team members.
        if ($line -match '^\|\s*(area-\S+)\s*\|\s*@(\S+)\s*\|\s*(.+?)\s*\|') {
            $areaName = $matches[1].Trim()
            $lead = $matches[2].Trim()
            $ownerField = $matches[3].Trim()
            $people = @([regex]::Matches($ownerField, '@(\S+)') | ForEach-Object { $_.Groups[1].Value } |
                Where-Object { $_ -notmatch '^dotnet/' })
            if ($people.Count -eq 0) { $people = @($lead) }
            $areaOwners[$areaName] = $people

.github/skills/pr-triage/scripts/Get-PrTriageData.ps1:323

  • $rawMax is set to 20.5, but the weights actually summed in $rawScore total 20.0 (3+3+3+2+2+1.5+1.5+1+1+1+0.5+0.5). This under-scales the displayed 0–10 score and makes -MinScore filtering inconsistent with the intended rubric. Either correct $rawMax to match the implemented weights, or adjust the weights/formula so the declared max is accurate.

    # Composite: weighted sum normalized to 0-10 scale
    $rawMax = 20.5
    $rawScore = ($ciScore * 3) + ($conflictScore * 3) + ($maintScore * 3) +
        ($feedbackScore * 2) + ($approvalScore * 2) + ($stalenessScore * 1.5) +
        ($discussionScore * 1.5) +
        ($alignScore * 1) + ($freshScore * 1) + ($sizeScore * 1) +
        ($communityScore * 0.5) + ($velocityScore * 0.5)

.github/skills/pr-triage/scripts/Get-PrTriageData.ps1:301

  • When -IncludeNeedsAuthor is used, PRs with the needs-author-action label are analyzed, but scoring/next-action logic doesn’t treat that label as a hard blocker (e.g., $feedbackScore is still 1.0 when there are no unresolved threads, and next_action can become maintainer-driven). To keep results accurate in include mode, incorporate the presence of needs-author-action into scoring (feedback dimension) and into next-action determination (prefer Author: address review feedback).
    # --- DIMENSION SCORING ---
    $ciScore = switch ($baConclusion) { "SUCCESS" { 1.0 } "ABSENT" { 0.5 } "IN_PROGRESS" { 0.5 } default { 0.0 } }
    $stalenessScore = if ($daysSinceUpdate -le 3) { 1.0 } elseif ($daysSinceUpdate -le 14) { 0.5 } else { 0.0 }
    $maintScore = if ($hasOwnerApproval) { 1.0 } elseif ($hasTriagerApproval) { 0.75 } elseif ($hasAnyReview) { 0.5 } else { 0.0 }
    $hasNeedsAuthorAction = $labelNames -contains "needs-author-action"
    $feedbackScore = if ($hasNeedsAuthorAction) { 0.0 } elseif ($unresolvedThreads -eq 0) { 1.0 } else { 0.5 }
    $conflictScore = switch ($pr.mergeable) { "MERGEABLE" { 1.0 } "UNKNOWN" { 0.5 } "CONFLICTING" { 0.0 } default { 0.5 } }
    $alignScore = if ($isUntriaged -or -not $hasAreaLabel) { 0.0 } else { 1.0 }
    $freshScore = if ($daysSinceUpdate -le 14) { 1.0 } elseif ($daysSinceUpdate -le 30) { 0.5 } else { 0.0 }
    $totalLines = $pr.additions + $pr.deletions

.github/skills/pr-triage/scripts/Get-PrTriageData.ps1:68

  • $communityTriagers is hardcoded, but the accompanying reference docs say community triagers should be parsed from docs/area-owners.md (and not hardcoded). Hardcoding will drift as the list changes and causes misclassification of maintainer/triager approvals. Consider parsing the “Community Triagers” section from docs/area-owners.md at startup (similar to how area owners are loaded) and falling back to the hardcoded list only if parsing fails.
$communityTriagers = @("a74nh","am11","clamp03","Clockwork-Muse","filipnavara",
    "huoyaoyuan","martincostello","omajid","Sergio0694","shushanhf",
    "SingleAccretion","teo-tsirpanis","tmds","vcsjones","xoofx")

.github/skills/pr-triage/scripts/Get-PrTriageData.ps1:47

  • $Repo is split and indexed as repoParts[0]/repoParts[1] without validating the format. If the caller passes an unexpected value (e.g., missing /), the script will throw an index error early (area-owners URL and GraphQL query). Consider validating $Repo up-front (must be owner/name) and emitting a clear error message before continuing.
$areaOwners = @{}
$repoParts = $Repo -split '/'
$areaOwnersUrl = "repos/$($repoParts[0])/$($repoParts[1])/contents/docs/area-owners.md"

.github/skills/pr-triage/scripts/Get-PrTriageData.ps1:529

  • The -OutputCsv path writes untrusted PR titles directly into a tab-separated file that is intended for spreadsheet import. A malicious PR author can set a title beginning with =, +, -, or @ so that when the TSV is opened in Excel or similar tools it is interpreted as a formula, leading to potential CSV/Excel formula injection (for example, triggering outbound requests or DDE-based commands on the maintainer’s machine). To mitigate this, treat titles as plain text when exporting (e.g., escape or prefix titles that start with formula characters before writing them to the TSV).
        blocked_conflicts = @($results | Where-Object { $_.mergeable -eq "CONFLICTING" }).Count
    }
    prs = @($results)
    elapsed_seconds = [Math]::Round(((Get-Date) - $scriptStart).TotalSeconds, 1)
}

if ($OutputCsv) {

- Change Write-Host to Write-Verbose with [CmdletBinding()] so stdout
  stays clean JSON/TSV for machine parsing
- Remove emoji prefix from author field; is_community field already
  provides the signal, and the prefix broke -MyActions filtering
- Fix rawMax from 20.5 to 20.0 to match actual weight sum
- Guard TSV export against Excel formula injection for titles starting
  with =, +, -, or @

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 13 out of 13 changed files in this pull request and generated 1 comment.

Comments suppressed due to low confidence (7)

.github/skills/pr-triage/scripts/Get-PrTriageData.ps1:129

  • This verbose summary always reports needs-author and stale counts as “excluded”, but those PRs may be included when -IncludeNeedsAuthor / -IncludeStale are set. This makes the diagnostic output misleading when those switches are used. Consider computing “excluded” counts based on the actual candidate filtering logic (or adjust the wording to avoid claiming exclusion unconditionally).
$excludedDrafts = if ($IncludeDrafts) { 0 } else { $drafts.Count }
$excludedBots = $bots.Count
Write-Verbose "Scanned $($prsRaw.Count) -> $($candidates.Count) candidates ($excludedDrafts drafts, $excludedBots bots, $($needsAuthor.Count) needs-author, $($stale.Count) stale excluded)"

.github/skills/pr-triage/scripts/Get-PrTriageData.ps1:537

  • The TSV/CSV output only guards against spreadsheet formula injection for title, but who will typically start with @ (and other fields could start with =, +, -, @ depending on future changes). To avoid CSV/TSV injection when users import into Excel/Sheets, apply the same leading-character escaping to all string fields written to the TSV (at least who, and ideally next_action, blockers, why, author).
    foreach ($r in $results) {
        $t = ($r.title -replace "`t"," ").Substring(0, [Math]::Min(70, $r.title.Length))
        if ($t.Length -gt 0 -and $t[0] -in '=','+','-','@') { $t = "'$t" }
        $lines += "$($r.number)`t$t`t$($r.author)`t$($r.score)`t$($r.ci)`t$($r.ci_detail)`t$($r.unresolved_threads)`t$($r.total_threads)`t$($r.total_comments)`t$($r.distinct_commenters)`t$($r.mergeable)`t$($r.approval_count)`t$(if ($r.is_community) {1} else {0})`t$($r.age_days)`t$($r.days_since_update)`t$($r.changed_files)`t$($r.lines_changed)`t$($r.next_action)`t$($r.who)`t$($r.blockers)`t$($r.why)"

.github/skills/pr-triage/references/merge-readiness-rubric.md:427

  • The weight table above sums to 20.0 (3+3+3+2+2+1.5+1.5+1+1+1+0.5+0.5), but this section states a max raw score of 20.5 and uses 20.5 in the normalization formula. This will confuse readers and diverges from the script (which normalizes by 20.0). Recommend updating the max score and formula to match the implemented weights.
| # | Dimension | Weight | Rationale |
|---|-----------|--------|-----------|
| 1 | CI Status | 3 | Can't merge without green CI |
| 5 | Merge Conflicts | 3 | Can't merge with conflicts |
| 3 | Maintainer Review | 3 | Required for merge |
| 4 | Feedback Addressed | 2 | Blocks merge if outstanding |
| 12 | Approval Strength | 2 | Stronger approvals = closer to merge |
| 2 | Staleness | 1.5 | Days since last update |
| — | Discussion Complexity | 1.5 | Thread count and distinct commenters |
| 6 | Alignment | 1 | Organizational signal |
| 7 | Freshness | 1 | Stale PRs less likely to merge soon |
| 9 | PR Size / Complexity | 1 | Ease of review |
| 10 | Community Contribution Flag | 0.5 | Flags community PRs — typically more expensive to drive |
| 13 | Review Velocity | 0.5 | Momentum signal |

**Max possible raw score**: 20.5. Normalized to a 0–10 scale: `(rawScore / 20.5) × 10`.

.github/skills/pr-triage/references/merge-readiness-rubric.md:65

  • The rubric’s dimension #2 is titled “Build Staleness” and describes comparing build timestamps/SHA, but the later “as implemented in the script” table uses dimension #2 as “Staleness — Days since last update”. Since the batch script doesn’t currently compute build-age/SHA staleness, either rename/re-scope dimension #2 in the rubric to match the implementation, or clearly call out Build Staleness as a single-PR-only dimension that isn’t part of the batch composite.
### 2. Build Staleness

**How to check**

Compare `headRefOid` from `gh pr view` against build info from `gh pr checks`
timestamps.

**Scoring**

| Score | Condition |
|-------|-----------|
| ✅ 1 | Build ran on current HEAD and within last 3 days |
| ⚠️ 0.5 | Build is 3–14 days old |
| ❌ 0 | Build is >14 days old or ran on a different SHA than current HEAD |

**Next-action guidance**: Do NOT recommend "merge main / rebase" solely because
a build is a few days old. Running CI takes 2–3 hours, so a rebase is costly.
Only recommend rebase when:
- Build is >14 days old, OR
- Context drift (dimension 8) shows significant changes to touched directories, OR
- Build ran on a different SHA than current HEAD

A 7-day-old green build with no context drift is fine — the next action should
be whatever the other blocking dimensions indicate.

.github/skills/pr-triage/references/runtime-signals.md:66

  • This reference says the script should not rely on hardcoded owner/triager lists and should parse community triagers from the bottom of docs/area-owners.md. The current script hardcodes community triagers and also filters out @dotnet/* team owners, which makes “owner” resolution diverge from what’s documented here. Recommend aligning the docs with the implementation (or updating the script to match this documented behavior).
**Note**: The script parses `docs/area-owners.md` dynamically at startup to get
the full owner table (~138 areas). Do not rely on hardcoded owner lists.

**Fallback**: If the PR has no area label, use `.github/CODEOWNERS` to match file paths to reviewers.

---

## Approval Authority Levels

Not all approvals carry equal weight for merge decisions. This reflects dotnet/runtime's
governance model where area owners have merge authority — it is not a judgment of review quality.

| Level | Who | How to detect | Merge weight |
|-------|-----|--------------|--------------|
| **1. Area owner/lead** | Listed in Owners/Lead column of `docs/area-owners.md` for the PR's area label | Parse area-owners.md, match reviewer login against Owners column | Has merge authority for the area |
| **2. Community triager** | Trusted community members with triage permissions, listed in area-owners.md | Parse the "Community Triagers" section at the bottom of `docs/area-owners.md` | Strong signal — deeply familiar with the repo. Does not have merge authority but review is highly valued. |
| **3. Frequent contributor** | Has many merged PRs or commits in the touched area | `gh api "repos/dotnet/runtime/commits?author={login}&path={dir}&per_page=5"` — 3+ hits = frequent | Valuable domain expertise. Weight 1× (standard). |
| **4. New contributor** | First-time or infrequent contributor to this area | No commits found in touched paths | Review appreciated — every contributor's feedback helps improve the PR. Weight 0.5×. |

### Known Community Triagers

These are listed in the "Community Triagers" section of `docs/area-owners.md` and have triage permissions:

@a74nh, @am11, @clamp03, @Clockwork-Muse, @filipnavara, @huoyaoyuan, @martincostello, @omajid, @Sergio0694, @shushanhf, @SingleAccretion, @teo-tsirpanis, @tmds, @vcsjones, @xoofx

A Community Triager's APPROVED review is a stronger signal than a new contributor's approval. They are deeply familiar with the repo, its conventions, and its quality bar — but they are not area owners and their approval alone does not satisfy the "maintainer review" dimension.

.github/skills/pr-triage/scripts/Get-PrTriageData.ps1:48

  • $Repo -split '/' is assumed to yield exactly two parts, but the script doesn’t validate the input format. If -Repo is missing a slash (or contains extra segments), later indexing into $repoParts[0]/[1] will produce incorrect API paths or errors. Consider adding a validation attribute (e.g., ValidatePattern) or an explicit check that throws a clear error when -Repo isn’t in owner/repo form.
$repoParts = $Repo -split '/'
$areaOwnersUrl = "repos/$($repoParts[0])/$($repoParts[1])/contents/docs/area-owners.md"

.github/skills/pr-triage/scripts/Get-PrTriageData.ps1:69

  • $communityTriagers is hardcoded here, but the accompanying docs say community triagers should be parsed from the “Community Triagers” section of docs/area-owners.md. Hardcoding risks drift when the list changes. Consider parsing that section from the fetched area-owners.md content (or updating the docs to reflect the hardcoded behavior).
$communityTriagers = @("a74nh","am11","clamp03","Clockwork-Muse","filipnavara",
    "huoyaoyuan","martincostello","omajid","Sergio0694","shushanhf",
    "SingleAccretion","teo-tsirpanis","tmds","vcsjones","xoofx")

TrimEnd commas and semicolons from parsed @Handles in area-owners.md,
so owner names like 'lewing,' are normalized to 'lewing'.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
@danmoseley danmoseley requested a review from lewing March 3, 2026 02:17
danmoseley added a commit to danmoseley/pr-dashboard that referenced this pull request Mar 3, 2026
Merge this after dotnet/runtime#125005 lands.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
@danmoseley
Copy link
Copy Markdown
Member Author

We can have a job update the pages on a schedule keeping permalinks that we can possibly put in repos' README.md.

eg https://danmoseley.github.io/pr-dashboard/

When copilot-pull-request-reviewer leaves an error review ('Copilot
encountered an error'), append '; rerequest Copilot review' to the
next_action and add 'Copilot review errored' to blockers.

Uses a targeted secondary GraphQL query (batches of 50) to fetch
review bodies only for PRs with copilot reviews, avoiding the 19x
payload bloat of adding body to the main fragment.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Copilot AI review requested due to automatic review settings March 3, 2026 04:51
When a repo has no docs/area-owners.md (or no matching area label),
-Maintainers provides a fallback list of usernames to treat as
area owners. Their approvals count as 'owner approved' for scoring
and next-action determination.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 13 out of 13 changed files in this pull request and generated 1 comment.

Comments suppressed due to low confidence (7)

.github/skills/pr-triage/references/merge-readiness-rubric.md:426

  • The weight table above sums to a max raw score of 20.0 (3+3+3+2+2+1.5+1.5+1+1+1+0.5+0.5), but this line says 20.5 and the normalization formula uses 20.5. Please correct the max and normalization value (or update the weights) so the documentation matches the script’s composite calculation.
**Max possible raw score**: 20.5. Normalized to a 0–10 scale: `(rawScore / 20.5) × 10`.

.github/skills/pr-triage/references/runtime-signals.md:57

  • The approval authority description says triagers are detected by parsing the “Community Triagers” section of docs/area-owners.md, but Get-PrTriageData.ps1 currently uses a hard-coded triager array. Either implement the described parsing or adjust this text so the detection method matches reality.
| **1. Area owner/lead** | Listed in Owners/Lead column of `docs/area-owners.md` for the PR's area label | Parse area-owners.md, match reviewer login against Owners column | Has merge authority for the area |
| **2. Community triager** | Trusted community members with triage permissions, listed in area-owners.md | Parse the "Community Triagers" section at the bottom of `docs/area-owners.md` | Strong signal — deeply familiar with the repo. Does not have merge authority but review is highly valued. |
| **3. Frequent contributor** | Has many merged PRs or commits in the touched area | `gh api "repos/dotnet/runtime/commits?author={login}&path={dir}&per_page=5"` — 3+ hits = frequent | Valuable domain expertise. Weight 1× (standard). |

.github/skills/pr-triage/scripts/Get-PrTriageData.ps1:39

  • -Repo is accepted as an arbitrary string and later interpolated into both REST paths and the GraphQL query. Please validate that it matches an expected owner/name pattern (and contains no quotes/whitespace) up front, and fail fast with a clear error if it’s invalid.
    [string]$NextAction,
    [string]$PrNumber,
    [int]$Top = 0,

.github/skills/pr-triage/scripts/Get-PrTriageData.ps1:69

  • The community triager list is hard-coded here, but the repo’s source of truth is docs/area-owners.md (it has a dedicated “Community Triagers” section). Consider parsing that section at runtime (same fetch you already do for area owners) to avoid drift when the list changes.
} catch {
    Write-Verbose "Warning: could not fetch area-owners.md, using empty owner table"
}

.github/skills/pr-triage/scripts/Get-PrTriageData.ps1:582

  • In -OutputCsv mode, only title is sanitized for spreadsheet formula injection. Fields like who always start with @ and can still be interpreted as formulas by Excel/Sheets; next_action, blockers, and why are also unescaped. Consider applying the same leading-character quoting (and tab/newline stripping) consistently to all string columns.

if ($OutputCsv) {
    # Tab-separated output for easy SQL/spreadsheet import
    $header = "number`ttitle`tauthor`tscore`tci`tci_detail`tunresolved_threads`ttotal_threads`ttotal_comments`tdistinct_commenters`tmergeable`tapproval_count`tis_community`tage_days`tdays_since_update`tchanged_files`tlines_changed`tnext_action`twho`tblockers`twhy"

.github/skills/pr-triage/references/runtime-signals.md:43

  • This doc says “Do not rely on hardcoded owner lists” and implies runtime parsing from docs/area-owners.md, but the script currently hard-codes the Community Triagers list. Either parse triagers dynamically (from the same fetched markdown) or update this doc to reflect the current implementation to avoid drift.
**Note**: The script parses `docs/area-owners.md` dynamically at startup to get
the full owner table (~138 areas). Do not rely on hardcoded owner lists.

.github/skills/pr-triage/references/runtime-signals.md:45

  • The doc claims a CODEOWNERS-based fallback when a PR has no area-* label, but Get-PrTriageData.ps1 currently just falls back to a generic "area owner" (and doesn’t read .github/CODEOWNERS). Either implement the CODEOWNERS fallback in the script or remove/soften this guidance here to match reality.
**Fallback**: If the PR has no area label, use `.github/CODEOWNERS` to match file paths to reviewers.

Comment on lines +137 to +138
@{ scanned = $prsRaw.Count; analyzed = 0; prs = @() } | ConvertTo-Json -Depth 5
return
Copy link

Copilot AI Mar 3, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The GraphQL fragment only fetches reviews(last:10) and reviewThreads(first:50). For PRs with more than 10 reviews (or >50 threads), this can miss approvals / CHANGES_REQUESTED / unresolved threads and produce incorrect scores and next-action decisions. Consider switching to a query that captures the latest review state per reviewer (or increasing the page sizes and paging when needed).

Copilot uses AI. Check for mistakes.
When bash passes -Maintainers as a single comma-separated string
(e.g., 'JamesNK,eerhardt,mitchdenny'), split it into an array so
owner approval detection works correctly.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
@davidfowl
Copy link
Copy Markdown
Member

A PowerShell script (Get-PrTriageData.ps1) does all the mechanical work: batched GraphQL queries, Build Analysis extraction, review/thread parsing, 12-dimension scoring, next-action determination, and "who" identification

Are you sure you need this at all?

…iewed

- next_action now shows '@mitchdenny: respond to ...' instead of
  'Author: respond to ...' for all author-facing actions
- Detect when approvals are not on the latest commit (stale review):
  small penalty to approval score, added to blockers and why
- 'community reviewed' in why field when non-owner approves
- GraphQL fragment now fetches commit oid for reviews and head commit

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Copilot AI review requested due to automatic review settings March 3, 2026 06:30
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 13 out of 13 changed files in this pull request and generated 2 comments.

Comments suppressed due to low confidence (5)

.github/skills/pr-triage/references/runtime-signals.md:45

  • This doc says to fall back to .github/CODEOWNERS when a PR has no area label, but Get-PrTriageData.ps1 doesn’t implement CODEOWNERS-based owner resolution (it falls back to -Maintainers or the literal string "area owner"). Either implement CODEOWNERS lookup or adjust this guidance to match the script’s actual behavior.
**Fallback**: If the PR has no area label, use `.github/CODEOWNERS` to match file paths to reviewers.

.github/skills/pr-triage/references/merge-readiness-rubric.md:427

  • The rubric states a max raw score of 20.5 and normalizes with / 20.5, but the weight table shown sums to 20.0 (and the script uses $rawMax = 20.0). Please reconcile the documentation with the implementation so consumers don’t compute inconsistent composite scores.
**Max possible raw score**: 20.5. Normalized to a 0–10 scale: `(rawScore / 20.5) × 10`.

.github/skills/pr-triage/SKILL.md:125

  • The examples/documentation use next_action values prefixed with "Author:" (and later filters/counts refer to ^Author:), but the script currently formats author-owned actions as "@{login}: ...". This mismatch will confuse users and breaks the documented -NextAction author behavior unless either the docs or the script is updated to be consistent.
| Score | PR | Title | Who | Next Action | CI | Disc | Age | Updated | Files | Author |
|------:|---:|-------|-----|-------------|----|-----:|----:|--------:|------:|--------|
| 8.4 | [#123546](https://github.com/dotnet/runtime/pull/123546) | arm64: Remove widening casts before truncating | @EgorBo | Ready to merge ✅ | ✅ 91/2/0 | 1t/2p | 5d | 1d | 3 | @jonathandavies-arm |
| 7.9 | [#122485](https://github.com/dotnet/runtime/pull/122485) | [RISC-V] Enable instruction printing | @SkyShield | Author: merge main (stale 17d) | ✅ 91/2/0 | 1t/2p | 24d | 17d | 2 | @SkyShield |
| ... | | | | | | | | | | |

.github/skills/pr-triage/scripts/Get-PrTriageData.ps1:78

  • $communityTriagers is hardcoded, but the references in this skill describe it as being derived from docs/area-owners.md. Hardcoding will drift as that list changes; consider parsing the "Community Triagers" section from the already-fetched areaOwnersMd content instead (or update the docs to match).
$communityTriagers = @("a74nh","am11","clamp03","Clockwork-Muse","filipnavara",
    "huoyaoyuan","martincostello","omajid","Sergio0694","shushanhf",
    "SingleAccretion","teo-tsirpanis","tmds","vcsjones","xoofx")

.github/skills/pr-triage/scripts/Get-PrTriageData.ps1:321

  • approval_count and the approval/owner/triager flags currently count every APPROVED review node, which can overcount when the same reviewer submits multiple reviews (approve → comment → re-approve, etc.). This can inflate scores and mislabel PRs as "2+ approvals"; consider collapsing reviews to the latest state per reviewer (by author) before counting/deriving approval signals.
    foreach ($rev in $reviews) {
        $login = $rev.author.login
        $reviewerLogins += $login
        if ($rev.state -eq "APPROVED") {
            $approvalCount++
            $hasAnyApproval = $true
            $approverLogins += $login
            # Check if approval is on the current head commit
            $isStale = $headCommitOid -and $rev.commit -and $rev.commit.oid -and ($rev.commit.oid -ne $headCommitOid)
            if ($isStale) { $hasStaleApproval = $true }
            if ($prOwners -contains $login) { $hasOwnerApproval = $true }
            elseif ($communityTriagers -contains $login) { $hasTriagerApproval = $true }
        }
    }

Comment on lines +535 to +549
$pattern = switch ($NextAction.ToLower()) {
"ready" { "Ready to merge" }
"review" { "review needed" }
"author" { "^Author:" }
"conflicts" { "resolve conflicts" }
"ci" { "fix CI" }
default { $NextAction }
}
$results = @($results | Where-Object { $_.next_action -match $pattern })
}
if ($MyActions) {
$me = $MyActions.TrimStart('@')
$results = @($results | Where-Object {
$_.who -match $me -or
($_.author -eq $me -and $_.next_action -match "^Author:")
Copy link

Copilot AI Mar 3, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The -NextAction author filter uses "^Author:", but next_action strings produced by the script are formatted like "@{login}: ..." (e.g., @author: fix CI failures). This means -NextAction author will never match, and quick_actions.needs_author_action will always be 0. Align either the next_action strings (e.g., prefix with Author:) or the regex patterns used for filtering/counting.

Suggested change
$pattern = switch ($NextAction.ToLower()) {
"ready" { "Ready to merge" }
"review" { "review needed" }
"author" { "^Author:" }
"conflicts" { "resolve conflicts" }
"ci" { "fix CI" }
default { $NextAction }
}
$results = @($results | Where-Object { $_.next_action -match $pattern })
}
if ($MyActions) {
$me = $MyActions.TrimStart('@')
$results = @($results | Where-Object {
$_.who -match $me -or
($_.author -eq $me -and $_.next_action -match "^Author:")
$nextActionFilter = $NextAction.ToLower()
if ($nextActionFilter -eq "author") {
# "Author" means the PR author needs to take the next action.
# next_action strings are formatted like "@{login}: ...", so for author
# actions we expect the login to match the PR author.
$results = @(
$results | Where-Object {
$_.next_action -like "@$($_.author):*"
}
)
}
else {
$pattern = switch ($nextActionFilter) {
"ready" { "Ready to merge" }
"review" { "review needed" }
"conflicts" { "resolve conflicts" }
"ci" { "fix CI" }
default { $NextAction }
}
$results = @($results | Where-Object { $_.next_action -match $pattern })
}
}
if ($MyActions) {
$me = $MyActions.TrimStart('@')
$results = @($results | Where-Object {
$_.who -match $me -or
($_.author -eq $me -and $_.next_action -like "@$me:*")

Copilot uses AI. Check for mistakes.
Comment on lines +598 to +604
# Tab-separated output for easy SQL/spreadsheet import
$header = "number`ttitle`tauthor`tscore`tci`tci_detail`tunresolved_threads`ttotal_threads`ttotal_comments`tdistinct_commenters`tmergeable`tapproval_count`tis_community`tage_days`tdays_since_update`tchanged_files`tlines_changed`tnext_action`twho`tblockers`twhy"
$lines = @($header)
foreach ($r in $results) {
$t = ($r.title -replace "`t"," ").Substring(0, [Math]::Min(70, $r.title.Length))
if ($t.Length -gt 0 -and $t[0] -in '=','+','-','@') { $t = "'$t" }
$lines += "$($r.number)`t$t`t$($r.author)`t$($r.score)`t$($r.ci)`t$($r.ci_detail)`t$($r.unresolved_threads)`t$($r.total_threads)`t$($r.total_comments)`t$($r.distinct_commenters)`t$($r.mergeable)`t$($r.approval_count)`t$(if ($r.is_community) {1} else {0})`t$($r.age_days)`t$($r.days_since_update)`t$($r.changed_files)`t$($r.lines_changed)`t$($r.next_action)`t$($r.who)`t$($r.blockers)`t$($r.why)"
Copy link

Copilot AI Mar 3, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

-OutputCsv only sanitizes the title field for formula injection / TSV breaking characters. Other exported string fields (notably next_action, which often starts with @) can still trigger spreadsheet formula evaluation and can also contain tabs/newlines that break the TSV shape. Consider sanitizing all string columns (or quoting/escaping consistently) before emitting TSV intended for spreadsheet/SQL import.

Suggested change
# Tab-separated output for easy SQL/spreadsheet import
$header = "number`ttitle`tauthor`tscore`tci`tci_detail`tunresolved_threads`ttotal_threads`ttotal_comments`tdistinct_commenters`tmergeable`tapproval_count`tis_community`tage_days`tdays_since_update`tchanged_files`tlines_changed`tnext_action`twho`tblockers`twhy"
$lines = @($header)
foreach ($r in $results) {
$t = ($r.title -replace "`t"," ").Substring(0, [Math]::Min(70, $r.title.Length))
if ($t.Length -gt 0 -and $t[0] -in '=','+','-','@') { $t = "'$t" }
$lines += "$($r.number)`t$t`t$($r.author)`t$($r.score)`t$($r.ci)`t$($r.ci_detail)`t$($r.unresolved_threads)`t$($r.total_threads)`t$($r.total_comments)`t$($r.distinct_commenters)`t$($r.mergeable)`t$($r.approval_count)`t$(if ($r.is_community) {1} else {0})`t$($r.age_days)`t$($r.days_since_update)`t$($r.changed_files)`t$($r.lines_changed)`t$($r.next_action)`t$($r.who)`t$($r.blockers)`t$($r.why)"
function Sanitize-TsvField {
param(
[Parameter(Mandatory = $true)]
[string] $Value
)
$v = $Value -replace "[`t`r`n]", " "
if ($v.Length -gt 0 -and $v[0] -in '=','+','-','@') {
$v = "'$v"
}
return $v
}
# Tab-separated output for easy SQL/spreadsheet import
$header = "number`ttitle`tauthor`tscore`tci`tci_detail`tunresolved_threads`ttotal_threads`ttotal_comments`tdistinct_commenters`tmergeable`tapproval_count`tis_community`tage_days`tdays_since_update`tchanged_files`tlines_changed`tnext_action`twho`tblockers`twhy"
$lines = @($header)
foreach ($r in $results) {
$title = if ($null -ne $r.title) {
$r.title.Substring(0, [Math]::Min(70, $r.title.Length))
} else {
""
}
$t = Sanitize-TsvField -Value $title
$author = Sanitize-TsvField -Value ([string]$r.author)
$ci = Sanitize-TsvField -Value ([string]$r.ci)
$ciDetail = Sanitize-TsvField -Value ([string]$r.ci_detail)
$mergeable = Sanitize-TsvField -Value ([string]$r.mergeable)
$nextAction = Sanitize-TsvField -Value ([string]$r.next_action)
$who = Sanitize-TsvField -Value ([string]$r.who)
$blockers = Sanitize-TsvField -Value ([string]$r.blockers)
$why = Sanitize-TsvField -Value ([string]$r.why)
$lines += "$($r.number)`t$t`t$author`t$($r.score)`t$ci`t$ciDetail`t$($r.unresolved_threads)`t$($r.total_threads)`t$($r.total_comments)`t$($r.distinct_commenters)`t$mergeable`t$($r.approval_count)`t$(if ($r.is_community) {1} else {0})`t$($r.age_days)`t$($r.days_since_update)`t$($r.changed_files)`t$($r.lines_changed)`t$nextAction`t$who`t$blockers`t$why"

Copilot uses AI. Check for mistakes.
Track hasCurrentOwnerApproval separately from hasOwnerApproval.
When owner approved but only on an older commit, next_action shows
'Maintainer: re-review needed (approval on older commit)' directed
at the approving owner, instead of appearing ready to merge.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
@danmoseley
Copy link
Copy Markdown
Member Author

Are you sure you need this at all?

@davidfowl you mean instead of the LLM? I started with just the LLM but found that putting the querying/math in powershell is just faster.

@danmoseley
Copy link
Copy Markdown
Member Author

@danmoseley
Copy link
Copy Markdown
Member Author

I'm going to close this as for now I think it's better in https://github.com/danmoseley/pr-dashboard where I can just regenerate periodically without a skill.

@danmoseley danmoseley closed this Mar 3, 2026
@artl93
Copy link
Copy Markdown
Member

artl93 commented Mar 3, 2026

I thought @stephentoub had one of these in progress already?

@github-actions github-actions bot locked and limited conversation to collaborators Apr 3, 2026
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

area-skills Agent Skills

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants