refactor(ci): split PR triage into 4-job pipeline#4866
Conversation
Replace the monolithic /triage skill + standalone qwen-code-pr-review.yml with a single qwen-pr-triage.yml workflow that orchestrates 4 jobs: 1. product-decision — template check + direction + approach (ubuntu-latest) 2. review — code review via /review skill (self-hosted, parallel) 3. tmux-testing — real-scenario testing (self-hosted, parallel, internal only) 4. approval-decision — reads all verdicts, approves/rejects (ubuntu-latest) Key design decisions: - Fork PRs run full pipeline except tmux-testing (no code execution) - settings_json restricts commands per job (gh/curl only, no general shell) - tmux-testing has no write token; approval-decision posts on its behalf - Each job posts comments with unique markers for dedup on re-run - qwen-triage.yml stripped to issue-only (PR triggers removed) - qwen-code-pr-review.yml deleted (folded into review job) Refs: #4570
There was a problem hiding this comment.
Pull request overview
This PR refactors the repo’s GitHub Actions automation by separating issue triage from PR triage, and replacing the prior monolithic PR review workflow with a new multi-stage PR triage pipeline (resolve → product-decision → review/tmux in parallel → approval-decision).
Changes:
- Added two new Qwen skills to gate PRs on product alignment (
product-decision) and to synthesize a final approve/request-changes decision (approval-decision). - Updated
qwen-triage.ymlto become issue-only (removed PR triggers/logic). - Introduced a new
qwen-pr-triage.ymlworkflow to orchestrate the 4-job PR pipeline and deleted the oldqwen-code-pr-review.yml.
Reviewed changes
Copilot reviewed 5 out of 5 changed files in this pull request and generated 9 comments.
Show a summary per file
| File | Description |
|---|---|
.qwen/skills/product-decision/SKILL.md |
New skill definition for template + direction + approach gating before code review/testing. |
.qwen/skills/approval-decision/SKILL.md |
New skill definition for final synthesis and PR approve/request-changes action. |
.github/workflows/qwen-triage.yml |
Renamed and constrained existing triage workflow to issues only. |
.github/workflows/qwen-pr-triage.yml |
New orchestrating PR triage workflow with parallel review and tmux-testing stages. |
.github/workflows/qwen-code-pr-review.yml |
Removed the previous standalone PR review workflow. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
actionlint flagged github.event.comment.body in inline script as injection risk. Move all event context fields to env vars.
/triage now delegates PR work to /product-decision, /review, tmux-real-user-testing, and /approval-decision sequentially. This aligns local behavior with the CI pipeline (qwen-pr-triage.yml). Old 3-stage inline logic removed from pr-workflow.md.
tmux-testing now needs review to pass first — no point building and running the app if code review found critical issues. Flow is now: product-decision → review → tmux-testing → approval-decision Each step gates the next. Also updates pr-workflow.md to reflect the serial dependency chain.
Skills now write verdict JSON to /tmp/triage-results/<skill>.json. The workflow reads the file after the action completes and emits the verdict as a job output via $GITHUB_OUTPUT. Fallback: if the file doesn't exist (skill crashed or didn't follow instructions), verdict is inferred from the action's exit code.
Code Coverage Summary
CLI Package - Full Text ReportCore Package - Full Text ReportFor detailed HTML reports, please see the 'coverage-reports-22.x-ubuntu-latest' artifact from the main CI run. |
Two fixes: 1. After review runs, check if it posted REQUEST_CHANGES. If so, set verdict=fail and skip tmux-testing (saves runner time). 2. Add environment protection rule (qwen-pr-review-delay) to the review job for auto-triggered PRs. This reuses the existing 10-minute wait timer to debounce rapid consecutive pushes. Manual triggers (@qwen-code /review, workflow_dispatch) skip it.
P1 fixes:
- Add synchronize/reopened/review_requested triggers (parity with old workflow)
- Fix environment empty-string issue — move delay logic to resolve job output
- Fix review verdict detection — filter only by bot login, not authorAssociation
- Add fallback-comment job for pipeline failure notification
P2 fixes:
- Replace || true with proper exit code handling (distinguish timeout vs error)
- Fix qwen-triage.yml to use env vars instead of direct ${{ }} interpolation
P3 fixes:
- Update stale "parallel" comment to reflect serial pipeline
- resolve job: add explicit read-only permissions (least privilege) - tmux-testing: emit verdict=fail/timeout on CLI crash instead of always pass - approval-decision skill: fix TMUX_COMMENT_BODY → read from downloaded artifact file, remove ambiguous var name - product-decision skill: clarify fail is ONLY for template check, direction concerns always escalate to needs_human
…ted runner - review: job timeout 45→90m, inner qwen timeout 35→85m - push→review debounce: comment updated to 1h (env wait timer set to 60 via API) - product-decision: move to ecs-qwen self-hosted runner and call the local qwen CLI instead of qwen-code-action — reuses the runner's preinstalled qwen, no reinstall per run - product-decision: pass CLAUDE_CODE_SRC (from vars.CLAUDE_CODE_SRC_PATH) so the skill can read local Claude Code source for product-direction reasoning - product-decision skill: read local $CLAUDE_CODE_SRC (CHANGELOG + src) with a remote CHANGELOG fallback when the var is unset - remove dead Cache node_modules step from review job (no npm ci ever runs)
…te in AGENTS.md Q6 — readable CI logs without Ink/tmux overhead: - add shared env STREAM_JSON_FMT: a jq program that renders qwen --output-format stream-json events into one-line progress (tool calls, assistant text, final result) instead of raw JSON envelopes - pipe product-decision / review / tmux-testing through `| tee raw.jsonl | jq -Rr "$STREAM_JSON_FMT"`, keeping PIPESTATUS[0] for the qwen exit code and a raw .jsonl for debugging - the non-interactive -p path never mounts Ink, so this avoids the full TUI render cost a tmux capture would incur Q7 — stop PRs tripping the triage template gate after creation: - AGENTS.md "Submitting PRs": instruct agents to read .github/pull_request_template.md and fill it section-for-section via --body-file before opening a PR, and list the required sections, noting product-decision posts CHANGES_REQUESTED if any are missing
- product-decision: clear stale /tmp/triage-results/product-decision.json before the run. /tmp persists on the self-hosted runner (unlike the previous ephemeral ubuntu-latest), so the capture step could otherwise read a verdict left by a previous PR if this run's skill fails to write one. - STREAM_JSON_FMT: wrap the per-line render in try/catch and null-guard subtype/name, so one unexpected event shape can't error out jq mid-stream and blank the rest of the log. - product-decision / review / tmux-testing: fall back to raw passthrough (cat) when jq is absent on the runner, so log output is never silently blanked.
…ch dirs Local review round 2 findings: - product-decision: propagate the qwen exit code at the end of the run step. Previously a timeout/crash left the step green, and when no verdict file was written the capture fallback defaulted to "pass" — silently waving the PR through the gate. Now the job fails and the verdict resolves to fail, matching the original action-based semantics. - jq: add --unbuffered so progress lines stream into the CI log in real time instead of arriving in block-buffered chunks. - Move runner scratch state from /tmp to runner.temp, which is emptied per job and is per runner instance: triage-results verdict dir (via TRIAGE_RESULTS_DIR env, skill falls back to /tmp locally), raw stream-json tee files, and the tmux-results artifact dir. /tmp persists across runs and is shared between runner instances on the same host, risking stale or cross-PR contamination. The approval-decision side stays on /tmp (ephemeral ubuntu-latest VM).
It now runs the local qwen CLI on the same self-hosted runner as the review and tmux-testing stages, so it should use the same credential pair instead of the action-era OPENAI_* secrets.
The old qwen-code-pr-review.yml posted an immediate '<!-- qwen-review-ack -->' comment when a maintainer requested a review, updated in place on re-trigger. That feedback was lost in the pipeline split — a comment trigger gave no signal that the request was accepted. Add a lightweight ack-request job (no needs, fires instantly) mirroring resolve's comment-trigger gate.
The old qwen-code-pr-review.yml accepted @qwen-code /review from PR review threads (pull_request_review_comment) and review summaries (pull_request_review), not just the main conversation. Restore both: on: triggers, resolve gate + PR-number/trigger-type resolution, and the ack-request acknowledgement. Manual triggers still skip the debounce delay.
Port of the old delay job's post-wait re-check: with a 1-hour debounce the PR may be closed, merged, or drafted by the time the review job starts. Skip checkout/review in that case (verdict=skip) and gate tmux-testing and approval-decision off it, instead of burning an 85-minute runner slot on a dead PR.
- Auto-triggered runs (pull_request_target) again require the PR author to be OWNER/MEMBER/COLLABORATOR, as the old workflow did. External/fork PRs run only on explicit maintainer trigger; the gate can be relaxed later. - workflow_dispatch regains the timeout_minutes input (default 90, min 10). The review job timeout and the inner qwen timeout (N-5) derive from it.
ack-request already guards on state == open; resolve didn't, so a /review comment on a closed PR would still launch the pipeline. Align the gates.
…factor # Conflicts: # .github/workflows/qwen-code-pr-review.yml
PR #4962 extends the old qwen-code-pr-review.yml timeouts (review 60->90min, delay comment 10->30min). That file is deleted here — its intent is already implemented in the qwen-pr-triage.yml pipeline (review timeout 90min with a workflow_dispatch override). Resolved the modify/delete conflict by keeping the deletion; merging the branch lets #4962 auto-close as merged when this PR lands.
The qwen-pr-review-delay environment wait timer is set to 30 minutes (per #4962); update the stale 1-hour comments to match.
Gating (one formal-review owner per path): - approval-decision now requires product verdict=pass (or review_only) AND review verdict=pass - it no longer runs alongside an upstream stage's request-changes (template fail / review fail paths stop there) - review propagates the qwen exit code: a crashed or timed-out review fails the job instead of defaulting the verdict to pass - verdict read-back fails closed on API errors (retry 3x, then fail the job) so a network blip is never mistaken for a passing review - bot login for the read-back is derived from the live token instead of a hardcoded account name Orchestration: - the four stage jobs share one per-PR concurrency group, so a new trigger cancels the whole in-flight pipeline, not just the same stage - ack-request now consumes resolve's comment_trigger output instead of duplicating the 25-line trigger-validation if expression ECS cross-border network resilience: - all three self-hosted checkouts abort stalled transfers (GIT_HTTP_LOW_SPEED_LIMIT/TIME) and retry once instead of hanging until the job timeout - gh API calls in the review job retry with backoff Docs: align approval-decision SKILL.md and pr-workflow.md with the serial gated pipeline (the old text described parallel jobs)
…factor # Conflicts: # .github/workflows/qwen-code-pr-review.yml
DragonnZhang
left a comment
There was a problem hiding this comment.
Reviewed the full diff (8 files, +1191/-683). This is a well-structured refactoring that splits the monolithic PR review workflow into a 4-job gated pipeline (product-decision → review → tmux-testing → approval-decision) with proper dependency chains, retry logic for flaky API calls, and correct security boundaries (fork PRs skip code execution, self-hosted runners for code checkout, minimal permissions per job).
The issue triage workflow is cleanly separated from PR triage. The new skill definitions (product-decision, approval-decision) are thorough with proper comment dedup markers and bilingual output.
No bugs, security issues, or logic errors found. CI checks are still pending but the workflow structure is sound.
The approval-decision agent hit exit code 53 (maxSessionTurns exhausted) on the first live run even though it had already posted the review and written the verdict file. Two fixes: 1. Raise maxSessionTurns from 15 → 30 to give the skill more headroom. 2. Add continue-on-error + verdict-file arbitration so the stage succeeds when the verdict file exists, regardless of exit code.
ECS runners will have a proxy configured, making cross-border git transfers reliable. This removes the stall-abort env vars and retry steps from all three checkout sites (product-decision, review, tmux-testing), reducing complexity and eliminating the continue-on-error that masked checkout failures. Also adds a PR state pre-check to tmux-testing (matching the one review already has): if the PR was merged, closed, or converted to draft while review was running, tmux-testing skips gracefully instead of failing on a missing refs/pull/N/merge ref.
DragonnZhang
left a comment
There was a problem hiding this comment.
Good refactor overall — the 4-stage pipeline is well-structured, the gating logic is sound, and the concurrency/error-handling design shows careful thought. A few issues worth addressing before merge:
1. fallback-comment missing tmux-testing in needs (medium)
fallback-comment:
needs: ['resolve', 'product-decision', 'review', 'approval-decision']tmux-testing is absent from the needs array. Two consequences:
- When
tmux-testingfails,approval-decisionis skipped (itsifrequirestmux-testing.result == 'success' || 'skipped'). Thefailure()guard infallback-commentchecks itsneedsjobs —approval-decisionwas skipped, not failed. Per GitHub docs,failure()returns true for failed jobs, not skipped ones. So the fallback comment may not fire when tmux-testing is the stage that broke. - Even if it does fire, the fallback has no visibility into which stage failed, so the comment is generic.
Fix: add 'tmux-testing' to the needs list:
needs: ['resolve', 'product-decision', 'review', 'tmux-testing', 'approval-decision']2. Inconsistent OpenAI secret naming in approval-decision (low-medium)
The approval-decision action passes:
OPENAI_API_KEY: '${{ secrets.OPENAI_API_KEY }}'
OPENAI_BASE_URL: '${{ secrets.OPENAI_BASE_URL }}'But product-decision and review use secrets.REVIEW_OPENAI_API_KEY / secrets.REVIEW_OPENAI_BASE_URL. If the repo only has the REVIEW_* secrets configured, the approval stage would get empty credentials and fail. Should these be the same REVIEW_* secrets?
3. PR body claims GIT_HTTP_LOW_SPEED_LIMIT resilience that isn't in the code (low)
The PR description states:
ECS cross-border network resilience: all three self-hosted checkouts abort stalled transfers (
GIT_HTTP_LOW_SPEED_LIMIT/TIME) and retry once
But the actions/checkout steps in the diff don't configure any git retry or low-speed-limit settings. Either the claim should be removed from the PR body, or the resilience logic should be added (e.g., via env: on the checkout steps or a wrapper script).
What looks good:
- Serial job chain with proper gating — each stage blocks the next cleanly
cancel-in-progress: trueon the shared per-PR concurrency group for proper cancellation on new pushes- Exit code propagation in
product-decisionandreview(fail-closed on timeout/crash) - 3x retry with backoff for
ghAPI calls on flaky ECS connections - PR state re-checks after debounce wait
runner.tempfor scratch state instead of shared/tmpcontinue-on-erroron the approval action with verdict-file-as-source-of-truth is clever- The
STREAM_JSON_FMTjq formatter for readable CI logs
中文说明
整体重构不错——四阶段流水线结构合理,门控逻辑正确,并发和错误处理设计用心。合并前有几个问题值得处理:
-
fallback-comment的needs缺少tmux-testing:tmux-testing 失败时,approval-decision 被跳过(skip),而failure()只对 failed 返回 true,对 skipped 不返回。fallback 可能不会触发。建议把'tmux-testing'加到needs里。 -
approval-decision的 OpenAI 密钥名不一致:用的是secrets.OPENAI_API_KEY而非REVIEW_OPENAI_API_KEY,如果仓库只配了后者,approval 阶段会拿不到凭据。 -
PR 描述里提到了
GIT_HTTP_LOW_SPEED_LIMIT网络韧性,但代码里没有实现:checkout 步骤没有配置 git 重试或低速限制,建议要么删掉描述里的说法,要么补上实现。
Cross-border ECS → GitHub transfers are unreliable without a proxy. Restore the retry pattern with 3 total attempts (up from 2) and GIT_HTTP_LOW_SPEED_LIMIT/TIME for fast stall detection (60s instead of waiting until the job timeout). Applied to all 3 checkout sites: product-decision, review, and tmux-testing. The tmux PR state precheck from the previous commit is preserved — it guards against missing refs/pull/N/merge refs (merged/closed PRs), while retry handles transient network failures.
ECS runners now have a stable HTTP proxy for GitHub access, making retry logic and stall-abort env vars unnecessary. This removes: - All checkout retry steps (3 sites × 2 retries = 6 steps removed) - GIT_HTTP_LOW_SPEED_LIMIT/TIME env vars - continue-on-error on checkout steps Also changes review checkout from fetch-depth: 0 (full history) to fetch-depth: 1 — the /review skill reads PR diffs via gh API, not local git history. The tmux-testing PR state precheck is preserved — it guards against missing merge refs from merged/closed PRs, which is unrelated to network reliability.
- product/approval verdict defaults flip pass/approve -> fail when the skill writes a file with no .verdict field (malformed = do not wave through) - tmux-testing emits verdict=skipped when the PR is closed/merged mid-flight, so approval-decision sees a meaningful value instead of empty - fallback-comment now also depends on tmux-testing, so a tmux failure triggers the fallback path
…r labels
The /review skill leaves git worktrees under .qwen/tmp/ in the persistent
self-hosted workspace. A leftover worktree pins its branch, so a later
actions/checkout cannot clean the repo ('cannot delete branch ... used by
worktree'), falls back to a full recreate, and that fails with
'upload-pack: not our ref'. This was cross-PR contamination: a prior
review of #5002 broke product-decision checkout for #4868.
- add a 'Clean stale review worktrees' step before each self-hosted
checkout (product-decision / review / tmux-testing) that prunes
.qwen/tmp worktrees and stale qwen-review/* branches; no-op on a
fresh runner
- simplify self-hosted runs-on to ['self-hosted', 'ecs-qwen'] to match
the unified runner label set
Drop the OWNER/MEMBER/COLLABORATOR author gate on pull_request_target so fork and external-contributor PRs are triaged automatically, to speed up PR handling. Accepted-risk note: product-decision/review run an LLM agent that reads untrusted PR content while CI_BOT_PAT is in the job env, so a prompt injection could exfiltrate the token. This is a known, accepted trade-off for full automation under current org constraints (no fine-grained PAT / GitHub App available). Code execution (tmux) stays gated on is_fork==false.
The product-decision and review agents read untrusted PR content, so they must never hold a write-capable credential. Move the write PAT out of both agents and into separate no-agent "Publish" steps: - Agents run with QWEN_REVIEW_READ_TOKEN (zero-scope, public reads only) and emit their comment/verdict to files via QWEN_PRODUCT_EMIT_ONLY / QWEN_REVIEW_EMIT_ONLY. They fail closed if the read token is unset, and isolate GH_CONFIG_DIR so gh cannot fall back to the runner's ambient auth. - Publish steps hold CI_BOT_PAT, run no agent, and read only agent-emitted files — untrusted PR content cannot reach the PAT. - review verdict now reads the emitted event (fail-closed: only APPROVE / COMMENT pass) instead of re-querying the bot's posted review. - Publish review is gated on explicit success() so a crashed/partial review never publishes; product request-changes dedups against an existing CHANGES_REQUESTED review on re-runs. - Gate auto-approval behind is_fork == 'false': fork PRs get product + review but a human approves. Residual risk (deferred): OPENAI_API_KEY is still in the agent env and could be exfiltrated via egress; mitigate later with an egress allowlist + a spend-limited, rotatable model key.
actionlint/shellcheck SC2012: ls with a glob mishandles non-alphanumeric filenames. Switch the review-result fallback lookup to find -maxdepth 1.
What this PR does
Replaces the monolithic
/triageskill + standaloneqwen-code-pr-review.ymlwith a singleqwen-pr-triage.ymlthat orchestrates a staged pipeline, and narrowsqwen-triage.ymlto issues only (qwen-code-pr-review.ymlis deleted).Each stage gates the next — a failed gate stops the pipeline there, and the Actions DAG view shows exactly which stage a PR is sitting at.
New skills:
/product-decision(PR-template + product-direction gate) and/approval-decision(final verdict).The three reasoning stages run on the self-hosted
ecs-qwenrunner using its preinstalledqwenCLI (no per-run install via the action):is_fork == false)product-decisionreads a local Claude Code checkout on the runner (CLAUDE_CODE_SRC←vars.CLAUDE_CODE_SRC_PATH) for direction signals, falling back to the remote CHANGELOG when unset. Its qwen exit code propagates, so a timed-out or crashed gate run fails the job instead of defaulting the verdict to pass.Exactly one stage owns the formal review verdict on every path: template fail → product-decision posts request-changes; review finds blockers → the
/reviewCHANGES_REQUESTED is the verdict (tmux + approval skip); all pass → approval-decision posts the single final verdict. No path produces competing formal reviews.A crashed or timed-out review fails its job instead of silently defaulting to pass, and the verdict read-back fails closed on API errors (retry 3×, then fail) — a network blip is never mistaken for a passing review.
The four stage jobs share one per-PR concurrency group, so a new push cancels the whole in-flight pipeline (not just the same stage) and restarts the debounce clock.
ECS cross-border network resilience: all three self-hosted checkouts abort stalled transfers (
GIT_HTTP_LOW_SPEED_LIMIT/TIME) and retry once instead of hanging until the job timeout; gh API calls in the review job retry with backoff.CI logs:
qwen --output-format stream-jsonis piped through a sharedjq --unbufferedformatter into readable one-line progress (tool calls / narration / result) instead of raw JSON, keeping a raw.jsonland falling back tocatifjqis absent.Runner scratch state (verdict files, raw logs, tmux artifact dir) lives under
runner.temp— emptied per job and per runner instance — instead of shared persistent/tmp.Automatic runs are gated to internal authors (OWNER/MEMBER/COLLABORATOR), as before the split — external/fork PRs run only on an explicit maintainer trigger. The queued-acknowledgement comment, review-thread/review-body triggers, post-debounce PR state re-check, and the
timeout_minutesdispatch input (default 90) are all carried over from the old workflow.AGENTS.md: front-loads the PR-template requirement so agents fill it before opening a PR.Why it's needed
The single-run workflow conflated product judgment, code review, and testing into one opaque pass. Staged gates let each stage block early, run with least privilege, and be triggered/retried independently. Running on the self-hosted runner with the preinstalled CLI avoids repeated installs and lets
product-decisionreason against local source. Rawstream-jsonlogs were unreadable — thejqformatter makes runs debuggable without the Ink TUI cost a tmux capture would add. Front-loading the template stops PRs from being auto-flagged by the new gate right after creation.Reviewer Test Plan
How to verify
workflow_dispatchon an open internal PR → confirm the serial DAGresolve → product-decision → review → tmux-testing → approval-decision(this also works pre-merge:gh workflow run qwen-pr-triage.yml --ref <this branch> -f pr_number=<N>uses the branch's workflow file).product-decisionposts a comment carrying the<!-- qwen-triage:product -->marker; a PR missing template sections getsCHANGES_REQUESTED.▶ tool/💬 text/■ resultlines streaming in real time (not raw JSON).@qwen-code /reviewcomment trigger skipsproduct-decisionand runs review only;@qwen-code /triageruns the full pipeline; both skip the debounce wait and get an immediate "queued" ack comment.qwen-pr-review-delayenvironment, aligned with ci: extend qwen PR review timeout to 90min and queue delay to 30min #4962).Evidence (Before & After)
Not user-visible (CI workflow + skills). Log readability: before = raw
stream-jsonenvelopes; after = one readable line per event. Locally validated: YAML parses, yamllint + actionlint clean (custom runner label aside),act -grenders the expected serial DAG, and theresolvescript was executed directly against the live API for all three trigger paths (workflow_dispatch→ full,@qwen-code /review→ review_only + ack,@qwen-code /triage→ full + ack) with correct outputs.Tested on
Environment (optional)
Self-hosted
ecs-qwenrunners must haveqwenCLI,jq, andtmuxpreinstalled;vars.CLAUDE_CODE_SRC_PATHshould point at a local Claude Code checkout (the skill falls back to the remote CHANGELOG until set). Local validation usedact0.2.88 + direct script execution; not yet run end-to-end on the runner — needs a maintainerworkflow_dispatch.Risk & Scope
product-decisionon self-hosted +--approval-mode yolodrops the action'ssettings_jsontool allowlist (a fork-PR defense-in-depth layer); this matchesreview's existing posture.product-decisionruns undebounced for fast template feedback — rapid pushes re-run it (each cancelled by the next), trading some ECS time for author experience.workflow_dispatch); settingvars.CLAUDE_CODE_SRC_PATHis runner configuration, not part of this PR.qwen-code-pr-review.ymlremoved (supersedes ci: extend qwen PR review timeout to 90min and queue delay to 30min #4962, whose content is carried here);qwen-triage.ymlno longer triages PRs (issues only); auto-review waits up to 30 min after a push.Linked Issues
None — CI infrastructure refactor.
中文说明
把原来「单体
/triageskill + 独立qwen-code-pr-review.yml」换成一个qwen-pr-triage.yml,编排成分阶段流水线:resolve → product-decision → review → tmux-testing → approval-decision(串行门控,每一阶段拦住则流水线就停在那里,Actions 的 DAG 视图能直接看出 PR 卡在哪一步);qwen-triage.yml收窄为只处理 issue,qwen-code-pr-review.yml删除。/product-decision(PR 模板 + 产品方向门)、/approval-decision(最终判定)。ecs-qwenrunner 上、用其预装的qwenCLI(不再每次用 action 安装):product-decision / review 只通过 API 读 PR,不执行代码;tmux-testing 仅对内部 PR(非 fork)执行代码。product-decision读取 runner 上的本地 Claude Code 源码(CLAUDE_CODE_SRC←vars.CLAUDE_CODE_SRC_PATH)做方向判断,未配置时回退到远程 CHANGELOG;其 qwen 退出码会向上传播,超时/崩溃时 job 失败而不是默认放行。/review的 CHANGES_REQUESTED 即为判定(tmux 与 approval 跳过);全部通过 → approval-decision 发唯一的最终判定。任何路径都不会出现互相竞争的 formal review。GIT_HTTP_LOW_SPEED_LIMIT/TIME)并重试一次,而不是挂到 job 超时;review job 里的 gh API 调用带退避重试。stream-json经共享jq --unbuffered实时渲染成可读单行(工具调用/叙述/结果),保留原始.jsonl,缺jq时回退cat。runner.temp(每个 job 清空、每实例独立),不再用共享持久的/tmp。timeout_minutes手动输入(默认 90)均已保留。AGENTS.md:前置 PR 模板要求,让 agent 建 PR 前就按模板填,避免建完即被 triage 打回。为什么: 分阶段门控可早阻断、最小权限、独立触发重试;用预装 CLI 省安装并能读本地源码;
jq渲染让日志可读又不付 Ink TUI 代价;前置模板避免被新门拦。验证: 本地 yamllint + actionlint 通过;
act -g渲染出预期的串行 DAG;resolve脚本直接对真实 API 跑通三条触发路径(dispatch → full、/review→ review_only + ack、/triage→ full + ack)输出全部正确。合并前可用gh workflow run qwen-pr-triage.yml --ref <本分支> -f pr_number=<N>端到端实测(dispatch 用的是分支上的 workflow 文件)。尚未在 runner 上端到端跑过,需要 maintainer dispatch 一次。风险/范围: product-decision 改 yolo 后对 fork 少一层工具白名单(与 review 一致);product-decision 不防抖(快速模板反馈的取舍,连续 push 时会反复跑、被后续触发取消);runner 需预装
qwen、jq、tmux。范围外:设CLAUDE_CODE_SRC_PATH(未设时 skill 回退远程 CHANGELOG)、runner 上的端到端实测。破坏性:删除qwen-code-pr-review.yml(吸收并取代 #4962);qwen-triage.yml不再处理 PR;push 后自动 review 最长等 30 分钟。