GH#14964: require model escalation before BLOCKED in worker policy by alex-solovyev · Pull Request #15190 · marcusquinn/aidevops

alex-solovyev · 2026-04-01T17:17:56Z

Summary

Implements the worker escalation policy required by GH#14964: workers must attempt model escalation before exiting BLOCKED or handing off to a human.

Changes

.agents/prompts/worker-efficiency-protocol.md — Added section 4 "Model escalation before BLOCKED" with decision matrix and escalation pattern
.agents/scripts/commands/full-loop.md — Added rule 9 to headless dispatch rules: BLOCKED only valid after escalation attempt
.agents/scripts/headless-runtime-helper.sh — Added rule 6 to the HEADLESS_CONTINUATION_CONTRACT_V1 injected into every worker prompt
.agents/tools/context/model-routing.md — Extended "Failure-Based Escalation" section with BLOCKED policy reference
.agents/scripts/tests/test-headless-contract-escalation.sh — 8 regression tests verifying contract content and injection behavior

Acceptance Criteria

[DONE] Headless worker contract updated to require escalation before BLOCKED
[DONE] full-loop.md headless dispatch rules updated
[DONE] worker-efficiency-protocol.md has escalation decision matrix
[DONE] model-routing.md cross-references the new policy
[DONE] Regression tests added and passing (8/8)

Runtime Testing

Risk level: Low — documentation/prompt changes only, no runtime code paths changed. The test script validates contract content statically.

Test results:

PASS contract heredoc includes GH#14964 escalation reference
PASS contract heredoc includes model escalation requirement text
PASS contract heredoc includes rule 6 (escalation)
PASS contract not injected when AIDEVOPS_HEADLESS_APPEND_CONTRACT=0
PASS contract not injected for non-/full-loop prompts
PASS contract injection is idempotent (not duplicated)
PASS contract distinguishes genuine blockers from invalid ones
PASS contract injected for /full-loop prompts

Ran 8 tests, 0 failed.

Closes #14964

aidevops.sh v3.5.555 plugin for OpenCode v1.3.13 with claude-sonnet-4-6

Summary by CodeRabbit

Documentation
- Updated internal protocol documentation and model escalation guidance.
Tests
- Added regression test suite for model escalation behavior validation.

Note: This release contains internal infrastructure and documentation updates with no visible changes to end-user functionality.

…964) - Add rule 6 to headless continuation contract: workers must attempt model escalation (sonnet → opus) before exiting BLOCKED - Update full-loop.md headless dispatch rules with escalation requirement - Add escalation decision matrix to worker-efficiency-protocol.md - Clarify worker BLOCKED policy in model-routing.md failure-based escalation - Add regression test suite (8 tests) verifying contract content and injection Review-policy metadata, nominal GitHub states, and lower-tier model limits are not valid blockers on their own. Genuine blockers require evidence that persists after escalation. Closes #14964

coderabbitai · 2026-04-01T17:18:04Z

Walkthrough

This pull request enforces a mandatory worker policy requiring model escalation before marking tasks as BLOCKED. Changes span protocol documentation, command specifications, script contract injection, model routing guidance, and new regression tests to validate escalation behavior.

Changes

Cohort / File(s)	Summary
Policy Documentation `.agents/prompts/worker-efficiency-protocol.md`, `.agents/scripts/commands/full-loop.md`, `.agents/tools/context/model-routing.md`	Updated worker policy to mandate model escalation before exiting `BLOCKED`. Clarifies that review-policy metadata, nominal GitHub states, and lower-tier model limits are not standalone blockers. Adds decision matrix mapping specific failure scenarios (model stalls, auth errors, failing checks) to escalation vs. immediate block determination.
Runtime Contract Injection `.agents/scripts/headless-runtime-helper.sh`	Injected new mandatory rule `#6` into the headless worker contract requiring escalation attempt (sonnet → opus via `--model anthropic/claude-opus-4-6`) before returning `BLOCKED` outcome. Contract rule specifies that genuine blockers require evidence persisting after escalation.
Regression Test Suite `.agents/scripts/tests/test-headless-contract-escalation.sh`	Added comprehensive Bash test script (265 lines) validating contract injection behavior. Includes extraction helpers for contract text via regex, behavioral tests for marker presence/absence across prompt types, idempotency checks, and early-exit validation when `headless-runtime-helper.sh` is missing.

Sequence Diagram

sequenceDiagram
    participant W as Worker (Tier 1)
    participant M1 as Model: Sonnet
    participant M2 as Model: Opus
    participant GH as GitHub
    
    W->>M1: Attempt task with tier-1 model
    M1->>GH: Fetch PR/issue context
    GH-->>M1: Context data
    M1->>M1: Detect stall/unable to proceed
    
    alt Escalation Path
        M1-->>W: Signal escalation needed
        W->>M2: Re-invoke with --model opus-4-6
        M2->>GH: Fetch context (retained state)
        GH-->>M2: Context data
        M2->>M2: Attempt with higher capacity
        M2-->>W: Success or genuine blocker evidence
    else Genuine Blocker
        M2-->>W: Evidence persists (failing checks,<br/>missing permissions, conflicts)
        W->>W: Return BLOCKED (justified)
    end

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~22 minutes

Poem

🚀 When models hit a wall and stall,
Don't give up—escalate the call!
From Sonnet's reach to Opus's might,
Let higher tiers illuminate the night.
Only when all paths are spent,
Then claim the task is truly BLOCKED and bent.

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 23.08% which is insufficient. The required threshold is 80.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.

✅ Passed checks (4 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The title clearly summarizes the main change: requiring model escalation before BLOCKED in worker policy, with explicit issue reference.
Linked Issues check	✅ Passed	All acceptance criteria from GH#14964 are met: escalation requirement enforced in headless contract, failure classification defined, regression tests added, and full-loop updated.
Out of Scope Changes check	✅ Passed	All changes directly address GH#14964 requirements; no extraneous modifications detected outside scope of worker escalation policy.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

📝 Generate docstrings

Create stacked PR
Commit on current branch

🧪 Generate unit tests (beta)

Create PR with unit tests
Commit unit tests in branch feature/GH-14964-worker-escalation-policy

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

github-actions · 2026-04-01T17:18:37Z

🔍 Code Quality Report

�[0;35m[MONITOR]�[0m Code Review Monitoring Report

SonarCloud: 0 bugs, 0 vulnerabilities, 1 code smells

Wed Apr 1 17:18:32 UTC 2026: Code review monitoring started
Wed Apr 1 17:18:33 UTC 2026: SonarCloud - Bugs: 0, Vulnerabilities: 0, Code Smells: 1

📈 Current Quality Metrics

BUGS: 0
CODE SMELLS: 1
VULNERABILITIES: 0

Generated on: Wed Apr 1 17:18:35 UTC 2026

Generated by AI DevOps Framework Code Review Monitoring

sonarqubecloud · 2026-04-01T17:18:57Z

Quality Gate passed

Issues
0 New issues
0 Accepted issues

Measures
0 Security Hotspots
0.0% Coverage on New Code
0.0% Duplication on New Code

See analysis details on SonarQube Cloud

gemini-code-assist

Code Review

This pull request implements a mandatory model escalation policy (GH#14964) requiring workers to attempt higher-tier model execution (e.g., Sonnet to Opus) before exiting with a BLOCKED status. The changes update the worker efficiency protocol, full-loop command documentation, and the headless runtime helper script to enforce this behavior. Additionally, a new regression test suite, test-headless-contract-escalation.sh, has been introduced to verify that the escalation rules are correctly injected and maintained within the headless contract. I have no feedback to provide.

coderabbitai

🧹 Nitpick comments (1)

.agents/scripts/tests/test-headless-contract-escalation.sh (1)
75-114: Function extraction via Python subprocess is a pragmatic testing approach.

Extracting and running append_worker_headless_contract in isolation avoids sourcing the full script with its main() execution. The regex relies on the function ending with \n} at the outermost level, which works because the embedded heredoc ends with EOF, not a brace.

One note: if the function structure ever changes to include nested } at line start, the regex could truncate. Consider documenting this assumption or using a more robust AST-based extraction in the future if the function grows in complexity.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In @.agents/scripts/tests/test-headless-contract-escalation.sh around lines 75 -
114, The current extraction in _call_append_contract uses a regex in the Python
snippet to grab append_worker_headless_contract from HEADLESS_HELPER which can
fail if the function contains nested braces starting at line-begin; replace the
fragile regex approach by implementing a balanced-brace extractor (e.g., locate
the line that starts "append_worker_headless_contract()" then read lines while
counting "{" and "}" until the count returns to zero) in the Python block, or at
minimum add an inline comment documenting the assumption about no nested leading
"}" and why the regex is acceptable; reference the symbols
append_worker_headless_contract, _call_append_contract, and HEADLESS_HELPER when
making the change.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Nitpick comments:
In @.agents/scripts/tests/test-headless-contract-escalation.sh:
- Around line 75-114: The current extraction in _call_append_contract uses a
regex in the Python snippet to grab append_worker_headless_contract from
HEADLESS_HELPER which can fail if the function contains nested braces starting
at line-begin; replace the fragile regex approach by implementing a
balanced-brace extractor (e.g., locate the line that starts
"append_worker_headless_contract()" then read lines while counting "{" and "}"
until the count returns to zero) in the Python block, or at minimum add an
inline comment documenting the assumption about no nested leading "}" and why
the regex is acceptable; reference the symbols append_worker_headless_contract,
_call_append_contract, and HEADLESS_HELPER when making the change.

ℹ️ Review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 079ffa2d-1dde-4749-879c-3eb4cbf4ab9a

📥 Commits

Reviewing files that changed from the base of the PR and between 388455f and 69b75bd.

📒 Files selected for processing (5)

.agents/prompts/worker-efficiency-protocol.md
.agents/scripts/commands/full-loop.md
.agents/scripts/headless-runtime-helper.sh
.agents/scripts/tests/test-headless-contract-escalation.sh
.agents/tools/context/model-routing.md

codacy-production · 2026-04-01T17:23:54Z

Up to standards ✅

🟢 Issues 0 issues

Results:
0 new issues

View in Codacy

_{TIP This summary will be updated as you push new changes. Give us feedback}

marcusquinn

Auto-approved by pulse — collaborator PR (author: @alex-solovyev). All pre-merge checks passed.

marcusquinn · 2026-04-01T17:26:44Z

Completed via PR #15190, merged to main.

Merged by deterministic merge pass (pulse-wrapper.sh). No worker summary was available — the worker either crashed before writing one or this PR predates the merge summary convention.

alex-solovyev added the origin:worker Created from worker session label Apr 1, 2026

gemini-code-assist bot reviewed Apr 1, 2026

View reviewed changes

coderabbitai bot reviewed Apr 1, 2026

View reviewed changes

coderabbitai bot approved these changes Apr 1, 2026

View reviewed changes

marcusquinn approved these changes Apr 1, 2026

View reviewed changes

marcusquinn merged commit e76ed95 into main Apr 1, 2026
39 checks passed

marcusquinn deleted the feature/GH-14964-worker-escalation-policy branch April 1, 2026 17:26

marcusquinn mentioned this pull request Apr 1, 2026

Worker policy: require model escalation before BLOCKED or human handoff #14964

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

GH#14964: require model escalation before BLOCKED in worker policy#15190

GH#14964: require model escalation before BLOCKED in worker policy#15190
marcusquinn merged 1 commit intomainfrom
feature/GH-14964-worker-escalation-policy

alex-solovyev commented Apr 1, 2026 •

edited by coderabbitai bot

Loading

Uh oh!

coderabbitai bot commented Apr 1, 2026 •

edited

Loading

❌ Failed checks (1 warning)

Uh oh!

github-actions bot commented Apr 1, 2026

Uh oh!

sonarqubecloud bot commented Apr 1, 2026

Uh oh!

gemini-code-assist bot left a comment

Uh oh!

coderabbitai bot left a comment

Uh oh!

codacy-production bot commented Apr 1, 2026

Uh oh!

marcusquinn left a comment

Uh oh!

Uh oh!

marcusquinn commented Apr 1, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

alex-solovyev commented Apr 1, 2026 • edited by coderabbitai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Changes

Acceptance Criteria

Runtime Testing

Summary by CodeRabbit

Uh oh!

coderabbitai bot commented Apr 1, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Sequence Diagram

Estimated code review effort

Poem

❌ Failed checks (1 warning)

Uh oh!

github-actions bot commented Apr 1, 2026

🔍 Code Quality Report

📈 Current Quality Metrics

Uh oh!

sonarqubecloud bot commented Apr 1, 2026

Quality Gate passed

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

codacy-production bot commented Apr 1, 2026

Up to standards ✅

Uh oh!

marcusquinn left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

marcusquinn commented Apr 1, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

alex-solovyev commented Apr 1, 2026 •

edited by coderabbitai bot

Loading

coderabbitai bot commented Apr 1, 2026 •

edited

Loading