Skip to content

GH#14964: require model escalation before BLOCKED in worker policy#15190

Merged
marcusquinn merged 1 commit intomainfrom
feature/GH-14964-worker-escalation-policy
Apr 1, 2026
Merged

GH#14964: require model escalation before BLOCKED in worker policy#15190
marcusquinn merged 1 commit intomainfrom
feature/GH-14964-worker-escalation-policy

Conversation

@alex-solovyev
Copy link
Copy Markdown
Collaborator

@alex-solovyev alex-solovyev commented Apr 1, 2026

Summary

Implements the worker escalation policy required by GH#14964: workers must attempt model escalation before exiting BLOCKED or handing off to a human.

Changes

  • .agents/prompts/worker-efficiency-protocol.md — Added section 4 "Model escalation before BLOCKED" with decision matrix and escalation pattern
  • .agents/scripts/commands/full-loop.md — Added rule 9 to headless dispatch rules: BLOCKED only valid after escalation attempt
  • .agents/scripts/headless-runtime-helper.sh — Added rule 6 to the HEADLESS_CONTINUATION_CONTRACT_V1 injected into every worker prompt
  • .agents/tools/context/model-routing.md — Extended "Failure-Based Escalation" section with BLOCKED policy reference
  • .agents/scripts/tests/test-headless-contract-escalation.sh — 8 regression tests verifying contract content and injection behavior

Acceptance Criteria

  • [DONE] Headless worker contract updated to require escalation before BLOCKED
  • [DONE] full-loop.md headless dispatch rules updated
  • [DONE] worker-efficiency-protocol.md has escalation decision matrix
  • [DONE] model-routing.md cross-references the new policy
  • [DONE] Regression tests added and passing (8/8)

Runtime Testing

Risk level: Low — documentation/prompt changes only, no runtime code paths changed. The test script validates contract content statically.

Test results:

PASS contract heredoc includes GH#14964 escalation reference
PASS contract heredoc includes model escalation requirement text
PASS contract heredoc includes rule 6 (escalation)
PASS contract not injected when AIDEVOPS_HEADLESS_APPEND_CONTRACT=0
PASS contract not injected for non-/full-loop prompts
PASS contract injection is idempotent (not duplicated)
PASS contract distinguishes genuine blockers from invalid ones
PASS contract injected for /full-loop prompts

Ran 8 tests, 0 failed.

Closes #14964


aidevops.sh v3.5.555 plugin for OpenCode v1.3.13 with claude-sonnet-4-6

Summary by CodeRabbit

  • Documentation

    • Updated internal protocol documentation and model escalation guidance.
  • Tests

    • Added regression test suite for model escalation behavior validation.

Note: This release contains internal infrastructure and documentation updates with no visible changes to end-user functionality.

…964)

- Add rule 6 to headless continuation contract: workers must attempt model
  escalation (sonnet → opus) before exiting BLOCKED
- Update full-loop.md headless dispatch rules with escalation requirement
- Add escalation decision matrix to worker-efficiency-protocol.md
- Clarify worker BLOCKED policy in model-routing.md failure-based escalation
- Add regression test suite (8 tests) verifying contract content and injection

Review-policy metadata, nominal GitHub states, and lower-tier model limits
are not valid blockers on their own. Genuine blockers require evidence that
persists after escalation.

Closes #14964
@alex-solovyev alex-solovyev added the origin:worker Created from worker session label Apr 1, 2026
@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai bot commented Apr 1, 2026

Walkthrough

This pull request enforces a mandatory worker policy requiring model escalation before marking tasks as BLOCKED. Changes span protocol documentation, command specifications, script contract injection, model routing guidance, and new regression tests to validate escalation behavior.

Changes

Cohort / File(s) Summary
Policy Documentation
.agents/prompts/worker-efficiency-protocol.md, .agents/scripts/commands/full-loop.md, .agents/tools/context/model-routing.md
Updated worker policy to mandate model escalation before exiting BLOCKED. Clarifies that review-policy metadata, nominal GitHub states, and lower-tier model limits are not standalone blockers. Adds decision matrix mapping specific failure scenarios (model stalls, auth errors, failing checks) to escalation vs. immediate block determination.
Runtime Contract Injection
.agents/scripts/headless-runtime-helper.sh
Injected new mandatory rule #6 into the headless worker contract requiring escalation attempt (sonnet → opus via --model anthropic/claude-opus-4-6) before returning BLOCKED outcome. Contract rule specifies that genuine blockers require evidence persisting after escalation.
Regression Test Suite
.agents/scripts/tests/test-headless-contract-escalation.sh
Added comprehensive Bash test script (265 lines) validating contract injection behavior. Includes extraction helpers for contract text via regex, behavioral tests for marker presence/absence across prompt types, idempotency checks, and early-exit validation when headless-runtime-helper.sh is missing.

Sequence Diagram

sequenceDiagram
    participant W as Worker (Tier 1)
    participant M1 as Model: Sonnet
    participant M2 as Model: Opus
    participant GH as GitHub
    
    W->>M1: Attempt task with tier-1 model
    M1->>GH: Fetch PR/issue context
    GH-->>M1: Context data
    M1->>M1: Detect stall/unable to proceed
    
    alt Escalation Path
        M1-->>W: Signal escalation needed
        W->>M2: Re-invoke with --model opus-4-6
        M2->>GH: Fetch context (retained state)
        GH-->>M2: Context data
        M2->>M2: Attempt with higher capacity
        M2-->>W: Success or genuine blocker evidence
    else Genuine Blocker
        M2-->>W: Evidence persists (failing checks,<br/>missing permissions, conflicts)
        W->>W: Return BLOCKED (justified)
    end
Loading

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~22 minutes

Poem

🚀 When models hit a wall and stall,
Don't give up—escalate the call!
From Sonnet's reach to Opus's might,
Let higher tiers illuminate the night.
Only when all paths are spent,
Then claim the task is truly BLOCKED and bent.

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 23.08% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title clearly summarizes the main change: requiring model escalation before BLOCKED in worker policy, with explicit issue reference.
Linked Issues check ✅ Passed All acceptance criteria from GH#14964 are met: escalation requirement enforced in headless contract, failure classification defined, regression tests added, and full-loop updated.
Out of Scope Changes check ✅ Passed All changes directly address GH#14964 requirements; no extraneous modifications detected outside scope of worker escalation policy.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch feature/GH-14964-worker-escalation-policy

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@github-actions
Copy link
Copy Markdown
Contributor

github-actions bot commented Apr 1, 2026

🔍 Code Quality Report

�[0;35m[MONITOR]�[0m Code Review Monitoring Report

SonarCloud: 0 bugs, 0 vulnerabilities, 1 code smells

Wed Apr 1 17:18:32 UTC 2026: Code review monitoring started
Wed Apr 1 17:18:33 UTC 2026: SonarCloud - Bugs: 0, Vulnerabilities: 0, Code Smells: 1

📈 Current Quality Metrics

  • BUGS: 0
  • CODE SMELLS: 1
  • VULNERABILITIES: 0

Generated on: Wed Apr 1 17:18:35 UTC 2026


Generated by AI DevOps Framework Code Review Monitoring

@sonarqubecloud
Copy link
Copy Markdown

sonarqubecloud bot commented Apr 1, 2026

Copy link
Copy Markdown

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request implements a mandatory model escalation policy (GH#14964) requiring workers to attempt higher-tier model execution (e.g., Sonnet to Opus) before exiting with a BLOCKED status. The changes update the worker efficiency protocol, full-loop command documentation, and the headless runtime helper script to enforce this behavior. Additionally, a new regression test suite, test-headless-contract-escalation.sh, has been introduced to verify that the escalation rules are correctly injected and maintained within the headless contract. I have no feedback to provide.

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick comments (1)
.agents/scripts/tests/test-headless-contract-escalation.sh (1)

75-114: Function extraction via Python subprocess is a pragmatic testing approach.

Extracting and running append_worker_headless_contract in isolation avoids sourcing the full script with its main() execution. The regex relies on the function ending with \n} at the outermost level, which works because the embedded heredoc ends with EOF, not a brace.

One note: if the function structure ever changes to include nested } at line start, the regex could truncate. Consider documenting this assumption or using a more robust AST-based extraction in the future if the function grows in complexity.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In @.agents/scripts/tests/test-headless-contract-escalation.sh around lines 75 -
114, The current extraction in _call_append_contract uses a regex in the Python
snippet to grab append_worker_headless_contract from HEADLESS_HELPER which can
fail if the function contains nested braces starting at line-begin; replace the
fragile regex approach by implementing a balanced-brace extractor (e.g., locate
the line that starts "append_worker_headless_contract()" then read lines while
counting "{" and "}" until the count returns to zero) in the Python block, or at
minimum add an inline comment documenting the assumption about no nested leading
"}" and why the regex is acceptable; reference the symbols
append_worker_headless_contract, _call_append_contract, and HEADLESS_HELPER when
making the change.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Nitpick comments:
In @.agents/scripts/tests/test-headless-contract-escalation.sh:
- Around line 75-114: The current extraction in _call_append_contract uses a
regex in the Python snippet to grab append_worker_headless_contract from
HEADLESS_HELPER which can fail if the function contains nested braces starting
at line-begin; replace the fragile regex approach by implementing a
balanced-brace extractor (e.g., locate the line that starts
"append_worker_headless_contract()" then read lines while counting "{" and "}"
until the count returns to zero) in the Python block, or at minimum add an
inline comment documenting the assumption about no nested leading "}" and why
the regex is acceptable; reference the symbols append_worker_headless_contract,
_call_append_contract, and HEADLESS_HELPER when making the change.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 079ffa2d-1dde-4749-879c-3eb4cbf4ab9a

📥 Commits

Reviewing files that changed from the base of the PR and between 388455f and 69b75bd.

📒 Files selected for processing (5)
  • .agents/prompts/worker-efficiency-protocol.md
  • .agents/scripts/commands/full-loop.md
  • .agents/scripts/headless-runtime-helper.sh
  • .agents/scripts/tests/test-headless-contract-escalation.sh
  • .agents/tools/context/model-routing.md

@codacy-production
Copy link
Copy Markdown

Up to standards ✅

🟢 Issues 0 issues

Results:
0 new issues

View in Codacy

TIP This summary will be updated as you push new changes. Give us feedback

Copy link
Copy Markdown
Owner

@marcusquinn marcusquinn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Auto-approved by pulse — collaborator PR (author: @alex-solovyev). All pre-merge checks passed.

@marcusquinn marcusquinn merged commit e76ed95 into main Apr 1, 2026
39 checks passed
@marcusquinn
Copy link
Copy Markdown
Owner

Completed via PR #15190, merged to main.

Merged by deterministic merge pass (pulse-wrapper.sh). No worker summary was available — the worker either crashed before writing one or this PR predates the merge summary convention.

@marcusquinn marcusquinn deleted the feature/GH-14964-worker-escalation-policy branch April 1, 2026 17:26
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

origin:worker Created from worker session

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Worker policy: require model escalation before BLOCKED or human handoff

2 participants