feat(core): fail impossible goals by qqqys · Pull Request #4230 · QwenLM/qwen-code

qqqys · 2026-05-17T05:44:59Z

Summary

What changed: add an impossible judge outcome for /goal, terminate that goal as failed, and render/restore the failed terminal state in the CLI.
Why it changed: impossible goals should clear instead of looping until the iteration cap.
Reviewer focus: judge prompt/schema semantics and failed terminal UI wording.

Validation

Commands run:

cd packages/core && npx vitest run src/goals/goalJudge.test.ts src/goals/goalHook.test.ts
cd packages/cli && npx vitest run src/ui/components/messages/GoalStatusMessage.test.tsx src/ui/commands/goalCommand.test.ts src/ui/utils/restoreGoal.test.ts
npm run build
npm run typecheck
git diff --check
Observed result: targeted tests passed, build passed, typecheck passed. npm run build reported an existing lint warning in packages/vscode-ide-companion/src/utils/editorGroupUtils.ts.

Scope / Risk

Main risk or tradeoff: the judge prompt may classify a goal as impossible too aggressively; the prompt limits this to genuinely unachievable cases.
Not covered / not validated: no interactive TUI screenshot/video was captured.
Breaking changes / migration notes: none expected.

Linked Issues / Bugs

Related #4228

github-actions · 2026-05-17T05:46:34Z

📋 Review Summary

This PR introduces a new "failed" terminal state for goals that are judged as genuinely impossible to achieve, preventing the agent from looping until the iteration cap. The implementation adds an impossible flag to the judge's JSON response schema, updates the goal hook to handle this verdict, and renders the failed state in the CLI UI. The change is well-scoped and addresses a real usability issue where impossible goals would otherwise burn iterations unnecessarily.

🔍 General Feedback

Clean separation of concerns: The core logic (judge schema + hook callback) is cleanly separated from UI rendering, making the change easy to trace across packages.
Good test coverage: Tests cover the new failed kind in both the judge parsing and the UI components, including the restore/clear behavior.
Consistent with existing patterns: The failed kind follows the same structure as achieved/aborted, minimizing cognitive load for future maintainers.
Risk acknowledged: The PR description correctly identifies the main risk (over-aggressive classification) and mitigates it with clear prompt guidance.

🎯 Specific Feedback

🟡 High

File: packages/core/src/goals/goalJudge.ts:34-40 — The impossible flag is added to RESPONSE_SCHEMA but marked as optional in required array. This is correct for backward compatibility, but consider adding validation that impossible: true can only appear when ok: false. Currently, a malformed response like {ok: true, impossible: true} would pass through without warning. Consider adding:
```
if (impossible && ok) {
  debugLogger.debug('Judge returned impossible=true with ok=true; ignoring impossible flag');
  return { ok, reason: reasonText };
}
```
File: packages/core/src/goals/goalHook.ts:187-197 — The impossible verdict path calls finishGoal with kind: 'failed', but doesn't include a systemMessage field like the aborted path does (line 206). This could lead to inconsistent behavior if downstream code expects systemMessage on all terminal events. Consider whether verdict.reason should also be passed as systemMessage for parity, or document why it's not needed.

🟢 Medium

File: packages/cli/src/ui/components/messages/GoalStatusMessage.tsx:84-90 — The new failed case returns the same prefix: '!' and prefixColor: theme.status.error as the warning style. Consider whether failed should use a distinct visual treatment from aborted (warning) since they represent semantically different outcomes: "couldn't be done" vs. "gave up trying." If the current design is intentional, a brief comment explaining why both share the ! prefix would help future maintainers.
File: packages/cli/src/ui/utils/restoreGoal.ts:23-34 — The findGoalToRestore function now correctly returns null for failed goals (tested in line 102-110 of the test file), but the JSDoc comment still says "(achieved / cleared / aborted)" without mentioning failed. Update the comment to:
```
* or `null` if the last goal_status was terminal/cancelled
* (achieved / failed / cleared / aborted) or none exists.
```
File: packages/cli/src/ui/utils/restoreGoal.ts:47-58 — Similarly, findLastTerminalGoal JSDoc says "(achieved / aborted)" but should include failed:
```
* Finds the most recent terminal (achieved / failed / aborted) goal_status item in
```

🔵 Low

File: packages/core/src/goals/goalJudge.ts:58-62 — The updated prompt guidance for impossible is thorough, but the phrase "the transcript confirms there is no path forward" could be tightened. Consider adding an explicit example of what not to classify as impossible (e.g., "Do not use this for goals that are merely difficult or time-consuming").
File: packages/cli/src/ui/commands/goalCommand.ts:60-67 — The formatTerminalSummary function now handles failed in the title logic, but the variable name title could be more descriptive since it's used in both the checking branch (line 47) and the terminal branch. Consider renaming to cardTitle or statusTitle for clarity.
File: packages/core/src/goals/activeGoalStore.ts:73 — The type definition GoalTerminalKind = 'achieved' | 'aborted' | 'failed' is updated correctly, but consider adding a brief JSDoc comment explaining the semantic difference between aborted (system gave up) and failed (genuinely impossible) for future contributors who encounter this type.

✅ Highlights

Excellent test coverage: The new failed kind is tested across all layers — judge parsing (goalJudge.test.ts), hook behavior (goalHook.test.ts), UI rendering (GoalStatusMessage.test.tsx), command behavior (goalCommand.test.ts), and restore logic (restoreGoal.test.ts). This multi-layer testing approach is exemplary.
Thoughtful schema evolution: The impossible field is added as an optional boolean with backward-compatible parsing logic. The parseJudgeReply function (lines 336-343 in goalJudge.ts) correctly handles the new field without breaking existing judge responses.
Clear UI differentiation: The GoalStatusMessage component now renders "Goal could not be achieved" for failed vs. "Goal aborted" for iteration caps, giving users immediate visibility into why their goal stopped.
Proper state cleanup: The failed kind correctly prevents goal restoration on session resume (tested in restoreGoal.test.ts:102-110), which is the right behavior since impossible goals shouldn't auto-restart.

wenshao

🔍 Code Review — PR #4230

Scope: 13 files (+243/-29) | Build: ✅ | Tests: 76/76 pass | Typecheck: ✅ | Lint: ✅

This PR adds an "impossible" terminal state for /goal, allowing the judge to signal that a goal is genuinely unachievable. The implementation is clean and well-structured. I found 1 critical issue, 2 suggestions, and 2 minor comment updates.

🔴 Critical — Prompt injection via `judgeReason` (goalHook.ts:48-57)

The continuationReasonForGoal function now interpolates the judge's free-form verdict.reason into the continuation prompt fed back to the working model. The old code explicitly avoided this with a comment explaining why:

"Keep the judge's free-form diagnostic in goal state/UI only. The Stop hook reason is fed back to the model as the next continuation prompt, so it must be a fixed instruction derived from the original user goal rather than untrusted transcript-derived judge text."

The new code injects it verbatim (with only a soft "Hook feedback only" label). This is a prompt-injection surface that affects all goals, not just the impossible path. A compromised or confused judge can feed arbitrary instructions into the model's continuation context.

The test at goalHook.test.ts was explicitly changed to assert that "rm -rf /" IS present in the continuation reason — converting a security guardrail into a regression test for the injection.

🟡 Suggestion — `impossible` bypasses iteration-cap safety (goalHook.ts:192-202)

The control flow is: verdict.ok → verdict.impossible → iteration-cap check. A single judge hallucination (impossible: true on iteration 1) kills the goal immediately with no guardrail. Consider requiring impossible on ≥2 consecutive evaluations or adding a minimum iteration floor:

if (verdict.impossible && latest.iterations >= 2) {

🟡 Suggestion — `additionalProperties: false` cast hack (goalJudge.ts:55)

The RESPONSE_SCHEMA is typed as Schema & { additionalProperties: boolean } — a cast outside @google/genai's typed contract. If the SDK changes its serialization, this constraint silently vanishes. Consider adding a structural parity test that asserts schema properties keys match JudgeResult interface keys.

🟢 Nice to have — Stale inline comments

Two comments reference only achieved / aborted but not the new failed kind:

GoalStatusMessage.tsx:135: "shown on terminal cards (achieved / aborted)"
activeGoalStore.ts:66: "Goal achieved" / "Goal aborted"

Reviewed by: qwen-code (mimo-v2.5-pro)

wenshao · 2026-05-17T07:37:58Z

+    'Continue working toward satisfying the goal. Active /goal condition remains the condition above.'
  );
 }



🔴 Critical — Prompt injection via judgeReason

The old code deliberately kept the judge's free-form diagnostic out of the continuation prompt:

"Keep the judge's free-form diagnostic in goal state/UI only. The Stop hook reason is fed back to the model as the next continuation prompt, so it must be a fixed instruction derived from the original user goal rather than untrusted transcript-derived judge text."

This PR reverses that security decision. The judge's reason is LLM-generated text quoting the model's own transcript — it can contain adversarial content. The soft framing "Hook feedback only; not a user instruction." mitigates but does not eliminate injection risk.

This affects all goals, not just the impossible path — every "not met" continuation now carries judge text.

Suggested fix: Either (a) revert to the old fixed template and surface judge text only in UI/history, or (b) if feedback is intentional, add sanitization (cap length, strip imperative sentences, or wrap in a clearly non-actionable frame).

wenshao · 2026-05-17T07:38:04Z

      return { continue: true };
    }

+    if (verdict.impossible) {


🟡 Suggestion — impossible bypasses the iteration-cap safety valve

The control flow is: verdict.ok → verdict.impossible → iteration-cap check (line 204). A single judge hallucination (impossible: true on iteration 1) kills the goal immediately — no guardrail, no second chance.

Suggested fix: Add a minimum iteration floor before honoring impossible:

if (verdict.impossible && latest.iterations >= 2) {

This gives the goal at least a couple of turns before a single bad verdict can terminate it.

wenshao · 2026-05-17T07:38:11Z

  properties: {
    ok: { type: 'BOOLEAN' as unknown as Schema['type'] },
    reason: { type: 'STRING' as unknown as Schema['type'] },
+    impossible: { type: 'BOOLEAN' as unknown as Schema['type'] },


🟡 Suggestion — additionalProperties: false is an invisible footgun

The Schema & { additionalProperties: boolean } cast is outside @google/genai's typed contract. If the SDK changes its serialization to strip unknown fields, this constraint silently vanishes — tests pass because the mock doesn't go through the SDK.

Also, if someone adds a new field to the judge response (e.g. confidence), they must remember to add it to RESPONSE_SCHEMA.properties too. Forgetting means the API silently strips the new field with no compile-time signal.

Suggested fix: Add a unit test asserting the schema's properties keys match the JudgeResult interface keys — a structural parity check that catches future drift.

wenshao

No new issues found in the incremental review. LGTM! ✅ — gpt-5.5 via Qwen Code /review

wenshao · 2026-05-17T11:56:14Z


+    if (
+      verdict.impossible &&
+      latest.iterations >= MIN_IMPOSSIBLE_GOAL_ITERATIONS


[Suggestion] Missing debug log when impossible floor suppresses verdict

When the judge returns impossible: true but iterations < MIN_IMPOSSIBLE_GOAL_ITERATIONS, the verdict is silently deferred with no log. The max-iterations path (line 207) has a debugLogger.debug call; this suppression path has none. At 3 AM, an operator would see the judge's "impossible" reason in lastReason but have no log explaining why the system kept looping.

Suggested change

latest.iterations >= MIN_IMPOSSIBLE_GOAL_ITERATIONS

if (verdict.impossible) {

if (latest.iterations >= MIN_IMPOSSIBLE_GOAL_ITERATIONS) {

finishGoal(config, sessionId, latest, {

kind: 'failed',

condition: latest.condition,

iterations: latest.iterations,

durationMs: Date.now() - latest.setAt,

lastReason: verdict.reason,

});

return { continue: true };

}

debugLogger.debug(

`Impossible verdict suppressed: iterations=${latest.iterations} < floor=${MIN_IMPOSSIBLE_GOAL_ITERATIONS}; continuing loop.`,

);

}

— mimo-v2.5-pro via Qwen Code /review

wenshao · 2026-05-17T11:56:14Z

 export const GOAL_JUDGE_TIMEOUT_MS = 25_000;
 export const GOAL_HOOK_TIMEOUT_SECONDS = 30;
 export const GOAL_HOOK_TIMEOUT_MS = GOAL_HOOK_TIMEOUT_SECONDS * 1000;
+export const MIN_IMPOSSIBLE_GOAL_ITERATIONS = 2;


[Suggestion] MIN_IMPOSSIBLE_GOAL_ITERATIONS lacks documentation

The exported constant has no JSDoc explaining its rationale. The value "2" is a policy decision — it means the model gets one continuation turn after the first impossible verdict. A future maintainer encountering "my impossible goal took 3 turns to stop" will have to reverse-engineer the control flow.

Suggested change

export const MIN_IMPOSSIBLE_GOAL_ITERATIONS = 2;

/**

* Minimum iteration count before an `impossible` judge verdict is accepted.

* Gives the model at least one continuation turn after the judge first flags

* impossibility, guarding against premature failure on a single bad-judgment

* turn. The goal terminates on the second consecutive impossible verdict.

*/

export const MIN_IMPOSSIBLE_GOAL_ITERATIONS = 2;

— mimo-v2.5-pro via Qwen Code /review

wenshao · 2026-05-17T12:19:39Z

+the condition is genuinely unachievable rather than deferring to the assistant's
+self-assessment. Do not use it just because progress is slow or evidence is
+currently missing. When in doubt, return {"ok": false} without "impossible".`;



Good prompt guardrails. "The assistant claiming the goal is impossible is evidence, not proof" is the key line — it prevents the model from self-certifying impossibility to escape a difficult goal. The "when in doubt, return ok:false without impossible" fallback is also the right default.

wenshao · 2026-05-17T12:42:35Z

+
+export const JUDGE_RESULT_SCHEMA_KEYS = [
+  'ok',
+  'reason',


[Suggestion] JudgeResult.impossible has undocumented mutual exclusion with ok

parseJudgeReply strips impossible when ok is true:

...(impossible && !ok ? { impossible: true } : {}),

The test covers this (ignores impossible=true when the judge also reports ok=true), but the JudgeResult interface doesn't document that impossible is only meaningful when ok is false. A future code path checking verdict.impossible without first verifying !verdict.ok could misclassify an achieved goal as impossible.

Suggested change

'reason',

/**

* Whether the goal is genuinely impossible in this session.

* Only meaningful when `ok` is `false`. If `ok` is `true`,

* this field is always absent.

*/

impossible?: boolean;

— DeepSeek/deepseek-v4-pro via Qwen Code /review

wenshao · 2026-05-17T12:42:42Z

@@ -126,7 +132,8 @@ export const GoalStatusMessage: React.FC<GoalStatusMessageProps> = ({
            <Text wrap="wrap">{condition}</Text>
          </Box>


[Suggestion] Switch default silently absorbs future GoalStatusKind variants

The case 'aborted': + default: share one return block. This PR correctly adds case 'failed': before the default, but if GoalStatusKind gains another variant later, TypeScript won't catch the missing case — the new kind will silently render as "Goal aborted" with a warning icon.

Suggested change

</Box>

case 'aborted':

return {

prefix: '!',

prefixColor: theme.status.warning,

title: 'Goal aborted',

};

// No default — let TypeScript enforce exhaustiveness.

— DeepSeek/deepseek-v4-pro via Qwen Code /review

wenshao · 2026-05-17T12:42:49Z

+      return { continue: true };
+    }
+
    // Give the latest assistant output one final evaluation before aborting.


[Suggestion] Missing debugLogger.debug() on the actual failed termination path

The suppressed-impossible path (below floor) has a debugLogger.debug() call, which is great. But the actual termination path where the goal IS marked as failed (iterations ≥ MIN) calls finishGoal without any debug log. At 3 AM, if a user reports their goal was prematurely marked impossible, there's no structured evidence that the judge returned impossible: true — only the freeform lastReason text in UI history.

Suggested change

// Give the latest assistant output one final evaluation before aborting.

debugLogger.debug(

`Goal judge ruled impossible; clearing goal.`,

{ reason: verdict.reason, iterations: latest.iterations },

);

finishGoal(config, sessionId, latest, {

— DeepSeek/deepseek-v4-pro via Qwen Code /review

wenshao

Additional notes (could not be mapped to specific diff lines)

Suggestion — Removed prompt injection safety comment. The PR removed the comment explaining why the continuation prompt must use continuationReasonForGoal(condition) instead of verdict.reason: verdict.reason is untrusted model-generated text and feeding it back as the continuation prompt would be a prompt injection vector. The safety property is encoded in the runtime string but the design rationale is lost. Consider restoring the comment or moving it to continuationReasonForGoal's JSDoc.

Suggestion — Missing test for impossible precedence over MAX_GOAL_ITERATIONS. The code correctly checks impossible before the iteration cap, so failed takes precedence over aborted when both conditions are true. But no test verifies this — if the check order is accidentally reversed in a future refactor, the behavior would silently regress.

Suggestion — Missing integration test for impossible→failed pipeline. Only unit-level tests with mocked judgeGoal cover the new path. An integration test through the real HookSystem would catch bugs in how judgeGoalWithTimeout interacts with the impossible verdict (e.g., if the timeout wrapper accidentally swallows impossible).

Verdict: Comment — no Critical issues. Build ✅, tests 77/77 ✅.

— DeepSeek/deepseek-v4-pro via Qwen Code /review

wenshao · 2026-05-17T12:48:43Z

+    : never;
+
+export const JUDGE_RESULT_SCHEMA_COVERS_INTERFACE: SchemaCoversJudgeResult = true;
+


Nit: this exported constant is never read at runtime — it exists solely to trigger the SchemaCoversJudgeResult compile-time check. Consider making it unexported (just const) or adding a brief // compile-time only annotation, since it currently shows up in the public API surface but has no runtime consumers.

wenshao · 2026-05-17T12:49:05Z

+export const JUDGE_RESULT_SCHEMA_COVERS_INTERFACE: SchemaCoversJudgeResult = true;
+
+const RESPONSE_SCHEMA: Schema & { additionalProperties: boolean } = {
  // Schema typing in @google/genai uses an enum-like Type, but accepts the


The additionalProperties field is bolted on via Schema & { additionalProperties: boolean } because @google/genai's Schema type does not include it. This works but is fragile — if the upstream type adds the property with a different shape, the intersection could silently conflict. Consider a // @ts-expect-error or a short comment noting the upstream gap so future readers understand the intent.

wenshao · 2026-05-17T12:49:09Z

            line). One Text + natural wrap keeps the continuation flush. */}
-        {(kind === 'achieved' || kind === 'aborted') && lastReason?.trim() ? (
+        {(kind === 'achieved' || kind === 'aborted' || kind === 'failed') &&
+        lastReason?.trim() ? (


This disjunction is growing — it now matches all terminal kinds except set, cleared, and checking. The same pattern appears in restoreGoal.ts:findLastTerminalGoal and goalCommand.ts:formatTerminalSummary. Consider extracting a shared isTerminalGoalKind() helper or a TERMINAL_GOAL_KINDS set so adding another terminal kind in the future only requires one change instead of N.

wenshao · 2026-05-17T12:49:12Z

-    return { decision: 'block', reason: continuationReasonForGoal(condition) };
+    return {
+      decision: 'block',
+      reason: continuationReasonForGoal(condition),


The old code had a multi-line comment here explaining why the stop hook reason uses fixed text (continuationReasonForGoal) instead of passing through the judge's free-form diagnostic. That security rationale (untrusted transcript-derived text must not flow into the model's continuation prompt) is non-obvious and was removed in this reformatting. Worth preserving — new contributors may wonder why verdict.reason isn't included in the hook response.

wenshao · 2026-05-17T13:33:05Z

+        durationMs: Date.now() - latest.setAt,
+        lastReason: verdict.reason,
+      });
+      return { continue: true };


[Suggestion] failed terminal state missing systemMessage

The failed path returns { continue: true } with no systemMessage, while the aborted path (line 222) returns { continue: true, systemMessage: GOAL_ABORTED_REASON }. After a failed outcome, the user gets no explicit hint to rephrase and re-attempt the goal — unlike for aborted goals which include actionable guidance ("Re-set with /goal <condition> if you still need it").

Consider adding a GOAL_FAILED_REASON constant and including it as systemMessage in both the finishGoal call and the return, matching the aborted pattern:

const GOAL_FAILED_REASON = 'Goal determined to be impossible; cleared. Re-phrase and re-set with `/goal <condition>` to try a different approach.';

— glm-5.1 via Qwen Code /review

wenshao · 2026-05-17T13:33:05Z

+
+// Compile-time only: fails if JudgeResult grows a key that the response schema
+// key list does not include.
+const JUDGE_RESULT_SCHEMA_COVERS_INTERFACE: SchemaCoversJudgeResult = true;


[Suggestion] Compile-time assertion is exported but never imported

JUDGE_RESULT_SCHEMA_COVERS_INTERFACE is exported but never imported anywhere in the codebase. Its sole purpose is to trigger a compile-time type error when JUDGE_RESULT_SCHEMA_KEYS diverges from JudgeResult. An aggressive "remove unused exports" lint rule or manual cleanup could delete it, silently breaking the compile-time safety net.

Consider either:

Keeping it unexported with a void statement (which the diff already has) — drop the export keyword, or

Adding a comment explaining why it must be kept.

— glm-5.1 via Qwen Code /review

wenshao

No new issues found. Existing 7 open review comments already cover all concerns — no additional findings in this pass. LGTM! ✅ — DeepSeek/deepseek-v4-pro via Qwen Code /review

wenshao · 2026-05-17T14:12:53Z

+    ok,
+    reason: reasonText,
+    ...(impossible && !ok ? { impossible: true } : {}),
+  };


[Suggestion] parseJudgeReply uses strict === true for impossible

const impossible = (payload as { impossible?: unknown }).impossible === true;

If the model returns "true" (string) or 1 (number), the verdict is silently treated as not-impossible and the goal continues looping. The structured-output schema requires boolean, but parseJudgeReply already has fallback parsing for malformed JSON (bracket-search in start/end), suggesting the model doesn't always respect the schema.

Consider adding a tolerance test for non-boolean impossible values and optionally using Boolean() coercion to be defensive.

Suggested change

};

const impossible = Boolean((payload as { impossible?: unknown }).impossible);

— DeepSeek/deepseek-v4-pro via Qwen Code /review

wenshao · 2026-05-17T14:12:54Z

@@ -81,13 +85,20 @@ export const GoalStatusMessage: React.FC<GoalStatusMessageProps> = ({
          prefixColor: theme.text.secondary,
          title: 'Goal cleared',


[Suggestion] GoalStatusMessage is a pure functional component but not wrapped in React.memo

In the Ink/React rendering model, every streaming event triggers a parent re-render, causing all historical GoalStatusMessage instances in the scrollback to re-render even though their props (kind, condition, iterations, durationMs, lastReason) are immutable once the card is rendered.

Suggested change

title: 'Goal cleared',

export const GoalStatusMessage: React.FC<GoalStatusMessageProps> = React.memo(({

— DeepSeek/deepseek-v4-pro via Qwen Code /review

wenshao · 2026-05-17T14:25:08Z

  | 'checking';

+export const TERMINAL_GOAL_STATUS_KINDS = [
+  'achieved',


Nice. The satisfies readonly GoalStatusKind[] constraint plus isTerminalGoalStatusKind type guard means adding a new terminal kind to GoalTerminalKind without adding it here produces a type error at the assertNever call sites. Single source of truth for terminal-kind membership — exactly what I was suggesting.

wenshao · 2026-05-17T14:25:19Z

@@ -209,9 +239,12 @@ export function createGoalStopHookCallback(args: {
    recordGoalIteration(sessionId, verdict.reason);
    // Keep the judge's free-form diagnostic in goal state/UI only. The Stop
    // hook reason is fed back to the model as the next continuation prompt, so


Thanks for restoring this. The security rationale (untrusted judge text must not flow into the continuation prompt) is non-obvious and worth preserving for future contributors.

wenshao · 2026-05-17T14:25:29Z

+// Compile-time only: fails if JudgeResult grows a key that the response schema
+// key list does not include.
+const JUDGE_RESULT_SCHEMA_COVERS_INTERFACE: SchemaCoversJudgeResult = true;
+void JUDGE_RESULT_SCHEMA_COVERS_INTERFACE;


Clean fix — unexported with void to suppress the unused-variable warning. The // Compile-time only comment makes the intent immediately clear.

wenshao · 2026-05-17T15:34:07Z

Local validation report

Verified on 9f2430a in a clean checkout.

Unit / typecheck

Command	Result
`npx vitest run packages/core/src/goals/`	46/46 passed across 4 files
`npm run typecheck --workspace packages/core`	exit 0
`npx vitest run packages/cli/src/ui/commands/goalCommand.test.ts packages/cli/src/ui/components/messages/GoalStatusMessage.test.tsx packages/cli/src/ui/utils/restoreGoal.test.ts`	37/37 passed across 3 files
`npm run typecheck --workspace packages/cli`	exit 0

Spot check

parseJudgeReply impossible-detection (goalJudge.ts:374) — the strict === true works when the structured schema is honored. The post-approval [Suggestion] flagged that string "true" or number 1 from a misbehaving model would fall through silently to "not impossible." Non-blocking but worth folding into a follow-up that widens the predicate or asserts the schema before the cast.
isTerminalGoalStatusKind (types.ts:513) — satisfies readonly GoalStatusKind[] + the type guard make GoalTerminalKind a single source of truth; adding a new terminal kind without registering it here produces a tsc error at the assertNever call sites. Confirmed by toggling the array locally — npm run typecheck fails as expected if the constant drifts.
goalHook.ts:241 security-rationale comment about not letting untrusted judge text flow into the continuation prompt — preserved as requested.
GoalStatusMessage.tsx:86 is a pure functional component; the post-approval React.memo suggestion is a perf nit specifically for Ink scrollback re-renders. Not a correctness issue.

GitHub CI

All 8 required checks green on 9f2430a: Lint / CodeQL ×2 / Test (mac+ubuntu+win) / Classify / Coverage skip.

) Ported from QwenLM#4230, adapted to our goal subsystem (core/src/goal, single goalEvaluator) rather than upstream's goalJudge. The goal evaluator can now return a terminal "impossible" verdict — used only when the condition is genuinely unachievable in this session (self- contradictory, depends on an unavailable resource, or all reasonable approaches are exhausted). When it fires, the autonomous /goal loop stops and the goal is recorded as failed instead of continuing forever. Guardrails mirror upstream: the assistant claiming impossibility is evidence, not proof; "impossible" is only honoured alongside met:false; and an evaluator error or empty reply never fails a goal (stays met:false, continues). - types: `GoalEvaluationResult.impossible`, `GoalState.failedAt` - goalEvaluator: prompt + parser support the third verdict shape - GoalManager: `markImpossible(reason)` / `getLastFailedGoal()`; reset clears it - client: impossible verdict → markImpossible + stop (no continuation) - goalCommand: `/goal` status reports an abandoned-as-impossible goal, preferring the most recent terminal outcome - tests: parser, manager, and status-text coverage (60 goal tests pass) Co-authored-by: Automaker <automaker@localhost> Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com>

…sion, SDK canUseTool timeout) (#331) Documents behavior shipped in 0.51.x–0.52.0 that lacked coverage: - guides/goal.md: the evaluator's three verdicts (met / not met / impossible), the conservative impossible→abandoned terminal state, and how `/goal` status reports an abandoned goal. (QwenLM#4230) - explanation/agent-harness.md: new "Reactive compression" section — proactive vs. reactive compression and the one-shot force-compress-and-retry on a provider context-overflow rejection. (QwenLM#3879) - reference/sdk-api.md + contributing/sdk-typescript.md: document the `timeout.canUseTool` option and that the SDK now forwards it to the CLI so the control-plane timeout matches the callback's. (QwenLM#4491) (The `--max-tool-calls` / `--max-wall-time` budgets were already documented in guides/run-headless.md when those features shipped.) Co-authored-by: Automaker <automaker@localhost> Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com>

feat(core): fail impossible goals

6814a9a

qqqys mentioned this pull request May 17, 2026

feat(core): fail impossible goals #4229

Closed

fix(core): refine impossible goal judgement

ea74be8

qqqys marked this pull request as ready for review May 17, 2026 05:53

qqqys added 2 commits May 17, 2026 14:05

fix(core): include goal feedback when continuing

ffd7328

fix(core): clarify impossible goal terminal state

bdc071e

wenshao requested changes May 17, 2026

View reviewed changes

wenshao reviewed May 17, 2026

View reviewed changes

fix(core): harden impossible goal feedback

877a8c0

wenshao approved these changes May 17, 2026

View reviewed changes

wenshao previously approved these changes May 17, 2026

View reviewed changes

fix(core): log suppressed impossible verdicts

679f2dc

qqqys dismissed wenshao’s stale review via 679f2dc May 17, 2026 12:06

tanzhenxin added the type/feature-request New feature or enhancement request label May 17, 2026

wenshao reviewed May 17, 2026

View reviewed changes

fix(goal): address review suggestions

9f2430a

wenshao reviewed May 17, 2026

View reviewed changes

wenshao previously approved these changes May 17, 2026

View reviewed changes

wenshao reviewed May 17, 2026

View reviewed changes

test(goal): cover impossible parsing suggestions

0673089

qqqys dismissed wenshao’s stale review via 0673089 May 17, 2026 15:32

wenshao approved these changes May 17, 2026

View reviewed changes

wenshao merged commit f84ddd4 into QwenLM:main May 17, 2026
9 checks passed

mabry1985 mentioned this pull request May 27, 2026

feat(goal): abandon impossible /goal conditions instead of looping protoLabsAI/protoCLI#322

Merged

mabry1985 mentioned this pull request May 27, 2026

docs: cover impossible goals, reactive compression, and SDK canUseTool timeout protoLabsAI/protoCLI#331

Merged

-      latest.iterations >= MIN_IMPOSSIBLE_GOAL_ITERATIONS
+    if (verdict.impossible) {
+      if (latest.iterations >= MIN_IMPOSSIBLE_GOAL_ITERATIONS) {
+        finishGoal(config, sessionId, latest, {
+          kind: 'failed',
+          condition: latest.condition,
+          iterations: latest.iterations,
+          durationMs: Date.now() - latest.setAt,
+          lastReason: verdict.reason,
+        });
+        return { continue: true };
+      }
+      debugLogger.debug(
+        `Impossible verdict suppressed: iterations=${latest.iterations} < floor=${MIN_IMPOSSIBLE_GOAL_ITERATIONS}; continuing loop.`,
+      );
+    }

-export const MIN_IMPOSSIBLE_GOAL_ITERATIONS = 2;
+/**
+ * Minimum iteration count before an `impossible` judge verdict is accepted.
+ * Gives the model at least one continuation turn after the judge first flags
+ * impossibility, guarding against premature failure on a single bad-judgment
+ * turn. The goal terminates on the second consecutive impossible verdict.
+ */
+export const MIN_IMPOSSIBLE_GOAL_ITERATIONS = 2;

-  'reason',
+/**
+ * Whether the goal is genuinely impossible in this session.
+ * Only meaningful when `ok` is `false`. If `ok` is `true`,
+ * this field is always absent.
+ */
+impossible?: boolean;

		@@ -126,7 +132,8 @@ export const GoalStatusMessage: React.FC<GoalStatusMessageProps> = ({
		<Text wrap="wrap">{condition}</Text>
		</Box>

-          </Box>
+      case 'aborted':
+        return {
+          prefix: '!',
+          prefixColor: theme.status.warning,
+          title: 'Goal aborted',
+        };
+      // No default — let TypeScript enforce exhaustiveness.

		: never;

		export const JUDGE_RESULT_SCHEMA_COVERS_INTERFACE: SchemaCoversJudgeResult = true;

	};
	const impossible = Boolean((payload as { impossible?: unknown }).impossible);

		@@ -81,13 +85,20 @@ export const GoalStatusMessage: React.FC<GoalStatusMessageProps> = ({
		prefixColor: theme.text.secondary,
		title: 'Goal cleared',

	title: 'Goal cleared',
	export const GoalStatusMessage: React.FC<GoalStatusMessageProps> = React.memo(({

Conversation

qqqys commented May 17, 2026

Summary

Validation

Scope / Risk

Linked Issues / Bugs

Uh oh!

github-actions Bot commented May 17, 2026

📋 Review Summary

🔍 General Feedback

🎯 Specific Feedback

🟡 High

🟢 Medium

🔵 Low

✅ Highlights

Uh oh!

wenshao left a comment

Choose a reason for hiding this comment

🔍 Code Review — PR #4230

🔴 Critical — Prompt injection via judgeReason (goalHook.ts:48-57)

🟡 Suggestion — impossible bypasses iteration-cap safety (goalHook.ts:192-202)

🟡 Suggestion — additionalProperties: false cast hack (goalJudge.ts:55)

🟢 Nice to have — Stale inline comments

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

wenshao left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

wenshao left a comment

Choose a reason for hiding this comment

Additional notes (could not be mapped to specific diff lines)

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

wenshao left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

wenshao commented May 17, 2026

Local validation report

Unit / typecheck

Spot check

GitHub CI

Uh oh!

🔴 Critical — Prompt injection via `judgeReason` (goalHook.ts:48-57)

🟡 Suggestion — `impossible` bypasses iteration-cap safety (goalHook.ts:192-202)

🟡 Suggestion — `additionalProperties: false` cast hack (goalJudge.ts:55)