fix(loop-detection): escalate generic_repeat to critical at criticalThreshold by jwchmodx · Pull Request #60248 · openclaw/openclaw

jwchmodx · 2026-04-03T11:17:16Z

Summary

The generic_repeat detector only checked warningThreshold and always returned level: "warning", making criticalThreshold a no-op for the most common runaway loop pattern (same tool, same arguments, repeated calls).

Added a criticalThreshold check in the generic_repeat code path, before the existing warningThreshold check
Updated the test that was asserting the broken warn-only behavior to assert the correct critical escalation

This makes generic_repeat consistent with known_poll_no_progress and ping_pong, which both escalate to "critical" at criticalThreshold.

Detector	warningThreshold	criticalThreshold
`known_poll_no_progress`	⚠️ Warns	✅ Blocks
`ping_pong`	⚠️ Warns	✅ Blocks (with noProgressEvidence)
`generic_repeat` (before)	⚠️ Warns	❌ Not checked
`generic_repeat` (after)	⚠️ Warns	✅ Blocks

Test plan

Existing 29 tests pass
Updated test now asserts level: "critical" and detector: "generic_repeat" at criticalThreshold

🤖 Generated with Claude Code

greptile-apps · 2026-04-03T11:19:18Z

Greptile Summary

This PR correctly fixes the generic_repeat loop detector so it escalates to "critical" at criticalThreshold (previously it only ever returned "warning"), making it consistent with the known_poll_no_progress and ping_pong detectors. It also bundles an unrelated fix in extensions/mattermost that prevents DM replies from being threaded by adding a required kind: ChatType guard to resolveMattermostReplyRootId; all three call sites are updated and new tests cover the DM-gating behavior.

Confidence Score: 5/5

Safe to merge — logic is correct, all call sites are updated, and only P2 suggestions remain.

No P0 or P1 issues found. The critical-threshold check for generic_repeat is correctly ordered before the warning check, matching the established pattern. The mattermost DM threading guard is complete and tested. Remaining comments are style suggestions only.

No files require special attention.

Comments Outside Diff (1)

extensions/mattermost/src/mattermost/monitor.ts, line 160-173 (link)

Unrelated fix bundled without PR description coverage

The kind: ChatType parameter addition to resolveMattermostReplyRootId — preventing DM threading — is a separate behaviour fix not mentioned in the PR description or summary. Since the function is exported, reviewers need to know it was intentionally included and that all three call sites (lines 536, 744, 1456) were updated. The change itself is correct, but it would be worth adding a note in the PR description so it can be tracked against the issue it resolves.

Prompt To Fix With AI

This is a comment left during a code review.
Path: extensions/mattermost/src/mattermost/monitor.ts
Line: 160-173

Comment:
**Unrelated fix bundled without PR description coverage**

The `kind: ChatType` parameter addition to `resolveMattermostReplyRootId` — preventing DM threading — is a separate behaviour fix not mentioned in the PR description or summary. Since the function is `export`ed, reviewers need to know it was intentionally included and that all three call sites (lines 536, 744, 1456) were updated. The change itself is correct, but it would be worth adding a note in the PR description so it can be tracked against the issue it resolves.

How can I resolve this? If you propose a fix, please make it concise.

_{Note: If this suggestion doesn't match your team's coding style, reply to this and let me know. I'll remember it for next time!}

Prompt To Fix All With AI

This is a comment left during a code review.
Path: src/agents/tool-loop-detection.test.ts
Line: 302-316

Comment:
**Missing count assertion in updated test**

The renamed test now asserts `level: "critical"` and `detector: "generic_repeat"`, but the previous test also checked `result.count`. Adding `expect(loopResult.count).toBe(CRITICAL_THRESHOLD)` would keep coverage parity with the existing warning-level test (line 296) and pin the count value returned by the new critical path.

```suggestion
      expect(loopResult.stuck).toBe(true);
      if (loopResult.stuck) {
        expect(loopResult.level).toBe("critical");
        expect(loopResult.detector).toBe("generic_repeat");
        expect(loopResult.count).toBe(CRITICAL_THRESHOLD);
        expect(loopResult.message).toContain("CRITICAL");
      }
```

How can I resolve this? If you propose a fix, please make it concise.

---

This is a comment left during a code review.
Path: extensions/mattermost/src/mattermost/monitor.ts
Line: 160-173

Comment:
**Unrelated fix bundled without PR description coverage**

The `kind: ChatType` parameter addition to `resolveMattermostReplyRootId` — preventing DM threading — is a separate behaviour fix not mentioned in the PR description or summary. Since the function is `export`ed, reviewers need to know it was intentionally included and that all three call sites (lines 536, 744, 1456) were updated. The change itself is correct, but it would be worth adding a note in the PR description so it can be tracked against the issue it resolves.

How can I resolve this? If you propose a fix, please make it concise.

_{Reviews (1): Last reviewed commit: "fix(loop-detection): escalate generic_re..." | Re-trigger Greptile}

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: e02478669a

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 9a02183ce7

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

…hreshold The generic_repeat detector only checked warningThreshold and always returned level "warning", making criticalThreshold effectively a no-op for the most common runaway loop pattern (same tool + same args). Add a critical-threshold check before the warning check, consistent with how known_poll_no_progress and ping_pong already behave. Fixes openclaw#60111 Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

vincentkoc · 2026-04-28T07:37:39Z

ProjectClownfish pushed a narrow repair to this branch so the original contributor path can stay canonical.

Source PR: #60248
Validation: pnpm test src/agents/tool-loop-detection.test.ts -- --reporter=dot; pnpm check:changed
Contributor credit is preserved in the branch history and PR context.

clawsweeper · 2026-04-28T07:42:05Z

Codex review: needs changes before merge.

What this changes:

This PR changes generic_repeat loop detection to return a critical no-progress block at criticalThreshold, and updates unit coverage, docs, and the active changelog.

Required change before merge:

This is a narrow automation-safe PR repair: update the existing branch's e2e and diagnostic expectations for the already-chosen no-progress-gated behavior, without needing product, security, or release approval.

Security review:

Security review cleared: Security review cleared: the current diff touches loop-detection logic, tests, docs, and changelog only, with no dependency, CI, release, secret, or package execution changes.

Review findings:

[P2] Update hook e2e expectations for generic repeats — src/agents/tool-loop-detection.ts:617-621

Review details

Best possible solution:

Land the no-progress-gated generic_repeat critical behavior with matching unit, before-tool-call e2e, structured diagnostic, docs, and changelog coverage, while keeping broader turn-abort policy in the separate related reports.

Do we have a high-confidence way to reproduce the issue?

Yes. Source and test inspection on current main shows generic_repeat remains warning-only at CRITICAL_THRESHOLD, and the PR-specific mismatch is reproducible by following the existing before-tool-call read fixture that currently resolves through CRITICAL_THRESHOLD + 5 no-progress calls.

Is this the best way to solve the issue?

No, not as currently submitted. The no-progress-gated critical path is the narrow maintainable direction and avoids progressive-output false positives, but the PR needs the before-tool-call e2e and diagnostic expectations updated before it is the best mergeable fix.

Full review comments:

[P2] Update hook e2e expectations for generic repeats — src/agents/tool-loop-detection.ts:617-621
This branch makes generic_repeat block once a no-progress streak reaches criticalThreshold, but the existing before-tool-call e2e test still runs the same no-progress read fixture through CRITICAL_THRESHOLD + 5 calls and expects every call to resolve. Critical loop outcomes are returned as failures and wrapped tools throw, so update the e2e and structured diagnostic expectations for the new generic critical path.
Confidence: 0.9

Overall correctness: patch is incorrect
Overall confidence: 0.88

Acceptance criteria:

pnpm test src/agents/tool-loop-detection.test.ts src/agents/pi-tools.before-tool-call.e2e.test.ts -- --reporter=dot
pnpm exec oxfmt --check --threads=1 src/agents/tool-loop-detection.ts src/agents/tool-loop-detection.test.ts src/agents/pi-tools.before-tool-call.e2e.test.ts docs/tools/loop-detection.md CHANGELOG.md
pnpm check:changed in Testbox

What I checked:

current_main_warn_only: On current main, detectToolCallLoop still has only the generic warning branch and returns level: "warning" for repeated same-tool/same-argument calls. (src/agents/tool-loop-detection.ts:612, a256745323ab)
unit_test_records_old_behavior: The current unit test drives a generic no-progress read loop to CRITICAL_THRESHOLD and still expects warning-level behavior. (src/agents/tool-loop-detection.test.ts:383, a256745323ab)
hook_e2e_conflicts_with_pr_behavior: The before-tool-call e2e fixture uses identical read output and asserts all CRITICAL_THRESHOLD + 5 executions resolve; the PR's generic critical branch would make this path block earlier. (src/agents/pi-tools.before-tool-call.e2e.test.ts:228, a256745323ab)
critical_loop_blocks_wrapped_tools: Critical loop results are converted to blocked: true with kind: "failure", and wrapped tools throw for non-veto blocked outcomes, so the e2e resolve expectation must change with this PR. (src/agents/pi-tools.before-tool-call.ts:424, a256745323ab)
release_not_fixed: The latest release tag v2026.4.29 still contains the same generic warn-only branch, confirming this PR has not already shipped on main or in the latest release. (src/agents/tool-loop-detection.ts:612, a448042c2edd)
discussion_context: The discussion says ProjectClownfish pushed a narrow repair to keep the contributor path canonical, and the latest ClawSweeper review already identified the missing before-tool-call e2e and diagnostic coverage as the required follow-up. (9fd4b85d9a4b)

Likely related people:

steipete: Git history shows Peter Steinberger introduced configurable tool loop detection and owns the current-main blame/refactor path for the central detector files. (role: introduced behavior and recent maintainer; confidence: high; commits: 076df941a326, 9300d48244ca, a448042c2edd; files: src/agents/tool-loop-detection.ts, src/agents/pi-tools.before-tool-call.ts, docs/tools/loop-detection.md)
akramcodez: The earlier stuck-loop detection and backoff infrastructure for agents appears in the relevant history under Sk Akram, making this a plausible routing point for loop-detection semantics. (role: adjacent feature introducer; confidence: medium; commits: e5eb5b3e43d7; files: src/agents, src/agents/pi-tools.before-tool-call.e2e.test.ts)
vincentkoc: Vincent Koc has recent current-main maintenance on the before-tool-call e2e file and also pushed the narrow repair commit to this PR branch, so the missing e2e follow-up is likely in their context. (role: recent test maintainer and branch repair contributor; confidence: medium; commits: 5959c9927ed2, c7a562683a64; files: src/agents/pi-tools.before-tool-call.e2e.test.ts)

Remaining risk / open question:

The before-tool-call e2e and diagnostic surface can remain failing or undercovered until the PR updates the generic no-progress critical expectations.
The broader hard-abort turn policy discussed in Loop detector should hard-abort agent after N identical tool calls, not just warn #58726 and OpenShell should fail fast on malformed generated commands and hard-abort repeated tool loops #72373 remains separate and should not be treated as solved by this narrow detector change.

Codex review notes: model gpt-5.5, reasoning high; reviewed against a256745323ab.

greptile-apps Bot reviewed Apr 3, 2026

View reviewed changes

Comment thread src/agents/tool-loop-detection.test.ts

chatgpt-codex-connector Bot reviewed Apr 3, 2026

View reviewed changes

Comment thread src/agents/tool-loop-detection.ts

chatgpt-codex-connector Bot reviewed Apr 3, 2026

View reviewed changes

Comment thread src/infra/provider-usage.fetch.minimax.ts

steipete mentioned this pull request Apr 3, 2026

fix(xiaomi): disable thinking for MiMo reasoning models #60283

Closed

BingqingLyu mentioned this pull request Apr 28, 2026

fix(loop-detection): escalate generic_repeat to critical at criticalThreshold BingqingLyu/openclaw#1912

Open

2 tasks

jwchmodx and others added 2 commits April 28, 2026 07:37

fix(loop-detection): block generic repeats with no-progress evidence

9fd4b85

vincentkoc force-pushed the fix/generic-repeat-critical-threshold branch from 9a02183 to 9fd4b85 Compare April 28, 2026 07:37

vincentkoc added clownfish:human-review labels Apr 28, 2026

openclaw-barnacle Bot added docs Improvements or additions to documentation agents Agent runtime and tooling size: S labels Apr 28, 2026

vincentkoc added clawsweeper Tracked by ClawSweeper automation and removed clownfish:merge-ready labels Apr 28, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

fix(loop-detection): escalate generic_repeat to critical at criticalThreshold#60248

fix(loop-detection): escalate generic_repeat to critical at criticalThreshold#60248
jwchmodx wants to merge 2 commits intoopenclaw:mainfrom
jwchmodx:fix/generic-repeat-critical-threshold

jwchmodx commented Apr 3, 2026

Uh oh!

greptile-apps Bot commented Apr 3, 2026 •

edited

Loading

Comments Outside Diff (1)

Uh oh!

Uh oh!

chatgpt-codex-connector Bot left a comment

Uh oh!

Uh oh!

chatgpt-codex-connector Bot left a comment

Uh oh!

Uh oh!

vincentkoc commented Apr 28, 2026

Uh oh!

clawsweeper Bot commented Apr 28, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Uh oh!

Conversation

jwchmodx commented Apr 3, 2026

Summary

Test plan

Uh oh!

greptile-apps Bot commented Apr 3, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Greptile Summary

Confidence Score: 5/5

Comments Outside Diff (1)

Uh oh!

Uh oh!

chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

Uh oh!

chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

Uh oh!

vincentkoc commented Apr 28, 2026

Uh oh!

clawsweeper Bot commented Apr 28, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

greptile-apps Bot commented Apr 3, 2026 •

edited

Loading

clawsweeper Bot commented Apr 28, 2026 •

edited

Loading