Skip to content

fix(loop-detection): escalate generic_repeat to critical at criticalThreshold#60248

Open
jwchmodx wants to merge 2 commits intoopenclaw:mainfrom
jwchmodx:fix/generic-repeat-critical-threshold
Open

fix(loop-detection): escalate generic_repeat to critical at criticalThreshold#60248
jwchmodx wants to merge 2 commits intoopenclaw:mainfrom
jwchmodx:fix/generic-repeat-critical-threshold

Conversation

@jwchmodx
Copy link
Copy Markdown
Contributor

@jwchmodx jwchmodx commented Apr 3, 2026

Summary

Fixes #60111.

The generic_repeat detector only checked warningThreshold and always returned level: "warning", making criticalThreshold a no-op for the most common runaway loop pattern (same tool, same arguments, repeated calls).

  • Added a criticalThreshold check in the generic_repeat code path, before the existing warningThreshold check
  • Updated the test that was asserting the broken warn-only behavior to assert the correct critical escalation

This makes generic_repeat consistent with known_poll_no_progress and ping_pong, which both escalate to "critical" at criticalThreshold.

Detector warningThreshold criticalThreshold
known_poll_no_progress ⚠️ Warns ✅ Blocks
ping_pong ⚠️ Warns ✅ Blocks (with noProgressEvidence)
generic_repeat (before) ⚠️ Warns ❌ Not checked
generic_repeat (after) ⚠️ Warns ✅ Blocks

Test plan

  • Existing 29 tests pass
  • Updated test now asserts level: "critical" and detector: "generic_repeat" at criticalThreshold

🤖 Generated with Claude Code

@greptile-apps
Copy link
Copy Markdown
Contributor

greptile-apps Bot commented Apr 3, 2026

Greptile Summary

This PR correctly fixes the generic_repeat loop detector so it escalates to "critical" at criticalThreshold (previously it only ever returned "warning"), making it consistent with the known_poll_no_progress and ping_pong detectors. It also bundles an unrelated fix in extensions/mattermost that prevents DM replies from being threaded by adding a required kind: ChatType guard to resolveMattermostReplyRootId; all three call sites are updated and new tests cover the DM-gating behavior.

Confidence Score: 5/5

Safe to merge — logic is correct, all call sites are updated, and only P2 suggestions remain.

No P0 or P1 issues found. The critical-threshold check for generic_repeat is correctly ordered before the warning check, matching the established pattern. The mattermost DM threading guard is complete and tested. Remaining comments are style suggestions only.

No files require special attention.

Comments Outside Diff (1)

  1. extensions/mattermost/src/mattermost/monitor.ts, line 160-173 (link)

    P2 Unrelated fix bundled without PR description coverage

    The kind: ChatType parameter addition to resolveMattermostReplyRootId — preventing DM threading — is a separate behaviour fix not mentioned in the PR description or summary. Since the function is exported, reviewers need to know it was intentionally included and that all three call sites (lines 536, 744, 1456) were updated. The change itself is correct, but it would be worth adding a note in the PR description so it can be tracked against the issue it resolves.

    Prompt To Fix With AI
    This is a comment left during a code review.
    Path: extensions/mattermost/src/mattermost/monitor.ts
    Line: 160-173
    
    Comment:
    **Unrelated fix bundled without PR description coverage**
    
    The `kind: ChatType` parameter addition to `resolveMattermostReplyRootId` — preventing DM threading — is a separate behaviour fix not mentioned in the PR description or summary. Since the function is `export`ed, reviewers need to know it was intentionally included and that all three call sites (lines 536, 744, 1456) were updated. The change itself is correct, but it would be worth adding a note in the PR description so it can be tracked against the issue it resolves.
    
    How can I resolve this? If you propose a fix, please make it concise.

    Note: If this suggestion doesn't match your team's coding style, reply to this and let me know. I'll remember it for next time!

Prompt To Fix All With AI
This is a comment left during a code review.
Path: src/agents/tool-loop-detection.test.ts
Line: 302-316

Comment:
**Missing count assertion in updated test**

The renamed test now asserts `level: "critical"` and `detector: "generic_repeat"`, but the previous test also checked `result.count`. Adding `expect(loopResult.count).toBe(CRITICAL_THRESHOLD)` would keep coverage parity with the existing warning-level test (line 296) and pin the count value returned by the new critical path.

```suggestion
      expect(loopResult.stuck).toBe(true);
      if (loopResult.stuck) {
        expect(loopResult.level).toBe("critical");
        expect(loopResult.detector).toBe("generic_repeat");
        expect(loopResult.count).toBe(CRITICAL_THRESHOLD);
        expect(loopResult.message).toContain("CRITICAL");
      }
```

How can I resolve this? If you propose a fix, please make it concise.

---

This is a comment left during a code review.
Path: extensions/mattermost/src/mattermost/monitor.ts
Line: 160-173

Comment:
**Unrelated fix bundled without PR description coverage**

The `kind: ChatType` parameter addition to `resolveMattermostReplyRootId` — preventing DM threading — is a separate behaviour fix not mentioned in the PR description or summary. Since the function is `export`ed, reviewers need to know it was intentionally included and that all three call sites (lines 536, 744, 1456) were updated. The change itself is correct, but it would be worth adding a note in the PR description so it can be tracked against the issue it resolves.

How can I resolve this? If you propose a fix, please make it concise.

Reviews (1): Last reviewed commit: "fix(loop-detection): escalate generic_re..." | Re-trigger Greptile

Comment thread src/agents/tool-loop-detection.test.ts
Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: e02478669a

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

Comment thread src/agents/tool-loop-detection.ts
Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 9a02183ce7

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

Comment thread src/infra/provider-usage.fetch.minimax.ts
jwchmodx and others added 2 commits April 28, 2026 07:37
…hreshold

The generic_repeat detector only checked warningThreshold and always
returned level "warning", making criticalThreshold effectively a no-op
for the most common runaway loop pattern (same tool + same args).

Add a critical-threshold check before the warning check, consistent
with how known_poll_no_progress and ping_pong already behave.

Fixes openclaw#60111

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
@vincentkoc vincentkoc force-pushed the fix/generic-repeat-critical-threshold branch from 9a02183 to 9fd4b85 Compare April 28, 2026 07:37
@vincentkoc
Copy link
Copy Markdown
Member

ProjectClownfish pushed a narrow repair to this branch so the original contributor path can stay canonical.

Source PR: #60248
Validation: pnpm test src/agents/tool-loop-detection.test.ts -- --reporter=dot; pnpm check:changed
Contributor credit is preserved in the branch history and PR context.

@openclaw-barnacle openclaw-barnacle Bot added docs Improvements or additions to documentation agents Agent runtime and tooling size: S labels Apr 28, 2026
@clawsweeper
Copy link
Copy Markdown
Contributor

clawsweeper Bot commented Apr 28, 2026

Codex review: needs changes before merge.

What this changes:

This PR changes generic_repeat loop detection to return a critical no-progress block at criticalThreshold, and updates unit coverage, docs, and the active changelog.

Required change before merge:

This is a narrow automation-safe PR repair: update the existing branch's e2e and diagnostic expectations for the already-chosen no-progress-gated behavior, without needing product, security, or release approval.

Security review:

Security review cleared: Security review cleared: the current diff touches loop-detection logic, tests, docs, and changelog only, with no dependency, CI, release, secret, or package execution changes.

Review findings:

  • [P2] Update hook e2e expectations for generic repeats — src/agents/tool-loop-detection.ts:617-621
Review details

Best possible solution:

Land the no-progress-gated generic_repeat critical behavior with matching unit, before-tool-call e2e, structured diagnostic, docs, and changelog coverage, while keeping broader turn-abort policy in the separate related reports.

Do we have a high-confidence way to reproduce the issue?

Yes. Source and test inspection on current main shows generic_repeat remains warning-only at CRITICAL_THRESHOLD, and the PR-specific mismatch is reproducible by following the existing before-tool-call read fixture that currently resolves through CRITICAL_THRESHOLD + 5 no-progress calls.

Is this the best way to solve the issue?

No, not as currently submitted. The no-progress-gated critical path is the narrow maintainable direction and avoids progressive-output false positives, but the PR needs the before-tool-call e2e and diagnostic expectations updated before it is the best mergeable fix.

Full review comments:

  • [P2] Update hook e2e expectations for generic repeats — src/agents/tool-loop-detection.ts:617-621
    This branch makes generic_repeat block once a no-progress streak reaches criticalThreshold, but the existing before-tool-call e2e test still runs the same no-progress read fixture through CRITICAL_THRESHOLD + 5 calls and expects every call to resolve. Critical loop outcomes are returned as failures and wrapped tools throw, so update the e2e and structured diagnostic expectations for the new generic critical path.
    Confidence: 0.9

Overall correctness: patch is incorrect
Overall confidence: 0.88

Acceptance criteria:

  • pnpm test src/agents/tool-loop-detection.test.ts src/agents/pi-tools.before-tool-call.e2e.test.ts -- --reporter=dot
  • pnpm exec oxfmt --check --threads=1 src/agents/tool-loop-detection.ts src/agents/tool-loop-detection.test.ts src/agents/pi-tools.before-tool-call.e2e.test.ts docs/tools/loop-detection.md CHANGELOG.md
  • pnpm check:changed in Testbox

What I checked:

  • current_main_warn_only: On current main, detectToolCallLoop still has only the generic warning branch and returns level: "warning" for repeated same-tool/same-argument calls. (src/agents/tool-loop-detection.ts:612, a256745323ab)
  • unit_test_records_old_behavior: The current unit test drives a generic no-progress read loop to CRITICAL_THRESHOLD and still expects warning-level behavior. (src/agents/tool-loop-detection.test.ts:383, a256745323ab)
  • hook_e2e_conflicts_with_pr_behavior: The before-tool-call e2e fixture uses identical read output and asserts all CRITICAL_THRESHOLD + 5 executions resolve; the PR's generic critical branch would make this path block earlier. (src/agents/pi-tools.before-tool-call.e2e.test.ts:228, a256745323ab)
  • critical_loop_blocks_wrapped_tools: Critical loop results are converted to blocked: true with kind: "failure", and wrapped tools throw for non-veto blocked outcomes, so the e2e resolve expectation must change with this PR. (src/agents/pi-tools.before-tool-call.ts:424, a256745323ab)
  • release_not_fixed: The latest release tag v2026.4.29 still contains the same generic warn-only branch, confirming this PR has not already shipped on main or in the latest release. (src/agents/tool-loop-detection.ts:612, a448042c2edd)
  • discussion_context: The discussion says ProjectClownfish pushed a narrow repair to keep the contributor path canonical, and the latest ClawSweeper review already identified the missing before-tool-call e2e and diagnostic coverage as the required follow-up. (9fd4b85d9a4b)

Likely related people:

  • steipete: Git history shows Peter Steinberger introduced configurable tool loop detection and owns the current-main blame/refactor path for the central detector files. (role: introduced behavior and recent maintainer; confidence: high; commits: 076df941a326, 9300d48244ca, a448042c2edd; files: src/agents/tool-loop-detection.ts, src/agents/pi-tools.before-tool-call.ts, docs/tools/loop-detection.md)
  • akramcodez: The earlier stuck-loop detection and backoff infrastructure for agents appears in the relevant history under Sk Akram, making this a plausible routing point for loop-detection semantics. (role: adjacent feature introducer; confidence: medium; commits: e5eb5b3e43d7; files: src/agents, src/agents/pi-tools.before-tool-call.e2e.test.ts)
  • vincentkoc: Vincent Koc has recent current-main maintenance on the before-tool-call e2e file and also pushed the narrow repair commit to this PR branch, so the missing e2e follow-up is likely in their context. (role: recent test maintainer and branch repair contributor; confidence: medium; commits: 5959c9927ed2, c7a562683a64; files: src/agents/pi-tools.before-tool-call.e2e.test.ts)

Remaining risk / open question:

Codex review notes: model gpt-5.5, reasoning high; reviewed against a256745323ab.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

agents Agent runtime and tooling clawsweeper Tracked by ClawSweeper automation docs Improvements or additions to documentation size: S

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Bug]: generic_repeat loop detector never escalates to blocking — criticalThreshold has no effect

2 participants