Skip to content

fix: exec tool loop detection bypassed by volatile details fields#292

Open
BingqingLyu wants to merge 7 commits into
mainfrom
fork-pr-41502-fix-exec-loop-detection
Open

fix: exec tool loop detection bypassed by volatile details fields#292
BingqingLyu wants to merge 7 commits into
mainfrom
fork-pr-41502-fix-exec-loop-detection

Conversation

@BingqingLyu

@BingqingLyu BingqingLyu commented Apr 27, 2026

Copy link
Copy Markdown
Owner

Summary

hashToolOutcome() hashes the entire details object for exec tool results, which includes volatile fields (durationMs, pid, startedAt, sessionId) that change on every invocation. This makes every exec call produce a unique result hash, so the no-progress streak detector never triggers — even when the same command is called 121+ times with identical output.

  • Add exec-specific handling in hashToolOutcome() that only hashes stable fields (status, exitCode, text), matching the existing pattern for process tool's poll/log actions
  • Add 4 regression tests covering: volatile durationMs, volatile pid/startedAt, critical threshold, and negative case (different commands)

Root Cause

In tool-loop-detection.ts, the fallback path at the end of hashToolOutcome() hashes the raw details object:

return digestStable({
  details,    // ← includes durationMs, pid, cwd — volatile!
  text,
});

For exec tool results, details is typed as ExecToolDetails which contains:

  • Completed: durationMs (changes every call), cwd
  • Running: pid, startedAt, sessionId (all different every call)

Since every hash is unique, getNoProgressStreak() always returns count: 0, and the loop is never detected.

Fix

Add an exec-specific branch before the fallback that strips volatile fields:

if (toolName === "exec") {
  return digestStable({
    status: details.status,
    exitCode: details.exitCode ?? null,
    text,
  });
}

This follows the same pattern already used for process tool's poll and log actions (lines 200-223).

Test plan

Fixes openclaw#34574

🤖 Generated with Claude Code

Zcg2021 and others added 7 commits March 10, 2026 04:38
hashToolOutcome() hashed the entire `details` object for exec tool
results, which includes volatile fields (durationMs, pid, startedAt,
sessionId) that change on every invocation. This made every exec call
produce a unique hash, so loop detection never triggered — even when
the same command was called 121+ times with identical output.

Add exec-specific handling that only hashes stable fields (status,
exitCode, text), matching the existing pattern for process tool's
poll/log actions.

Fixes openclaw#34574

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Address greptile review feedback:
- Rename "blocks repeated exec calls at critical threshold" to
  "warns for exec calls repeated past warning threshold" (it asserts
  "warning", not "critical")
- Add dedicated test for GLOBAL_CIRCUIT_BREAKER_THRESHOLD (30
  iterations) — the most impactful code path fixed by this PR,
  asserting level: "critical" and detector: "global_circuit_breaker"

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Address Codex review feedback: running exec results embed volatile
session id and pid in content.text ("Command still running (session
sess-123, pid 45678)"), so hashing text still produces unique hashes.

For running exec, hash only status + tail output (which reflects
actual command progress). For completed exec, content.text already
mirrors the stable aggregated output, so keep hashing it.

Add test for running exec with volatile text + varying pid/sessionId,
and a negative test verifying that changing tail output prevents
escalation to global_circuit_breaker.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
content.text can diverge from details.aggregated in two ways:
- node-host path: content.text uses short-circuit OR (stdout || stderr ||
  errorText), dropping stderr/error when stdout is non-empty
- gateway path: content.text prepends warnings and uses "(no output)" fallback

Use details.aggregated which always contains the full combined output,
falling back to content text when aggregated is unavailable.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

loopDetection does not catch repeated exec tool calls

2 participants