fix: exec tool loop detection bypassed by volatile details fields#292
Open
BingqingLyu wants to merge 7 commits into
Open
fix: exec tool loop detection bypassed by volatile details fields#292BingqingLyu wants to merge 7 commits into
BingqingLyu wants to merge 7 commits into
Conversation
hashToolOutcome() hashed the entire `details` object for exec tool results, which includes volatile fields (durationMs, pid, startedAt, sessionId) that change on every invocation. This made every exec call produce a unique hash, so loop detection never triggered — even when the same command was called 121+ times with identical output. Add exec-specific handling that only hashes stable fields (status, exitCode, text), matching the existing pattern for process tool's poll/log actions. Fixes openclaw#34574 Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Address greptile review feedback: - Rename "blocks repeated exec calls at critical threshold" to "warns for exec calls repeated past warning threshold" (it asserts "warning", not "critical") - Add dedicated test for GLOBAL_CIRCUIT_BREAKER_THRESHOLD (30 iterations) — the most impactful code path fixed by this PR, asserting level: "critical" and detector: "global_circuit_breaker" Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Address Codex review feedback: running exec results embed volatile
session id and pid in content.text ("Command still running (session
sess-123, pid 45678)"), so hashing text still produces unique hashes.
For running exec, hash only status + tail output (which reflects
actual command progress). For completed exec, content.text already
mirrors the stable aggregated output, so keep hashing it.
Add test for running exec with volatile text + varying pid/sessionId,
and a negative test verifying that changing tail output prevents
escalation to global_circuit_breaker.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
content.text can diverge from details.aggregated in two ways: - node-host path: content.text uses short-circuit OR (stdout || stderr || errorText), dropping stderr/error when stdout is non-empty - gateway path: content.text prepends warnings and uses "(no output)" fallback Use details.aggregated which always contains the full combined output, falling back to content text when aggregated is unavailable. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
hashToolOutcome()hashes the entiredetailsobject for exec tool results, which includes volatile fields (durationMs,pid,startedAt,sessionId) that change on every invocation. This makes every exec call produce a unique result hash, so the no-progress streak detector never triggers — even when the same command is called 121+ times with identical output.hashToolOutcome()that only hashes stable fields (status,exitCode,text), matching the existing pattern for process tool'spoll/logactionsdurationMs, volatilepid/startedAt, critical threshold, and negative case (different commands)Root Cause
In
tool-loop-detection.ts, the fallback path at the end ofhashToolOutcome()hashes the rawdetailsobject:For exec tool results,
detailsis typed asExecToolDetailswhich contains:durationMs(changes every call),cwdpid,startedAt,sessionId(all different every call)Since every hash is unique,
getNoProgressStreak()always returnscount: 0, and the loop is never detected.Fix
Add an exec-specific branch before the fallback that strips volatile fields:
This follows the same pattern already used for
processtool'spollandlogactions (lines 200-223).Test plan
npx vitest run src/agents/tool-loop-detection.test.ts)Fixes openclaw#34574
🤖 Generated with Claude Code