Bug Description
The /goal judge marks a goal as DONE based solely on the agent's textual response claiming success, without requiring verifiable evidence (e.g., read_file output or filesystem confirmation).
Reproduction Steps
- Set a goal:
/goal Criar um resumo prático em português dos principais conceitos de machine learning para revisão, com exemplos de código Python, e salvar em /home/ubuntu/ml-resumo.md
- Agent responds claiming the file was created successfully
- Judge evaluates the response and returns
{"done": true, "reason": "..."}
- Goal shows:
✓ Goal done (1/20 turns)
- File does not exist on disk — verified with
find and read_file
Expected Behavior
The judge should require verifiable evidence of completion for filesystem-related goals — such as read_file output, ls confirmation, or terminal exit code — rather than trusting the agent's self-declaration.
Actual Behavior
The judge trusts the agent's textual claim of success and marks the goal as DONE, even though the file was never written to disk. This is a false positive — the documented failure mode where the judge says done when work remains.
Root Cause
The judge only sees ~4KB of the agent's last response text. It has no access to tool outputs, filesystem state, or terminal results. If the agent says "file created successfully" but write_file failed silently (or the tool call was never actually executed), the judge has no way to detect the discrepancy.
Suggested Fix
- For goals involving file creation/modification, the judge system prompt should require the agent to include verification evidence (e.g.,
read_file output or ls -la result) in its final response before marking done.
- Alternatively, the judge could be given access to tool output metadata (not just response text) to cross-reference claims against actual tool execution results.
- The
CONTINUATION_PROMPT_TEMPLATE could instruct the agent: "Before declaring a goal complete, always verify filesystem changes with a read/list operation and include the output in your response."
Environment
- Hermes Agent v0.12.0 (2026.4.30)
- Provider: Dialagram (unlimited)
- OS: Ubuntu (VPS)
Workaround
Include explicit verification instructions in the goal text:
/goal Create file X, then read it back with read_file to confirm it exists and has content. Only consider the goal complete after verification.
Bug Description
The
/goaljudge marks a goal as DONE based solely on the agent's textual response claiming success, without requiring verifiable evidence (e.g.,read_fileoutput or filesystem confirmation).Reproduction Steps
/goal Criar um resumo prático em português dos principais conceitos de machine learning para revisão, com exemplos de código Python, e salvar em /home/ubuntu/ml-resumo.md{"done": true, "reason": "..."}✓ Goal done (1/20 turns)findandread_fileExpected Behavior
The judge should require verifiable evidence of completion for filesystem-related goals — such as
read_fileoutput,lsconfirmation, or terminal exit code — rather than trusting the agent's self-declaration.Actual Behavior
The judge trusts the agent's textual claim of success and marks the goal as DONE, even though the file was never written to disk. This is a false positive — the documented failure mode where the judge says done when work remains.
Root Cause
The judge only sees ~4KB of the agent's last response text. It has no access to tool outputs, filesystem state, or terminal results. If the agent says "file created successfully" but
write_filefailed silently (or the tool call was never actually executed), the judge has no way to detect the discrepancy.Suggested Fix
read_fileoutput orls -laresult) in its final response before marking done.CONTINUATION_PROMPT_TEMPLATEcould instruct the agent: "Before declaring a goal complete, always verify filesystem changes with a read/list operation and include the output in your response."Environment
Workaround
Include explicit verification instructions in the goal text: