Skip to content

fix(goals): require filesystem verification evidence#18679

Open
LeonSGP43 wants to merge 1 commit into
NousResearch:mainfrom
LeonSGP43:fix/goal-filesystem-verification-18421
Open

fix(goals): require filesystem verification evidence#18679
LeonSGP43 wants to merge 1 commit into
NousResearch:mainfrom
LeonSGP43:fix/goal-filesystem-verification-18421

Conversation

@LeonSGP43

Copy link
Copy Markdown
Contributor

Fixes #18421.

Summary

  • teach goal continuations to verify file creation/modification before claiming completion
  • make the goal judge prompt reject bare success claims for filesystem goals
  • add a deterministic pre-judge guard so filesystem goals continue when the response lacks read/list-style evidence
  • cover both the missing-evidence path and verified path in goal tests

Verification

  • scripts/run_tests.sh tests/hermes_cli/test_goals.py::TestJudgeGoal::test_filesystem_goal_without_verification_continues_before_judge tests/hermes_cli/test_goals.py::TestJudgeGoal::test_filesystem_goal_with_verification_uses_judge tests/hermes_cli/test_goals.py::TestGoalManager::test_continuation_prompt_shape
  • scripts/run_tests.sh tests/hermes_cli/test_goals.py
  • git diff --check origin/main...HEAD

@alt-glitch alt-glitch added type/bug Something isn't working P2 Medium — degraded but workaround exists comp/cli CLI entry point, hermes_cli/, setup wizard labels May 2, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

comp/cli CLI entry point, hermes_cli/, setup wizard P2 Medium — degraded but workaround exists type/bug Something isn't working

Projects

None yet

Development

Successfully merging this pull request may close these issues.

/goal judge produces false positive when agent claims file created but write silently fails

2 participants