chore(agents): default bug-investigator to Opus + add grounding contract by Yeraze · Pull Request #3666 · Yeraze/meshmonitor

Yeraze · 2026-06-23T12:15:34Z

Summary

Hardens the meshmonitor-bug-investigator agent, which had been producing confidently-wrong file paths / line numbers / symbol names and false "X doesn't exist" claims that had to be caught and corrected downstream.

Two root causes, both addressed:

It ran on Sonnet (meshtastic-expert is already Opus) → bumped to model: opus.
No requirement to verify what it cited → added a mandatory Grounding rules section.

Grounding rules added

Cite only what you opened — every path/line/symbol must come from a tool result in that session.
Verify before reporting — re-run ls / grep -n on each citation and keep the output; drop unconfirmable claims.
Prove absence — "X doesn't exist anywhere" requires the actual grep -rn command + its empty output (the most common failure mode).
Grep-first, no bulk-reads — locate with grep, read only ±30 lines; an errored/empty Read means it doesn't exist (don't synthesize contents).
Label confidence — every finding tagged CONFIRMED (with output) or HYPOTHESIS; trust the code over the user's framing when they conflict.

The report's Evidence items must now be tool-confirmed.

Config-only change (.claude/agents/); no application code touched.

🤖 Generated with Claude Code

The meshmonitor-bug-investigator agent ran on Sonnet and had no requirement to verify the file paths / line numbers / symbols it cited, which led to confident hallucinations (nonexistent files, wrong field names, "X doesn't exist" claims that were false) that had to be caught and corrected downstream. - Bump model sonnet -> opus (meshtastic-expert is already opus). - Add a mandatory "Grounding rules" section: cite only what you opened; re-run ls/grep to confirm each citation before reporting; prove absence with a real grep + its empty output; grep-first instead of bulk-reading large files; and label every finding CONFIRMED (with tool output) or HYPOTHESIS. - Require the report's Evidence items to be tool-confirmed. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> Claude-Session: https://claude.ai/code/session_011JEaCGwY9Wz8jeV4e22GW4

github-actions · 2026-06-23T12:15:51Z

Claude finished @Yeraze's task —— View job

PR Review: chore(agents): default bug-investigator to Opus + add grounding contract

Tasks

Read changed file
Review changes and provide feedback

Review

This is a clean, focused improvement to the agent configuration. The changes are well-motivated and well-executed.

Model upgrade (line 5)

Bumping to model: opus is appropriate given the agent's job — it needs strong reasoning to avoid hallucinating file paths and line numbers. The PR description notes meshtastic-expert is already on Opus, so this brings meshmonitor-bug-investigator in line with that precedent.

Grounding rules (lines 14–45)

The 5 rules are specific, actionable, and directly address the failure modes described in the PR summary:

Cite only what you opened — closes the "memory/guessing" loophole cleanly.
Verify before you report — the ls + grep -n verification step is concrete and easy for the model to follow.
Prove absence — requiring the actual empty grep -rn output to substantiate "X doesn't exist" is the right fix for the most common failure mode.
Grep first, never bulk-read — good constraint; the ±30-line guidance is specific enough to be actionable.
Label confidence — CONFIRMED/HYPOTHESIS tagging makes uncertainty visible to downstream engineers.

Evidence section in output format (lines 275–278)

The inline comment in the output template reinforces the grounding rules at the exact point where the agent would produce citations. This is a nice touch — it makes the rules hard to overlook even if the agent "skims" the system prompt.

Minor observations (no blocking issues)

The existing HYPOTHESIS / EVIDENCE / ROOT CAUSE block format at lines 128–135 predates the grounding rules and uses the HYPOTHESIS label. The new grounding rule 5 introduces CONFIRMED / HYPOTHESIS tagging on individual Evidence items rather than top-level sections. These aren't in conflict, but a future cleanup could unify the two patterns for clarity.
Rule 4 says "read only the targeted ±30 lines" but the tools frontmatter (line 4) gives the agent the read tool with no explicit constraint. The rule is advisory text; whether the model reliably respects the 30-line bound depends on model behavior. Low-risk for now, worth monitoring.
No test coverage for agent config files is expected or needed here; this is purely a prompt/config change.

Overall: Straightforward, well-scoped hardening. No blocking concerns. ✓

Yeraze merged commit be88753 into main Jun 23, 2026
19 checks passed

Yeraze deleted the chore/bug-investigator-grounding-opus branch June 23, 2026 12:41

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

chore(agents): default bug-investigator to Opus + add grounding contract#3666

chore(agents): default bug-investigator to Opus + add grounding contract#3666
Yeraze merged 1 commit into
mainfrom
chore/bug-investigator-grounding-opus

Yeraze commented Jun 23, 2026

Uh oh!

github-actions Bot commented Jun 23, 2026 •

edited

Loading

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Uh oh!

Conversation

Yeraze commented Jun 23, 2026

Summary

Grounding rules added

Uh oh!

github-actions Bot commented Jun 23, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

PR Review: chore(agents): default bug-investigator to Opus + add grounding contract

Review

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

github-actions Bot commented Jun 23, 2026 •

edited

Loading