Skip to content

prompt: harden audit-mode reasoning against known failure modes #610

@esengine

Description

@esengine

When users ask Reasonix Code to audit the project's own architecture ("what tools are missing?", "review our prompt system"), the model produces confident, well-structured recommendations built on factually wrong premises. The existing prompt rules don't catch it.

What we saw (real session, 2026-05-10)

Asked to audit the tool registry. Eight new tools proposed. Verified against the code:

  • read_files (batch read) — claimed each read_file call is sequential and round-trips per file. read_file is annotated parallelSafe: true (src/tools/filesystem.ts:152) and the dispatcher already chunks consecutive parallel-safe calls into concurrent batches (src/loop.ts:1060-1068). The model read tools.ts (saw the flag) and loop.ts (twice, but head then range:"200-350" — both miss the dispatcher at L1060).
  • Structured git_status / git_diff / git_log / git_branch — claimed each git command goes through human approval. Nine git read commands are already in BUILTIN_ALLOWLIST (src/tools/shell/parse.ts:6-14). Zero confirmation prompts.
  • checkpoint_create / checkpoint_restore exposed to the model — checkpoints already auto-snapshot on plan-step continue (src/cli/ui/App.tsx:3134) and restore is intentionally user-controlled via /checkpoint. Putting checkpoint_restore on the model is also a category error: it lets the model silently roll back user work.
  • preview_diffedit_file / multi_edit already return a diff after applying (renderEditDiff in src/tools/fs/edit.ts). No information gain from a pre-apply preview when SEARCH/REPLACE is deterministic.
  • Fabricated "40-60% token savings" — never measured, not derivable from anything in context.

Why the existing rules don't catch this

src/code/prompt.ts:23-39 ("Cite or shut up") covers "X is missing / Y is not implemented" claims and forces a search_content before asserting absence. Good, but the failures above aren't absence claims — they're:

  • positive claims about runtime behavior (model didn't trace flag → consumer)
  • fabricated quantities (no rule against unmeasured percentages)
  • design proposals contradicting MEMORY.md feedback
  • model-facing recommendations for user-facing features

Failure modes the prompt should explicitly call out

  1. Auto-preview is for locating, not auditing. For ≥500-line files (loop.ts is 1185), default head + tail preview will hide the relevant runtime code. Audit-style tasks must use range:"A-B" to read the dispatcher / handler bodies in full.
  2. Flag-to-consumer trace. Reading a type definition (parallelSafe?: boolean) is not the same as understanding behavior. Before claiming "this tool runs in mode Y", grep the flag's consumer (isParallelSafe, stormExempt, etc.) and read the dispatch site.
  3. No fabricated percentages. "40-60% savings" looks load-bearing but is invented. Either compute against a real transcript / token count, or use hedged language ("small but non-zero", "may compound").
  4. Schema cost is real. Tool descriptions are part of every request. Recommending a new tool requires (a) which existing tool composition fails to do this, (b) rough description size, (c) why a prompt change can't achieve the same end. Default to "improve prompt or existing tool description" before "add tool".
  5. MEMORY.md is part of the design space. Recommendations must not contradict feedback already loaded into the session (e.g. "checkpoints are user-controlled, never auto-commit"). Cross-check the loaded memory before proposing.
  6. User-facing ≠ model-facing. Reasonix has three action surfaces: slash commands (user), tools (model), UI (user). Mechanically promoting a user-level feature to a model tool can break user control invariants.

Proposal

Add a ## When auditing or reviewing this codebase section to src/code/prompt.ts, six rules above compressed to one line each. Approx 12-15 lines. PR to follow.

Out of scope for this issue: tightening the existing "Cite or shut up" wording, or adding a runtime "fact-check before recommend" rail. Those are separate.

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or requestrfcArchitecture proposal / request for comments

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions