feat(prompt): add audit-mode rails for review/critique tasks#611
Merged
Conversation
When the user asks Reasonix Code to audit its own architecture, the existing "cite or shut up" rule covers absence claims but doesn't catch the more common audit-mode failure: confident, well-structured proposals built on factually wrong premises about runtime behavior, fabricated quantities, or recommendations that contradict pinned memory. Adds a six-bullet section after "Cite or shut up": auto-preview is for locating not auditing, flag→consumer trace before claiming runtime behavior, no fabricated percentages, schema-cost accounting for new-tool proposals, MEMORY.md as design constraint, and user-facing ≠ model-facing as a category-error guardrail. Closes #610.
Audit session run against the original 6-rail section (#610) showed two failures the wording didn't catch: 1. **Inventory-claim hallucination.** Asked which tools have `stormExempt: true`, the model enumerated 6 file-system tools as having it — only 2 actually do. The rail said "trace flag to consumer", which the model interpreted as "for one named tool", not "for an inventory claim covering many tools." Add an explicit inventory clause: grep the flag, don't enumerate from memory. 2. **Library API → dead-code mischaracterization.** The model labeled `registerSubagentTool` "dead code from CLI perspective" on the basis of a clean grep in `src/cli/`. It's a deliberate library export consumed by embedders via `src/index.ts`. The rail enumerated three surfaces (slash / tools / UI); add a fourth (library) so library exports aren't mistaken for unused code. Two-test bump on tests/code-prompt.test.ts so the tightened wording can't silently regress.
Second audit session run against the tightened rails (#611) showed the same head-only-then-conclude failure mode again, this time on a plan-doc file rather than runtime code: the model read the head of docs/plans/architecture-refactoring-roadmap.md, saw "8 services still use singletons", and asserted the plan was now stale — without reading the rest of the doc to check for a "Status: done" section that might have been there. The original rail was scoped to "runtime behavior" because that was the loop.ts dispatcher case from #610. The same blind spot applies to any file: don't conclude what's in the elided middle off head + tail. Broaden the wording to cover runtime behavior, current architectural state, and doc freshness explicitly. One test bump on tests/code-prompt.test.ts so the broader scope can't silently regress to runtime-only.
4 tasks
ChasLui
pushed a commit
to ChasLui/DeepSeek-Reasonix
that referenced
this pull request
May 23, 2026
…e#611) * feat(prompt): add audit-mode rails for review/critique tasks When the user asks Reasonix Code to audit its own architecture, the existing "cite or shut up" rule covers absence claims but doesn't catch the more common audit-mode failure: confident, well-structured proposals built on factually wrong premises about runtime behavior, fabricated quantities, or recommendations that contradict pinned memory. Adds a six-bullet section after "Cite or shut up": auto-preview is for locating not auditing, flag→consumer trace before claiming runtime behavior, no fabricated percentages, schema-cost accounting for new-tool proposals, MEMORY.md as design constraint, and user-facing ≠ model-facing as a category-error guardrail. Closes esengine#610. * prompt: tighten rails #2 and #6 from real audit-session failure modes Audit session run against the original 6-rail section (esengine#610) showed two failures the wording didn't catch: 1. **Inventory-claim hallucination.** Asked which tools have `stormExempt: true`, the model enumerated 6 file-system tools as having it — only 2 actually do. The rail said "trace flag to consumer", which the model interpreted as "for one named tool", not "for an inventory claim covering many tools." Add an explicit inventory clause: grep the flag, don't enumerate from memory. 2. **Library API → dead-code mischaracterization.** The model labeled `registerSubagentTool` "dead code from CLI perspective" on the basis of a clean grep in `src/cli/`. It's a deliberate library export consumed by embedders via `src/index.ts`. The rail enumerated three surfaces (slash / tools / UI); add a fourth (library) so library exports aren't mistaken for unused code. Two-test bump on tests/code-prompt.test.ts so the tightened wording can't silently regress. * prompt: broaden rail #1 to cover doc / state claims, not just runtime Second audit session run against the tightened rails (esengine#611) showed the same head-only-then-conclude failure mode again, this time on a plan-doc file rather than runtime code: the model read the head of docs/plans/architecture-refactoring-roadmap.md, saw "8 services still use singletons", and asserted the plan was now stale — without reading the rest of the doc to check for a "Status: done" section that might have been there. The original rail was scoped to "runtime behavior" because that was the loop.ts dispatcher case from esengine#610. The same blind spot applies to any file: don't conclude what's in the elided middle off head + tail. Broaden the wording to cover runtime behavior, current architectural state, and doc freshness explicitly. One test bump on tests/code-prompt.test.ts so the broader scope can't silently regress to runtime-only.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Closes #610.
When users ask Reasonix Code to audit its own architecture, the
existing "Cite or shut up" rule catches absence-style claims but not
the more common audit-mode failure: confident proposals built on
factually wrong premises about runtime behavior, invented quantities,
or recommendations that contradict pinned memory.
Adds a six-bullet
# When auditing or reviewing this codebasesection right after "Cite or shut up". Placed there because both are
about evidence integrity for evaluative claims, and audit-mode is
the most common context where a model produces evaluative output.
The six rails:
back as
head + tailwith the middle elided; re-read withrange:"A-B"against the actual section before asserting what'sthere (covers runtime behavior, current architectural state, and
doc freshness).
understanding behavior; grep the flag's consumer first. For
inventory claims ("which tools have flag F?"), grep — don't
enumerate from memory.
measurement or hedge; never present unmeasured numbers as
measured.
existing-tool composition fails, (b) description-token cost, (c)
why a prompt change can't reach the same end.
contradicting loaded user feedback are wrong by construction.
surfaces: slash commands (user), tools (model), UI (user),
library exports (
src/index.ts). Promoting a user-level featureto a model tool breaks user-control invariants; treating a
library export as "dead code" because the CLI doesn't register
it misreads the design.
Tightening pass (5093c79)
First real audit session against the original six rails surfaced
two failure modes the original wording didn't catch:
file-system tools had
stormExempt: true(only 2 actually do).The original rail covered the single-tool case; the inventory case
needed its own clause.
plan-replayrole from ink to frame #6's library-facing surface was added after the modellabeled
registerSubagentTool"dead code from CLI perspective" —it's a deliberate
src/index.tsexport consumed by embedders.Broadening pass (cd60b80)
A second session against the tightened rails showed the same
head-only-then-conclude failure on a plan-doc file: the model read
the head of
docs/plans/architecture-refactoring-roadmap.md, saw"8 services still use singletons", and asserted the plan was stale
without checking the rest of the doc. Rail #1's "runtime behavior"
scope was inherited from the loop.ts dispatcher case from #610 —
broadened to cover doc freshness and architectural-state claims
too.
Test plan
tests/code-prompt.test.tsextended with audit-rail asserts(now 8 assertions across the 6 rails) — anchored on stable
phrases plus the concrete tool / param tokens (
range:"A-B",parallelSafe?: boolean,40-60% tokens,tighten prompt / existing tool,grep the flag,library exports (\src/index.ts`),current architectural state,whethera plan doc is still accurate`).
npm run typecheck/npm run lintcleannpx vitest run tests/code-prompt.test.ts— 25 passnpx vitest run tests/comment-policy.test.ts— 9 pass