feat(prompt): add audit-mode rails for review/critique tasks by esengine · Pull Request #611 · esengine/DeepSeek-Reasonix

esengine · 2026-05-10T13:42:21Z

Summary

Closes #610.

When users ask Reasonix Code to audit its own architecture, the
existing "Cite or shut up" rule catches absence-style claims but not
the more common audit-mode failure: confident proposals built on
factually wrong premises about runtime behavior, invented quantities,
or recommendations that contradict pinned memory.

Adds a six-bullet # When auditing or reviewing this codebase
section right after "Cite or shut up". Placed there because both are
about evidence integrity for evaluative claims, and audit-mode is
the most common context where a model produces evaluative output.

The six rails:

Auto-preview is for locating, not auditing — long files come
back as head + tail with the middle elided; re-read with
range:"A-B" against the actual section before asserting what's
there (covers runtime behavior, current architectural state, and
doc freshness).
Flag → consumer trace — reading a type field is not
understanding behavior; grep the flag's consumer first. For
inventory claims ("which tools have flag F?"), grep — don't
enumerate from memory.
No fabricated percentages — ground numbers in a cited
measurement or hedge; never present unmeasured numbers as
measured.
Schema cost is real — new-tool proposals must cover (a) which
existing-tool composition fails, (b) description-token cost, (c)
why a prompt change can't reach the same end.
MEMORY.md is part of the design space — recommendations
contradicting loaded user feedback are wrong by construction.
User-facing ≠ model-facing ≠ library-facing — four action
surfaces: slash commands (user), tools (model), UI (user),
library exports (src/index.ts). Promoting a user-level feature
to a model tool breaks user-control invariants; treating a
library export as "dead code" because the CLI doesn't register
it misreads the design.

Tightening pass (`5093c79`)

First real audit session against the original six rails surfaced
two failure modes the original wording didn't catch:

Rail feat: event-log kernel — durable Event sidecar + replay-capable consumer #2's inventory clause was added after the model claimed 6
file-system tools had stormExempt: true (only 2 actually do).
The original rail covered the single-tool case; the inventory case
needed its own clause.
Rail log-frame: migrate plan-replay role from ink to frame #6's library-facing surface was added after the model
labeled registerSubagentTool "dead code from CLI perspective" —
it's a deliberate src/index.ts export consumed by embedders.

Broadening pass (`cd60b80`)

A second session against the tightened rails showed the same
head-only-then-conclude failure on a plan-doc file: the model read
the head of docs/plans/architecture-refactoring-roadmap.md, saw
"8 services still use singletons", and asserted the plan was stale
without checking the rest of the doc. Rail #1's "runtime behavior"
scope was inherited from the loop.ts dispatcher case from #610 —
broadened to cover doc freshness and architectural-state claims
too.

Test plan

tests/code-prompt.test.ts extended with audit-rail asserts
(now 8 assertions across the 6 rails) — anchored on stable
phrases plus the concrete tool / param tokens (range:"A-B",
parallelSafe?: boolean, 40-60% tokens, tighten prompt / existing tool, grep the flag, library exports (\src/index.ts`), current architectural state, whether
a plan doc is still accurate`).
npm run typecheck / npm run lint clean
npx vitest run tests/code-prompt.test.ts — 25 pass
npx vitest run tests/comment-policy.test.ts — 9 pass

When the user asks Reasonix Code to audit its own architecture, the existing "cite or shut up" rule covers absence claims but doesn't catch the more common audit-mode failure: confident, well-structured proposals built on factually wrong premises about runtime behavior, fabricated quantities, or recommendations that contradict pinned memory. Adds a six-bullet section after "Cite or shut up": auto-preview is for locating not auditing, flag→consumer trace before claiming runtime behavior, no fabricated percentages, schema-cost accounting for new-tool proposals, MEMORY.md as design constraint, and user-facing ≠ model-facing as a category-error guardrail. Closes #610.

Audit session run against the original 6-rail section (#610) showed two failures the wording didn't catch: 1. **Inventory-claim hallucination.** Asked which tools have `stormExempt: true`, the model enumerated 6 file-system tools as having it — only 2 actually do. The rail said "trace flag to consumer", which the model interpreted as "for one named tool", not "for an inventory claim covering many tools." Add an explicit inventory clause: grep the flag, don't enumerate from memory. 2. **Library API → dead-code mischaracterization.** The model labeled `registerSubagentTool` "dead code from CLI perspective" on the basis of a clean grep in `src/cli/`. It's a deliberate library export consumed by embedders via `src/index.ts`. The rail enumerated three surfaces (slash / tools / UI); add a fourth (library) so library exports aren't mistaken for unused code. Two-test bump on tests/code-prompt.test.ts so the tightened wording can't silently regress.

Second audit session run against the tightened rails (#611) showed the same head-only-then-conclude failure mode again, this time on a plan-doc file rather than runtime code: the model read the head of docs/plans/architecture-refactoring-roadmap.md, saw "8 services still use singletons", and asserted the plan was now stale — without reading the rest of the doc to check for a "Status: done" section that might have been there. The original rail was scoped to "runtime behavior" because that was the loop.ts dispatcher case from #610. The same blind spot applies to any file: don't conclude what's in the elided middle off head + tail. Broaden the wording to cover runtime behavior, current architectural state, and doc freshness explicitly. One test bump on tests/code-prompt.test.ts so the broader scope can't silently regress to runtime-only.

…e#611) * feat(prompt): add audit-mode rails for review/critique tasks When the user asks Reasonix Code to audit its own architecture, the existing "cite or shut up" rule covers absence claims but doesn't catch the more common audit-mode failure: confident, well-structured proposals built on factually wrong premises about runtime behavior, fabricated quantities, or recommendations that contradict pinned memory. Adds a six-bullet section after "Cite or shut up": auto-preview is for locating not auditing, flag→consumer trace before claiming runtime behavior, no fabricated percentages, schema-cost accounting for new-tool proposals, MEMORY.md as design constraint, and user-facing ≠ model-facing as a category-error guardrail. Closes esengine#610. * prompt: tighten rails #2 and #6 from real audit-session failure modes Audit session run against the original 6-rail section (esengine#610) showed two failures the wording didn't catch: 1. **Inventory-claim hallucination.** Asked which tools have `stormExempt: true`, the model enumerated 6 file-system tools as having it — only 2 actually do. The rail said "trace flag to consumer", which the model interpreted as "for one named tool", not "for an inventory claim covering many tools." Add an explicit inventory clause: grep the flag, don't enumerate from memory. 2. **Library API → dead-code mischaracterization.** The model labeled `registerSubagentTool` "dead code from CLI perspective" on the basis of a clean grep in `src/cli/`. It's a deliberate library export consumed by embedders via `src/index.ts`. The rail enumerated three surfaces (slash / tools / UI); add a fourth (library) so library exports aren't mistaken for unused code. Two-test bump on tests/code-prompt.test.ts so the tightened wording can't silently regress. * prompt: broaden rail #1 to cover doc / state claims, not just runtime Second audit session run against the tightened rails (esengine#611) showed the same head-only-then-conclude failure mode again, this time on a plan-doc file rather than runtime code: the model read the head of docs/plans/architecture-refactoring-roadmap.md, saw "8 services still use singletons", and asserted the plan was now stale — without reading the rest of the doc to check for a "Status: done" section that might have been there. The original rail was scoped to "runtime behavior" because that was the loop.ts dispatcher case from esengine#610. The same blind spot applies to any file: don't conclude what's in the elided middle off head + tail. Broaden the wording to cover runtime behavior, current architectural state, and doc freshness explicitly. One test bump on tests/code-prompt.test.ts so the broader scope can't silently regress to runtime-only.

esengine added 3 commits May 10, 2026 06:40

esengine mentioned this pull request May 10, 2026

feat(tools): add delete/copy fs tools + compress shell descriptions #615

Merged

4 tasks

esengine merged commit 56fcb2a into main May 10, 2026
3 checks passed

esengine deleted the feat/audit-mode-rails branch May 10, 2026 14:39

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(prompt): add audit-mode rails for review/critique tasks#611

feat(prompt): add audit-mode rails for review/critique tasks#611
esengine merged 3 commits into
mainfrom
feat/audit-mode-rails

esengine commented May 10, 2026 •

edited

Loading

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

esengine commented May 10, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Tightening pass (5093c79)

Broadening pass (cd60b80)

Test plan

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

esengine commented May 10, 2026 •

edited

Loading

Tightening pass (`5093c79`)

Broadening pass (`cd60b80`)