perf: apply non-inferable-only principle to system prompts and memory injection

## Context

ETH Zurich research ([arXiv:2602.11988](https://arxiv.org/abs/2602.11988), AGENTbench 138 tasks) found that repository context files (AGENTS.md, CLAUDE.md-style) often **hurt** AI agent performance:

- LLM-generated context: **-3% success rate** vs no context
- Human-written context: **+4% success rate**, but **+19-20% inference cost**
- Agents follow ALL instructions literally — including unnecessary ones (excessive testing, quality checks beyond scope)
- Generic architecture overviews don't reduce exploration — agents read code anyway

## The Non-Inferable Principle

System prompts and injected memories should include **only** information the agent cannot discover by reading the codebase or environment:

| Include (non-inferable) | Exclude (inferable) |
|------------------------|-------------------|
| Role constraints, authority | Architecture overviews |
| Custom build commands | File structure descriptions |
| Project-specific conventions | General coding best practices |
| Organizational policies (specific) | Generic quality rules |
| Prior decisions, historical outcomes | What's already in the code |
| Interpersonal context | Standard library usage |

## Current State Audit

The current prompt template (`engine/prompt_template.py` v1.2.0) is **mostly good**:
- Identity, Personality, Skills, Authority, Autonomy sections: all non-inferable ✅
- Task section: essential ✅
- Company Context: low-value generic info (department listing) — already correctly first-to-trim ✅
- **Tools section**: potentially redundant if tools are also passed via the API's tool_use mechanism ⚠️
- **org_policies**: content-dependent — no guidance on what makes a good policy ⚠️

## Tasks

### Prompt Builder (`engine/prompt.py`, `engine/prompt_template.py`)

- [ ] Add docstring/comment documenting the non-inferable-only principle
- [ ] Evaluate whether the Tools section is redundant when tools are passed via the LLM API's native tool mechanism — if so, skip the section or make it opt-in
- [ ] Consider making Company Context opt-in rather than default (currently always included if company is provided)

### Org Policies Validation

- [ ] Add guidance/validation for `org_policies` content: policies should be actionable + non-inferable
- [ ] Document what makes a good org policy (specific conventions, custom tooling) vs bad (generic "maintain quality")
- [ ] Consider a `validate_policy_quality()` helper or at minimum a docstring contract

### Memory Injection (`memory/retrieval/`)

- [ ] Add a non-inferable filter stage to the context injection pipeline (Strategy 1)
- [ ] Filter should exclude memories that restate what's discoverable from code
- [ ] Consider relevance-score penalty for generic/inferable memories

### Cost-Aware Context Budgeting

- [ ] Factor prompt token overhead into auto-loop selection cost estimation
- [ ] Track and log prompt tokens as a percentage of total run cost
- [ ] Consider a `prompt_cost_ratio` metric in `TaskCompletionMetrics`

## References

- [Evaluating AGENTS.md (arXiv:2602.11988)](https://arxiv.org/abs/2602.11988)
- [Codified Context (arXiv:2602.20478)](https://arxiv.org/abs/2602.20478) — complementary research; reconciled in DESIGN_SPEC
- DESIGN_SPEC §6.5 (step 3, updated), §7.7 (updated with non-inferable filter note)

## Labels

`type:perf`, `scope:engine`, `scope:memory`

---

## Design Decisions Finalized

- **D22 — Remove Tools Section:** Do NOT list tools in system prompt — the API's `tools` parameter already injects richer definitions with schemas. Saves 200-400+ tokens per call, 20%+ cost reduction. Behavioral guidance ("when to use") may be added later.
- **D23 — Memory Filter:** Pluggable `MemoryFilterStrategy` protocol. Initial: tag-based at write time. `non-inferable` tag convention enforced at `MemoryBackend.store()` boundary. Uses existing `MemoryMetadata.tags` + `MemoryQuery.tags` — zero new models.

**Common pattern:** All strategies use pluggable protocol interfaces with one initial implementation. Alternative strategies are documented in DESIGN_SPEC.md for future.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

perf: apply non-inferable-only principle to system prompts and memory injection #188

Context

The Non-Inferable Principle

Current State Audit

Tasks

Prompt Builder (`engine/prompt.py`, `engine/prompt_template.py`)

Org Policies Validation

Memory Injection (`memory/retrieval/`)

Cost-Aware Context Budgeting

References

Labels

Design Decisions Finalized

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Include (non-inferable)	Exclude (inferable)
Role constraints, authority	Architecture overviews
Custom build commands	File structure descriptions
Project-specific conventions	General coding best practices
Organizational policies (specific)	Generic quality rules
Prior decisions, historical outcomes	What's already in the code
Interpersonal context	Standard library usage

perf: apply non-inferable-only principle to system prompts and memory injection #188

Description

Context

The Non-Inferable Principle

Current State Audit

Tasks

Prompt Builder (engine/prompt.py, engine/prompt_template.py)

Org Policies Validation

Memory Injection (memory/retrieval/)

Cost-Aware Context Budgeting

References

Labels

Design Decisions Finalized

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions

Prompt Builder (`engine/prompt.py`, `engine/prompt_template.py`)

Memory Injection (`memory/retrieval/`)