feat(contract): clean-chat-output-v1 — codify M287 cascade sanitization invariants by noahgift · Pull Request #1859 · paiml/aprender

noahgift · 2026-05-21T16:22:38Z

Summary

Authors the provable contract behind `clean_chat_output` so the six invariants established by the M287 → #1852 → #1853 cascade are falsifier-backed instead of merely tested.

Why

The cascade fixed three things in concert:

fix(try_qwen3_moe_backend): populate stop_tokens with EOS — fixes M287 runaway 'Human:' generation #1852: EOS stop-token detection (`<|im_end|>` / `<|endoftext|>`)
fix(clean_chat_output): strip leading turn-marker prefix at start-of-string #1853: leading "Human:"/"User:"/"Assistant:" prefix strip
M287 surface: 'Human: I need to...' runaway pattern post-EOS-miss

The implementation (`crates/aprender-serve/src/api/realize_handlers.rs::clean_chat_output`) already lives; this contract retroactively codifies its guarantees so future stop-sequence changes require a contract bump alongside the code change. Hooks `pv lint` / contract-coverage audits onto a previously-uncontracted sanitization layer.

Six falsifiers

ID	Guarantee
V1_001	Leading "Human:" / "User:" / "Assistant:" stripped
V1_002	Stop sequence inside body truncates at first occurrence
V1_003	Earliest stop sequence wins when multiple are present
V1_004	Clean text passes through (trim-only)
V1_005	Empty / whitespace / stop-only collapses to ""
V1_006	STOP_SEQUENCES code constant ↔ contract YAML stay synced

All six are already covered by existing unit tests in `crates/aprender-serve/src/api/realize_handlers_clean_chat.rs` and `crates/aprender-serve/src/api/tests/format_chat_02.rs`.

Validation

```bash
$ cargo run -p aprender-contracts-cli --bin pv -- validate contracts/clean-chat-output-v1.yaml
0 error(s), 0 warning(s)
Contract is valid.
```

🤖 Generated with Claude Code

…on invariants Author the provable contract behind `clean_chat_output` so the six invariants established by the M287 → #1852 → #1853 cascade are falsifier-backed instead of merely tested. ## Why The cascade fixed three things in concert: - #1852: EOS stop-token detection (`<|im_end|>` / `<|endoftext|>`) - #1853: leading "Human:"/"User:"/"Assistant:" prefix strip - M287 surface: 'Human: I need to...' runaway pattern post-EOS-miss The implementation (`crates/aprender-serve/src/api/realize_handlers.rs::clean_chat_output`) already lives; this contract retroactively codifies its guarantees so future stop-sequence changes require a contract bump alongside the code change. Hooks `pv lint` / contract-coverage audits onto a previously-uncontracted sanitization layer. ## Six falsifiers - V1_001: leading "Human:" / "User:" / "Assistant:" stripped - V1_002: stop sequence inside body truncates at first occurrence - V1_003: earliest stop sequence wins when multiple are present - V1_004: clean text passes through (trim-only) - V1_005: empty / whitespace / stop-only collapses to "" - V1_006: STOP_SEQUENCES code constant ↔ contract YAML stay synced (manual audit for now; could be pv-lint check later) ## Evidence All six are already covered by existing unit tests in `crates/aprender-serve/src/api/realize_handlers_clean_chat.rs` and `crates/aprender-serve/src/api/tests/format_chat_02.rs`. ## Validation `pv validate contracts/clean-chat-output-v1.yaml` → "Contract is valid." 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

noahgift enabled auto-merge (squash) May 21, 2026 16:22

noahgift added 2 commits May 21, 2026 18:49

Merge branch 'main' into feat/clean-chat-output-v1-contract

c8a27e1

Merge branch 'main' into feat/clean-chat-output-v1-contract

7bf721c

noahgift merged commit fa67c9e into main May 21, 2026
10 checks passed

noahgift deleted the feat/clean-chat-output-v1-contract branch May 21, 2026 17:35

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(contract): clean-chat-output-v1 — codify M287 cascade sanitization invariants#1859

feat(contract): clean-chat-output-v1 — codify M287 cascade sanitization invariants#1859
noahgift merged 3 commits into
mainfrom
feat/clean-chat-output-v1-contract

noahgift commented May 21, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

noahgift commented May 21, 2026

Summary

Why

Six falsifiers

Validation

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant