docs(ship-two-001): §26 + §26.8 — three-priority plan + apr-is-canonical stack-tool-extension rule — spec v2.69.0 → v2.71.0 by noahgift · Pull Request #1079 · paiml/aprender

noahgift · 2026-04-27T06:07:19Z

Summary

Two-section spec amendment: §26 codifies the three-priority next-session execution plan (P1/P2/P3 with falsifiable binding criteria), and §26.8 codifies a binding methodology rule discovered mid-session:

When a stack CLI lacks a feature, extend the tool via contract→code, NEVER route around to a non-stack shim like huggingface-cli.

§26.8 triggering incident (2026-04-27)

P1 sub-agent recommended huggingface-cli download --include 'data/train-000[0-7][0-9]-of-00880.parquet' because batuta hf pull lacks --include. This is muda — huggingface-cli is non-stack, batuta hf pull is stack-canonical. Reaching for a non-stack CLI to bypass a missing flag violates feedback_fix_root_cause_never_route_around.md and feedback_pv_not_bash_for_contracts.md.

§26.8.1 Binding rule

When a stack CLI (batuta, apr, pv, …) lacks a feature:

Author contracts/<tool>-cli-<flag>-v1.yaml provable contract
Extend the tool via in-tree implementation
Use the extended stack tool

Acceptable narrow exceptions: one-off uv run --with data prep, one-off xxd forensics. Recurring workflows MUST extend the stack tool.

§26 priority matrix

Priority	Wall-time	Binding	Discharges
P1	~6-8 hr	manifest.total_tokens > 1e9 AND vocab_size == 50257	enables P2
P2	7.3 hr	best_val_loss < 9.75	up to 9 MODEL-2 PARTIALs
P3	~4 hr	APR vs GGUF layer-3 ffn_swigl ratio ≥10× or <2×	up to 5 MODEL-1 PARTIALs

P1 + P3 parallel; P2 gated on P1.

§26.9 Revised P1 chain (per §26.8 rule)

P1 now has a prerequisite:

P1.0 Author contracts/batuta-cli-pull-pattern-include-v1.yaml
P1.1 Implement batuta hf pull --include <glob> per contract
P1.2 Drift-prevention test
P1.3 THEN: batuta hf pull dataset codeparrot/github-code-clean --include '...'
P1.4 manifest.json.total_tokens > 1e9 AND vocab_size == 50257

Adds ~3-6 hours code-authoring before download, but produces a durable batuta improvement.

Coverage tally evolution (unchanged)

State	PARTIAL	DISCHARGED
Now	33	12
Both criteria met	19	26 (58% DISCHARGED)

Test plan

CI workspace-test passes
CI gate passes
Spec banner v2.71.0 reflects §26 + §26.8
§26.8 binding rule cross-referenced to existing feedback memories

🤖 Generated with Claude Code

…cal stack-tool-extension rule — spec v2.69.0 → v2.71.0 §26: Three-priority execution plan with falsifiable binding criteria. P1 (Stack v2 corpus), P2 (convergence run, depends P1), P3 (GGUF forward_traced for SHIP-007 pin). Maximum theoretical flip: 14 PARTIAL→DISCHARGED. §26.8 (added 2026-04-27 after triggering incident): **BINDING METHODOLOGY RULE — `apr` is the canonical stack CLI post-monorepo. When `apr` lacks a feature, extend `apr` via contract→code, NEVER route around to non-stack shims like `huggingface-cli` or to deprecated namespaces like `batuta hf pull`.** Triggering incident: P1 sub-agent recommended downloading codeparrot/github-code-clean via `huggingface-cli download --include '...'` because `apr pull` is model-only today (no dataset asset-type, no --include, no --license-allowlist). This violates three rules: - feedback_fix_root_cause_never_route_around: missing surface is a feature gap, fix at root - feedback_pv_not_bash_for_contracts: re-implementing what a stack tool should do via non-stack CLI is muda - feedback_monorepo_single_source_of_truth: `apr` is canonical post-APR-MONO; `batuta hf pull` is deprecated namespace Binding rule §26.8.1: 1. Author contracts/apr-cli-<subcommand>-v1.yaml 2. Extend `apr` via in-tree implementation 3. Use the extended `apr` Acceptable exceptions: one-off `uv run --with` data-prep where no stack tool covers the niche; one-off xxd forensics. Recurring workflows (every dataset pull) MUST extend `apr`. §26.9: P1 prerequisite chain: P1.0 Author contracts/apr-cli-pull-dataset-v1.yaml P1.1 Implement `apr pull dataset` with --include, --license-allowlist P1.2 Drift-prevention test P1.3 Update apr-cli-commands-v1.yaml registry per feedback_cli_subcommand_three_surface_drift P1.4 THEN: `apr pull dataset codeparrot/github-code-clean --include '...' --license-allowlist '...' --output ...` P1 manifest.json.total_tokens > 1e9 AND vocab_size == 50257 §26.2 corpus target updated: codeparrot/github-code-clean (directly downloadable; ~12-16B Python tokens after filters) replaces bigcode/the-stack-v2-dedup (uses Software Heritage IDs, too complex for session window). Sub-agent corpus survey ratified. Adds ~3-6 hours code-authoring before download, but produces a durable `apr pull dataset` extension benefiting every future dataset pull, not a one-off shim. Spec v2.69.0 → v2.70.0 (§26) → v2.71.0 (§26.8). Coverage unchanged at amendment — §26 is the plan, §26.8 the methodology clarification. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

…irmed APR-side at inference.rs:160-164 — spec v2.71.0 → v2.72.0 (#1084) Live evidence on noah-Lambda-Vector RTX 4090 2026-04-27. Built apr from PR #1083 branch (commits 77c016b + c657968 + f249464 from PR A+B+C cascade). Ran `apr trace --payload` on canonical 7B teacher in BOTH formats with identical prompt + tokenizer. Result: | Layer | APR ffn_swigl std | GGUF ffn_swigl std | Ratio | |------:|------------------:|-------------------:|------:| | 3 | 1.2216 | 0.0670 | 18.23x | §26.4 binding criterion threshold: ≥10x → APR-side bug. **Observed 18.23x — 8x past the threshold, decisive verdict.** The investigation chain that started in §15.4 (GPU GQA elimination) has reached its conclusion at §27: §15.4 → §16 → §17 → §23 → §27 (this) "Whole forward path" → "GPU eliminated" → "(layer=3, FFN sub-block)" → "(layer=3, ffn_swigl)" → "**APR-side at inference.rs:160-164**" Cascade-damping signature confirmed: - Layers 0-2: ratio ~1.1x (normal) - Layer 3: 18.23x (anomaly) - Layers 4-5: 3.3-4.5x (cascade) - Layer 6+: ~1x (recovered) This is consistent with a localized perturbation (off-by-one, buffer aliasing, or F32-vs-Q4K dequant defect at layer-3- specifically) rather than persistent residual-stream corruption. Per §17.5, SHIP-007 fix discharges 5 MODEL-1 PARTIALs at once (SHIP-002/005/006/007/008). §26.5 expected coverage flip: 33+12 → 28+17 when fix lands. §27 does NOT discharge by itself — it locates the bug for fixing. Next investigation reads `inference.rs:160-164` and tests 4 hypotheses: 1. Off-by-one slice indexing 2. Buffer aliasing (scratch reuse pattern) 3. F32-vs-Q4K dequant defect at layer-3 input range 4. Activation overflow (SiLU saturation amplifies multiply) Methodology held throughout: zero eprintln!, zero route-arounds, apr is canonical (§26.8), all instrumentation via `apr trace --payload`. Lambda-labs lane pre-authorized. Evidence persisted to evidence/ship-007-apr-vs-gguf-2026-04-27/: - apr-trace.txt (13.5 KB) - gguf-trace.txt (13.7 KB) - binding-criterion-summary.json Note: §27 reproduction requires PR #1081 + #1082 + #1083 cascade to merge first (the apr trace --payload <gguf> wiring is in PR C). Evidence was generated with a local build of PR #1083 branch. Spec v2.71.0 → v2.72.0. Coverage flip pending fix. Spec: SPEC-SHIP-TWO-001 §26.4 P3 verdict References: - §15.4 (PR #1062) — GPU GQA eliminated - §16 (PR #1063) — APR CPU isolated - §17 (PR #1064) — layer-3 FFN sub-block - §23 (PR #1075) — layer-3 ffn_swigl named - §26.8 (PR #1079) — apr-is-canonical methodology rule - PR #1081 (P3 PR A scaffold) - PR #1082 (P3 PR B sub-FFN populate) - PR #1083 (P3 PR C CLI wiring) Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com>

noahgift enabled auto-merge (squash) April 27, 2026 06:07

noahgift force-pushed the feat/spec-26-next-session-execution-plan branch from 99d3ad2 to e1c7ec9 Compare April 27, 2026 06:28

noahgift changed the title ~~docs(ship-two-001): §26 — three-priority execution plan + binding criteria — spec v2.69.0 → v2.70.0~~ docs(ship-two-001): §26 + §26.8 — three-priority plan + stack-tool-extension methodology rule — spec v2.69.0 → v2.71.0 Apr 27, 2026

noahgift force-pushed the feat/spec-26-next-session-execution-plan branch from e1c7ec9 to 7eae6d7 Compare April 27, 2026 06:36

noahgift merged commit 1556983 into main Apr 27, 2026
10 checks passed

noahgift deleted the feat/spec-26-next-session-execution-plan branch April 27, 2026 07:11

noahgift mentioned this pull request Apr 27, 2026

docs(ship-two-001): §29 — EOD 2026-04-27 goal recap + coverage scoreboard — spec v2.73.0 → v2.74.0 #1087

Merged

4 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

docs(ship-two-001): §26 + §26.8 — three-priority plan + apr-is-canonical stack-tool-extension rule — spec v2.69.0 → v2.71.0#1079

docs(ship-two-001): §26 + §26.8 — three-priority plan + apr-is-canonical stack-tool-extension rule — spec v2.69.0 → v2.71.0#1079
noahgift merged 1 commit into
mainfrom
feat/spec-26-next-session-execution-plan

noahgift commented Apr 27, 2026 •

edited

Loading

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

noahgift commented Apr 27, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

§26.8 triggering incident (2026-04-27)

§26.8.1 Binding rule

§26 priority matrix

§26.9 Revised P1 chain (per §26.8 rule)

Coverage tally evolution (unchanged)

Test plan

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

noahgift commented Apr 27, 2026 •

edited

Loading