feat(aprender-serve): apr-cli-trace-save-tensor-v1 write_stage_file composition#1136
Merged
Merged
Conversation
…omposition Per apr-cli-trace-save-tensor-v1.yaml v1.0.0 PROPOSED: combine #1133's byte-format primitives with #1135's directory-layout helpers into a single ergonomic API. The future apr trace --save-tensor CLI calls write_stage_file(dir, layer, stage, values) once per (layer, stage) without managing file handles or paths separately. New module crates/aprender-serve/src/inference_trace/save_tensor_compose.rs: - pub fn write_stage_file(dir, layer, stage_name, values) -> Result<PathBuf> * ensure_layer_dir → File::create → BufWriter → write_tensor_file → flush * Returns resolved path so callers can log it / pass to apr diff - pub fn read_stage_file(path) -> Result<(TensorHeader, Vec<f32>)> * Symmetric one-shot reader for apr diff --values consumers - thiserror-derived WriteStageError (Io) 10 unit tests cover: - write_stage_file_per_layer_roundtrip (canonical case) - write_stage_file_whole_model_roundtrip (no layer-* segment) - write_stage_file_creates_missing_parent (mkdir -p) - write_stage_file_truncates_existing (no append behavior) - write_stage_file_zero_length_tensor (12-byte file) - write_stage_file_preserves_nan_inf (sign-bit roundtrip via to_bits()) - write_stage_file_header_has_expected_magic_and_layer (raw byte check) - read_stage_file_propagates_missing_path - write_stage_file_returns_resolved_path_for_logging (matches output_path) - write_then_read_three_stages_in_one_layer (mirrors FALSIFY-APR-TRACE-SAVE-005 multi-stage scenario at the filesystem level: 3 distinct .bin files under same layer-N/) Live results: 10 passed; 0 failed. Save-tensor contract progress: 4 modules now in main — save_tensor (#1133) + save_tensor_paths (#1135) + save_tensor_compose (this PR), plus save_tensor_stage in flight (#1134). The CLI-wiring PR now needs only to: parse stages via SaveTensorStage::from_str + call write_stage_file at each capture point in the forward pass. Five-Whys (Toyota Way): Why 1: SHIP-007 next-session priority is per-stage element-wise diff. Why 2: 3 building blocks already merged (byte format + paths) but no single ergonomic API. CLI authors would have to re-derive the compose pattern, which invites drift. Why 3: A 60-LOC wrapper + 10 tests pins the writer ↔ reader ↔ layout invariants in one place, including BufWriter::flush() durability + truncating-not-appending semantics. Why 4: write_stage_file_returns_resolved_path_for_logging asserts the returned path matches output_path() — downstream tooling (apr diff --values) relies on this invariant. Why 5: §26.8 stack-tool-extension methodology — extend apr in falsifier-sized slices. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
3 tasks
noahgift
added a commit
that referenced
this pull request
Apr 29, 2026
…1137) Per apr-cli-trace-save-tensor-v1.yaml v1.0.0 PROPOSED: integration tests that exercise the public API of #1133 byte format + #1135 path helpers exactly as a future apr trace --save-tensor CLI implementation will, and as apr diff --values will when reading the produced files. These complement the unit tests in those modules with public-API-surface assertions, catching regressions that internal tests can miss. New file crates/aprender-serve/tests/save_tensor_integration.rs (5 tests): - falsify_apr_trace_save_002_byte_determinism_two_writes Two writes with identical inputs MUST produce byte-identical files. Partial-discharge of FALSIFY-APR-TRACE-SAVE-002 at the library level. - falsify_apr_trace_save_004_header_format_via_public_api Reads raw file bytes, verifies APRT magic, decodes header via parse_header, asserts header.total_file_size() == actual file size, decodes f32 LE body element-wise. Partial-discharge of FALSIFY-APR- TRACE-SAVE-004 at the public-API surface (complements unit tests). - falsify_apr_trace_save_005_three_stages_one_layer_independent_files Three writes (embedding, ffn_gate, ffn_swigl) at layer 0 produce exactly 3 distinct .bin files under layer-0/, each with its own correct dim_product. Partial-discharge of FALSIFY-APR-TRACE-SAVE-005 at the filesystem level (complements parser-level test in #1134). - whole_model_stages_dont_collide_with_per_layer_zero Writes lm_head at WHOLE_MODEL_LAYER and at layer=0; verifies both files exist at distinct paths and preserve their own dim_product values. Defends against future bugs where the WHOLE_MODEL_LAYER sentinel is miscompared at the path-builder layer. - parse_header_on_truncated_file_errors_via_public_api Writes 8 bytes of an APRT header (truncated below the 12-byte minimum); parse_header MUST error cleanly. Defends against silent zero-fill on filesystem corruption. Live results from cargo test -p aprender-serve --test save_tensor_integration: test result: ok. 5 passed; 0 failed. Save-tensor contract progress: - 4 lib modules merged/in-flight (#1133 + #1134 + #1135 + #1136) - 2 public-API integration tests added (this PR) - Independent of all in-flight save-tensor PRs (#1134, #1136); compiles against the modules already in main from #1133 + #1135. Five-Whys (Toyota Way): Why 1: SHIP-007 next-session priority is per-stage element-wise diff. Why 2: Lib-level unit tests cover internal-state invariants well; but a public-API caller can violate invariants the unit tests can't see (e.g., header offsets at the byte level). Why 3: Integration tests against the same public surface that the future CLI uses catch a different regression class. Why 4: 3 of the 5 falsifiers (002, 004, 005) are partial-dischargeable at the integration level today without waiting on the CLI. Why 5: §26.8 stack-tool-extension methodology — extend apr in falsifier-sized slices. Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Per `apr-cli-trace-save-tensor-v1.yaml` v1.0.0 PROPOSED: combine #1133's byte-format primitives with #1135's directory-layout helpers into a single ergonomic API. The future `apr trace --save-tensor` CLI calls `write_stage_file(dir, layer, stage, values)` once per (layer, stage) without managing file handles or paths separately.
What's added
New module `crates/aprender-serve/src/inference_trace/save_tensor_compose.rs` (~240 LOC):
Tests (10 unit tests, all green)
Live verification
```
$ cargo test -p aprender-serve --lib inference_trace::save_tensor_compose
test result: ok. 10 passed; 0 failed; 0 ignored
```
Save-tensor contract progress
After this PR, 4 modules will be in main:
The CLI-wiring PR now needs only to: parse stages via `SaveTensorStage::from_str` + call `write_stage_file` at each capture point in the forward pass.
Why this is small
This PR is tight: 1 new file (~240 LOC including tests), 1 line added to `mod.rs`. No CLI surface change, no behavior change to existing binaries. Depends on #1133 + #1135 (both merged).
Five-Whys (Toyota Way)
Test plan
🤖 Generated with Claude Code