feat(pretrain): add wall_ms to StepMetrics — Residual B per spec §19.4#1069
Merged
Conversation
Per `contracts/training-loop-pretrain-v1.yaml` v1.4.0 → v1.5.0 (additive minor): adds the 7th required field `wall_ms: f32` to per-step JSONL emission. Discharges §19.4 Residual B prerequisite for promoting GATE-GPUTRAIN-004 (per-step latency budget < 500ms on RTX 4090 / 370M) to ACTIVE_WITH_LIVE_EVIDENCE. ## What changed - `contracts/training-loop-pretrain-v1.yaml` v1.4.0 → v1.5.0: - per_step_metrics.required adds `wall_ms` (f32) - Includes consistency invariant note: tokens_per_sec * (wall_ms / 1000.0) ≈ batch_tokens - `StepMetrics` (pretrain.rs:104-118) gains `wall_ms: f32` field - `#[serde(default)]` to keep older JSONL parseable on read - Doc-comment cites the contract version - `PretrainLoop::train_step` (pretrain.rs:565-586) populates wall_ms from the same `t0.elapsed()` span as tokens_per_sec — single-source derivation prevents independent drift - `validate_finite()` rejects non-finite or negative wall_ms - 3 new unit tests: - `step_metrics_rejects_negative_wall_ms` - `step_metrics_rejects_nan_wall_ms` - `step_metrics_wall_ms_consistent_with_tokens_per_sec` ## Backward compat `#[serde(default)]` on the new field means JSONL emitted by older binaries (without wall_ms) still deserializes — wall_ms defaults to 0.0 when absent. Newly emitted JSONL always has the field set. ## Test plan - [x] `pv validate` on contract: 0 errors, 0 warnings - [x] `cargo test -p aprender-train --release --lib pretrain::tests::`: 25 passed (was 22) - [x] `cargo build --workspace --release` succeeds - [x] No downstream consumer of StepMetrics broke ## What this does NOT do - Does NOT capture live evidence on a cuda:0 dispatch (Residual B step 2 — operator action, scoped to a follow-up). - Does NOT promote GATE-GPUTRAIN-004 to ACTIVE_WITH_LIVE_EVIDENCE (that requires the live dispatch + persisted evidence file). This PR closes the *code* prerequisite for GATE-GPUTRAIN-004 discharge. The next operator dispatch can persist `wall_ms` per-step and use it as the GATE-004 verdict input. Closes task #156. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
This was referenced Apr 26, 2026
noahgift
added a commit
that referenced
this pull request
Apr 26, 2026
…2.64.0 → v2.65.0 §19 verified `apr pretrain --device cuda` is wired but the canonical apr binary lacked `--features cuda`. §20 records the next step: **rebuild + live dispatch + evidence capture** on RTX 4090. ## What §20 contains (9 subsections) 1. §20.1 — Rebuild (40s incremental, `--features cuda` enabled apr-cli) 2. §20.2 — Live dispatch command + 100-step JSONL output 3. §20.3 — wall_ms statistics: median=264.74ms (47% headroom under GATE-GPUTRAIN-004's 500ms budget) 4. §20.4 — nvidia-smi PID 1658504 / 6636 MiB GPU memory captured mid-run 5. §20.5 — Gate-by-gate impact table (GATE-GPUTRAIN-002/003/004/005) 6. §20.6 — Evidence files at evidence/task-132-residual-b/ 7. §20.7 — Long-path status: §19.5 step (a) DONE 8. §20.8 — What §20 is NOT (contract bump is follow-up PR) 9. §20.9 — Methodological alignment (live-evidence pattern, not chain-of-thought) ## Live evidence captured - 100 real CUDA training steps on noah-Lambda-Vector RTX 4090 - Real corpus: /mnt/nvme-raid0/data/csn-python-shards - Real tokenizer: /mnt/nvme-raid0/models/model-2-tokenizer-v1 (vocab=50,257) - wall_ms median: 264.74 ms (range 257.86–467.66 with step 0 = 467.66 kernel-warmup outlier) - train_loss step 0=11.02 → step 99=10.50 (Δ=−0.52, decreasing) - val_loss=10.31 triggered GATE-TRAIN-005 ship-blocker abort at epoch boundary (correct behavior for fresh-init 370M before convergence) - nvidia-smi PID 1658504 / 6636 MiB stable mid-run ## Spec progression v2.64.0 → v2.65.0. Coverage tally update is **pending** the contract bump for `gpu-training-backend-v1.yaml` GATE-GPUTRAIN-004 PARTIAL_ALGORITHM_LEVEL → ACTIVE_WITH_LIVE_EVIDENCE (separate follow-up PR; §20 records the data, the contract amendment captures the durable verdict). ## Stacks under - #1068 (§19 — task #132 correction) - #1067 (§18 — training status snapshot) - Concrete progress on §19.4 Residual B (live evidence half) - Pairs with PR #1069 (wall_ms code half — provided the JSONL field used for the GATE-GPUTRAIN-004 timing data) Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
noahgift
added a commit
that referenced
this pull request
Apr 26, 2026
…2.64.0 → v2.65.0 §19 verified `apr pretrain --device cuda` is wired but the canonical apr binary lacked `--features cuda`. §20 records the next step: **rebuild + live dispatch + evidence capture** on RTX 4090. ## What §20 contains (9 subsections) 1. §20.1 — Rebuild (40s incremental, `--features cuda` enabled apr-cli) 2. §20.2 — Live dispatch command + 100-step JSONL output 3. §20.3 — wall_ms statistics: median=264.74ms (47% headroom under GATE-GPUTRAIN-004's 500ms budget) 4. §20.4 — nvidia-smi PID 1658504 / 6636 MiB GPU memory captured mid-run 5. §20.5 — Gate-by-gate impact table (GATE-GPUTRAIN-002/003/004/005) 6. §20.6 — Evidence files at evidence/task-132-residual-b/ 7. §20.7 — Long-path status: §19.5 step (a) DONE 8. §20.8 — What §20 is NOT (contract bump is follow-up PR) 9. §20.9 — Methodological alignment (live-evidence pattern, not chain-of-thought) ## Live evidence captured - 100 real CUDA training steps on noah-Lambda-Vector RTX 4090 - Real corpus: /mnt/nvme-raid0/data/csn-python-shards - Real tokenizer: /mnt/nvme-raid0/models/model-2-tokenizer-v1 (vocab=50,257) - wall_ms median: 264.74 ms (range 257.86–467.66 with step 0 = 467.66 kernel-warmup outlier) - train_loss step 0=11.02 → step 99=10.50 (Δ=−0.52, decreasing) - val_loss=10.31 triggered GATE-TRAIN-005 ship-blocker abort at epoch boundary (correct behavior for fresh-init 370M before convergence) - nvidia-smi PID 1658504 / 6636 MiB stable mid-run ## Spec progression v2.64.0 → v2.65.0. Coverage tally update is **pending** the contract bump for `gpu-training-backend-v1.yaml` GATE-GPUTRAIN-004 PARTIAL_ALGORITHM_LEVEL → ACTIVE_WITH_LIVE_EVIDENCE (separate follow-up PR; §20 records the data, the contract amendment captures the durable verdict). ## Stacks under - #1068 (§19 — task #132 correction) - #1067 (§18 — training status snapshot) - Concrete progress on §19.4 Residual B (live evidence half) - Pairs with PR #1069 (wall_ms code half — provided the JSONL field used for the GATE-GPUTRAIN-004 timing data) Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
noahgift
added a commit
that referenced
this pull request
Apr 26, 2026
…2.64.0 → v2.65.0 §19 verified `apr pretrain --device cuda` is wired but the canonical apr binary lacked `--features cuda`. §20 records the next step: **rebuild + live dispatch + evidence capture** on RTX 4090. ## What §20 contains (9 subsections) 1. §20.1 — Rebuild (40s incremental, `--features cuda` enabled apr-cli) 2. §20.2 — Live dispatch command + 100-step JSONL output 3. §20.3 — wall_ms statistics: median=264.74ms (47% headroom under GATE-GPUTRAIN-004's 500ms budget) 4. §20.4 — nvidia-smi PID 1658504 / 6636 MiB GPU memory captured mid-run 5. §20.5 — Gate-by-gate impact table (GATE-GPUTRAIN-002/003/004/005) 6. §20.6 — Evidence files at evidence/task-132-residual-b/ 7. §20.7 — Long-path status: §19.5 step (a) DONE 8. §20.8 — What §20 is NOT (contract bump is follow-up PR) 9. §20.9 — Methodological alignment (live-evidence pattern, not chain-of-thought) ## Live evidence captured - 100 real CUDA training steps on noah-Lambda-Vector RTX 4090 - Real corpus: /mnt/nvme-raid0/data/csn-python-shards - Real tokenizer: /mnt/nvme-raid0/models/model-2-tokenizer-v1 (vocab=50,257) - wall_ms median: 264.74 ms (range 257.86–467.66 with step 0 = 467.66 kernel-warmup outlier) - train_loss step 0=11.02 → step 99=10.50 (Δ=−0.52, decreasing) - val_loss=10.31 triggered GATE-TRAIN-005 ship-blocker abort at epoch boundary (correct behavior for fresh-init 370M before convergence) - nvidia-smi PID 1658504 / 6636 MiB stable mid-run ## Spec progression v2.64.0 → v2.65.0. Coverage tally update is **pending** the contract bump for `gpu-training-backend-v1.yaml` GATE-GPUTRAIN-004 PARTIAL_ALGORITHM_LEVEL → ACTIVE_WITH_LIVE_EVIDENCE (separate follow-up PR; §20 records the data, the contract amendment captures the durable verdict). ## Stacks under - #1068 (§19 — task #132 correction) - #1067 (§18 — training status snapshot) - Concrete progress on §19.4 Residual B (live evidence half) - Pairs with PR #1069 (wall_ms code half — provided the JSONL field used for the GATE-GPUTRAIN-004 timing data) Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
noahgift
added a commit
that referenced
this pull request
Apr 26, 2026
…2.64.0 → v2.65.0 §19 verified `apr pretrain --device cuda` is wired but the canonical apr binary lacked `--features cuda`. §20 records the next step: **rebuild + live dispatch + evidence capture** on RTX 4090. ## What §20 contains (9 subsections) 1. §20.1 — Rebuild (40s incremental, `--features cuda` enabled apr-cli) 2. §20.2 — Live dispatch command + 100-step JSONL output 3. §20.3 — wall_ms statistics: median=264.74ms (47% headroom under GATE-GPUTRAIN-004's 500ms budget) 4. §20.4 — nvidia-smi PID 1658504 / 6636 MiB GPU memory captured mid-run 5. §20.5 — Gate-by-gate impact table (GATE-GPUTRAIN-002/003/004/005) 6. §20.6 — Evidence files at evidence/task-132-residual-b/ 7. §20.7 — Long-path status: §19.5 step (a) DONE 8. §20.8 — What §20 is NOT (contract bump is follow-up PR) 9. §20.9 — Methodological alignment (live-evidence pattern, not chain-of-thought) ## Live evidence captured - 100 real CUDA training steps on noah-Lambda-Vector RTX 4090 - Real corpus: /mnt/nvme-raid0/data/csn-python-shards - Real tokenizer: /mnt/nvme-raid0/models/model-2-tokenizer-v1 (vocab=50,257) - wall_ms median: 264.74 ms (range 257.86–467.66 with step 0 = 467.66 kernel-warmup outlier) - train_loss step 0=11.02 → step 99=10.50 (Δ=−0.52, decreasing) - val_loss=10.31 triggered GATE-TRAIN-005 ship-blocker abort at epoch boundary (correct behavior for fresh-init 370M before convergence) - nvidia-smi PID 1658504 / 6636 MiB stable mid-run ## Spec progression v2.64.0 → v2.65.0. Coverage tally update is **pending** the contract bump for `gpu-training-backend-v1.yaml` GATE-GPUTRAIN-004 PARTIAL_ALGORITHM_LEVEL → ACTIVE_WITH_LIVE_EVIDENCE (separate follow-up PR; §20 records the data, the contract amendment captures the durable verdict). ## Stacks under - #1068 (§19 — task #132 correction) - #1067 (§18 — training status snapshot) - Concrete progress on §19.4 Residual B (live evidence half) - Pairs with PR #1069 (wall_ms code half — provided the JSONL field used for the GATE-GPUTRAIN-004 timing data) Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
noahgift
added a commit
that referenced
this pull request
Apr 26, 2026
…2.64.0 → v2.65.0 §19 verified `apr pretrain --device cuda` is wired but the canonical apr binary lacked `--features cuda`. §20 records the next step: **rebuild + live dispatch + evidence capture** on RTX 4090. ## What §20 contains (9 subsections) 1. §20.1 — Rebuild (40s incremental, `--features cuda` enabled apr-cli) 2. §20.2 — Live dispatch command + 100-step JSONL output 3. §20.3 — wall_ms statistics: median=264.74ms (47% headroom under GATE-GPUTRAIN-004's 500ms budget) 4. §20.4 — nvidia-smi PID 1658504 / 6636 MiB GPU memory captured mid-run 5. §20.5 — Gate-by-gate impact table (GATE-GPUTRAIN-002/003/004/005) 6. §20.6 — Evidence files at evidence/task-132-residual-b/ 7. §20.7 — Long-path status: §19.5 step (a) DONE 8. §20.8 — What §20 is NOT (contract bump is follow-up PR) 9. §20.9 — Methodological alignment (live-evidence pattern, not chain-of-thought) ## Live evidence captured - 100 real CUDA training steps on noah-Lambda-Vector RTX 4090 - Real corpus: /mnt/nvme-raid0/data/csn-python-shards - Real tokenizer: /mnt/nvme-raid0/models/model-2-tokenizer-v1 (vocab=50,257) - wall_ms median: 264.74 ms (range 257.86–467.66 with step 0 = 467.66 kernel-warmup outlier) - train_loss step 0=11.02 → step 99=10.50 (Δ=−0.52, decreasing) - val_loss=10.31 triggered GATE-TRAIN-005 ship-blocker abort at epoch boundary (correct behavior for fresh-init 370M before convergence) - nvidia-smi PID 1658504 / 6636 MiB stable mid-run ## Spec progression v2.64.0 → v2.65.0. Coverage tally update is **pending** the contract bump for `gpu-training-backend-v1.yaml` GATE-GPUTRAIN-004 PARTIAL_ALGORITHM_LEVEL → ACTIVE_WITH_LIVE_EVIDENCE (separate follow-up PR; §20 records the data, the contract amendment captures the durable verdict). ## Stacks under - #1068 (§19 — task #132 correction) - #1067 (§18 — training status snapshot) - Concrete progress on §19.4 Residual B (live evidence half) - Pairs with PR #1069 (wall_ms code half — provided the JSONL field used for the GATE-GPUTRAIN-004 timing data) Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
noahgift
added a commit
that referenced
this pull request
Apr 26, 2026
…2.64.0 → v2.65.0 §19 verified `apr pretrain --device cuda` is wired but the canonical apr binary lacked `--features cuda`. §20 records the next step: **rebuild + live dispatch + evidence capture** on RTX 4090. ## What §20 contains (9 subsections) 1. §20.1 — Rebuild (40s incremental, `--features cuda` enabled apr-cli) 2. §20.2 — Live dispatch command + 100-step JSONL output 3. §20.3 — wall_ms statistics: median=264.74ms (47% headroom under GATE-GPUTRAIN-004's 500ms budget) 4. §20.4 — nvidia-smi PID 1658504 / 6636 MiB GPU memory captured mid-run 5. §20.5 — Gate-by-gate impact table (GATE-GPUTRAIN-002/003/004/005) 6. §20.6 — Evidence files at evidence/task-132-residual-b/ 7. §20.7 — Long-path status: §19.5 step (a) DONE 8. §20.8 — What §20 is NOT (contract bump is follow-up PR) 9. §20.9 — Methodological alignment (live-evidence pattern, not chain-of-thought) ## Live evidence captured - 100 real CUDA training steps on noah-Lambda-Vector RTX 4090 - Real corpus: /mnt/nvme-raid0/data/csn-python-shards - Real tokenizer: /mnt/nvme-raid0/models/model-2-tokenizer-v1 (vocab=50,257) - wall_ms median: 264.74 ms (range 257.86–467.66 with step 0 = 467.66 kernel-warmup outlier) - train_loss step 0=11.02 → step 99=10.50 (Δ=−0.52, decreasing) - val_loss=10.31 triggered GATE-TRAIN-005 ship-blocker abort at epoch boundary (correct behavior for fresh-init 370M before convergence) - nvidia-smi PID 1658504 / 6636 MiB stable mid-run ## Spec progression v2.64.0 → v2.65.0. Coverage tally update is **pending** the contract bump for `gpu-training-backend-v1.yaml` GATE-GPUTRAIN-004 PARTIAL_ALGORITHM_LEVEL → ACTIVE_WITH_LIVE_EVIDENCE (separate follow-up PR; §20 records the data, the contract amendment captures the durable verdict). ## Stacks under - #1068 (§19 — task #132 correction) - #1067 (§18 — training status snapshot) - Concrete progress on §19.4 Residual B (live evidence half) - Pairs with PR #1069 (wall_ms code half — provided the JSONL field used for the GATE-GPUTRAIN-004 timing data) Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
noahgift
added a commit
that referenced
this pull request
Apr 26, 2026
…2.64.0 → v2.65.0 §19 verified `apr pretrain --device cuda` is wired but the canonical apr binary lacked `--features cuda`. §20 records the next step: **rebuild + live dispatch + evidence capture** on RTX 4090. ## What §20 contains (9 subsections) 1. §20.1 — Rebuild (40s incremental, `--features cuda` enabled apr-cli) 2. §20.2 — Live dispatch command + 100-step JSONL output 3. §20.3 — wall_ms statistics: median=264.74ms (47% headroom under GATE-GPUTRAIN-004's 500ms budget) 4. §20.4 — nvidia-smi PID 1658504 / 6636 MiB GPU memory captured mid-run 5. §20.5 — Gate-by-gate impact table (GATE-GPUTRAIN-002/003/004/005) 6. §20.6 — Evidence files at evidence/task-132-residual-b/ 7. §20.7 — Long-path status: §19.5 step (a) DONE 8. §20.8 — What §20 is NOT (contract bump is follow-up PR) 9. §20.9 — Methodological alignment (live-evidence pattern, not chain-of-thought) ## Live evidence captured - 100 real CUDA training steps on noah-Lambda-Vector RTX 4090 - Real corpus: /mnt/nvme-raid0/data/csn-python-shards - Real tokenizer: /mnt/nvme-raid0/models/model-2-tokenizer-v1 (vocab=50,257) - wall_ms median: 264.74 ms (range 257.86–467.66 with step 0 = 467.66 kernel-warmup outlier) - train_loss step 0=11.02 → step 99=10.50 (Δ=−0.52, decreasing) - val_loss=10.31 triggered GATE-TRAIN-005 ship-blocker abort at epoch boundary (correct behavior for fresh-init 370M before convergence) - nvidia-smi PID 1658504 / 6636 MiB stable mid-run ## Spec progression v2.64.0 → v2.65.0. Coverage tally update is **pending** the contract bump for `gpu-training-backend-v1.yaml` GATE-GPUTRAIN-004 PARTIAL_ALGORITHM_LEVEL → ACTIVE_WITH_LIVE_EVIDENCE (separate follow-up PR; §20 records the data, the contract amendment captures the durable verdict). ## Stacks under - #1068 (§19 — task #132 correction) - #1067 (§18 — training status snapshot) - Concrete progress on §19.4 Residual B (live evidence half) - Pairs with PR #1069 (wall_ms code half — provided the JSONL field used for the GATE-GPUTRAIN-004 timing data) Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
noahgift
added a commit
that referenced
this pull request
Apr 26, 2026
…2.64.0 → v2.65.0 (#1070) §19 verified `apr pretrain --device cuda` is wired but the canonical apr binary lacked `--features cuda`. §20 records the next step: **rebuild + live dispatch + evidence capture** on RTX 4090. ## What §20 contains (9 subsections) 1. §20.1 — Rebuild (40s incremental, `--features cuda` enabled apr-cli) 2. §20.2 — Live dispatch command + 100-step JSONL output 3. §20.3 — wall_ms statistics: median=264.74ms (47% headroom under GATE-GPUTRAIN-004's 500ms budget) 4. §20.4 — nvidia-smi PID 1658504 / 6636 MiB GPU memory captured mid-run 5. §20.5 — Gate-by-gate impact table (GATE-GPUTRAIN-002/003/004/005) 6. §20.6 — Evidence files at evidence/task-132-residual-b/ 7. §20.7 — Long-path status: §19.5 step (a) DONE 8. §20.8 — What §20 is NOT (contract bump is follow-up PR) 9. §20.9 — Methodological alignment (live-evidence pattern, not chain-of-thought) ## Live evidence captured - 100 real CUDA training steps on noah-Lambda-Vector RTX 4090 - Real corpus: /mnt/nvme-raid0/data/csn-python-shards - Real tokenizer: /mnt/nvme-raid0/models/model-2-tokenizer-v1 (vocab=50,257) - wall_ms median: 264.74 ms (range 257.86–467.66 with step 0 = 467.66 kernel-warmup outlier) - train_loss step 0=11.02 → step 99=10.50 (Δ=−0.52, decreasing) - val_loss=10.31 triggered GATE-TRAIN-005 ship-blocker abort at epoch boundary (correct behavior for fresh-init 370M before convergence) - nvidia-smi PID 1658504 / 6636 MiB stable mid-run ## Spec progression v2.64.0 → v2.65.0. Coverage tally update is **pending** the contract bump for `gpu-training-backend-v1.yaml` GATE-GPUTRAIN-004 PARTIAL_ALGORITHM_LEVEL → ACTIVE_WITH_LIVE_EVIDENCE (separate follow-up PR; §20 records the data, the contract amendment captures the durable verdict). ## Stacks under - #1068 (§19 — task #132 correction) - #1067 (§18 — training status snapshot) - Concrete progress on §19.4 Residual B (live evidence half) - Pairs with PR #1069 (wall_ms code half — provided the JSONL field used for the GATE-GPUTRAIN-004 timing data) Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
wall_ms: f32toStepMetrics, the 7th required field percontracts/training-loop-pretrain-v1.yamlv1.4.0 → v1.5.0 (additive minor bump).wall_mspersisted) is the operator follow-up.What this PR delivers
training-loop-pretrain-v1.yamlv1.4.0 → v1.5.0 withwall_msinper_step_metrics.required+ consistency invariant note (tokens_per_sec * (wall_ms / 1000.0) ≈ batch_tokens).StepMetrics.wall_ms: f32with#[serde(default)]for backward compat on older JSONL.PretrainLoop::train_steppopulateswall_msfrom the samet0.elapsed()span astokens_per_sec— single-source derivation prevents independent drift.validate_finite()rejects non-finite or negativewall_ms.Test plan
pv validate contracts/training-loop-pretrain-v1.yaml: 0 errors, 0 warningscargo test -p aprender-train --release --lib pretrain::tests::: 25 passed (was 22, +3 new)cargo build --workspace --release: succeeds — no downstream consumer brokeWhat this does NOT do
evidence/task-132/gputrain-004-live-2026-04-XX.{json,csv}.References
docs/specifications/aprender-train/ship-two-models-spec.mdmemory/project_task_132_cuda_training_backend_gap.md(updated description)Closes task #156.
🤖 Generated with Claude Code