Skip to content

feat(pretrain): add wall_ms to StepMetrics — Residual B per spec §19.4#1069

Merged
noahgift merged 1 commit into
mainfrom
feat/wall-ms-per-step-residual-b
Apr 26, 2026
Merged

feat(pretrain): add wall_ms to StepMetrics — Residual B per spec §19.4#1069
noahgift merged 1 commit into
mainfrom
feat/wall-ms-per-step-residual-b

Conversation

@noahgift

Copy link
Copy Markdown
Contributor

Summary

  • Adds wall_ms: f32 to StepMetrics, the 7th required field per contracts/training-loop-pretrain-v1.yaml v1.4.0 → v1.5.0 (additive minor bump).
  • Discharges the code prerequisite for §19.4 Residual B — promoting GATE-GPUTRAIN-004 (per-step latency budget < 500ms on RTX 4090 / 370M) to ACTIVE_WITH_LIVE_EVIDENCE.
  • Live evidence dispatch (50-step cuda:0 run with wall_ms persisted) is the operator follow-up.

What this PR delivers

  • Contract bump: training-loop-pretrain-v1.yaml v1.4.0 → v1.5.0 with wall_ms in per_step_metrics.required + consistency invariant note (tokens_per_sec * (wall_ms / 1000.0) ≈ batch_tokens).
  • Struct field: StepMetrics.wall_ms: f32 with #[serde(default)] for backward compat on older JSONL.
  • Producer wired: PretrainLoop::train_step populates wall_ms from the same t0.elapsed() span as tokens_per_sec — single-source derivation prevents independent drift.
  • Validation: validate_finite() rejects non-finite or negative wall_ms.
  • 3 new tests covering negative/NaN rejection + consistency invariant.

Test plan

  • pv validate contracts/training-loop-pretrain-v1.yaml: 0 errors, 0 warnings
  • cargo test -p aprender-train --release --lib pretrain::tests::: 25 passed (was 22, +3 new)
  • cargo build --workspace --release: succeeds — no downstream consumer broke
  • PMAT pre-commit gates pass

What this does NOT do

  • Does NOT capture live evidence on a cuda:0 dispatch — that's the operator-action step in Residual B.
  • Does NOT promote GATE-GPUTRAIN-004 to ACTIVE_WITH_LIVE_EVIDENCE — needs the live dispatch + persisted evidence file at evidence/task-132/gputrain-004-live-2026-04-XX.{json,csv}.

References

Closes task #156.

🤖 Generated with Claude Code

Per `contracts/training-loop-pretrain-v1.yaml` v1.4.0 → v1.5.0
(additive minor): adds the 7th required field `wall_ms: f32` to
per-step JSONL emission. Discharges §19.4 Residual B prerequisite
for promoting GATE-GPUTRAIN-004 (per-step latency budget < 500ms
on RTX 4090 / 370M) to ACTIVE_WITH_LIVE_EVIDENCE.

## What changed

- `contracts/training-loop-pretrain-v1.yaml` v1.4.0 → v1.5.0:
  - per_step_metrics.required adds `wall_ms` (f32)
  - Includes consistency invariant note: tokens_per_sec * (wall_ms / 1000.0) ≈ batch_tokens
- `StepMetrics` (pretrain.rs:104-118) gains `wall_ms: f32` field
  - `#[serde(default)]` to keep older JSONL parseable on read
  - Doc-comment cites the contract version
- `PretrainLoop::train_step` (pretrain.rs:565-586) populates wall_ms
  from the same `t0.elapsed()` span as tokens_per_sec — single-source
  derivation prevents independent drift
- `validate_finite()` rejects non-finite or negative wall_ms
- 3 new unit tests:
  - `step_metrics_rejects_negative_wall_ms`
  - `step_metrics_rejects_nan_wall_ms`
  - `step_metrics_wall_ms_consistent_with_tokens_per_sec`

## Backward compat

`#[serde(default)]` on the new field means JSONL emitted by older
binaries (without wall_ms) still deserializes — wall_ms defaults to
0.0 when absent. Newly emitted JSONL always has the field set.

## Test plan
- [x] `pv validate` on contract: 0 errors, 0 warnings
- [x] `cargo test -p aprender-train --release --lib pretrain::tests::`: 25 passed (was 22)
- [x] `cargo build --workspace --release` succeeds
- [x] No downstream consumer of StepMetrics broke

## What this does NOT do

- Does NOT capture live evidence on a cuda:0 dispatch (Residual B
  step 2 — operator action, scoped to a follow-up).
- Does NOT promote GATE-GPUTRAIN-004 to ACTIVE_WITH_LIVE_EVIDENCE
  (that requires the live dispatch + persisted evidence file).

This PR closes the *code* prerequisite for GATE-GPUTRAIN-004
discharge. The next operator dispatch can persist `wall_ms` per-step
and use it as the GATE-004 verdict input.

Closes task #156.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
@noahgift noahgift enabled auto-merge (squash) April 26, 2026 08:49
@noahgift noahgift merged commit 9d9a390 into main Apr 26, 2026
11 checks passed
@noahgift noahgift deleted the feat/wall-ms-per-step-residual-b branch April 26, 2026 09:47
noahgift added a commit that referenced this pull request Apr 26, 2026
…2.64.0 → v2.65.0

§19 verified `apr pretrain --device cuda` is wired but the canonical
apr binary lacked `--features cuda`. §20 records the next step:
**rebuild + live dispatch + evidence capture** on RTX 4090.

## What §20 contains (9 subsections)

1. §20.1 — Rebuild (40s incremental, `--features cuda` enabled apr-cli)
2. §20.2 — Live dispatch command + 100-step JSONL output
3. §20.3 — wall_ms statistics: median=264.74ms (47% headroom under
   GATE-GPUTRAIN-004's 500ms budget)
4. §20.4 — nvidia-smi PID 1658504 / 6636 MiB GPU memory captured mid-run
5. §20.5 — Gate-by-gate impact table (GATE-GPUTRAIN-002/003/004/005)
6. §20.6 — Evidence files at evidence/task-132-residual-b/
7. §20.7 — Long-path status: §19.5 step (a) DONE
8. §20.8 — What §20 is NOT (contract bump is follow-up PR)
9. §20.9 — Methodological alignment (live-evidence pattern, not chain-of-thought)

## Live evidence captured

- 100 real CUDA training steps on noah-Lambda-Vector RTX 4090
- Real corpus: /mnt/nvme-raid0/data/csn-python-shards
- Real tokenizer: /mnt/nvme-raid0/models/model-2-tokenizer-v1 (vocab=50,257)
- wall_ms median: 264.74 ms (range 257.86–467.66 with step 0 = 467.66
  kernel-warmup outlier)
- train_loss step 0=11.02 → step 99=10.50 (Δ=−0.52, decreasing)
- val_loss=10.31 triggered GATE-TRAIN-005 ship-blocker abort at epoch
  boundary (correct behavior for fresh-init 370M before convergence)
- nvidia-smi PID 1658504 / 6636 MiB stable mid-run

## Spec progression

v2.64.0 → v2.65.0. Coverage tally update is **pending** the contract
bump for `gpu-training-backend-v1.yaml` GATE-GPUTRAIN-004
PARTIAL_ALGORITHM_LEVEL → ACTIVE_WITH_LIVE_EVIDENCE (separate
follow-up PR; §20 records the data, the contract amendment captures
the durable verdict).

## Stacks under

- #1068 (§19 — task #132 correction)
- #1067 (§18 — training status snapshot)
- Concrete progress on §19.4 Residual B (live evidence half)
- Pairs with PR #1069 (wall_ms code half — provided the JSONL field
  used for the GATE-GPUTRAIN-004 timing data)

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
noahgift added a commit that referenced this pull request Apr 26, 2026
…2.64.0 → v2.65.0

§19 verified `apr pretrain --device cuda` is wired but the canonical
apr binary lacked `--features cuda`. §20 records the next step:
**rebuild + live dispatch + evidence capture** on RTX 4090.

## What §20 contains (9 subsections)

1. §20.1 — Rebuild (40s incremental, `--features cuda` enabled apr-cli)
2. §20.2 — Live dispatch command + 100-step JSONL output
3. §20.3 — wall_ms statistics: median=264.74ms (47% headroom under
   GATE-GPUTRAIN-004's 500ms budget)
4. §20.4 — nvidia-smi PID 1658504 / 6636 MiB GPU memory captured mid-run
5. §20.5 — Gate-by-gate impact table (GATE-GPUTRAIN-002/003/004/005)
6. §20.6 — Evidence files at evidence/task-132-residual-b/
7. §20.7 — Long-path status: §19.5 step (a) DONE
8. §20.8 — What §20 is NOT (contract bump is follow-up PR)
9. §20.9 — Methodological alignment (live-evidence pattern, not chain-of-thought)

## Live evidence captured

- 100 real CUDA training steps on noah-Lambda-Vector RTX 4090
- Real corpus: /mnt/nvme-raid0/data/csn-python-shards
- Real tokenizer: /mnt/nvme-raid0/models/model-2-tokenizer-v1 (vocab=50,257)
- wall_ms median: 264.74 ms (range 257.86–467.66 with step 0 = 467.66
  kernel-warmup outlier)
- train_loss step 0=11.02 → step 99=10.50 (Δ=−0.52, decreasing)
- val_loss=10.31 triggered GATE-TRAIN-005 ship-blocker abort at epoch
  boundary (correct behavior for fresh-init 370M before convergence)
- nvidia-smi PID 1658504 / 6636 MiB stable mid-run

## Spec progression

v2.64.0 → v2.65.0. Coverage tally update is **pending** the contract
bump for `gpu-training-backend-v1.yaml` GATE-GPUTRAIN-004
PARTIAL_ALGORITHM_LEVEL → ACTIVE_WITH_LIVE_EVIDENCE (separate
follow-up PR; §20 records the data, the contract amendment captures
the durable verdict).

## Stacks under

- #1068 (§19 — task #132 correction)
- #1067 (§18 — training status snapshot)
- Concrete progress on §19.4 Residual B (live evidence half)
- Pairs with PR #1069 (wall_ms code half — provided the JSONL field
  used for the GATE-GPUTRAIN-004 timing data)

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
noahgift added a commit that referenced this pull request Apr 26, 2026
…2.64.0 → v2.65.0

§19 verified `apr pretrain --device cuda` is wired but the canonical
apr binary lacked `--features cuda`. §20 records the next step:
**rebuild + live dispatch + evidence capture** on RTX 4090.

## What §20 contains (9 subsections)

1. §20.1 — Rebuild (40s incremental, `--features cuda` enabled apr-cli)
2. §20.2 — Live dispatch command + 100-step JSONL output
3. §20.3 — wall_ms statistics: median=264.74ms (47% headroom under
   GATE-GPUTRAIN-004's 500ms budget)
4. §20.4 — nvidia-smi PID 1658504 / 6636 MiB GPU memory captured mid-run
5. §20.5 — Gate-by-gate impact table (GATE-GPUTRAIN-002/003/004/005)
6. §20.6 — Evidence files at evidence/task-132-residual-b/
7. §20.7 — Long-path status: §19.5 step (a) DONE
8. §20.8 — What §20 is NOT (contract bump is follow-up PR)
9. §20.9 — Methodological alignment (live-evidence pattern, not chain-of-thought)

## Live evidence captured

- 100 real CUDA training steps on noah-Lambda-Vector RTX 4090
- Real corpus: /mnt/nvme-raid0/data/csn-python-shards
- Real tokenizer: /mnt/nvme-raid0/models/model-2-tokenizer-v1 (vocab=50,257)
- wall_ms median: 264.74 ms (range 257.86–467.66 with step 0 = 467.66
  kernel-warmup outlier)
- train_loss step 0=11.02 → step 99=10.50 (Δ=−0.52, decreasing)
- val_loss=10.31 triggered GATE-TRAIN-005 ship-blocker abort at epoch
  boundary (correct behavior for fresh-init 370M before convergence)
- nvidia-smi PID 1658504 / 6636 MiB stable mid-run

## Spec progression

v2.64.0 → v2.65.0. Coverage tally update is **pending** the contract
bump for `gpu-training-backend-v1.yaml` GATE-GPUTRAIN-004
PARTIAL_ALGORITHM_LEVEL → ACTIVE_WITH_LIVE_EVIDENCE (separate
follow-up PR; §20 records the data, the contract amendment captures
the durable verdict).

## Stacks under

- #1068 (§19 — task #132 correction)
- #1067 (§18 — training status snapshot)
- Concrete progress on §19.4 Residual B (live evidence half)
- Pairs with PR #1069 (wall_ms code half — provided the JSONL field
  used for the GATE-GPUTRAIN-004 timing data)

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
noahgift added a commit that referenced this pull request Apr 26, 2026
…2.64.0 → v2.65.0

§19 verified `apr pretrain --device cuda` is wired but the canonical
apr binary lacked `--features cuda`. §20 records the next step:
**rebuild + live dispatch + evidence capture** on RTX 4090.

## What §20 contains (9 subsections)

1. §20.1 — Rebuild (40s incremental, `--features cuda` enabled apr-cli)
2. §20.2 — Live dispatch command + 100-step JSONL output
3. §20.3 — wall_ms statistics: median=264.74ms (47% headroom under
   GATE-GPUTRAIN-004's 500ms budget)
4. §20.4 — nvidia-smi PID 1658504 / 6636 MiB GPU memory captured mid-run
5. §20.5 — Gate-by-gate impact table (GATE-GPUTRAIN-002/003/004/005)
6. §20.6 — Evidence files at evidence/task-132-residual-b/
7. §20.7 — Long-path status: §19.5 step (a) DONE
8. §20.8 — What §20 is NOT (contract bump is follow-up PR)
9. §20.9 — Methodological alignment (live-evidence pattern, not chain-of-thought)

## Live evidence captured

- 100 real CUDA training steps on noah-Lambda-Vector RTX 4090
- Real corpus: /mnt/nvme-raid0/data/csn-python-shards
- Real tokenizer: /mnt/nvme-raid0/models/model-2-tokenizer-v1 (vocab=50,257)
- wall_ms median: 264.74 ms (range 257.86–467.66 with step 0 = 467.66
  kernel-warmup outlier)
- train_loss step 0=11.02 → step 99=10.50 (Δ=−0.52, decreasing)
- val_loss=10.31 triggered GATE-TRAIN-005 ship-blocker abort at epoch
  boundary (correct behavior for fresh-init 370M before convergence)
- nvidia-smi PID 1658504 / 6636 MiB stable mid-run

## Spec progression

v2.64.0 → v2.65.0. Coverage tally update is **pending** the contract
bump for `gpu-training-backend-v1.yaml` GATE-GPUTRAIN-004
PARTIAL_ALGORITHM_LEVEL → ACTIVE_WITH_LIVE_EVIDENCE (separate
follow-up PR; §20 records the data, the contract amendment captures
the durable verdict).

## Stacks under

- #1068 (§19 — task #132 correction)
- #1067 (§18 — training status snapshot)
- Concrete progress on §19.4 Residual B (live evidence half)
- Pairs with PR #1069 (wall_ms code half — provided the JSONL field
  used for the GATE-GPUTRAIN-004 timing data)

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
noahgift added a commit that referenced this pull request Apr 26, 2026
…2.64.0 → v2.65.0

§19 verified `apr pretrain --device cuda` is wired but the canonical
apr binary lacked `--features cuda`. §20 records the next step:
**rebuild + live dispatch + evidence capture** on RTX 4090.

## What §20 contains (9 subsections)

1. §20.1 — Rebuild (40s incremental, `--features cuda` enabled apr-cli)
2. §20.2 — Live dispatch command + 100-step JSONL output
3. §20.3 — wall_ms statistics: median=264.74ms (47% headroom under
   GATE-GPUTRAIN-004's 500ms budget)
4. §20.4 — nvidia-smi PID 1658504 / 6636 MiB GPU memory captured mid-run
5. §20.5 — Gate-by-gate impact table (GATE-GPUTRAIN-002/003/004/005)
6. §20.6 — Evidence files at evidence/task-132-residual-b/
7. §20.7 — Long-path status: §19.5 step (a) DONE
8. §20.8 — What §20 is NOT (contract bump is follow-up PR)
9. §20.9 — Methodological alignment (live-evidence pattern, not chain-of-thought)

## Live evidence captured

- 100 real CUDA training steps on noah-Lambda-Vector RTX 4090
- Real corpus: /mnt/nvme-raid0/data/csn-python-shards
- Real tokenizer: /mnt/nvme-raid0/models/model-2-tokenizer-v1 (vocab=50,257)
- wall_ms median: 264.74 ms (range 257.86–467.66 with step 0 = 467.66
  kernel-warmup outlier)
- train_loss step 0=11.02 → step 99=10.50 (Δ=−0.52, decreasing)
- val_loss=10.31 triggered GATE-TRAIN-005 ship-blocker abort at epoch
  boundary (correct behavior for fresh-init 370M before convergence)
- nvidia-smi PID 1658504 / 6636 MiB stable mid-run

## Spec progression

v2.64.0 → v2.65.0. Coverage tally update is **pending** the contract
bump for `gpu-training-backend-v1.yaml` GATE-GPUTRAIN-004
PARTIAL_ALGORITHM_LEVEL → ACTIVE_WITH_LIVE_EVIDENCE (separate
follow-up PR; §20 records the data, the contract amendment captures
the durable verdict).

## Stacks under

- #1068 (§19 — task #132 correction)
- #1067 (§18 — training status snapshot)
- Concrete progress on §19.4 Residual B (live evidence half)
- Pairs with PR #1069 (wall_ms code half — provided the JSONL field
  used for the GATE-GPUTRAIN-004 timing data)

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
noahgift added a commit that referenced this pull request Apr 26, 2026
…2.64.0 → v2.65.0

§19 verified `apr pretrain --device cuda` is wired but the canonical
apr binary lacked `--features cuda`. §20 records the next step:
**rebuild + live dispatch + evidence capture** on RTX 4090.

## What §20 contains (9 subsections)

1. §20.1 — Rebuild (40s incremental, `--features cuda` enabled apr-cli)
2. §20.2 — Live dispatch command + 100-step JSONL output
3. §20.3 — wall_ms statistics: median=264.74ms (47% headroom under
   GATE-GPUTRAIN-004's 500ms budget)
4. §20.4 — nvidia-smi PID 1658504 / 6636 MiB GPU memory captured mid-run
5. §20.5 — Gate-by-gate impact table (GATE-GPUTRAIN-002/003/004/005)
6. §20.6 — Evidence files at evidence/task-132-residual-b/
7. §20.7 — Long-path status: §19.5 step (a) DONE
8. §20.8 — What §20 is NOT (contract bump is follow-up PR)
9. §20.9 — Methodological alignment (live-evidence pattern, not chain-of-thought)

## Live evidence captured

- 100 real CUDA training steps on noah-Lambda-Vector RTX 4090
- Real corpus: /mnt/nvme-raid0/data/csn-python-shards
- Real tokenizer: /mnt/nvme-raid0/models/model-2-tokenizer-v1 (vocab=50,257)
- wall_ms median: 264.74 ms (range 257.86–467.66 with step 0 = 467.66
  kernel-warmup outlier)
- train_loss step 0=11.02 → step 99=10.50 (Δ=−0.52, decreasing)
- val_loss=10.31 triggered GATE-TRAIN-005 ship-blocker abort at epoch
  boundary (correct behavior for fresh-init 370M before convergence)
- nvidia-smi PID 1658504 / 6636 MiB stable mid-run

## Spec progression

v2.64.0 → v2.65.0. Coverage tally update is **pending** the contract
bump for `gpu-training-backend-v1.yaml` GATE-GPUTRAIN-004
PARTIAL_ALGORITHM_LEVEL → ACTIVE_WITH_LIVE_EVIDENCE (separate
follow-up PR; §20 records the data, the contract amendment captures
the durable verdict).

## Stacks under

- #1068 (§19 — task #132 correction)
- #1067 (§18 — training status snapshot)
- Concrete progress on §19.4 Residual B (live evidence half)
- Pairs with PR #1069 (wall_ms code half — provided the JSONL field
  used for the GATE-GPUTRAIN-004 timing data)

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
noahgift added a commit that referenced this pull request Apr 26, 2026
…2.64.0 → v2.65.0

§19 verified `apr pretrain --device cuda` is wired but the canonical
apr binary lacked `--features cuda`. §20 records the next step:
**rebuild + live dispatch + evidence capture** on RTX 4090.

## What §20 contains (9 subsections)

1. §20.1 — Rebuild (40s incremental, `--features cuda` enabled apr-cli)
2. §20.2 — Live dispatch command + 100-step JSONL output
3. §20.3 — wall_ms statistics: median=264.74ms (47% headroom under
   GATE-GPUTRAIN-004's 500ms budget)
4. §20.4 — nvidia-smi PID 1658504 / 6636 MiB GPU memory captured mid-run
5. §20.5 — Gate-by-gate impact table (GATE-GPUTRAIN-002/003/004/005)
6. §20.6 — Evidence files at evidence/task-132-residual-b/
7. §20.7 — Long-path status: §19.5 step (a) DONE
8. §20.8 — What §20 is NOT (contract bump is follow-up PR)
9. §20.9 — Methodological alignment (live-evidence pattern, not chain-of-thought)

## Live evidence captured

- 100 real CUDA training steps on noah-Lambda-Vector RTX 4090
- Real corpus: /mnt/nvme-raid0/data/csn-python-shards
- Real tokenizer: /mnt/nvme-raid0/models/model-2-tokenizer-v1 (vocab=50,257)
- wall_ms median: 264.74 ms (range 257.86–467.66 with step 0 = 467.66
  kernel-warmup outlier)
- train_loss step 0=11.02 → step 99=10.50 (Δ=−0.52, decreasing)
- val_loss=10.31 triggered GATE-TRAIN-005 ship-blocker abort at epoch
  boundary (correct behavior for fresh-init 370M before convergence)
- nvidia-smi PID 1658504 / 6636 MiB stable mid-run

## Spec progression

v2.64.0 → v2.65.0. Coverage tally update is **pending** the contract
bump for `gpu-training-backend-v1.yaml` GATE-GPUTRAIN-004
PARTIAL_ALGORITHM_LEVEL → ACTIVE_WITH_LIVE_EVIDENCE (separate
follow-up PR; §20 records the data, the contract amendment captures
the durable verdict).

## Stacks under

- #1068 (§19 — task #132 correction)
- #1067 (§18 — training status snapshot)
- Concrete progress on §19.4 Residual B (live evidence half)
- Pairs with PR #1069 (wall_ms code half — provided the JSONL field
  used for the GATE-GPUTRAIN-004 timing data)

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
noahgift added a commit that referenced this pull request Apr 26, 2026
…2.64.0 → v2.65.0 (#1070)

§19 verified `apr pretrain --device cuda` is wired but the canonical
apr binary lacked `--features cuda`. §20 records the next step:
**rebuild + live dispatch + evidence capture** on RTX 4090.

## What §20 contains (9 subsections)

1. §20.1 — Rebuild (40s incremental, `--features cuda` enabled apr-cli)
2. §20.2 — Live dispatch command + 100-step JSONL output
3. §20.3 — wall_ms statistics: median=264.74ms (47% headroom under
   GATE-GPUTRAIN-004's 500ms budget)
4. §20.4 — nvidia-smi PID 1658504 / 6636 MiB GPU memory captured mid-run
5. §20.5 — Gate-by-gate impact table (GATE-GPUTRAIN-002/003/004/005)
6. §20.6 — Evidence files at evidence/task-132-residual-b/
7. §20.7 — Long-path status: §19.5 step (a) DONE
8. §20.8 — What §20 is NOT (contract bump is follow-up PR)
9. §20.9 — Methodological alignment (live-evidence pattern, not chain-of-thought)

## Live evidence captured

- 100 real CUDA training steps on noah-Lambda-Vector RTX 4090
- Real corpus: /mnt/nvme-raid0/data/csn-python-shards
- Real tokenizer: /mnt/nvme-raid0/models/model-2-tokenizer-v1 (vocab=50,257)
- wall_ms median: 264.74 ms (range 257.86–467.66 with step 0 = 467.66
  kernel-warmup outlier)
- train_loss step 0=11.02 → step 99=10.50 (Δ=−0.52, decreasing)
- val_loss=10.31 triggered GATE-TRAIN-005 ship-blocker abort at epoch
  boundary (correct behavior for fresh-init 370M before convergence)
- nvidia-smi PID 1658504 / 6636 MiB stable mid-run

## Spec progression

v2.64.0 → v2.65.0. Coverage tally update is **pending** the contract
bump for `gpu-training-backend-v1.yaml` GATE-GPUTRAIN-004
PARTIAL_ALGORITHM_LEVEL → ACTIVE_WITH_LIVE_EVIDENCE (separate
follow-up PR; §20 records the data, the contract amendment captures
the durable verdict).

## Stacks under

- #1068 (§19 — task #132 correction)
- #1067 (§18 — training status snapshot)
- Concrete progress on §19.4 Residual B (live evidence half)
- Pairs with PR #1069 (wall_ms code half — provided the JSONL field
  used for the GATE-GPUTRAIN-004 timing data)

Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant