Summary
apr cbtop --model-path <FILE> --headless --json produces completely fake BrickScore data in JSON output, despite the BrickProfiler collecting correct GPU timing internally. The stderr shows real data (LmHead avg=595µs) but JSON reports LmHead actual_us=1.9µs.
Root Cause
brick_scores_from_profiler() in gguf.rs hardcodes all scoring fields and uses wrong denominator.
Bugs (5 total)
Bug 1: gguf.rs:406-413 — brick_scores_from_profiler hardcodes everything
score: 100, grade: "R", gap_factor: 1.0 all hardcoded
budget_us and actual_us both set to per_token_us (total_ns / profiler.total_tokens)
profiler.total_tokens() counts brick ELEMENTS (~952K), not decoded tokens (~3K)
- Result: all bricks show score=100, gap=1.0, actual=budget — completely useless
Bug 2: cbtop_measure_batch.rs:328-335 — weighted score truncation
pmat_brick_score uses 7 hardcoded weights [1.5, 6.0, 1.0, 10.0, 3.5, 1.5, 12.2]
- Real profiler returns 11 bricks —
zip silently drops 4
- Aggregate brick_score wrong
Bug 3: cbtop_measure_batch.rs:362-365 — hardcoded PMAT scores
rust_project_score: 173.9, tdg_score: 98.1, cuda_tdg_score: 95.2 are constants
- Not computed from any real data
Bug 4: cbtop_measure_batch.rs:369-374 — hardcoded FalsificationSummary
total_points: 137, passed: 137, failed: 0, blocked: 0
- No actual falsification tests run
Bug 5: cbtop_measure_batch.rs:338 — hardcoded target
target_tok_s = 976.0 is a stale CPU spec target
- Should be derived from hardware capability or removed
Contract Violation
Violates C-GDP-001 (gpu-decode-profiling-v1.yaml): brick data must reflect real GPU execution time.
Fix Plan
- Use
stats.avg_us() for actual_us (per-call GPU time)
- Derive
decoded_tokens from LmHead.count (exactly 1 per decoded token)
- Compute real
score/grade/gap_factor using existing compute_brick_score()
- Use equal weights for dynamic brick count instead of hardcoded 7-element array
- Remove hardcoded PMAT scores and falsification summary — report 0 if not computed
Summary
apr cbtop --model-path <FILE> --headless --jsonproduces completely fake BrickScore data in JSON output, despite the BrickProfiler collecting correct GPU timing internally. The stderr shows real data (LmHead avg=595µs) but JSON reports LmHead actual_us=1.9µs.Root Cause
brick_scores_from_profiler()ingguf.rshardcodes all scoring fields and uses wrong denominator.Bugs (5 total)
Bug 1:
gguf.rs:406-413—brick_scores_from_profilerhardcodes everythingscore: 100,grade: "R",gap_factor: 1.0all hardcodedbudget_usandactual_usboth set toper_token_us(total_ns / profiler.total_tokens)profiler.total_tokens()counts brick ELEMENTS (~952K), not decoded tokens (~3K)Bug 2:
cbtop_measure_batch.rs:328-335— weighted score truncationpmat_brick_scoreuses 7 hardcoded weights[1.5, 6.0, 1.0, 10.0, 3.5, 1.5, 12.2]zipsilently drops 4Bug 3:
cbtop_measure_batch.rs:362-365— hardcoded PMAT scoresrust_project_score: 173.9,tdg_score: 98.1,cuda_tdg_score: 95.2are constantsBug 4:
cbtop_measure_batch.rs:369-374— hardcoded FalsificationSummarytotal_points: 137, passed: 137, failed: 0, blocked: 0Bug 5:
cbtop_measure_batch.rs:338— hardcoded targettarget_tok_s = 976.0is a stale CPU spec targetContract Violation
Violates C-GDP-001 (gpu-decode-profiling-v1.yaml): brick data must reflect real GPU execution time.
Fix Plan
stats.avg_us()foractual_us(per-call GPU time)decoded_tokensfromLmHead.count(exactly 1 per decoded token)score/grade/gap_factorusing existingcompute_brick_score()