Overview
Implement GPU performance benchmarks for APR format per Section Y.2 of the spec.
Requirement
- Y7: APR decode speed must be ≥200 tok/s on GPU (RTX 4090 reference)
- Must match or exceed GGUF decode speed on same hardware
Falsification Condition
APR < 200 tok/s when GGUF ≥ 200 tok/s on same GPU
Implementation Tasks
Blocked By
- Requires GPU hardware for development and testing
References
- Spec:
docs/specifications/apr-whisper-and-cookbook-support-eoy-2025.md Section Y.2
- Related: Y6 (CPU benchmarks) - ✅ Verified at 206.4 tok/s
Priority
P2 - Deferred (no GPU hardware available currently)
Overview
Implement GPU performance benchmarks for APR format per Section Y.2 of the spec.
Requirement
Falsification Condition
APR < 200 tok/s when GGUF ≥ 200 tok/s on same GPU
Implementation Tasks
Blocked By
References
docs/specifications/apr-whisper-and-cookbook-support-eoy-2025.mdSection Y.2Priority
P2 - Deferred (no GPU hardware available currently)