Experiment: SQuat query-orthogonal error projection

## Hypothesis
Projecting quantization error perpendicular to query subspace reduces effective error in attention computation for head_dim=128.

## Background
SQuat (arXiv:2503.24358) proposes that quantization error in the direction of the query doesn't matter — only the orthogonal component affects attention scores. After FWHT rotation, the rotated Q subspace may still be low-rank, making this projection worthwhile.

## What to test
- Implement query-orthogonal error projection in dequant path
- PPL on head_dim=128 models
- Interaction with pre-rotate-queries optimization (both operate on Q)
- Decode speed impact (additional projection per token)

## Expected outcome
Could close head_dim=128 PPL gap. But more complex than CAT diagonal — try CAT first.

## Priority
Low — depends on CAT diagonal results, more complex implementation.

## Source
AutoRepl: TODO-001 (buun, fork_dc582a), arXiv:2503.24358

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Experiment: SQuat query-orthogonal error projection #11

Hypothesis

Background

What to test

Expected outcome

Priority

Source

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Experiment: SQuat query-orthogonal error projection #11

Description

Hypothesis

Background

What to test

Expected outcome

Priority

Source

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions