[WIP] SSM LRU Baseline — First State Space Model Submission by timothywangdev · Pull Request #220 · openai/parameter-golf

timothywangdev · 2026-03-20T15:09:53Z

Summary

First non-transformer submission to parameter golf — uses a Linear Recurrent Unit (LRU) state space model
SSM blocks are 36% smaller than attention blocks at equivalent dimension, enabling 12-15 layers where transformers fit 9-10 in 16MB
Complex diagonal recurrence with parallel scan (cumsum trick), gated projection, ReLU^2 MLP
MuonAdamW optimizer with SSM-aware parameter groups

Status: WIP

Applying for compute credits to validate on H100. Current results on RTX 3090 (5 min budget):

val_bpb: 1.848 (bottlenecked by pure PyTorch scan speed, 2.8% MFU)
With mamba_ssm CUDA kernels on H100, expect 10-50x speedup → competitive results

Why SSMs Could Win

Parameter efficiency: SSM-specific params are <0.2% of total; projections dominate
No KV cache: Native sliding window eval without recomputation
Linear complexity: More tokens processed in fixed time budget
Compressibility: Research shows 50% of SSM weights can be pruned with zero accuracy loss (SparseSSM)

Research Backing

6 deep-dive research documents (LinOSS, Mamba-3, SSM taxonomy, compression frontiers)
Autonomous experiment loop with brainstorming, arxiv paper reading, self-reflection
Concrete 16MB configs identified from parameter analysis

Test plan

Validate on H100 with mamba_ssm CUDA kernels
Establish competitive val_bpb baseline
Optimize architecture (depth, width, d_state) for 16MB budget
Add int8+zlib quantization pipeline
Statistical significance testing (3+ seeds, p < 0.01)

First non-transformer submission to parameter golf. Uses Linear Recurrent Unit (LRU) with complex diagonal recurrence and parallel scan. SSM blocks are 36% smaller than attention — can fit 12-15 layers in 16MB vs 9-10 for transformers. WIP pending H100 compute validation.

notapplica mentioned this pull request Mar 20, 2026

⛳ Parameter Golf Live AI Commentary ⛳ + Analysis / Ideas | every 10 minutes #140

Open

timothywangdev force-pushed the ssm-lru-baseline branch from 52bbb35 to c60b5a8 Compare March 20, 2026 15:40

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[WIP] SSM LRU Baseline — First State Space Model Submission#220

[WIP] SSM LRU Baseline — First State Space Model Submission#220
timothywangdev wants to merge 1 commit intoopenai:mainfrom
timothywangdev:ssm-lru-baseline

timothywangdev commented Mar 20, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

timothywangdev commented Mar 20, 2026

Summary

Status: WIP

Why SSMs Could Win

Research Backing

Test plan

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant