Skip to content

PROTEUS v4 — non-record submission (val_bpb: 1.2037)#368

Open
MatoTeziTanka wants to merge 1 commit intoopenai:mainfrom
MatoTeziTanka:proteus-v4
Open

PROTEUS v4 — non-record submission (val_bpb: 1.2037)#368
MatoTeziTanka wants to merge 1 commit intoopenai:mainfrom
MatoTeziTanka:proteus-v4

Conversation

@MatoTeziTanka
Copy link
Copy Markdown

Summary

Non-record submission. 10-layer transformer with mixed INT5/INT6 quantization.

  • val_bpb: 1.20368943 (post-quant sliding window eval)
  • Artifact: 12,499,612 bytes (78.1% of 16MB cap)
  • Run on Modal 8×H100

See PR #95 for our earlier submission with documented negative results on INT4, depth recurrence, and EMA overhead.

Techniques

Technique Source
Mixed INT5 MLP / INT6 attention @nanlliu PR #39 concept, our implementation
SmearGate + BigramHash + OrthoInit @unnir PR #162
Muon WD 0.04 @notapplica PR #60
Sliding window eval stride=64 @mattqlf PR #50
EMA, 3% pruning, RoPE 50K, zstd-22 Various / our additions

Built with PROTEUS by LightSpeedUp

10L, mixed INT5/INT6, SmearGate+BigramHash+OrthoInit, Muon WD,
EMA, 3% magnitude pruning, sliding window eval. Artifact 12.5MB.

Built with PROTEUS by LightSpeedUp — lightspeedup.com

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant