Skip to content

Non-record: Universal Transformer + Adaptive Density (val_bpb 1.4390)#1193

Open
dentity007 wants to merge 3 commits intoopenai:mainfrom
NathanMaine:research/universal-transformer
Open

Non-record: Universal Transformer + Adaptive Density (val_bpb 1.4390)#1193
dentity007 wants to merge 3 commits intoopenai:mainfrom
NathanMaine:research/universal-transformer

Conversation

@dentity007
Copy link
Copy Markdown

Shared-weight block looped 12× with per-iteration params + 50% sparse→dense curriculum. 4.56M params → 2.87MB. Implements OpenAI's requested 'Universal transformer' direction. Confirms PR #363 findings on depth recurrence.

🤖 Generated with Claude Code

@dentity007 dentity007 closed this Apr 1, 2026
@dentity007 dentity007 reopened this Apr 1, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant