Skip to content

Non-record: H-Net Dynamic Chunking — Learned Tokenization Layer (val_bpb 1.3587)#1191

Open
dentity007 wants to merge 3 commits intoopenai:mainfrom
NathanMaine:research/hnet-chunking
Open

Non-record: H-Net Dynamic Chunking — Learned Tokenization Layer (val_bpb 1.3587)#1191
dentity007 wants to merge 3 commits intoopenai:mainfrom
NathanMaine:research/hnet-chunking

Conversation

@dentity007
Copy link
Copy Markdown

Summary

Adds a learned dynamic chunking layer (inspired by H-Net) to the standard transformer baseline. The chunker predicts soft boundaries between adjacent token embeddings and blends neighbors where boundaries are low — a differentiable approximation of H-Net's hard chunking.

val_bpb: 1.3587 | 1×RTX 5090, 600s | TTT enabled

  • Nearly matches baseline (1.3577) with only ~263K extra parameters
  • Demonstrates learned tokenization is viable for parameter-constrained models
  • Implements one of OpenAI's explicitly requested research directions (H-net tokenization)
  • Setting HNET_ENABLED=0 produces identical behavior to base script

Changes

  • DynamicChunker module: boundary prediction + soft blending
  • Inserted after embedding norm, before transformer blocks
  • New env vars: HNET_ENABLED, HNET_LAYERS

Test plan

  • Verified on 1×RTX 5090 (600s wallclock)
  • Control run with HNET_ENABLED=0 matches baseline
  • Model fits in 16MB after int8+zlib compression

🤖 Generated with Claude Code

@dentity007 dentity007 closed this Apr 1, 2026
@dentity007 dentity007 reopened this Apr 1, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant