Pinned
Very excited about this line of research of fast-slow learning,
1) potential to solve a lot of issues with current RL (eg. entropy collapse, sparse rewards)
2) an intuitive way of incorporating rich feedback with RL
3) provides a way to transfer knowledge of text-only based
Can LLMs adapt continually without losing base skills?
Fast-Slow Training (FST) pairs "slow" weights with "fast" context.
FST vs. RL:
• 3x more sample-efficient
• Higher performance ceiling
• Less KL drift (better plasticity)
• Continual learning: succeeds where RL stalls











