There is a paper by Google trending right now, that claims transformer in-context learning cannot generalize between two function classes
I have reproduced their experiment in a colab and come to a very different conclusion...
Datadog Post-Training and advising at PriorLabs. Ex-Meta, Ex-DeepL, Ex-Amazon. ETH BSc, Cambridge MPhil, PhD from Freiburg. Opinions are my own. (he/him)

