@arcprize 2024 with more than 16k entrants just ended after 5 months, and we rank #1 (@bayesilicon@MohamedOsmanML)! We just scored 58% with a submission that finished after the deadline! We're just getting started. We hope to have an announcement about @tufalabs soon.
ARC-AGI update. We posted a 41% on the official leaderboard! Thanks to my great team! @bayesilicon@epsil0ndelta Here's to hoping and working towards an even higher score. ๐
This is very interesting work. But if we are going to play the SoTA game, our solution that first broke SoTA on the private test set (34%) for ARC scored 60% on the public test set back in February, 2024. You wouldn't want us to run it on the training dataset like Ryan did.
ARC-AGIโs been hyped over the last week as a benchmark that LLMs canโt solve. This claim triggered my dear coworker Ryan Greenblatt so he spent the last week trying to solve it with LLMs. Ryan gets 71% accuracy on a set of examples where humans get 85%; this is SOTA.
49.5 on ARC-AGI (gained .5 points). Is that 50% level magical, or will it fall soon? Our fantastic @bayesilicon is cooking something hot, and our superb @MohamedOsmanML has some very cool stuff in the works. What a great team! @arcprize
Always love to hear @fchollet. He has created many years of hard challenges for people and machines. More importantly, it is advancing the science of AI. ARC-AGI is one of his brilliant creations and a steppingstone towards AGI.
I finally got to meet @fchollet in person recently to interview him about @arcprize, intelligence vs memorization, human cognitive development, learning abstractions, limits of pattern recognition and consciousness development. These are the best bits. Full show released tomorrow
I found some time ago that it was good to think of the interview being reversed. In other words, what does the choice of interview process say about the environment of the organization. How did I feel as a result the interaction? Would I want to work with people who treat
Excited to be full time on ARC-AGI-2. It's going to be a slow cook for the new ideas to be realized. Maybe we'll see a new face or two on the team! ๐ @arcprize@MohamedOsmanML@bayesilicon@DriesSmit1