Reached above 50% on ARC AGI.
Spent the morning testing a few ideas stuck in my mind.
GPT-4 Turbo is ~45% better than GPT-4o.
Built a few-shot dataset from where GPT-4 Turbo outperforms. Tested system message improvement, threaded n-shots, and a GPT-4o fine-tune.