#1 81.6%
GPT 5.4 High
Proprietary
💻 Coding 84.7%*
🧠 Reasoning 82.9%*
🤖 Agents & Tools 83.8%*
| Favorite | Rank | Model | Type | 💻 Coding | 🧠 Reasoning | 🤖 Agents & Tools | 💬 Conversation | 🔢 Math | 👁️ Multimodal | 🧠 Knowledge | Price | Speed |
|---|---|---|---|---|---|---|---|---|---|---|---|---|
#1 81.6% | GPT 5.4 High | Proprietary | 84.7% * | 82.9% * | 83.8% * | 68.6% * | 88.3% * | 72.8% * | 81.4% * | $8.75 | 75.3 t/s | |
#— — | Claude Opus 4.6 Thinking | Proprietary | 81.5% * | 80.0% * | — | — | — | — | 88.8% * | $15.00 | 67.8 t/s | |
#2 80.1% | Gemini 3.1 Pro Preview | Proprietary | 80.6% * | 81.9% * | — | — | — | 68.2% * | 89.5% * | $7.00 | 130 t/s | |
#5 74.5% | Claude Opus 4.5 Thinking | Proprietary | 80.3% | 67.7% | 78.0% | 72.3% | 82.3% | 61.0% | 77.6% | $15.00 | 35 t/s | |
#10 73.1% | Claude Sonnet 4.6 Thinking | Proprietary | 80.2% * | 71.0% * | — | — | 60.2% | 67.7% * | 86.1% * | $9.00 | 45 t/s | |
#4 76.9% | GPT 5.2 Pro | Proprietary | 79.7% * | 75.5% * | 78.9% | 64.8% * | 87.7% * | 71.3% * | 77.6% * | $94.50 | 28 t/s | |
#8 74.5% | GPT 5.2 | Proprietary | 79.4% * | 73.2% * | 63.8% * | 75.0% * | 84.9% * | 70.2% * | 81.1% * | $7.88 | 187 t/s | |
#14 71.0% | Claude Opus 4.5 | Proprietary | 78.8% | 64.7% | 70.2% | 69.2% | 79.7% | 56.9% | 73.3% * | $15.00 | 65 t/s | |
#20 67.5% | Claude Sonnet 4.6 | Proprietary | 78.8% * | 61.2% * | — * | — | 54.2% * | 62.2% * | 83.7% * | $9.00 | 77 t/s | |
#6 74.5% | Gemini 3 Flash Thinking | Proprietary | 78.6% | 68.1% | 78.5% * | 72.2% | 78.1% * | 67.5% * | 84.1% * | $1.75 | 180 t/s | |
#18 67.9% | Claude Sonnet 4.5 Thinking | Proprietary | 78.4% | 58.3% | 65.8% | 68.5% | 77.1% | 54.4% * | 71.2% | $9.00 | 45 t/s | |
#15 70.3% | Gemini 3 Flash | Proprietary | 78.3% * | 55.8% * | 78.1% * | 71.3% | 69.3% * | 67.3% * | 84.0% | $1.75 | 218 t/s | |
#3 78.8% | Kimi K2.5 Thinking | Open Source | 78.2% | 76.5% * | — | — | 85.2% * | — | 80.9% * | $1.55 | 45 t/s | |
#— — | Claude Opus 4.6 | Proprietary | 78.0% * | 70.7% * | — * | — | — * | — | 88.2% * | $15.00 | 67.8 t/s | |
#7 74.5% | GPT 5.2 High | Proprietary | 77.6% | 71.7% | 77.3% | 62.4% | 87.0% | 68.9% | 75.5% | $7.88 | 45 t/s | |
#9 74.2% | Gemini 3 Pro | Proprietary | 76.9% | 68.3% | 71.5% | 75.2% | 84.4% | 70.7% | 85.6% | $7.00 | 128 t/s | |
#22 65.9% | Claude Sonnet 4.5 | Proprietary | 76.8% | 54.3% * | 64.4% | 64.1% | 73.7% | 62.8% | 70.9% | $9.00 | 77 t/s | |
#17 68.1% | GPT 5.1 High | Proprietary | 75.2% | 61.7% | 58.8% * | 68.2% | 83.2% | 63.8% | 75.9% | $67.50 | 40 t/s | |
#25 62.7% | Claude Opus 4.1 | Proprietary | 74.0% | 49.3% * | 64.4% | 63.9% | 63.6% | 60.3% * | 66.2% * | $45.00 | 52 t/s | |
#11 72.4% | Gemini 3.1 Pro Preview Base | Proprietary | 73.2% * | 73.7% * | — | — | — * | 61.4% * | 80.9% * | $7.00 | 130 t/s | |
#19 67.7% | Kimi K2 Thinking | Open Source | 72.9% * | 51.7% * | 78.9% * | 61.8% * | 79.6% * | — | 73.6% * | $1.55 | 45 t/s | |
#12 72.1% | Kimi K2.5 Instant | Open Source | 72.4% * | 69.7% * | — * | — * | 77.2% * | — * | 73.1% * | $1.55 | 85 t/s | |
#23 62.9% | OpenAI o3 | Proprietary | 71.4% * | 52.7% * | 58.2% * | 58.3% * | 79.4% * | 59.3% | 76.5% * | $25.00 | 35 t/s | |
#16 68.6% | Grok 4.1 Thinking | Proprietary | 71.0% * | 63.0% * | 58.5% * | 67.8% * | 80.3% * | 86.5% * | 78.3% * | $9.00 | 45 t/s | |
#— — | MiniMax M2.1 | Open Source | 70.5% * | — | — | — | — | — | — | $0.75 | 148 t/s | |
#13 71.2% | o4-mini | Proprietary | 70.2% * | 65.3% * | — | — | 84.0% * | 83.0% * | 59.6% * | $10.00 | 100 t/s | |
#21 67.5% | Grok 4.1 | Proprietary | 70.0% * | 58.8% * | 61.1% * | 66.2% * | 79.5% * | 86.2% | 77.7% * | $9.00 | 95 t/s | |
#30 60.5% | Qwen3 Max Preview | Proprietary | 68.2% * | 37.0% | 67.4% * | 62.2% * | 76.4% * | 67.0% * | 75.6% * | $3.60 | 85 t/s | |
#27 61.8% | GPT 5.1 | Proprietary | 68.0% * | 49.2% * | 54.1% * | 73.9% * | 73.1% * | 59.3% * | 79.0% * | $3.75 | 120 t/s | |
#24 62.7% | DeepSeek V3.2 Thinking | Open Source | 67.3% | 54.0% | 55.6% | 60.4% | 73.4% | 81.2% | 70.2% | $0.69 | 60 t/s | |
#26 62.3% | Gemini 2.5 Pro | Proprietary | 67.2% | 55.2% | 53.1% | 64.9% | 77.0% | 61.4% | 78.8% | $3.13 | 165 t/s | |
#34 58.8% | MiniMax M2 | Open Source | 65.8% * | 60.2% * | — | 42.2% * | — | — | 55.1% * | $0.75 | 100 t/s | |
#32 59.9% | Kimi K2 | Open Source | 65.8% * | 46.7% * | 65.7% * | 56.3% * | 76.1% * | — | 46.5% * | $1.55 | 85 t/s | |
#33 59.6% | DeepSeek V3.2 | Open Source | 65.5% | 48.8% | 55.8% | 54.8% | 66.0% | 81.7% | 68.8% | $0.69 | 120 t/s | |
#31 60.3% | Qwen3 235B | Open Source | 65.2% * | 55.6% * | 55.3% * | 59.4% * | 73.1% * | 50.3% * | 73.1% * | Free | 75 t/s | |
#28 61.6% | DeepSeek R1 | Open Source | 62.8% | 53.0% | 55.1% * | 60.6% | 78.5% | 79.3% | 67.6% | $1.37 | 85 t/s | |
#29 60.7% | OpenAI o3-mini | Proprietary | 62.0% * | 54.2% | 58.9% * | — | 79.4% | — | 52.5% * | $2.75 | 115 t/s | |
#35 56.5% | Longcat Flash Chat | Open Source | 60.1% * | 36.5% | 67.1% * | 57.6% * | 80.8% * | — | 42.6% * | $0.45 | 100 t/s | |
#39 48.7% | Mistral Large 3 | Open Source | 56.4% * | 24.9% | — | 57.0% * | 74.7% * | — | 62.6% * | $1.00 | 90 t/s | |
#38 52.1% | Qwen3 32B | Open Source | 53.9% * | 46.8% * | 48.0% * | 44.9% * | 68.4% * | 62.4% | 55.2% * | Free | 145 t/s | |
#37 53.5% | Gemini 2.5 Flash | Proprietary | 53.8% * | 44.6% * | 51.1% * | 59.8% * | 71.0% * | 49.2% * | 65.2% * | $0.38 | 372 t/s | |
#— — | MiniMax M2.5 | Open Source | 53.4% | — | — | — | — | — | — | $0.75 | 39.3 t/s | |
#40 46.3% | Llama 4 Maverick | Open Source | 53.1% * | 41.0% * | 49.7% * | 41.7% * | 46.4% * | 36.5% * | 54.9% * | Free | 155 t/s | |
#36 53.7% | GPT-4.5 | Proprietary | 48.6% * | 44.8% * | 51.7% * | 67.2% * | 68.5% * | 55.4% * | 75.0% * | $7.50 | 85 t/s | |
#42 38.8% | Llama 4 Scout | Open Source | 42.5% * | 34.6% * | 42.0% * | 38.6% * | 38.5% * | 28.4% * | 50.3% * | Free | 2.6k t/s | |
#41 42.0% | GPT-4o | Proprietary | 40.7% * | 34.1% * | 48.5% * | 45.6% * | 44.9% * | 40.9% * | 56.9% * | $6.25 | 110 t/s | |
#— — | Grok 4.20 | Proprietary | — | — * | — | — | — * | — | — * | $9.00 | 100 t/s | |
#— — | Grok 4.20 Thinking | Proprietary | — | — | — | — | — | — | — | $9.00 | 100 t/s |
Proprietary
Proprietary
Proprietary
Proprietary
Proprietary
Proprietary
Proprietary
Proprietary
Proprietary
Proprietary
Proprietary
Proprietary
Open
Proprietary
Proprietary
Proprietary
Proprietary
Proprietary
Proprietary
Proprietary
Open
Open
Proprietary
Proprietary
Open
Proprietary
Proprietary
Proprietary
Proprietary
Open
Proprietary
Open
Open
Open
Open
Open
Proprietary
Open
Open
Open
Proprietary
Open
Open
Proprietary
Open
Proprietary
Proprietary
Proprietary