13 models·150 puzzles
g2c2
| Rank | |||||||||
|---|---|---|---|---|---|---|---|---|---|
#1 | Grok 4.1 Fast | 58.7% | 33.3% | 53.3% | 93.3% | 73.3% | 40% | 2.3765 | 2,294,216 |
#2 | Gemini 3.1 Pro Preview | 55.3% | 43.3% | 50% | 83.3% | 56.7% | 43.3% | 4.5200 | 433,511 |
#3 | Grok 4.20 Beta | 53.3% | 20% | 43.3% | 90% | 83.3% | 30% | 7.8989 | 1,730,162 |
#4 | Gemini 3 Flash Preview | 48.7% | 36.7% | 30% | 83.3% | 53.3% | 40% | 0.8700 | 624,635 |
#5 | Gemini 3.1 Flash Image Preview | 46.0% | 30% | 33.3% | 90% | 40% | 36.7% | 2.4400 | 481,223 |
#6 | GPT-5.4 | 32.7% | 6.7% | 23.3% | 76.7% | 43.3% | 13.3% | 7.6800 | 607,885 |
#7 | Qwen 3.6 Plus | 28.0% | 20% | 10% | 86.7% | 16.7% | 6.7% | 0.0000 | 7,667,154 |
#8 | Qwen 3.6 Plus Preview | 22.0% | 13.3% | 23.3% | 56.7% | 16.7% | 0% | 0.0000 | 4,428,821 |
#9 | Claude Opus 4.6 | 16.7% | 13.3% | 6.7% | 46.7% | 10% | 6.7% | 13.0000 | 617,547 |
#10 | GLM-5 | 12.0% | 20% | 6.7% | 33.3% | 0% | 0% | 1.8700 | 689,388 |
#11 | Claude Sonnet 4.6 | 10.7% | 3.3% | 6.7% | 40% | 3.3% | 0% | 8.9400 | 704,037 |
#12 | Claude Haiku 4.5 | 8.7% | 13.3% | 0% | 30% | 0% | 0% | 0.1490 | 359,946 |
#13 | Gemini 2.5 Pro | 5.3% | 10% | 0% | 16.7% | 0% | 0% | 4.7000 | 563,618 |