30 подписчиков

⚡️ GPT-5 Hits the Sudoku Scene

6 декабря 20256 дек 2025

~1 мин

⚡️ GPT-5 Hits the Sudoku Scene! 🧩 Sudoku-Bench just dropped their latest test results, and WOW, have times changed! Back in May '25, no LLM could crack a classic 9x9 puzzle. Fast forward to now: GPT-5 is the champ, solving 33% of puzzles—twice as smart as its closest rival! 🏆 But hold up! The big challenge? A whopping 67% of tougher puzzles are still too tricky. Why? Modern models struggle with the core of Sudoku: grasping new rules, keeping the big picture in mind, building long logical chains, and spotting that "aha!" moment that seasoned players nail right away. 🤔 We tried some cool tweaks—like GRPO-tuning on Qwen2.5-7B and Thought Cloning from Cracking the Cryptic—but they only made a dent. Spatial reasoning and creative flair are still a tough nut to crack for these models. So here’s the deal: noticeable progress for sure, but we’re still light-years from human-level logic and spatial thinking. 🚀 Want to dive deeper? Check it out: https://pub.sakana.ai/sudoku-gpt5/

⚡️ GPT-5 Hits the Sudoku Scene! 🧩

Sudoku-Bench just dropped their latest test results, and WOW, have times changed! Back in May '25, no LLM could crack a classic 9x9 puzzle. Fast forward to now: GPT-5 is the champ, solving 33% of puzzles—twice as smart as its closest rival! 🏆

But hold up! The big challenge? A whopping 67% of tougher puzzles are still too tricky. Why? Modern models struggle with the core of Sudoku: grasping new rules, keeping the big picture in mind, building long logical chains, and spotting that "aha!" moment that seasoned players nail right away. 🤔

We tried some cool tweaks—like GRPO-tuning on Qwen2.5-7B and Thought Cloning from Cracking the Cryptic—but they only made a dent. Spatial reasoning and creative flair are still a tough nut to crack for these models.

So here’s the deal: noticeable progress for sure, but we’re still light-years from human-level logic and spatial thinking. 🚀

Want to dive deeper? Check it out: https://pub.sakana.ai/sudoku-gpt5/