⚡️ Introducing Agent S3 - the next-level computer agent getting closer to human-like capabilities! 🤖💻 Instead of complicating one model, the creators are launching several agents at once and picking the best outcome. They call this method **Behavior Best-of-N (bBoN)**. 🏆 Here’s the scoop: - Each agent tackles a task. - Their actions turn into a behavior narrative - a snappy summary of what actually changed on-screen. - A special judge compares these narratives and picks the winner! 🥇 Results? Check this out: - GPT-5 with 10 parallel agents → 69.9% success rate! 💯 - For comparison, GPT-5 Mini hits 60.2%. - Agent S3 is scoring a solid +10% over the previous SOTA! 🚀✨
⚡️ Introducing Agent S3 - the next-level computer agent getting closer to human-like capabilities
25 октября 202525 окт 2025
~1 мин