New OpenAI models are hallucinating way more than the old school ones! 🤯 Internal tests reveal that the new reasoning models, o3 and o4-mini, are cooking up wild responses more often than their predecessors like o1, o1-mini, and o3-mini. 📊 In fact, o3 hallucinated on 33% of test questions while o4-mini hit a whopping 48%! For a little perspective, o1 and o3-mini only had 16% and 14.8%, respectively. Researchers found that as models get bigger, the hallucinations seem to ramp up. The new guys know more but also make more wild guesses—some spot on, others totally made up! 😳 If this trend keeps up, we might have a serious problem in the industry as we lean into these reasoning models. Let’s hope they find a way to keep it real! 🙏💡 Check out the full scoop here: https://techcrunch.com/2025/04/18/openais-new-reasoning-ai-models-hallucinate-more/
New OpenAI models are hallucinating way more than the old school ones
7 мая 20257 мая 2025
~1 мин