🔍 Anthropic is diving deep into AI interpretability and working on the "MRI for AI"! 🧠💡 Co-founder Dario Amodei points out a major issue: we're cranking out super-powerful AI systems without fully understanding how they tick. 🤔 This lack of transparency makes it tricky to assess if these AIs could be conscious (and maybe deserve some rights?). There's a race happening between boosting AI power and companies figuring out how these systems operate. Dario predicts we might see AGI-level models popping up by 2026-2027! 🚀 Anthropic is developing "mechanistic interpretability" techniques to peer inside AI and uncover its inner workings. Think of it as an MRI scan for AI! 🩻 This will help spot issues like lying tendencies, power-seeking behavior, vulnerabilities, and cognitive strengths/weaknesses. He’s calling on neurobiologists to join the fun—gathering data from artificial neural networks is way easier than from biological ones! 💪 So far, they’ve managed to: 1. Identify over 30
🔍 Anthropic is diving deep into AI interpretability and working on the "MRI for AI
13 мая 202513 мая 2025
1 мин