30 подписчиков

🖥 Big news from Nvidia! They’re pivoting away from one-size-fits-all GPUs

1 октября 20251 окт 2025

~1 мин

🖥 Big news from Nvidia! They’re pivoting away from one-size-fits-all GPUs. 🚀 Now, each chip is tailored for specific stages of LLM inference. Let’s break it down: - **Prefill**: First step, needs serious computing power but barely touches memory. 💪 - **Decode**: The second step flips the script—heavy on memory but lighter on computing. 🧠 Remember the R200? It tried to do it all—powerful compute blocks AND tons of memory—but that was a pricey mess: - Memory sat idle during Prefill 💤 - Compute blocks twiddled their thumbs during Decode 🙄 🟢 Enter Nvidia’s new game plan: different GPUs for different tasks! Check this out: - **Rubin CPX** - Prefill champ 🎯 • 20 PFLOPS of compute power ⚡️ • 128 GB GDDR7 🚀 • 2 TB/s bandwidth 📈 - **R200** - Decode pro 💻 • 288 GB HBM4 🧠💪 • 20.5 TB/s memory speed 🌊 Get ready for some serious GPU magic!✨ #Nvidia #TechTalk

🖥 Big news from Nvidia! They’re pivoting away from one-size-fits-all GPUs. 🚀

Now, each chip is tailored for specific stages of LLM inference. Let’s break it down:

- **Prefill**: First step, needs serious computing power but barely touches memory. 💪

- **Decode**: The second step flips the script—heavy on memory but lighter on computing. 🧠

Remember the R200? It tried to do it all—powerful compute blocks AND tons of memory—but that was a pricey mess:

- Memory sat idle during Prefill 💤

- Compute blocks twiddled their thumbs during Decode 🙄

🟢 Enter Nvidia’s new game plan: different GPUs for different tasks! Check this out:

- **Rubin CPX** - Prefill champ 🎯

• 20 PFLOPS of compute power ⚡️

• 128 GB GDDR7 🚀

• 2 TB/s bandwidth 📈

- **R200** - Decode pro 💻

• 288 GB HBM4 🧠💪

• 20.5 TB/s memory speed 🌊

Get ready for some serious GPU magic!✨ #Nvidia #TechTalk