Найти в Дзене
Crynet.io

🎤 Meet VibeVoice: Microsoft’s next-gen text-to-speech (TTS) model for lengthy chats with multiple voices

🎤 Meet VibeVoice: Microsoft’s next-gen text-to-speech (TTS) model for lengthy chats with multiple voices!

✨ 1.5B parameters

📜 MIT licensed

⏳ Generate up to 90 minutes of audio

💖 Super strong emotional vibes

So, what’s the scoop? VibeVoice is designed to create expressive and engaging audio conversations—think podcasts—with multiple speakers. It tackles the common headaches of traditional TTS systems, like scalability and consistency.

With this model, you can synthesize conversations featuring up to 4 unique speakers—way beyond the usual 1 or 2!

Curious? Check out all the deets and examples here: https://microsoft.github.io/VibeVoice/ 🚀