Найти в Дзене
Crynet.io

🎙 NVIDIA just dropped the Canary-1B v2 — an open model for speech recognition and translation that works with 25 European languages

🎙 NVIDIA just dropped the Canary-1B v2 — an open model for speech recognition and translation that works with 25 European languages! Here’s the lowdown: - 📝 Spot-on ASR (Automatic Speech Recognition) & AST (Automatic Speech Translation) between English and 24 other languages. - Auto punctuation, capitalization, and precise word timestamps. - Supports Russian, French, German, Spanish, and more! Why it’s a game-changer: - Inference up to 10× faster than models three times its size. - Already rocking state-of-the-art accuracy among open models on Hugging Face. - CC-BY-4.0 license means you can use it freely in your projects! What’s under the hood: - Architecture: FastConformer encoder + Transformer decoder (~978M parameters). - Formats: .wav and .flac, mono 16 kHz. - Easy integration via NVIDIA NeMo or directly from Hugging Face. Where you can use it: 🟢 Voice assistants 🟢 Video subtitles & translations 🟢 Voice-input chatbots 🟢 Real-time speech analysis With ~978M parameters → l

🎙 NVIDIA just dropped the Canary-1B v2 — an open model for speech recognition and translation that works with 25 European languages!

Here’s the lowdown:

- 📝 Spot-on ASR (Automatic Speech Recognition) & AST (Automatic Speech Translation) between English and 24 other languages.

- Auto punctuation, capitalization, and precise word timestamps.

- Supports Russian, French, German, Spanish, and more!

Why it’s a game-changer:

- Inference up to 10× faster than models three times its size.

- Already rocking state-of-the-art accuracy among open models on Hugging Face.

- CC-BY-4.0 license means you can use it freely in your projects!

What’s under the hood:

- Architecture: FastConformer encoder + Transformer decoder (~978M parameters).

- Formats: .wav and .flac, mono 16 kHz.

- Easy integration via NVIDIA NeMo or directly from Hugging Face.

Where you can use it:

🟢 Voice assistants

🟢 Video subtitles & translations

🟢 Voice-input chatbots

🟢 Real-time speech analysis

With ~978M parameters → lighter, faster, and cheaper than bulky competitor models!

🟠 Check it out here: https://huggingface.co/nvidia/canary-1b-v2