Introducing speech-to-text, text-to-speech, and more for 1,100+ languages (via) New from Meta AI: Massively Multilingual Speech. “MMS supports speech-to-text and text-to-speech for 1,107 languages and language identification for over 4,000 languages. [...] Some of these, such as the Tatuyo language, have only a few hundred speakers, and for most of these languages, no prior speech technology exists.”
It’s licensed CC-BY-NC 4.0 though, so it’s not available for commercial use.
“In a like-for-like comparison with OpenAI’s Whisper, we found that models trained on the Massively Multilingual Speech data achieve half the word error rate, but Massively Multilingual Speech covers 11 times more languages.”
The training data was mostly sourced from audio Bible translations.
Recent articles
- Two publishers and three authors fail to understand what "vibe coding" means - 1st May 2025
- Understanding the recent criticism of the Chatbot Arena - 30th April 2025
- Qwen 3 offers a case study in how to effectively release a model - 29th April 2025