Text-to-Speech AI News & Updates

Mistral AI Launches Open-Source Voxtral TTS Model for Real-Time Speech Generation

Mistral AI released Voxtral TTS, an open-source text-to-speech model supporting nine languages that can run on edge devices like smartphones and smartwatches. The model features rapid voice adaptation from five-second samples, real-time performance with 90ms time-to-first-audio, and multi-language support while preserving voice characteristics. This positions Mistral to compete with ElevenLabs, Deepgram, and OpenAI in enterprise voice AI applications like customer support and sales.

OpenAI Enhances Voice and Transcription AI Models with Advanced Control Features

OpenAI has released new AI models for transcription and voice generation that offer improved accuracy and control over previous versions. The new text-to-speech model allows developers to steer voice characteristics using natural language, while the transcription models reduce hallucinations but show significant error rates for certain languages.