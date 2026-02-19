Indian AI startup Gnani.ai has launched two new speech models — Vachana STT and Vachana TTS— under its Inya VoiceOS stack at the India AI Impact Summit. The Vachana STT is a speech-to-text model focused on Indic languages, while Vachana TTS is a text-to-speech system with support for voice synthesis and voice cloning. Together, the two models cover both sides of the speech pipeline and are positioned as production-scale systems rather than research demos.

Gnani.ai said both models have been built, trained and deployed within India, with training data and inference operations hosted in Indian data centres.

Gnani’s Vachana models: Details

Vachana STT is the first model to be released under Inya VoiceOS. Gnani.ai said the model has been trained on more than one million hours of real-world voice data spanning over 1,000 domains. The company said the system is designed to handle noisy, multi-channel audio and is already being used at scale, processing around 10 million calls per day with a reported p95 latency of about 200 milliseconds.

On accuracy, Gnani.ai said the model achieves a word error rate of under 5 per cent for Hindi and Indian English, under 10 per cent for languages such as Bengali, Tamil, Telugu, Kannada, Malayalam, Gujarati, Marathi, Punjabi, Odia and Assamese, and under 20 per cent for lower-resource Indian languages. The model is also built to handle code-mixed speech, regional accents and compressed telephony audio without requiring domain-specific fine-tuning. According to the company, Vachana STT supports both real-time streaming and batch transcription and is being deployed across sectors such as banking and financial services, telecom, insurance, automotive, retail, logistics and healthcare. The model is available via APIs for enterprise and government users, with on-premise deployment options for organisations with stricter data residency requirements.

Alongside this, Gnani.ai has also launched Vachana TTS, a text-to-speech model focused on Indian languages. The company said the model supports 12 languages, including Hindi, Bengali, Tamil, Telugu, Kannada, Malayalam, Gujarati, Marathi, Punjabi, Odia, Assamese and Indian English. In internal and external evaluations, Gnani.ai said the model achieves a mean opinion score of 4.23 and a character error rate of below 0.6 per cent. A key feature of Vachana TTS is zero-shot voice cloning, which allows the system to replicate a speaker's voice using less than 10 seconds of reference audio. The company said this can be used to maintain a consistent voice identity across multiple languages, including in cross-lingual settings.