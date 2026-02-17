Indian AI startup, Gnani.ai, has released a five-billion-parameter voice-to-voice foundational artificial intelligence model at the India AI Impact Summit 2026. Called Inya VoiceOS, the model has been built under the IndiaAI Mission and is being released as a research preview ahead of a larger 14-billion-parameter voice-to-voice model that Gnani.ai is developing. Inya VoiceOS supports more than 15 Indian languages, produces 24 kHz audio output, and is designed to operate with sub-second end-to-end latency, as per the company.

What’s Inya VoiceOS

Inya VoiceOS is designed to operate directly on speech rather than relying on the more common pipeline of converting speech-to-text and then back to speech. Gnani.ai said the model works in what it describes as acoustic and semantic space, allowing it to consume and generate speech tokens without intermediate transcription steps.

The company said the model jointly encodes phonetics, prosody, semantics and intent, and is built to retain paralinguistic cues such as tone, emotion, pacing and pauses. It also supports streaming and interruption-aware inference, and can handle overlapping speech and mid-utterance corrections without resetting a conversation. On the technical side, Gnani.ai said Inya VoiceOS has 5-billion parameters and has been trained on more than 14-million hours of multilingual speech data, with an additional 1.2-million hours of task-specific fine-tuning data. The model has also been trained using over 8-trillion text tokens for linguistic grounding and reasoning. The company said it supports more than 15 Indian languages, produces 24 kHz audio output, and is designed to operate with sub-second end-to-end latency, including in code-mixed speech scenarios.

Where it will be used Gnani.ai said the model is aimed at use cases that require real-time spoken interaction rather than text-first interfaces. In government systems, the company said this could include conversational AI for helplines, grievance redressal and emergency response services, where systems need to handle multilingual and natural speech input. In enterprise settings, the company said the model is being positioned for voice-driven workflows in sectors such as banking and financial services, healthcare, insurance and logistics, where hands-free operation and conversational interfaces are increasingly being tested.