Every day hundreds of thousands of dictations, court orders and petitions are dictated and converted to text documents using Nuance Communications’ Dragon NaturallySpeaking software across the high courts of Mumbai, Kerala, Andhra Pradesh (AP), Delhi and the judicial offices of Karnataka. Dipen S, a practising attorney in AP High Court says, “I regularly give dictations running into hundreds of pages. But availability of stenographers was always an issue. With the Dragon software, I can now give dictations even at midnight from my home. The accuracy is almost 97 per cent and the typing speed is very fast. I have drafted hundreds of petitions and the software actually learns from experience.”
Sunny Rao of Nuance Communications, the speech recognition company currently powering Apple’s Siri in the iPhone 4S, claims this is a first-of-its-kind initiative in India, where voice solutions have been adopted by the judiciary to read out judgments in courts. Rao, who is the managing director (India and South East Asia) for the firm, also adds that Nuance’s voice solutions are also being used by the Income Tax Department, RBI and NABARD.
Dragon NaturallySpeaking (legal version) speech recognition software lets users dictate documents, send and manage e-mail and give commands to their computers. The firm claims it is three times faster than typing. The software offers a pre-configured vocabulary with over 30,000 legal terms, automatically formats legal citations and supports third-party correction to help you achieve accuracy in dictating. One can dictate into a hand-held device and Dragon Legal will transcribe the audio and send it to PC.
Nuance is also bullish on its Dragon Medical, a medical speech recognition software, specifically designed for the healthcare sector. It is being used by several hospitals like SevenHills Hospital in Mumbai, Apollo Hospital in Bangalore, Medanta in Gurgaon, Narayana Hrudalaya in Bangalore and Lilavati Hospital and Research Centre in Mumbai, thus facilitating faster delivery of medical reports and discharging summaries to patients. By incorporating speech recognition into mobile apps, doctors and nurses are now able to dictate their notes into patients’ electronic health records (EHRs) on an iPhone or iPad or any Android device. “If you go to these hospitals, it is not unusual to see a physician speak to an iPhone or iPad, documenting an electronic health record,” says Rao. It’s not just the big players that want to make use of speech solutions, the government, too, is actively integrating voice-activated platforms for several of its e-governance projects across the country. In one such, Nuance, along with Evo lgence IT Systems, a global technology and development consulting firm, has deployed voice biometrics and speech recognition solutions across a range of rural healthcare programmes. The speech recognition deployments are powered by the ninth-generation automatic speech recognition (ASR) engine from Nuance Communications that supports 14 different Indian languages, including Indian English, Hindi, Bengali, Gujarati, Kannada, Malayalam, Marathi, Oriya, Punjabi, Tamil, Telugu, Assamese, Bhojpuri and Urdu.
Technologies such as voice-verification are also being used for authenticating and recording attendance in social schemes such as the National Rural Employment Guarantee Act (NREGA). It has helped eliminate fraudulent cases.
According to Rao, it is a matter of time before humans demand speaking directly to machines, instead of jabbing buttons. And Nuance isn't limiting itself to computers, phones and tablets. At the Consumer Electronics Show in January this year, the company rolled out its Dragon TV. The product brings voice-recognition to television, allowing firms to build voice and touch apps geared specifically towards video.
Nuance is also betting on automobiles to come embedded with their voice solutions. The company has been integral to the development of Ford Motor’s SYNC, a voice-activated in-car connectivity system, which will make its European debut later this year.
“We along with Ford are now working towards embedded systems for interpretation of the user’s natural speech, allowing people to intuitively operate vehicle features by speaking as they would to a friend, for a more responsive experience with more voice activation possibilities. This includes researching voice recognition that can understand the user’s intent based on keywords or phrases, even when the exact command is not given,” explains Rao.