Indian Institute of Technology (IIT) Madras is collaborating with the Mobile Payment Forum of India (MPFI) to develop voice-based solutions, focused on vernacular languages for digital money transactions.
“At present if you are using the UPI app, it assumes that the user has some literacy rate. You have to fill in a lot of details and then navigate. But speaking comes naturally to us and we can use voice to enable the transaction,” said Gaurav Raina, faculty, department of Electrical Engineering, IIT Madras and chairman MPFI.
Raina also elaborates that the challenge is not technology or computing but the Indian language itself. “We have an excellent payment infrastructure in India. But the number of vernacular languages and the dialects that we have just adds to the complexity,” he added.
IIT Madras researchers will focus on two aspects: first, is to make the technology backend robust and secure; two, at the front end, as more people use digital payments, it will create a useful digital data history and footprint. Then using machine learning and artificial intelligence, one can aim to provide customized financial solutions, and other value added services.
To begin with, IIT Madras will look for partnerships with corporates, startups and the government. “Of course, an important element for this to succeed will be standards. Standards that let the eco-system of developers know what can be done, and what needs to be kept out,” said Raina.
He further adds, “One of the good things about the voice platform that we are envisaging is that it’s a transactional conversation rather than a conversation. So from a language point we will bring down the complexity too. Because a yes or no is almost the same in each language and then the other part is numbers that are also simple.”
That Voice is the future was recently showcased by ShareChat, a regional language based social networking site. The company recently shared that as of May, ShareChat’s Chatrooms had over 10.5 million users spending over 1.2 billion minutes on audio chat rooms monthly, Gaurav Mishra, SVP, ShareChat told Business Standard.
The two main areas that live audio engaging for Sharechat’s regional language user base were that “conversation is audio-based not text-based, which automatically makes it more understandable and engaging” and “audio also removes the friction of ‘getting ready for the camera’ that creators on other platforms are used to doing. This makes it easier for both men and women to participate in the conversation.”
Indian language conundrum
Cracking the Indian language conundrum is on the top of the mind of not only the academic institutes in India but also the government. The Natural Language Translation Mission (NLTM) is also a step in that direction taken by the government. Last year under the NLTM programme the government launched Bahubhashak.
The goal of Bahubhashak’s which is managed by the Principal Scientific Adviser and Ministry of Electronics and Information Technology (MeitY), is to have Indian Language technology systems and products deployed in the field with the help of startups. The coordinating institutes provide technical and research support for the deployment of these technologies through startups.
Bahubashak, is a speech-to-speech machine translation system. One of the explicit goals of this project is to engage with startups in creating data. Under the umbrella of this project academic institutes IIT-B, CDAC, IISC Bangalore, IIIT-Hyderabad and IIT Madras are engaging with startups to create a database of language.
"Bahubhashak started with the idea of making engineering course content and even higher education lecture content available in vernacular languages real-time. Suppose I am delivering a lecture on AI at IIT Bombay; the same lecture in real-time will also be heard in Marathi by a student in Maharashtra. Our efforts are towards creating a system that can translate and deliver content in real-time” said Dr Pushpak Bhattacharyya, Professor Department of Computer Science and Engineering at IIT Bombay.
Bhattacharyya adds that last year a pilot project for speech-to-speech machine translation was started and is nearing completion. “We are attempting to combine three aspects of SSMT- Automatic Speech Recognition, Machine Translation and Text to Speech through Bahubhashak and the National Language Technology Mission.”
He is also of the view that voice-based interface is the future. “Keyboard based interfaces are not very convenient and their usage will reduce as more and more voice-based applications emerge,” he added.
One subscription. Two world-class reads.
Already subscribed? Log in
Subscribe to read the full story →
Smart Quarterly
₹900
3 Months
₹300/Month
Smart Essential
₹2,700
1 Year
₹225/Month
Super Saver
₹3,900
2 Years
₹162/Month
Renews automatically, cancel anytime
Here’s what’s included in our digital subscription plans
Exclusive premium stories online
Over 30 premium stories daily, handpicked by our editors


Complimentary Access to The New York Times
News, Games, Cooking, Audio, Wirecutter & The Athletic
Business Standard Epaper
Digital replica of our daily newspaper — with options to read, save, and share


Curated Newsletters
Insights on markets, finance, politics, tech, and more delivered to your inbox
Market Analysis & Investment Insights
In-depth market analysis & insights with access to The Smart Investor


Archives
Repository of articles and publications dating back to 1997
Ad-free Reading
Uninterrupted reading experience with no advertisements


Seamless Access Across All Devices
Access Business Standard across devices — mobile, tablet, or PC, via web or app
)