Microsoft's speech recognition system achieves new milestone

Image
IANS San Francisco
Last Updated : Aug 21 2017 | 12:02 PM IST

Microsofts conversational speech recognition system -- designed to accurately recognises the words in a conversation like humans do -- has reached a 5.1 per cent error rate, its lowest so far.

This milestone means that, for the first time, a computer can recognise the words in a conversation as well as a person would.

"Our research team reached that 5.1 per cent error rate with our speech recognition system, a new industry milestone, substantially surpassing the accuracy we achieved last year," Microsoft said in a blog post late on Sunday.

Last year in October, the team from Microsoft Artificial Intelligence and Research reported a speech recognition system that makes the same or fewer errors than professional transcriptionists.

The researchers had then reported a word error rate (WER) of 5.9 percent.

"Last year, Microsoft's speech and dialog research group announced a milestone in reaching human parity on the 'Switchboard' conversational speech recognition task, meaning we had created technology that recognised words in a conversation as well as professional human transcribers," said Xuedong Huang, Technical Fellow, Microsoft.

'Switchboard' is a corpus of recorded telephone conversations that the speech research community has used for more than 20 years to benchmark speech recognition systems.

The task involves transcribing conversations between strangers discussing topics such as sports and politics.

The team used "Microsoft Cognitive Toolkit 2.1" (CNTK), the most scalable deep learning software available, for exploring model architectures.

Additionally, Microsoft's investment in cloud compute infrastructure, specifically Azure GPUs, helped improve the effectiveness and speed.

Reaching human parity with an accuracy on par with humans has been a research goal for the last 25 years.

"Microsoft's willingness to invest in long-term research is now paying dividends for our customers in products and services such as Cortana, Presentation Translator, and Microsoft Cognitive Services," the post read.

"Moving from recognising to understanding speech is the next major frontier for speech technology," the post added.

--IANS

na/vm

Disclaimer: No Business Standard Journalist was involved in creation of this content

*Subscribe to Business Standard digital and get complimentary access to The New York Times

Smart Quarterly

₹900

3 Months

₹300/Month

SAVE 25%

Smart Essential

₹2,700

1 Year

₹225/Month

SAVE 46%
*Complimentary New York Times access for the 2nd year will be given after 12 months

Super Saver

₹3,900

2 Years

₹162/Month

Subscribe

Renews automatically, cancel anytime

Here’s what’s included in our digital subscription plans

Exclusive premium stories online

  • Over 30 premium stories daily, handpicked by our editors

Complimentary Access to The New York Times

  • News, Games, Cooking, Audio, Wirecutter & The Athletic

Business Standard Epaper

  • Digital replica of our daily newspaper — with options to read, save, and share

Curated Newsletters

  • Insights on markets, finance, politics, tech, and more delivered to your inbox

Market Analysis & Investment Insights

  • In-depth market analysis & insights with access to The Smart Investor

Archives

  • Repository of articles and publications dating back to 1997

Ad-free Reading

  • Uninterrupted reading experience with no advertisements

Seamless Access Across All Devices

  • Access Business Standard across devices — mobile, tablet, or PC, via web or app

More From This Section

First Published: Aug 21 2017 | 11:52 AM IST

Next Story