AI system improves performance by surfing on internet

Image
IANS New York
Last Updated : Nov 13 2016 | 2:42 PM IST

Researchers from the US have developed an artificial intelligence (AI) system that surfs the internet, extracts information from the available plain text and organises it for quantitative analysis in very less time.

Recently at the Association for Computational Linguistics' Conference on Empirical Methods on Natural Language Processing, researchers from the Massachusetts Institute of Technology (MIT) Computer Science and Artificial Intelligence Laboratory won a best-paper award for a new approach to information extraction that turns conventional machine learning on its head.

Most machine-learning systems work by combing through training examples and looking for patterns that correspond to classifications provided by human annotators.

In their new paper, the MIT researchers trained their system on scanty data -- because in the scenario they're investigating, that's usually all that's available. But then they find the limited information an easy problem to solve.

"In information extraction, traditionally, in natural-language processing, you are given an article and you need to do whatever it takes to extract correctly from this article," said Regina Barzilay, the Delta Electronics Professor of Electrical Engineering and Computer Science.

"That's very different from what you or I would do. When you are reading an article that you cannot understand, you are going to go on the web and find one that you can understand," Barzilay, who also a senior author of the paper, added.

A machine-learning system assigns each of its classifications a confidence score -- which is a measure of the statistical likelihood that the classification is correct -- given the patterns discerned in the training data.

With the researchers' new system, if the confidence score is too low, the system automatically does a web search to pull up texts likely to contain the data it is trying to extract.

It then attempts to extract the relevant data from one of the new texts and reconciles the results with those of its initial extraction.

If the confidence score remains too low, it moves on to the next text pulled up by the search string, and so on.

Eventually, the system learns how to generate search queries, gauge the likelihood that a new text is relevant to its extraction task, and determine the best strategy for fusing the results of multiple attempts at extraction.

--IANS

sku/sm/vt

Disclaimer: No Business Standard Journalist was involved in creation of this content

*Subscribe to Business Standard digital and get complimentary access to The New York Times

Smart Quarterly

₹900

3 Months

₹300/Month

SAVE 25%

Smart Essential

₹2,700

1 Year

₹225/Month

SAVE 46%
*Complimentary New York Times access for the 2nd year will be given after 12 months

Super Saver

₹3,900

2 Years

₹162/Month

Subscribe

Renews automatically, cancel anytime

Here’s what’s included in our digital subscription plans

Exclusive premium stories online

  • Over 30 premium stories daily, handpicked by our editors

Complimentary Access to The New York Times

  • News, Games, Cooking, Audio, Wirecutter & The Athletic

Business Standard Epaper

  • Digital replica of our daily newspaper — with options to read, save, and share

Curated Newsletters

  • Insights on markets, finance, politics, tech, and more delivered to your inbox

Market Analysis & Investment Insights

  • In-depth market analysis & insights with access to The Smart Investor

Archives

  • Repository of articles and publications dating back to 1997

Ad-free Reading

  • Uninterrupted reading experience with no advertisements

Seamless Access Across All Devices

  • Access Business Standard across devices — mobile, tablet, or PC, via web or app

More From This Section

First Published: Nov 13 2016 | 2:36 PM IST

Next Story