Researchers create first image-recognition software to improve web searches

Image
ANI Washington
Last Updated : Nov 19 2014 | 6:50 PM IST

Researchers have developed an image-recognition software that uses photos to locate documents on the Internet with far greater accuracy than ever before.

Created by Dartmouth researchers, the new system, which was tested on photos and is now being applied to videos, shows for the first time that a machine learning algorithm for image recognition and retrieval is accurate and efficient enough to improve large-scale document searches online.

The system uses pixel data in images and potentially video - rather than just text-to locate documents. It learns to recognize the pixels associated with a search phrase by studying the results from text-based image search engines. The knowledge gleaned from those results can then be applied to other photos without tags or captions, making for more accurate document search results.

According to the press release, the findings appear in the journal PAMI (IEEE Transactions on Pattern Analysis and Machine Intelligence).

"Images abound on the Internet and our approach means they'll no longer be ignored during document retrieval," says Associate Professor Lorenzo Torresani, a co-author of the study. "Over the last 30 years, the Web has evolved from a small collection of mostly text documents to a modern, gigantic, fast-growing multimedia dataset, where nearly every page includes multiple pictures or videos".

He adds that "when a person looks at a Web page, she immediately gets the gist of it by looking at the pictures in it. Yet, surprisingly, all existing popular search engines, such as Google or Bing, strip away the information contained in the photos and use exclusively the text of Web pages to perform the document retrieval. Our study is the first to show that modern machine vision systems are accurate and efficient enough to make effective use of the information contained in image pixels to improve document search."

The researchers designed and tested a machine vision system - a type of artificial intelligence that allows computers to learn without being explicitly programmed-that extracts semantic information from the pixels of photos in Web pages.

This information is used to enrich the description of the HTML page used by search engines for document retrieval. The researchers tested their approach using more than 600 search queries on a database of 50 million Web pages.

They selected the text-retrieval search engine with the best performance and modified it to make use of the additional semantic information extracted by their method from the pictures of the Web pages. They found that this produced a 30 percent improvement in precision over the original search engine purely based on text.

The new system was developed by researchers at Dartmouth College, Tecnalia Research and Innovation and Microsoft Research Cambridge.

*Subscribe to Business Standard digital and get complimentary access to The New York Times

Smart Quarterly

₹900

3 Months

₹300/Month

SAVE 25%

Smart Essential

₹2,700

1 Year

₹225/Month

SAVE 46%
*Complimentary New York Times access for the 2nd year will be given after 12 months

Super Saver

₹3,900

2 Years

₹162/Month

Subscribe

Renews automatically, cancel anytime

Here’s what’s included in our digital subscription plans

Exclusive premium stories online

  • Over 30 premium stories daily, handpicked by our editors

Complimentary Access to The New York Times

  • News, Games, Cooking, Audio, Wirecutter & The Athletic

Business Standard Epaper

  • Digital replica of our daily newspaper — with options to read, save, and share

Curated Newsletters

  • Insights on markets, finance, politics, tech, and more delivered to your inbox

Market Analysis & Investment Insights

  • In-depth market analysis & insights with access to The Smart Investor

Archives

  • Repository of articles and publications dating back to 1997

Ad-free Reading

  • Uninterrupted reading experience with no advertisements

Seamless Access Across All Devices

  • Access Business Standard across devices — mobile, tablet, or PC, via web or app

More From This Section

First Published: Nov 19 2014 | 6:38 PM IST

Next Story