Business Standard
Saturday, Feb 04, 2012
drived banner
drived banner
  Advanced Search
RSS
Content Guide
Follow us on  
||||||||Technology| 
 Section Home | News Now | Features & Analysis | IT/ITES | Telecom | Hardware | Columnists | Gadgets & Gizmos
Home > Tech World Live Markets | Commodities
 

Search market to get another engine
Shivani Shinde / Mumbai Aug 27, 2009, 02:58 IST

HP, along with the Indian Institute of Technology, Bombay, is working on an engine to make online search more meaningful

Last year, Hewlett-Packard (HP) Labs initiated open research grants to dozens of universities worldwide. One such grant was given to the Computer Science Department of Indian Institute of Technology, Bombay (IIT-B).

Professor Soumen Chakrabarti and his group at IIT-B used this grant to work on a new search engine which would trawl the web to provide relevant answers to queries. Their efforts are yielding results.

The IIT-B team has already created billions of annotation links between a 500-million web page corpus and millions of entities known to Wikipedia. The data is being churned on 42 high-end HP servers with over 350 gigabytes of RAM and over 150 terabytes of disks, donated by Yahoo. HP Labs and Microsoft Research have provided additional research funding.

The initial results have been very encouraging. “The search for quantity queries get answered in 2-5 seconds,” says Sayali Kulkarni, a student working on the project at IIT-B. Prof Chakraborti adds that the search engine will also allow searching for entities and relations — queries like “how old is Feng Shui” or “how many people are infected with AIDS worldwide.”

What makes this system different from others? While the existing major players still expect 2-3 word queries and return URLs (web addresses) to browse, the new engine will understand more structures in the query and respond with information nuggets and tables, and not just the links of the pages (sources) from which this knowledge is distilled.

So queries like “length of the Nile River” or “maximum speed of a Mercedez-Benz SLR McLaren” would be answered using encyclopaedia sources like Wikipedia, but in many cases the queries are not appropriate and need support from unstructured web text like news and blogs. However, this system being built can aggregate, for each query, tens of thousands of snippets into quantitative answers.

To be successful, any search engine needs a robust mechanism that indexes web pages. At any given time, there are millions of web pages on the internet. For instance, Google has over 8 billion pages indexed and over 1.1 billion images. Add to that an efficient crawler which basically connects servers across the world wide web and across servers.

In case of the HP-IIT-B machine, the mainstay is annotation, indexing of annotations alongside ordinary text, and supporting a query language that can combine categories, annotations, quantities and regular text in creative ways, typically ending with evidence aggregation. The key to moving up in the search value chain, according to Chakrabarti, is to add semi-structured knowledge to the unstructured corpus, in the form of type, entity, category and relationship annotations, to index these annotations along with the text, and open up search application programming interfaces (APIs) and query languages to probe these indices and aggregate the resulting knowledge.

Chakrabarti adds that most of the popular search engines offer little or no support for at least two important kinds of queries: “For example, you cannot ask for a table of actors and the number of academy awards they won. Typing in ‘actor number academy awards’ is a shot in the dark, as the existing players do not expose to you any catalogue of actors that they know about, and let you implicitly expand actor into each known instance of that category.”

Second, he says, existing engines are not very good with letting people question and manipulate physical quantities, “although this is the single most important data type on the web”. He adds: “Sure, you can go to an e-commerce vertical and ask for digital SLRs (cameras) priced between $700 and $1,010, but you won’t be that successful asking a generic search engine for a laptop with battery life between 4 and 6 hours, or the typical driving time between Stuttgart and Mainz.”

But analysts are not convinced. Asheesh Raina, principal research analyst Gartner believes that even if HP does decide to launch this for the masses, it will not make much difference. “First I would like to see the system. But this would just be an incremental enhancement to the already existing platforms. Even if you think that this might be useful for enterprises, there are very few who would want this and that also in selected departments,” he added.

Nevertheless, Chakrabarti and his team plan to release their new search API to key research partners, including several universities, by the end of this year. The initial target is to handle thousands of queries per day — a far cry from the hundreds of millions of queries processed by big search engines like Google, Yahoo and now Bing. The goal, however, is a new level of extraction, ranking, aggregation and consolidation of information nuggets, and the IIT-B team believes it can do it.

New Ipad Application :Business Standard's all new IPad App
Click here to download for free
Arrow Other Stories     
- Wall Street gains 1% on jobs jump
- PEs may hang up on telecom in short-term, post SC vedict
- Telenor to write down $721 mn on licences in India
- Tata Tele Q3 net loss at Rs 144 cr
- Ex-UBS trader refused bail as bank probe deepens
  Read Business news in 
- Now property search gets more exciting than ever before!
- Office 365 for professionals and small businesses.
- Improve Patient Care & Experience. Click here to know more
- Special moments captured with VIVID clarity. Know more..
- Special moments captured with VIVID clarity. Know more..
- A hassle free reservation with our Best Available Rate.
Sorry, comments to this story are closed
Latest Messages
SmartInvestor+ E-zine
  Pay Rs.747/- for 3 years and
  get a branded watch FREE

  Subscribe Now
Most Popular
Read
E-Mailed
Commented
   
- Bharti: Pricing freedom to return
- SC breather for Army Chief
- Adman Turned Ad Entrepreneur
- Madan Sabnavis: 2012 - The year for commodities?
- Fear factor
 
 More  
New Ipad Application
 Business Standard's all new IPad  App
 Click here to download for free
  BS Specials  
    Full coverage of elections in Uttar Pradesh, Punjab, Uttarakhand, Manipur and Goa
  Hot Searches  
 
Ambassador car |  Uttarakhand |  TCS |  Sarfaesi Act |  Vodafone |  DZire |  Aakash tablet |  Sodexo |  NHAI |  Companies Bill 2011 |  Playbook |  Rupee |  Samsung Galaxy Note |  Kingfisher Airlines |  FDI in retail |  Silver |  Provident Fund |  income tax refund |  Anna Hazare |  iPhone |  Reliance Industries |  SEBI |  BSNL |  BSE |  NSE |  Mukesh Ambani |  Anil Ambani |  TCS |  Infosys |  Pranab Mukherjee |  Sonia Gandhi |  Rahul Gandhi |  New Pension Scheme |  Reliance |  RBI |  GDP |  Gold |  Ratan Tata |  ICICI |  B-School |  Sensex |  Tax calculator |  Home Loan |  Personal Finance |  inflation |  oil prices |  Barack Obama |   
 
  Member Area Write to the Editor RSS Archives Advanced Search
  Subscribe to BS print product BS e-paper Newsletter Portfolio Tracker
  BS Products BS Hindi BS Motoring BS Books
FOR HOT PRODUCTS
BS Bazaar.com
Home | Markets & Investing | Companies & Industry | Banking & Finance | Economy & Policy | Opinion
Life & Leisure | Management & Marketing | Tech World
About Us | Partner With Us | Code of Conduct | Careers | Advertise with us| Terms & Conditions | Disclaimer | Contact Us