What happens when a US $29-billion tech giant accuses its US $63-billion rival of stealing intellectual property and puts the evidence in the public domain? Geeks everywhere settle down to enjoy the ensuing flame war.
On February 1, Google provided damning evidence that Microsoft’s (MS’) search engine Bing copy-pastes results from Google’s search engine (SE). Amit Singhal, who oversees Google’s search-ranking algorithm, explained on the Google blog how he ran a sting operation.
In mid-2010, Google started suspecting copying. “A misspelled query, ‘torsorophy’ (sic) for ‘tarsorrhaphy’ (an eye-surgery procedure), showed exactly the same top results on Google and Bing.” By October 2010, Google was certain: the same top results showed up in too many queries.
SEs use proprietory algorithms to rank and list query results. It’s statistically unlikely two algorithms will list results in the same exact order. This is like two random people listing the same favourite songs in the same order. Google also tinkers continuously with its algorithm, in part to prevent its AdSense system from being reverse-engineered and gamed. This makes accidental duplication impossible.
Singhal’s team created about 100 “synthetic queries” — nonsense alphanumeric combinations like “hiybbprqag”. No SE should return results for a nonsense word, which shows up nowhere (though you’ll get interesting results now if you type “hiybbprqag”). Google manually added a real webpage as one unique result to each synthetic. Singhal compared this to marking currency notes.
When Bing was queried on the same synthetics, the faked Google results showed up. Matt Cutts, who heads Google’s Webspam team, has posted a 40-minute video showing the identical faked results with a series of Google-Bing screenshots.
Stefan Weitz, Bing Director, responded with an initial denial that was in effect, not a denial. “We do not copy Google’s results. We use multiple signals and approaches. Opt-in programs like the Bing toolbar help us with clickstream data, one of many input signals we use to help rank sites. This ‘Google experiment’ seems like a hack to confuse and manipulate these signals.”
Translated, this says Bing analyses and weights results from Google and other SEs as well. “Clickstream” is a record of the clicks made by a surfer. Internet Explorer 8 (IE8) comes with a “suggested sites” feature and a Bing toolbar. Both monitor and send clickstream to MS.
MS can, therefore, figure exactly what IE8 users queried for, when, and on which SE. Bing’s algorithm weights the clickstream SE queries as part of its input. The sting suggests that the weight is very high indeed for Google SE queries.
According to January 2011 data from Net Applications, about 85 per cent of all global searches are made on Google. Bing is the third-ranked SE with 3.7 per cent (behind Yahoo’s 6 per cent). Hence, the high weightage is not surprising. As the war of words escalated, MS Senior VP Yusuf Mehdi called Google’s sting “click fraud”. Google likened Bing’s approach to kids copying in exams.
The two companies are bitter rivals. In the 1990s, Google was part of a widespread movement that broke the IE monopoly through antitrust lobbying. MS is now part of a consortium, FairSearch.org (this includes travel sites TripAdvisor, Expedia, Hotwire, Kayak and Travelocity) that is trying to stop Google’s bid to buy flight information software company Ita Software for $700 million.
Both are monopolies in different spaces. Around 89 per cent of all PCs use Windows, MS Office has 80 per cent application suite market share and IE has 56 per cent of browser market share. Google, apart from SE and AdSense, also has a major smart telephony play in Android. It has pushed hard into MS’ domain with the Chrome browser+OS, and with Google Docs, an online alternative to MS Office. The battle for dominance won’t end with a few synthetic words.