There's an old saying that "statistics don't lie, but liars use statistics". This is worth keeping in mind as we digest the plethora of polls that profess to take the pulse of the nation as elections approach.
Of many recent polls, the poll that by far has garnered the most play in the media asks respondents to compare possible prime ministerial candidates. The survey, conducted by GFK for CNN-IBN, finds that 38 per cent of respondents prefer Narendra Modi as PM, while Manmohan Singh weighs in at a measly 13 per cent and Rahul Gandhi fares only slightly better at 14 per cent.
The message from such polls for political commentators sympathetic to Modi is self-evident: the Bharatiya Janata Party (BJP) will have no choice but to nominate him as its PM candidate if it is to have any hope of winning next year. A screaming headline in FirstPost leaves little room for debate: "Three polls, one message: No alternative to Modi for the BJP".
The BJP itself appears to have heeded this message this past weekend in Goa and has elevated Modi to campaign chief, a possible first step to becoming the PM candidate.
The political judgement that Modi is the BJP's best shot at winning the 2014 elections may well be correct. But what is problematic is claiming a scientific basis for that judgement by appealing to opinion polls such as the GFK. This is especially so as many such polls in India do not disclose in any detail their survey methodology.
This is where one must separate fact from spin and keep that old adage in mind.
Consider further the GFK poll, which reports that they surveyed 2,466 adults spread across 12 major metros. In Mumbai, for instance, 102 men and 102 women were surveyed, for 204 respondents in total. For a city with a population estimated at 18 million people, to say this is a small sample is to put it mildly.
Nevertheless, as a matter of statistics, a sample, even if small, may lead to a reasonably accurate prediction, provided one crucial proviso is met: it must be representative of the population being sampled. This, in turn, requires that the sample be chosen truly randomly - for instance, by using a statistical algorithm to draw names from voter lists. Failing that, there is the danger of what is known in the trade as "sampling error". Put simply, a sample that has not been selected randomly is likely to lead to biased predictions.
The GFK poll tells us only that interviews were conducted "in respondents' homes and in street corners", but gives us no indication that subjects were picked randomly. Also, as is typical with Indian polls, we are not told the margin of error, so have absolutely no way to assess the accuracy of the predictions.
This is not merely an arcane matter of arid statistical theory. Polls in advance of the 2004 elections uniformly predicted a thumping victory for the National Democratic Alliance. We all know how accurate those polls turned out to be.
As my co-author and I have argued in a recent book, it is widely acknowledged by polling experts that a massive sampling error was one of the major culprits in 2004. The polls oversampled urban middle class voters - the BJP's natural constituency - and undersampled other groups more likely to support the Congress or other parties, leading to predictions that were badly skewed.
The same caution applies to the GFK and other similar polls. Urban middle class voters are Modi's natural constituency, and his polling strongly among them does not necessarily imply a similar pattern among everyone else.
Tellingly, the GFK poll reports that a whopping 73 per cent of respondents believe that social media will play an important role in upcoming election campaigns. Since only about 10 per cent of the population even have an internet connection, it doesn't take an advanced degree in statistics to realise how unrepresentative of the Indian population the survey is.
There is a more basic point to be made that involves understanding how our Westminster parliamentary democracy works. Asking whether respondents support one or another individual for PM may be of interest, but there is no guarantee that people will actually vote on that basis.
In our model, unlike a presidential system, voters elect members of Parliament, and the national election is an aggregation of individual contests across constituencies. While a party's presumptive or declared leader may be one criterion that affects a voter's choice, so will be a host of local, regional, and other factors that often have nothing to do with the person at the top.
The upshot is that even those people who say they prefer Modi for PM may not necessarily vote for the BJP. And since the party's support has historically been bunched in about 300 or so constituencies, the vote to seat translation in our first-past-the-post system is unlikely to favour the BJP. This, whether one likes it or not, makes it less likely that a Modi-inspired vote for the BJP will translate into big electoral gains.
The Indian electorate may yet make fools of all of the pollsters, pundits and prognosticators.
From the Ivory Tower makes research from the academic world accessible to all our readers