You are here: Home » Beyond Business » Columns
Business Standard

Too much or too little?

How will historians, who are after all students of the arts, ever cope with exabytes (or is it zettabytes or yottabytes?) of data?

Rrishi Raote  |  New Delhi 

What do you think the big problem of future historians will be — too much information or too little?

Moment’s pause for thought: 3 million books a year in 2011, millions of articles, millions of hours of television and video and audio, billions of emails and tweets (the US Library of Congress is now saving every public tweet), trillions of social media updates. Corporations are hard at work building electronic archives of human intellectual output. The habit can only spread.

Am I missing something? Oh yes, government papers, the raw material of conventional history. Well, who knows how many of those.

So, the historians of the near future will have to deal with exabytes (or is it zettabytes or yottabytes?) of data. How will historians, who are after all students of the arts, ever cope?

This is a philosophical question. When you are trying to tease a story from numbing quantities of data, you have to ask the right questions. For example, you can mine the mountain of tweets to know what topics were trending at what time, or when certain terms came into use or faded out of use, or to connect two or more variables such as trending topics and the Sensex. For this you need not an intelligent eye but clever software. That is being developed. Expect hectic interdisciplinary work in universities.

But let’s imagine the distant historian of the far future. He will be human, we presume, and manipulating data on a godlike scale. If software reduces “research” to “setting a query” then good historians will have to change focus. Perhaps they will be less interested in everyday minutiae than in slow processes, long trends.

Which suggests something like the “psychohistory” of Isaac Asimov’s science fiction Foundation series (1942 onwards). A mathematician of the future pioneers a way of predicting history, not in detail but broadly, by tracking very large numbers of people. The algorithms that can analyse the past via vast quantities of data could conceivably be applied to extrapolating the future. Until now, this has been done chiefly by human writers writing fiction.

So, not much future in conventional history.

But two recent sets of articles suggest that the future might not be so data-rich. In one excellent series in March in the New York Times, historian Dinyar Patel described the alarming state of affairs in India’s archives and libraries. Indian history, he says, is literally crumbling to dust. Paper is not immortal if it is not cared for.

And last week the Economist wrote about the risk to digital data. Disks and drives degrade, and the software and hardware to access them become obsolete. Careful libraries and archives keep upgrading their storage; but most do not. Swathes of digital history is already lost.

In some ways, paper is still best. Take care of it and it will last for ever. Some public and private institutions in the West are trying to protect what they can. In one private archive in America a philanthropist named Brewster Kahle is storing every book he can lay his hands on in sealed shipping containers. Others are attempting to print out and store parts of the Internet. But paper cannot capture the experience of the Internet, which is also a historical artefact.

For the moment, there is scope for fiction. Imagine, if data does survive and accumulate, a future society where one kind of worker mines data while another sets the philosophical goals, or questions. Or, a society where all information is consciously archived by those, including ordinary folk, who generate it — how will this change the way they behave? Or, a society in which ancient techniques of memorisation (think of the Vedas, faultlessly transmitted orally for millennia) are put to modern use, to store or catalogue data. A short story which raises the issue of private versus public data — should your emails be allowed to perish? A society in which, because of ways of storing and processing information, people and computers are co-dependent?

It sounds like old-fashioned science fiction. Interesting how the future can begin to look like the past.  

First Published: Sat, May 05 2012. 00:09 IST