Big data: Big concerns

Cathy O'Neil brings in interesting human stories to explain risks of indiscriminate use of big data

M S Sriram

Last Updated : Jan 26 2017 | 11:24 PM IST

Add as Preferred source

Weapons of Math Destruction

How Big Data Increases Inequality and Threatens Democracy

Cathy O’Neil

Allen Lane

259 pages; £.12.99

Big data seems to be the new fix for all the problems that we foresee, particularly in providing technology-enabled solutions for the most pressing problems of the world. Big data will help you diagnose diseases, it will predict frauds, tell you patterns in customer behaviour and, of course, there is a whole host of free stuff that you will get in return for authorising an app to use your personal information. What possibly started as the Google experience — where you get the ease of searching with the non-intrusive big pasted advertisements or pop-ups — has now become almost a lifestyle. As we merrily share data, the intrusion of commerce into our lives is subtle and, slowly, we are unable to see where our private persona ends and the public persona takes over.

Cathy O’Neil has been there and done that. She has worked on big data and has seen how the modelling takes place and how the results are interpreted from close quarters. She recognises the importance of big data and the benefits it brings. At the same time, Ms O’Neil rings a warning bell on the indiscriminate use of big data — how a human being or a life is seen as a data point in a larger journey of data becoming commerce. It is an important voice to be heard when the big advocates of the JanDhan-Aadhaar-Mobile trinity are talking of India moving from a data-poor country to a data-rich one. We need to understand the meaning of data rich and its implications.

Ms O’Neil talks about where the data and patterns would be useful: Certainly in baseball games (or for that matter in cricket) where you could use these to analyse the opposing team and frame your strategies. In the process you are making the game even more interesting and not killing anybody. However, what happens when the data that you use turns out to be circular and possibly leads to patterns similar to racial profiling in crime data? This is the problem. What big data does is exactly what our minds do — create patterns based on past experience. These patterns would record the exceptions as “errors”. But what happens to these exceptions in real life? Would they become victims of a predictive model? This is an important question to ask. This question then leads us to consider whether more and more “scientific” models would have an objective way of getting people in, but will have no objective way of making exceptions. After all, each human being is an exception and unique. While it is okay to make a game-based prediction, how fair is it to take legal action based on a suspected movement, just because the machine told you so?

Ms O’Neil brings in interesting human stories to explain the risks. Like the story of Sarah Wysocki and other teachers who were classified as failures because the district administration had used a sophisticated modelling technique. First, firing her was an “error”. A large part of the evaluation was based on the difference between what her students scored when they came in and what they scored when they went out; there was no objective way of telling if they had come in with artificially inflated scores by the previous teacher. Secondly, the fact that she was fired was an error was not even reported back for the system to learn. Most of the Big Data models work as black boxes, without as much feedback that is necessary to refine the models. In any case, using Ms Wysocki as a data point in itself should be an ethical and moral problem.

Given that we are on the verge of introducing many tech-enabled start-ups to help, say, peer-to-peer lending, payday lending and cross-selling of third party products, the scene is getting scary. There are companies that are building credit behaviour models based on the data mined from Facebook, Whatsapp posts and geo-locations. Big brother is watching like never before. In this context it is important to read this book and look at the limitations of data and seriously examine the ethical limits of the machine invading our lives and making decisions for us.

Ms O’Neil’s book is in the same league as Michael Sandel’s — though without quite the same width or depth — of reminding us of the limits of commerce, bringing fairness to the fore and asking difficult questions on whether the poor and customers in general are to be seen as data points or as human beings. This is a book that should be read by all the youngsters building “apps” to play around with human behaviour and all the venture funders who encourage these youngsters. It is important that they stand up to the start-ups.

The reviewer is Visiting Faculty, Centre for Public Policy, Indian Institute of Management, Bangalore

mssriram@gmail.com