What is scraping and how it was used to breach data at Facebook, LinkedIn

Facebook has previously filed cases against companies that have developed tools to scrape its user data; LinkedIn also said scraping violates its terms of service

hacking, hackers, cyberfraud, cyber security, cyber threat, digital, e-commerce, e-firms, payment, online, privacy, data breach
Facebook data of around 533 million users had leaked online, according to a report by Business Insider on April 3
Neha Alawadhi New Delhi
3 min read Last Updated : Apr 12 2021 | 1:30 AM IST

Don't want to miss the best from Business Standard?

The past week saw massive data breaches at two tech giants- Facebook and LinkedIn. While neither denied that customer data had been leaked, both said it wasn’t hacked from their systems, but had been scraped.

Facebook data of around 533 million users had leaked online, according to a report by Business Insider on April 3.

This includes the data of six million users from India, and includes their phone numbers, Facebook IDs, full names, locations, birthdates, bios, and, in some cases, email addresses.

India accounts for about 410 million Facebook users, among its largest markets.

In response, Facebook said, “It is important to understand that malicious actors obtained this data not through hacking our systems but by scraping it from our platform prior to September 2019.”

Similarly, regarding the data leak, first reported by CyberNews, 500 million LinkedIn data accounts had been leaked online, including name, address, email, phone number, workplace information and so on.

“We have investigated an alleged set of LinkedIn data that has been posted for sale and have determined that it is actually an aggregation of data from a number of websites and companies. It does include publicly viewable member profile data that appears to have been scraped from LinkedIn. This was not a LinkedIn data breach, and no private member account data from LinkedIn was included in what we’ve been able to review,” LinkedIn said in a statement.

Rajshekhar Rajaharia, the cybersecurity researcher who first tweeted about an alleged data breach at MobiKwik, explained, “When a data breach like this happens, it leaves not just the company whose data is leaked vulnerable, but some of this data can be used to hack or access information and even money of individuals from other organisations.”

"For example, one can make a spoof call using a mobile number leaked in a data breach, call a payment firm or bank’s IVR, which will recognise the spoof number as registered. Secondly, most payment companies and banks ask for the last four digits of your debit or credit card to verify you. That number has also been leaked, linked to your number in the data breach. Hence, unscrupulous elements can access and even steal your financial data or money using the data from data breaches." he said.

What is scraping?

Infrastructure and website security firm Cloudflare explains what scraping means. Essentially, it is the process of using an application to extract valuable information from a website.

Contact scraping, which is what has happened at LinkedIn, is executed when the perpertrator 'scrapes' locations like an online employee directory. By doing so, “a scraper is able to aggregate contact details from bulk mailing lists, robo calls, or malicious social engineering attempts. This is one of the primary methods both spammers and scammers use to find new targets”.

Facebook has previously filed cases in the US and UK against app developers and companies that have developed tools to scrape its user data as it is against the tech firm’s terms of service.

LinkedIn also said scraping violates its terms of service. “Any misuse of our members’ data, such as scraping, violates LinkedIn terms of service. When anyone tries to take member data and use it for purposes LinkedIn and our members haven’t agreed to, we work to stop them and hold them accountable”.

LinkedIn had a user base of 71 million in India in 2019, its second highest.

One subscription. Two world-class reads.

Already subscribed? Log in

Subscribe to read the full story →
*Subscribe to Business Standard digital and get complimentary access to The New York Times

Smart Quarterly

₹900

3 Months

₹300/Month

SAVE 25%

Smart Essential

₹2,700

1 Year

₹225/Month

SAVE 46%
*Complimentary New York Times access for the 2nd year will be given after 12 months

Super Saver

₹3,900

2 Years

₹162/Month

Subscribe

Renews automatically, cancel anytime

Here’s what’s included in our digital subscription plans

Exclusive premium stories online

  • Over 30 premium stories daily, handpicked by our editors

Complimentary Access to The New York Times

  • News, Games, Cooking, Audio, Wirecutter & The Athletic

Business Standard Epaper

  • Digital replica of our daily newspaper — with options to read, save, and share

Curated Newsletters

  • Insights on markets, finance, politics, tech, and more delivered to your inbox

Market Analysis & Investment Insights

  • In-depth market analysis & insights with access to The Smart Investor

Archives

  • Repository of articles and publications dating back to 1997

Ad-free Reading

  • Uninterrupted reading experience with no advertisements

Seamless Access Across All Devices

  • Access Business Standard across devices — mobile, tablet, or PC, via web or app

Topics :FacebookData policyData Privacy

Next Story