Hate speech-detecting AIs can be fooled: Study

Image
IANS London
Last Updated : Sep 16 2018 | 5:35 PM IST

Machine learning detectors deployed by major social media and online platforms to track hate speech are "brittle and easy to deceive", a study claims.

The study, led by researchers from the Aalto University in Finland, found that bad grammar and awkward spelling -- intentional or not -- might make toxic social media comments harder for artificial intelligence (AI) detectors to spot.

Modern natural language processing techniques (NLP) can classify text based on individual characters, words or sentences. When faced with textual data that differs from that used in their training, they begin to fumble, the researchers said.

"We inserted typos, changed word boundaries or added neutral words to the original hate speech. Removing spaces between words was the most powerful attack, and a combination of these methods was effective even against Google's comment-ranking system Perspective," said Tommi Grondahl, a doctoral student at the varsity.

The team put seven state-of-the-art hate speech detectors to the test for the study. All of them failed.

Among them was Google's Perspective. It ranks the "toxicity" of comments using text analysis methods.

Earlier, it was found that "Perspective" can be fooled by introducing simple typos.

But, Grondahl's team discovered that although "Perspective" has since become resilient to simple typos, it can still be fooled by other modifications such as removing spaces or adding innocuous words like "love".

A sentence like "I hate you" slipped through the sieve and became non-hateful when modified into "Ihateyou love".

Hate speech is subjective and context-specific, which renders text analysis techniques insufficient as stand-alone solutions the researchers said.

They recommend that more attention be paid to the quality of data sets used to train machine learning models -- rather than refining the model design.

The results will be presented at the forthcoming ACM AISec workshop in Toronto.

--IANS

rt/pgh/vm

Disclaimer: No Business Standard Journalist was involved in creation of this content

*Subscribe to Business Standard digital and get complimentary access to The New York Times

Smart Quarterly

₹900

3 Months

₹300/Month

SAVE 25%

Smart Essential

₹2,700

1 Year

₹225/Month

SAVE 46%
*Complimentary New York Times access for the 2nd year will be given after 12 months

Super Saver

₹3,900

2 Years

₹162/Month

Subscribe

Renews automatically, cancel anytime

Here’s what’s included in our digital subscription plans

Exclusive premium stories online

  • Over 30 premium stories daily, handpicked by our editors

Complimentary Access to The New York Times

  • News, Games, Cooking, Audio, Wirecutter & The Athletic

Business Standard Epaper

  • Digital replica of our daily newspaper — with options to read, save, and share

Curated Newsletters

  • Insights on markets, finance, politics, tech, and more delivered to your inbox

Market Analysis & Investment Insights

  • In-depth market analysis & insights with access to The Smart Investor

Archives

  • Repository of articles and publications dating back to 1997

Ad-free Reading

  • Uninterrupted reading experience with no advertisements

Seamless Access Across All Devices

  • Access Business Standard across devices — mobile, tablet, or PC, via web or app

More From This Section

First Published: Sep 16 2018 | 5:28 PM IST

Next Story