Detecting Abusive Language on Online Platforms: A Critical Analysis

Preslav Nakov , Vibha Nayak , Kyle Dent , Ameya Bhatawdekar , Sheikh Muhammad Sarwar , Momchil Hardalov , Yoan Dinkov , Dimitrina Zlatkova , Guillaume Bouchard , Isabelle Augenstein

27 Feb 2021

PDF

Abstract

Abusive language on online platforms is a major societal problem, often leading to important societal problems such as the marginalisation of underrepresented minorities. There are many different forms of abusive language such as hate speech, profanity, and cyber-bullying, and online platforms seek to moderate it in order to limit societal harm, to comply with legislation, and to create a more inclusive environment for their users. Within the field of Natural Language Processing, researchers have developed different methods for automatically detecting abusive language, often focusing on specific subproblems or on narrow communities, as what is considered abusive language very much differs by context. We argue that there is currently a dichotomy between what types of abusive language online platforms seek to curb, and what research efforts there are to automatically detect abusive language. We thus survey existing methods as well as content moderation policies by online platforms in this light, and we suggest directions for future work.

Type

Manuscript

Publication

CoRR, abs/2103.00153

Date

February, 2021