Deep Learning for Hate Speech Detection: A Comparative Study
Abstract
Automated hate speech detection is an important tool in combating the spread of hate speech, particularly in social media. Numerous methods have been developed for the task, including a recent proliferation of deep-learning based approaches. A variety of datasets have also been developed, exemplifying various manifestations of the hate-speech detection problem. We present here a large-scale empirical comparison of deep and shallow hate-speech detection methods, mediated through the three most commonly used datasets. Our goal is to illuminate progress in the area, and identify strengths and weaknesses in the current state-of-the-art. We particularly focus our analysis on measures of practical performance, including detection accuracy, computational efficiency, capability in using pre-trained models, and domain generalization. In doing so we aim to provide guidance as to the use of hate-speech detection in practice, quantify the state-of-the-art, and identify future research directions. Code and dataset are available at https://github.com/jmjmalik22/Hate-Speech-Detection.
- Publication:
-
arXiv e-prints
- Pub Date:
- February 2022
- DOI:
- 10.48550/arXiv.2202.09517
- arXiv:
- arXiv:2202.09517
- Bibcode:
- 2022arXiv220209517S
- Keywords:
-
- Computer Science - Computation and Language;
- Computer Science - Artificial Intelligence;
- Computer Science - Information Retrieval;
- Computer Science - Machine Learning
- E-Print:
- 18 pages, 4 figures, and 6 tables