One-step and Two-step Classification for Abusive Language Detection on Twitter

doi:10.48550/arXiv.1706.01206

One-step and Two-step Classification for Abusive Language Detection on Twitter

Automatic abusive language detection is a difficult but important task for online social media. Our research explores a two-step approach of performing classification on abusive language and then classifying into specific types and compares it with one-step approach of doing one multi-class classification for detecting sexist and racist languages. With a public English Twitter corpus of 20 thousand tweets in the type of sexism and racism, our approach shows a promising performance of 0.827 F-measure by using HybridCNN in one-step and 0.824 F-measure by using logistic regression in two-steps.

Publication:

arXiv e-prints

Pub Date:

June 2017

DOI:

10.48550/arXiv.1706.01206

arXiv:

arXiv:1706.01206

Bibcode:

2017arXiv170601206P

Keywords:

Computer Science - Computation and Language

E-Print:

ALW1: 1st Workshop on Abusive Language Online to be held at the annual meeting of the Association of Computational Linguistics (ACL) 2017 (Vancouver, Canada), August 4th, 2017

NASA/ADS

One-step and Two-step Classification for Abusive Language Detection on Twitter

Abstract