UBC-NLP at SemEval-2019 Task 6:Ensemble Learning of Offensive Content With Enhanced Training Data
Abstract
We examine learning offensive content on Twitter with limited, imbalanced data. For the purpose, we investigate the utility of using various data enhancement methods with a host of classical ensemble classifiers. Among the 75 participating teams in SemEval-2019 sub-task B, our system ranks 6th (with 0.706 macro F1-score). For sub-task C, among the 65 participating teams, our system ranks 9th (with 0.587 macro F1-score).
- Publication:
-
arXiv e-prints
- Pub Date:
- June 2019
- DOI:
- 10.48550/arXiv.1906.03692
- arXiv:
- arXiv:1906.03692
- Bibcode:
- 2019arXiv190603692R
- Keywords:
-
- Computer Science - Computation and Language
- E-Print:
- 7 pages, 2 figures, Proceedings of the 13th International Workshop on Semantic Evaluation (SemEval)