Gender Prediction from Tweets: Improving Neural Representations with Hand-Crafted Features
Abstract
Author profiling is the characterization of an author through some key attributes such as gender, age, and language. In this paper, a RNN model with Attention (RNNwA) is proposed to predict the gender of a twitter user using their tweets. Both word level and tweet level attentions are utilized to learn 'where to look'. This model (https://github.com/Darg-Iztech/gender-prediction-from-tweets) is improved by concatenating LSA-reduced n-gram features with the learned neural representation of a user. Both models are tested on three languages: English, Spanish, Arabic. The improved version of the proposed model (RNNwA + n-gram) achieves state-of-the-art performance on English and has competitive results on Spanish and Arabic.
- Publication:
-
arXiv e-prints
- Pub Date:
- August 2019
- DOI:
- 10.48550/arXiv.1908.09919
- arXiv:
- arXiv:1908.09919
- Bibcode:
- 2019arXiv190809919S
- Keywords:
-
- Computer Science - Computation and Language;
- Computer Science - Machine Learning;
- Statistics - Machine Learning