Bayesian Optimization of Text Representations

doi:10.48550/arXiv.1503.00693

Bayesian Optimization of Text Representations

When applying machine learning to problems in NLP, there are many choices to make about how to represent input texts. These choices can have a big effect on performance, but they are often uninteresting to researchers or practitioners who simply need a module that performs well. We propose an approach to optimizing over this space of choices, formulating the problem as global optimization. We apply a sequential model-based optimization technique and show that our method makes standard linear models competitive with more sophisticated, expensive state-of-the-art methods based on latent variable models or neural networks on various topic classification and sentiment analysis problems. Our approach is a first step towards black-box NLP systems that work with raw text and do not require manual tuning.

Publication:

arXiv e-prints

Pub Date:

March 2015

DOI:

10.48550/arXiv.1503.00693

arXiv:

arXiv:1503.00693

Bibcode:

2015arXiv150300693Y

Keywords:

Computer Science - Computation and Language;
Computer Science - Machine Learning;
Statistics - Machine Learning

NASA/ADS

Bayesian Optimization of Text Representations

Abstract