Product Classification in E-Commerce using Distributional Semantics
Abstract
Product classification is the task of automatically predicting a taxonomy path for a product in a predefined taxonomy hierarchy given a textual product description or title. For efficient product classification we require a suitable representation for a document (the textual description of a product) feature vector and efficient and fast algorithms for prediction. To address the above challenges, we propose a new distributional semantics representation for document vector formation. We also develop a new two-level ensemble approach utilizing (with respect to the taxonomy tree) a path-wise, node-wise and depth-wise classifiers for error reduction in the final product classification. Our experiments show the effectiveness of the distributional representation and the ensemble approach on data sets from a leading e-commerce platform and achieve better results on various evaluation metrics compared to earlier approaches.
- Publication:
-
arXiv e-prints
- Pub Date:
- June 2016
- DOI:
- 10.48550/arXiv.1606.06083
- arXiv:
- arXiv:1606.06083
- Bibcode:
- 2016arXiv160606083G
- Keywords:
-
- Computer Science - Artificial Intelligence;
- Computer Science - Computation and Language;
- Computer Science - Information Retrieval