Privacy-preserving Active Learning on Sensitive Data for User Intent Classification

doi:10.48550/arXiv.1903.11112

Privacy-preserving Active Learning on Sensitive Data for User Intent Classification

Active learning holds promise of significantly reducing data annotation costs while maintaining reasonable model performance. However, it requires sending data to annotators for labeling. This presents a possible privacy leak when the training set includes sensitive user data. In this paper, we describe an approach for carrying out privacy preserving active learning with quantifiable guarantees. We evaluate our approach by showing the tradeoff between privacy, utility and annotation budget on a binary classification task in a active learning setting.

Publication:

arXiv e-prints

Pub Date:

March 2019

DOI:

10.48550/arXiv.1903.11112

arXiv:

arXiv:1903.11112

Bibcode:

2019arXiv190311112F

Keywords:

Computer Science - Machine Learning;
Computer Science - Computation and Language;
Statistics - Machine Learning

E-Print:

To appear at PAL: Privacy-Enhancing Artificial Intelligence and Language Technologies as part of the AAAI Spring Symposium Series (AAAI-SSS 2019)

NASA/ADS

Privacy-preserving Active Learning on Sensitive Data for User Intent Classification

Abstract