Robust Importance Weighting for Covariate Shift
Abstract
In many learning problems, the training and testing data follow different distributions and a particularly common situation is the \textit{covariate shift}. To correct for sampling biases, most approaches, including the popular kernel mean matching (KMM), focus on estimating the importance weights between the two distributions. Reweighting-based methods, however, are exposed to high variance when the distributional discrepancy is large and the weights are poorly estimated. On the other hand, the alternate approach of using nonparametric regression (NR) incurs high bias when the training size is limited. In this paper, we propose and analyze a new estimator that systematically integrates the residuals of NR with KMM reweighting, based on a control-variate perspective. The proposed estimator can be shown to either strictly outperform or match the best-known existing rates for both KMM and NR, and thus is a robust combination of both estimators. The experiments shows the estimator works well in practice.
- Publication:
-
arXiv e-prints
- Pub Date:
- October 2019
- DOI:
- arXiv:
- arXiv:1910.06324
- Bibcode:
- 2019arXiv191006324L
- Keywords:
-
- Computer Science - Machine Learning;
- Mathematics - Statistics Theory;
- Statistics - Machine Learning