Wasserstein Distributionally Robust Optimization with Heterogeneous Data Sources
Abstract
We study decision problems under uncertainty, where the decision-maker has access to $K$ data sources that carry {\em biased} information about the underlying risk factors. The biases are measured by the mismatch between the risk factor distribution and the $K$ data-generating distributions with respect to an optimal transport (OT) distance. In this situation the decision-maker can exploit the information contained in the biased samples by solving a distributionally robust optimization (DRO) problem, where the ambiguity set is defined as the intersection of $K$ OT neighborhoods, each of which is centered at the empirical distribution on the samples generated by a biased data source. We show that if the decision-maker has a prior belief about the biases, then the out-of-sample performance of the DRO solution can improve with $K$ -- irrespective of the magnitude of the biases. We also show that, under standard convexity assumptions, the proposed DRO problem is computationally tractable if either $K$ or the dimension of the risk factors is kept constant.
- Publication:
-
arXiv e-prints
- Pub Date:
- July 2024
- DOI:
- 10.48550/arXiv.2407.13582
- arXiv:
- arXiv:2407.13582
- Bibcode:
- 2024arXiv240713582R
- Keywords:
-
- Mathematics - Optimization and Control;
- Mathematics - Probability;
- Mathematics - Statistics Theory