Online Convex Optimization with Perturbed Constraints
Abstract
This paper addresses Online Convex Optimization (OCO) problems where the constraints have additive perturbations that (i) vary over time and (ii) are not known at the time to make a decision. Perturbations may not be i.i.d. generated and can be used to model a time-varying budget or commodity in resource allocation problems. The problem is to design a policy that obtains sublinear regret while ensuring that the constraints are satisfied on average. To solve this problem, we present a primal-dual proximal gradient algorithm that has $O(T^\epsilon \vee T^{1-\epsilon})$ regret and $O(T^\epsilon)$ constraint violation, where $\epsilon \in [0,1)$ is a parameter in the learning rate. Our results match the bounds of previous work on OCO with time-varying constraints when $\epsilon = 1/2$; however, we (i) define the regret using a time-varying set of best fixed decisions; (ii) can balance between regret and constraint violation; and (iii) use an adaptive learning rate that allows us to run the algorithm for any time horizon.
- Publication:
-
arXiv e-prints
- Pub Date:
- May 2019
- DOI:
- 10.48550/arXiv.1906.00049
- arXiv:
- arXiv:1906.00049
- Bibcode:
- 2019arXiv190600049V
- Keywords:
-
- Mathematics - Optimization and Control