Online Convex Optimization with Perturbed Constraints
Abstract
This paper addresses Online Convex Optimization (OCO) problems where the constraints have additive perturbations that (i) vary over time and (ii) are not known at the time to make a decision. Perturbations may not be i.i.d. generated and can be used to model a timevarying budget or commodity in resource allocation problems. The problem is to design a policy that obtains sublinear regret while ensuring that the constraints are satisfied on average. To solve this problem, we present a primaldual proximal gradient algorithm that has $O(T^\epsilon \vee T^{1\epsilon})$ regret and $O(T^\epsilon)$ constraint violation, where $\epsilon \in [0,1)$ is a parameter in the learning rate. Our results match the bounds of previous work on OCO with timevarying constraints when $\epsilon = 1/2$; however, we (i) define the regret using a timevarying set of best fixed decisions; (ii) can balance between regret and constraint violation; and (iii) use an adaptive learning rate that allows us to run the algorithm for any time horizon.
 Publication:

arXiv eprints
 Pub Date:
 May 2019
 DOI:
 10.48550/arXiv.1906.00049
 arXiv:
 arXiv:1906.00049
 Bibcode:
 2019arXiv190600049V
 Keywords:

 Mathematics  Optimization and Control