Logarithmic regret in the dynamic and stochastic knapsack problem with equal rewards
Abstract
We study a dynamic and stochastic knapsack problem in which a decision maker is sequentially presented with items arriving according to a Bernoulli process over $n$ discrete time periods. Items have equal rewards and independent weights that are drawn from a known non-negative continuous distribution $F$. The decision maker seeks to maximize the expected total reward of the items that she includes in the knapsack while satisfying a capacity constraint and while making terminal decisions as soon as each item weight is revealed. Under mild regularity conditions on the weight distribution $F$, we prove that the regret---the expected difference between the performance of the best sequential algorithm and that of a prophet who sees all of the weights before making any decision---is, at most, logarithmic in $n$. Our proof is constructive. We devise a reoptimized heuristic that achieves this regret bound.
- Publication:
-
arXiv e-prints
- Pub Date:
- September 2018
- DOI:
- 10.48550/arXiv.1809.02016
- arXiv:
- arXiv:1809.02016
- Bibcode:
- 2018arXiv180902016A
- Keywords:
-
- Mathematics - Probability;
- Computer Science - Discrete Mathematics;
- Computer Science - Data Structures and Algorithms;
- Mathematics - Optimization and Control;
- 90C39 (Primary);
- 60C05;
- 68W27;
- 68W40;
- 90C27 (Secondary)
- E-Print:
- 33 pages, 2 figures