Scalable Policies for the Dynamic Traveling Multi-Maintainer Problem with Alerts
Abstract
Downtime of industrial assets such as wind turbines and medical imaging devices is costly. To avoid such downtime costs, companies seek to initiate maintenance just before failure, which is challenging because: (i) Asset failures are notoriously difficult to predict, even in the presence of real-time monitoring devices which signal degradation; and (ii) Limited resources are available to serve a network of geographically dispersed assets. In this work, we study the dynamic traveling multi-maintainer problem with alerts ($K$-DTMPA) under perfect condition information with the objective to devise scalable solution approaches to maintain large networks with $K$ maintenance engineers. Since such large-scale $K$-DTMPA instances are computationally intractable, we propose an iterative deep reinforcement learning (DRL) algorithm optimizing long-term discounted maintenance costs. The efficiency of the DRL approach is vastly improved by a reformulation of the action space (which relies on the Markov structure of the underlying problem) and by choosing a smart, suitable initial solution. The initial solution is created by extending existing heuristics with a dispatching mechanism. These extensions further serve as compelling benchmarks for tailored instances. We demonstrate through extensive numerical experiments that DRL can solve single maintainer instances up to optimality, regardless of the chosen initial solution. Experiments with hospital networks containing up to $35$ assets show that the proposed DRL algorithm is scalable. Lastly, the trained policies are shown to be robust against network modifications such as removing an asset or an engineer or yield a suitable initial solution for the DRL approach.
- Publication:
-
arXiv e-prints
- Pub Date:
- January 2024
- DOI:
- arXiv:
- arXiv:2401.04574
- Bibcode:
- 2024arXiv240104574V
- Keywords:
-
- Mathematics - Optimization and Control