Experimental Analysis of Reinforcement Learning Techniques for Spectrum Sharing Radar
Abstract
In this work, we first describe a framework for the application of Reinforcement Learning (RL) control to a radar system that operates in a congested spectral setting. We then compare the utility of several RL algorithms through a discussion of experiments performed on Commercial off-the-shelf (COTS) hardware. Each RL technique is evaluated in terms of convergence, radar detection performance achieved in a congested spectral environment, and the ability to share 100MHz spectrum with an uncooperative communications system. We examine policy iteration, which solves an environment posed as a Markov Decision Process (MDP) by directly solving for a stochastic mapping between environmental states and radar waveforms, as well as Deep RL techniques, which utilize a form of Q-Learning to approximate a parameterized function that is used by the radar to select optimal actions. We show that RL techniques are beneficial over a Sense-and-Avoid (SAA) scheme and discuss the conditions under which each approach is most effective.
- Publication:
-
arXiv e-prints
- Pub Date:
- January 2020
- DOI:
- 10.48550/arXiv.2001.01799
- arXiv:
- arXiv:2001.01799
- Bibcode:
- 2020arXiv200101799T
- Keywords:
-
- Computer Science - Machine Learning;
- Electrical Engineering and Systems Science - Signal Processing;
- Statistics - Machine Learning
- E-Print:
- Accepted for publication at IEEE Intl. Radar Conference, Washington DC, Apr. 2020. This is the author's version of the work