Deep Reinforcement Learning with Swin Transformers

doi:10.48550/arXiv.2206.15269

Deep Reinforcement Learning with Swin Transformers

Transformers are neural network models that utilize multiple layers of self-attention heads and have exhibited enormous potential in natural language processing tasks. Meanwhile, there have been efforts to adapt transformers to visual tasks of machine learning, including Vision Transformers and Swin Transformers. Although some researchers use Vision Transformers for reinforcement learning tasks, their experiments remain at a small scale due to the high computational cost. This article presents the first online reinforcement learning scheme that is based on Swin Transformers: Swin DQN. In contrast to existing research, our novel approach demonstrate the superior performance with experiments on 49 games in the Arcade Learning Environment. The results show that our approach achieves significantly higher maximal evaluation scores than the baseline method in 45 of all the 49 games (92%), and higher mean evaluation scores than the baseline method in 40 of all the 49 games (82%).

Publication:

arXiv e-prints

Pub Date:

June 2022

DOI:

10.48550/arXiv.2206.15269

arXiv:

arXiv:2206.15269

Bibcode:

2022arXiv220615269M

Keywords:

Computer Science - Machine Learning

ADS

Deep Reinforcement Learning with Swin Transformers

Abstract