Acquisition of chess knowledge in AlphaZero
Abstract
Seventy years ago, Alan Turing conjectured that a chess-playing machine could be built that would self-learn and continuously profit from its own experience. The AlphaZero system—a neural network-powered reinforcement learner—passed this milestone. In this paper, we ask the following questions. How did it do it? What did it learn from its experience, and how did it encode it? Did it learn anything like a human understanding of chess, in spite of having never seen a human game? Remarkably, we find many strong correspondences between human concepts and AlphaZero's representations that emerge during training, even though none of these concepts were initially present in the network.
- Publication:
-
Proceedings of the National Academy of Science
- Pub Date:
- November 2022
- DOI:
- 10.1073/pnas.2206625119
- arXiv:
- arXiv:2111.09259
- Bibcode:
- 2022PNAS..11906625M
- Keywords:
-
- Computer Science - Artificial Intelligence;
- Statistics - Machine Learning
- E-Print:
- 69 pages, 44 figures