Acquisition of chess knowledge in AlphaZero

doi:10.1073/pnas.2206625119

Acquisition of chess knowledge in AlphaZero

Seventy years ago, Alan Turing conjectured that a chess-playing machine could be built that would self-learn and continuously profit from its own experience. The AlphaZero system—a neural network-powered reinforcement learner—passed this milestone. In this paper, we ask the following questions. How did it do it? What did it learn from its experience, and how did it encode it? Did it learn anything like a human understanding of chess, in spite of having never seen a human game? Remarkably, we find many strong correspondences between human concepts and AlphaZero's representations that emerge during training, even though none of these concepts were initially present in the network.

Publication:

Proceedings of the National Academy of Science

Pub Date:

November 2022

DOI:

10.1073/pnas.2206625119

arXiv:

arXiv:2111.09259

Bibcode:

2022PNAS..11906625M

Keywords:

Computer Science - Artificial Intelligence;
Statistics - Machine Learning

E-Print:

69 pages, 44 figures

NASA/ADS

Acquisition of chess knowledge in AlphaZero

Abstract