Learning Structure-Aware Representations of Dependent Types

doi:10.48550/arXiv.2402.02104

Learning Structure-Aware Representations of Dependent Types

Agda is a dependently-typed programming language and a proof assistant, pivotal in proof formalization and programming language theory. This paper extends the Agda ecosystem into machine learning territory, and, vice versa, makes Agda-related resources available to machine learning practitioners. We introduce and release a novel dataset of Agda program-proofs that is elaborate and extensive enough to support various machine learning applications -- the first of its kind. Leveraging the dataset's ultra-high resolution, detailing proof states at the sub-type level, we propose a novel neural architecture targeted at faithfully representing dependently-typed programs on the basis of structural rather than nominal principles. We instantiate and evaluate our architecture in a premise selection setup, where it achieves strong initial results.

Publication:

arXiv e-prints

Pub Date:

February 2024

DOI:

10.48550/arXiv.2402.02104

arXiv:

arXiv:2402.02104

Bibcode:

2024arXiv240202104K

Keywords:

Computer Science - Machine Learning;
Computer Science - Programming Languages

E-Print:

15 pages, submitted to ICML2024

NASA/ADS

Learning Structure-Aware Representations of Dependent Types

Abstract