Describing the syntax of programming languages using conjunctive and Boolean grammars
Abstract
A classical result by Floyd ("On the non-existence of a phrase structure grammar for ALGOL 60", 1962) states that the complete syntax of any sensible programming language cannot be described by the ordinary kind of formal grammars (Chomsky's ``context-free''). This paper uses grammars extended with conjunction and negation operators, known as conjunctive grammars and Boolean grammars, to describe the set of well-formed programs in a simple typeless procedural programming language. A complete Boolean grammar, which defines such concepts as declaration of variables and functions before their use, is constructed and explained. Using the Generalized LR parsing algorithm for Boolean grammars, a program can then be parsed in time $O(n^4)$ in its length, while another known algorithm allows subcubic-time parsing. Next, it is shown how to transform this grammar to an unambiguous conjunctive grammar, with square-time parsing. This becomes apparently the first specification of the syntax of a programming language entirely by a computationally feasible formal grammar.
- Publication:
-
arXiv e-prints
- Pub Date:
- December 2020
- DOI:
- 10.48550/arXiv.2012.03538
- arXiv:
- arXiv:2012.03538
- Bibcode:
- 2020arXiv201203538O
- Keywords:
-
- Computer Science - Formal Languages and Automata Theory;
- Computer Science - Programming Languages