We consider languages generated by weighted context-free grammars. It is shown that the behaviour of large texts is controlled by saddle-point equations for an appropriate generating function. We then consider ensembles of grammars, in particular the Random Language Model of E. DeGiuli, Phys. Rev. Lett., 122, 128301, 2019. This model is solved in the replica-symmetric ansatz, which is valid in the high-temperature, disordered phase. It is shown that in the phase in which languages carry information, the replica symmetry must be broken.
- Pub Date:
- February 2019
- Condensed Matter - Disordered Systems and Neural Networks;
- Computer Science - Computation and Language;
- Computer Science - Formal Languages and Automata Theory
- 16 pages + 1 appendix