Clique-Based Lower Bounds for Parsing Tree-Adjoining Grammars
Abstract
Tree-adjoining grammars are a generalization of context-free grammars that are well suited to model human languages and are thus popular in computational linguistics. In the tree-adjoining grammar recognition problem, given a grammar $\Gamma$ and a string $s$ of length $n$, the task is to decide whether $s$ can be obtained from $\Gamma$. Rajasekaran and Yooseph's parser (JCSS'98) solves this problem in time $O(n^{2\omega})$, where $\omega < 2.373$ is the matrix multiplication exponent. The best algorithms avoiding fast matrix multiplication take time $O(n^6)$. The first evidence for hardness was given by Satta (J. Comp. Linguist.'94): For a more general parsing problem, any algorithm that avoids fast matrix multiplication and is significantly faster than $O(|\Gamma| n^6)$ in the case of $|\Gamma| = \Theta(n^{12})$ would imply a breakthrough for Boolean matrix multiplication. Following an approach by Abboud et al. (FOCS'15) for context-free grammar recognition, in this paper we resolve many of the disadvantages of the previous lower bound. We show that, even on constant-size grammars, any improvement on Rajasekaran and Yooseph's parser would imply a breakthrough for the $k$-Clique problem. This establishes tree-adjoining grammar parsing as a practically relevant problem with the unusual running time of $n^{2\omega}$, up to lower order factors.
- Publication:
-
arXiv e-prints
- Pub Date:
- March 2018
- DOI:
- 10.48550/arXiv.1803.00804
- arXiv:
- arXiv:1803.00804
- Bibcode:
- 2018arXiv180300804B
- Keywords:
-
- Computer Science - Computational Complexity;
- Computer Science - Data Structures and Algorithms;
- F.2.2
- E-Print:
- Presented at CPM'17. 15 pages