On Approximating Functions of the Singular Values in a Stream
Abstract
For any real number $p > 0$, we nearly completely characterize the space complexity of estimating $\A\_p^p = \sum_{i=1}^n \sigma_i^p$ for $n \times n$ matrices $A$ in which each row and each column has $O(1)$ nonzero entries and whose entries are presented one at a time in a data stream model. Here the $\sigma_i$ are the singular values of $A$, and when $p \geq 1$, $\A\_p^p$ is the $p$th power of the Schatten $p$norm. We show that when $p$ is not an even integer, to obtain a $(1+\epsilon)$approximation to $\A\_p^p$ with constant probability, any $1$pass algorithm requires $n^{1g(\epsilon)}$ bits of space, where $g(\epsilon) \rightarrow 0$ as $\epsilon \rightarrow 0$ and $\epsilon > 0$ is a constant independent of $n$. However, when $p$ is an even integer, we give an upper bound of $n^{12/p} \textrm{poly}(\epsilon^{1}\log n)$ bits of space, which holds even in the turnstile data stream model. The latter is optimal up to $\textrm{poly}(\epsilon^{1} \log n)$ factors. Our results considerably strengthen lower bounds in previous work for arbitrary (not necessarily sparse) matrices $A$: the previous best lower bound was $\Omega(\log n)$ for $p\in (0,1)$, $\Omega(n^{1/p1/2}/\log n)$ for $p\in [1,2)$ and $\Omega(n^{12/p})$ for $p\in (2,\infty)$. We note for $p \in (2, \infty)$, while our lower bound for even integers is the same, for other $p$ in this range our lower bound is $n^{1g(\epsilon)}$, which is considerably stronger than the previous $n^{12/p}$ for small enough constant $\epsilon > 0$. We obtain similar nearlinear lower bounds for KyFan norms, SVD entropy, eigenvalue shrinkers, and Mestimators, many of which could have been solvable in logarithmic space prior to our work.
 Publication:

arXiv eprints
 Pub Date:
 April 2016
 arXiv:
 arXiv:1604.08679
 Bibcode:
 2016arXiv160408679L
 Keywords:

 Computer Science  Data Structures and Algorithms
 EPrint:
 fixed a flaw in Section 6