On model misspecification and KL separation for Gaussian graphical models
Abstract
We establish bounds on the KL divergence between two multivariate Gaussian distributions in terms of the Hamming distance between the edge sets of the corresponding graphical models. We show that the KL divergence is bounded below by a constant when the graphs differ by at least one edge; this is essentially the tightest possible bound, since classes of graphs exist for which the edge discrepancy increases but the KL divergence remains bounded above by a constant. As a natural corollary to our KL lower bound, we also establish a sample size requirement for correct model selection via maximum likelihood estimation. Our results rigorize the notion that it is essential to estimate the edge structure of a Gaussian graphical model accurately in order to approximate the true distribution to close precision.
- Publication:
-
arXiv e-prints
- Pub Date:
- January 2015
- DOI:
- 10.48550/arXiv.1501.02320
- arXiv:
- arXiv:1501.02320
- Bibcode:
- 2015arXiv150102320J
- Keywords:
-
- Computer Science - Information Theory;
- Mathematics - Statistics Theory;
- Statistics - Machine Learning;
- 62B10
- E-Print:
- Accepted to ISIT 2015