Semiparametric Exponential Families for Heavy-Tailed Data
Abstract
We propose a semiparametric method for fitting the tail of a heavy-tailed population given a relatively small sample from that population and a larger sample from a related background population. We model the tail of the small sample as an exponential tilt of the better-observed large-sample tail, using a robust sufficient statistic motivated by extreme value theory. In particular, our method induces an estimator of the small-population mean, and we give theoretical and empirical evidence that this estimator outperforms methods that do not use the background sample. We demonstrate substantial efficiency gains over competing methods in simulation and on data from a large controlled experiment conducted by Facebook.
- Publication:
-
arXiv e-prints
- Pub Date:
- July 2013
- DOI:
- 10.48550/arXiv.1307.7830
- arXiv:
- arXiv:1307.7830
- Bibcode:
- 2013arXiv1307.7830F
- Keywords:
-
- Statistics - Methodology;
- 62G32;
- 62G35 (Primary) 62G20 (Secondary)
- E-Print:
- To appear in Biometrika