Automated Selection of r for the r Largest Order Statistics Approach with Adjustment for Sequential Testing
Abstract
The r largest order statistics approach is widely used in extreme value analysis because it may use more information from the data than just the block maxima. In practice, the choice of r is critical. If r is too large, bias can occur; if too small, the variance of the estimator can be high. The limiting distribution of the r largest order statistics, denoted by GEVr, extends that of the block maxima. Two specification tests are proposed to select r sequentially. The first is a score test for the GEVr distribution. Due to the special characteristics of the GEVr distribution, the classical chi-square asymptotics cannot be used. The simplest approach is to use the parametric bootstrap, which is straightforward to implement but computationally expensive. An alternative fast weighted bootstrap or multiplier procedure is developed for computational efficiency. The second test uses the difference in estimated entropy between the GEVr and GEV(r-1) models, applied to the r largest order statistics and the r-1 largest order statistics, respectively. The asymptotic distribution of the difference statistic is derived. In a large scale simulation study, both tests held their size and had substantial power to detect various misspecification schemes. A new approach to address the issue of multiple, sequential hypotheses testing is adapted to this setting to control the false discovery rate or familywise error rate. The utility of the procedures is demonstrated with extreme sea level and precipitation data.
- Publication:
-
arXiv e-prints
- Pub Date:
- April 2016
- DOI:
- 10.48550/arXiv.1604.01984
- arXiv:
- arXiv:1604.01984
- Bibcode:
- 2016arXiv160401984B
- Keywords:
-
- Statistics - Methodology;
- Mathematics - Statistics Theory
- E-Print:
- 21 pages (with tables and figures), 11 pages without, 6 tables, 7 figures