Outlier detection and a tail-adjusted boxplot based on extreme value theory
Abstract
Whether an extreme observation is an outlier or not, depends strongly on the corresponding tail behaviour of the underlying distribution. We develop an automatic, data-driven method to identify extreme tail behaviour that deviates from the intermediate and central characteristics. This allows for detecting extreme outliers or sets of extreme data that show less spread than the bulk of the data. To this end we extend a testing method proposed in Bhattacharya et al 2019 for the specific case of heavy tailed models, to all max-domains of attraction. Consequently we propose a tail-adjusted boxplot which yields a more accurate representation of possible outliers. Several examples and simulation results illustrate the finite sample behaviour of this approach.
- Publication:
-
arXiv e-prints
- Pub Date:
- December 2019
- DOI:
- 10.48550/arXiv.1912.02595
- arXiv:
- arXiv:1912.02595
- Bibcode:
- 2019arXiv191202595B
- Keywords:
-
- Statistics - Methodology;
- Mathematics - Statistics Theory