Multiple Outliers in Small Samples
Abstract
Z-scores are often employed in outlier detection in a dataset. For small samples, the presence of multiple outliers forces a finite supremum on the absolute value of possible z-scores that decreases with an increasing number of outliers, creating a "masking effect" that hinders identification of true outliers. We give an illustrative case study in which the accurate detection of the number of outliers is critical, and provide a closed form expression of the maximum possible z-score in terms of the sample size and number of outliers. In addition, a corresponding analysis on the $t-$statistic is performed.
- Publication:
-
arXiv e-prints
- Pub Date:
- January 2016
- DOI:
- 10.48550/arXiv.1601.07521
- arXiv:
- arXiv:1601.07521
- Bibcode:
- 2016arXiv160107521C
- Keywords:
-
- Mathematics - Statistics Theory
- E-Print:
- This paper has been withdrawn due to additional research which has led to new conclusions