Non-Parametric Methods Applied to the N-Sample Series Comparison
Abstract
Anomaly and similarity detection in multidimensional series have a long history and have found practical usage in many different fields such as medicine, networks, and finance. Anomaly detection is of great appeal for many different disciplines; for example, mathematicians searching for a unified mathematical formulation based on probability, statisticians searching for error bound estimates, and computer scientists who are trying to design fast algorithms, to name just a few. In summary, we have two contributions: First, we present a self-contained survey of the most promising methods being used in the fields of machine learning, statistics, and bio-informatics today. Included we present discussions about conformal prediction, kernels in the Hilbert space, Kolmogorov's information measure, and non-parametric cumulative distribution function comparison methods (NCDF). Second, building upon this foundation, we provide a powerful NCDF method for series with small dimensionality. Through a combination of data organization and statistical tests, we describe extensions that scale well with increased dimensionality.
- Publication:
-
arXiv e-prints
- Pub Date:
- May 2012
- DOI:
- arXiv:
- arXiv:1205.1880
- Bibcode:
- 2012arXiv1205.1880D
- Keywords:
-
- Statistics - Computation;
- G.3
- E-Print:
- 65 pages, 21 figures