Visualizing Dependence in High-Dimensional Data: An Application to S&P 500 Constituent Data
Abstract
The notion of a zenpath and a zenplot is introduced to search and detect dependence in high-dimensional data for model building and statistical inference. By using any measure of dependence between two random variables (such as correlation, Spearman's rho, Kendall's tau, tail dependence etc.), a zenpath can construct paths through pairs of variables in different ways, which can then be laid out and displayed by a zenplot. The approach is illustrated by investigating tail dependence and model fit in constituent data of the S&P 500 during the financial crisis of 2007-2008. The corresponding Global Industry Classification Standard (GICS) sector information is also addressed. Zenpaths and zenplots are useful tools for exploring dependence in high-dimensional data, for example, from the realm of finance, insurance and quantitative risk management. All presented algorithms are implemented using the R package zenplots and all examples and graphics in the paper can be reproduced using the accompanying demo SP500.
- Publication:
-
arXiv e-prints
- Pub Date:
- September 2016
- DOI:
- 10.48550/arXiv.1609.09429
- arXiv:
- arXiv:1609.09429
- Bibcode:
- 2016arXiv160909429H
- Keywords:
-
- Statistics - Applications;
- 62-09;
- 62H99;
- 65C60
- E-Print:
- The figures had to be massively reduced in size in order for the paper to fulfill the 10M limit