Separation of the largest eigenvalues in eigenanalysis of genotype data from discrete subpopulations
Abstract
We present a mathematical model, and the corresponding mathematical analysis, that justifies and quantifies the use of principal component analysis of biallelic genetic marker data for a set of individuals to detect the number of subpopulations represented in the data. We indicate that the power of the technique relies more on the number of individuals genotyped than on the number of markers.
- Publication:
-
Theoretical Population Biology
- Pub Date:
- November 2013
- DOI:
- 10.1016/j.tpb.2013.08.004
- arXiv:
- arXiv:1301.4511
- Bibcode:
- 2013TPBio..89...34B
- Keywords:
-
- Principal components analysis;
- Eigenanalysis;
- Population structure;
- Eigenvalues;
- Number of subpopulations;
- Quantitative Biology - Populations and Evolution
- E-Print:
- Corrected typos in Section 3.1 (M=120, N=2500) and proof of Lemma 2