Combinatorics of a dissimilarity measure for pairs of draws from discrete probability vectors on finite sets of objects
Abstract
Motivated by a problem in population genetics, we examine the combinatorics of dissimilarity for pairs of random unordered draws of multiple objects, with replacement, from a collection of distinct objects. Consider two draws of size $K$ taken with replacement from a set of $I$ objects, where the two draws represent samples from potentially distinct probability distributions over the set of $I$ objects. We define the set of \emph{identity states} for pairs of draws via a series of actions by permutation groups, describing the enumeration of all such states for a given $K \geq 2$ and $I \geq 2$. Given two probability vectors for the $I$ objects, we compute the probability of each identity state. From the set of all such probabilities, we obtain the expectation for a dissimilarity measure, finding that it has a simple form that generalizes a result previously obtained for the case of $K=2$. We determine when the expected dissimilarity between two draws from the same probability distribution exceeds that of two draws taken from different probability distributions. We interpret the results in the setting of the genetics of polyploid organisms, those whose genetic material contains many copies of the genome ($K > 2$).
- Publication:
-
arXiv e-prints
- Pub Date:
- September 2024
- DOI:
- 10.48550/arXiv.2410.00221
- arXiv:
- arXiv:2410.00221
- Bibcode:
- 2024arXiv241000221A
- Keywords:
-
- Mathematics - Combinatorics;
- Quantitative Biology - Populations and Evolution;
- 05A05;
- 05A15;
- 05A17;
- 20B05;
- 92D10
- E-Print:
- 14 pages, 0 figures