Unbiased mixed variables distance
Abstract
Defining a distance in a mixed setting requires the quantification of observed differences of variables of different types and of variables that are measured on different scales. There exist several proposals for mixed variable distances, however, such distances tend to be biased towards specific variable types and measurement units. That is, the variable types and scales influence the contribution of individual variables to the overall distance. In this paper, we define unbiased mixed variable distances for which the contributions of individual variables to the overall distance are not influenced by measurement types or scales. We define the relevant concepts to quantify such biases and we provide a general formulation that can be used to construct unbiased mixed variable distances.
- Publication:
-
arXiv e-prints
- Pub Date:
- November 2024
- DOI:
- 10.48550/arXiv.2411.00429
- arXiv:
- arXiv:2411.00429
- Bibcode:
- 2024arXiv241100429V
- Keywords:
-
- Statistics - Methodology
- E-Print:
- 40 pages, 9 figures