Weighted paths between partitions
Abstract
How to quantify the distance between any two partitions of a finite set is an important issue in statistical classification, whenever different clustering results need to be compared. Developing from the traditional Hamming distance between subsets or cardinality of their symmetric difference, this work considers alternative metric distances between partitions. With one exception, all of them obtain as minimum-weight paths in the undirected graph corresponding to the Hasse diagram of the partition lattice. Firstly, by focusing on the atoms of the lattice, one well-known partition distance is recognized to be in fact the analog of the Hamming distance between subsets, with weights on edges of the Hasse diagram determined through the number of atoms in the unique maximal join-decomposition of partitions. Secondly, another partition distance known as "variation of information" is seen to correspond to a minimum-weight path with edge weights determined by the entropy of partitions. These two distances are next compared in terms of their upper and lower bounds over all pairs of partitions that are complements of one another. What emerges is that the two distances share the same minimizers and maximizers, while a much rawer behavior is observed for the partition distance which does not correspond to a minimum-weight path. The idea of measuring the distance between partitions by means of minimum-weight paths in the Hasse diagram is further explored by considering alternative symmetric and order-preserving/inverting partition functions (such as the the rank, in the simplest case) for assigning weights to edges. What matters most, in such a general setting, turns out to be whether the weighting function is supermodular or else submodular, as this makes any minimum-weight path visit the meet or else the join of the two partitions, depending on order preserving/inverting.
- Publication:
-
arXiv e-prints
- Pub Date:
- September 2015
- DOI:
- 10.48550/arXiv.1509.01852
- arXiv:
- arXiv:1509.01852
- Bibcode:
- 2015arXiv150901852R
- Keywords:
-
- Computer Science - Discrete Mathematics;
- 05A18;
- 05C12