Local data structures
Abstract
Local data structures are systems of neighbourhoods within data sets. Specifications of neighbourhoods can arise in multiple ways, for example, from global geometric structure (stellar charts), combinatorial structure (weighted graphs), desired computational outcomes (natural language processing), or sampling. These examples are discussed, in the context of a theory of neighbourhoods. This theory is a step towards understanding clustering for large data sets. These clusters can only be approximated in practice, but approximations can be constructed from neighbourhoods via patching arguments that are derived from the Healy-McInnes UMAP construction. The patching arguments are enabled by changing the theoretical basis for data set structure, from metric spaces to extended pseudo metric spaces.
- Publication:
-
arXiv e-prints
- Pub Date:
- March 2023
- DOI:
- arXiv:
- arXiv:2303.01415
- Bibcode:
- 2023arXiv230301415J
- Keywords:
-
- Mathematics - Algebraic Topology;
- 62R40 (Primary) 55U10;
- 68T09 (Secondary)