From Data to the p-Adic or Ultrametric Model
Abstract
We model anomaly and change in data by embedding the data in an ultrametric space. Taking our initial data as cross-tabulation counts (or other input data formats), Correspondence Analysis allows us to endow the information space with a Euclidean metric. We then model anomaly or change by an induced ultrametric. The induced ultrametric that we are particularly interested in takes a sequential - e.g. temporal - ordering of the data into account. We apply this work to the flow of narrative expressed in the film script of the Casablanca movie; and to the evolution between 1988 and 2004 of the Colombian social conflict and violence.
- Publication:
-
arXiv e-prints
- Pub Date:
- September 2008
- DOI:
- 10.48550/arXiv.0809.0492
- arXiv:
- arXiv:0809.0492
- Bibcode:
- 2008arXiv0809.0492M
- Keywords:
-
- Statistics - Machine Learning;
- Statistics - Applications
- E-Print:
- 15 pages, 6 figures. To appear in: Proceedings of Third International Conference on p-Adic Mathematical Physics: From Planck Scale Physics to Complex Systems to Biology, Steklov Mathematics Institute, Russian Academy of Sciences