A Stochastic Analysis of the Linguistic Provenance of English Place Names
Abstract
In English place name analysis, meanings are often derived from the resemblance of roots in place names to topographical features, proper names and/or habitation terms in one of the languages that have had an influence on English place names. The problem here is that it is sometimes difficult to determine the base language to use to interpret the roots. The purpose of this paper is to stochastically determine the resemblance between 18799 English place names and 84687 place names from Ireland, Scotland, Wales, Denmark, Norway, Sweden, France, Germany, the Netherlands and Ancient Rome. Each English place name is ranked according to the extent to which it resembles place names from the other countries, and this provides a basis for determining the likely language to use to interpret the place name. A number of observations can be made using the ranking provided. In particular, it is found that `Harlington' is the most archetypically English place name in the English sample, and `Anna' is the least. Furthermore, it is found that the place names in the non-English datasets are most similar to Norwegian place names and least similar to Welsh place names.
- Publication:
-
arXiv e-prints
- Pub Date:
- December 2023
- DOI:
- 10.48550/arXiv.2312.12850
- arXiv:
- arXiv:2312.12850
- Bibcode:
- 2023arXiv231212850D
- Keywords:
-
- Computer Science - Computation and Language;
- 68T50;
- I.2.7