Where Did the Web Archive Go?
Abstract
To perform a longitudinal investigation of web archives and detecting variations and changes replaying individual archived pages, or mementos, we created a sample of 16,627 mementos from 17 public web archives. Over the course of our 14-month study (November, 2017 - January, 2019), we found that four web archives changed their base URIs and did not leave a machine-readable method of locating their new base URIs, necessitating manual rediscovery. Of the 1,981 mementos in our sample from these four web archives, 537 were impacted: 517 mementos were rediscovered but with changes in their time of archiving (or Memento-Datetime), HTTP status code, or the string comprising their original URI (or URI-R), and 20 of the mementos could not be found at all.
- Publication:
-
arXiv e-prints
- Pub Date:
- August 2021
- DOI:
- 10.48550/arXiv.2108.05939
- arXiv:
- arXiv:2108.05939
- Bibcode:
- 2021arXiv210805939A
- Keywords:
-
- Computer Science - Digital Libraries
- E-Print:
- 18 pages