Cobwebs from the Past and Present: Extracting Large Social Networks using Internet Archive Data
Abstract
Social graph construction from various sources has been of interest to researchers due to its application potential and the broad range of technical challenges involved. The World Wide Web provides a huge amount of continuously updated data and information on a wide range of topics created by a variety of content providers, and makes the study of extracted people networks and their temporal evolution valuable for social as well as computer scientists. In this paper we present SocGraph - an extraction and exploration system for social relations from the content of around 2 billion web pages collected by the Internet Archive over the 17 years time period between 1996 and 2013. We describe methods for constructing large social graphs from extracted relations and introduce an interface to study their temporal evolution.
- Publication:
-
arXiv e-prints
- Pub Date:
- January 2017
- DOI:
- 10.48550/arXiv.1701.03277
- arXiv:
- arXiv:1701.03277
- Bibcode:
- 2017arXiv170103277S
- Keywords:
-
- Computer Science - Social and Information Networks;
- Computer Science - Information Retrieval;
- Physics - Physics and Society
- E-Print:
- 5 pages, 5 figures, SIGIR '16, July 17-21, 2016, Pisa, Italy