DomainDemo: a dataset of domain-sharing activities among different demographic groups on Twitter
Abstract
Social media play a pivotal role in disseminating web content, particularly during elections, yet our understanding of the association between demographic factors and political discourse online remains limited. Here, we introduce a unique dataset, DomainDemo, linking domains shared on Twitter (X) with the demographic characteristics of associated users, including age, gender, race, political affiliation, and geolocation, from 2011 to 2022. This new resource was derived from a panel of over 1.5 million Twitter users matched against their U.S. voter registration records, facilitating a better understanding of a decade of information flows on one of the most prominent social media platforms and trends in political and public discourse among registered U.S. voters from different sociodemographic groups. By aggregating user demographic information onto the domains, we derive five metrics that provide critical insights into over 129,000 websites. In particular, the localness and partisan audience metrics quantify the domains' geographical reach and ideological orientation, respectively. These metrics show substantial agreement with existing classifications, suggesting the effectiveness and reliability of DomainDemo's approach.
- Publication:
-
arXiv e-prints
- Pub Date:
- January 2025
- arXiv:
- arXiv:2501.09035
- Bibcode:
- 2025arXiv250109035Y
- Keywords:
-
- Computer Science - Social and Information Networks;
- Computer Science - Computers and Society
- E-Print:
- 19 pages, 1 figure