Epidemic Intelligence for the Crowd, by the Crowd (Full Version)
Abstract
Tracking Twitter for public health has shown great potential. However, most recent work has been focused on correlating Twitter messages to influenza rates, a disease that exhibits a marked seasonal pattern. In the presence of sudden outbreaks, how can social media streams be used to strengthen surveillance capacity? In May 2011, Germany reported an outbreak of Enterohemorrhagic Escherichia coli (EHEC). It was one of the largest described outbreaks of EHEC/HUS worldwide and the largest in Germany. In this work, we study the crowd's behavior in Twitter during the outbreak. In particular, we report how tracking Twitter helped to detect key user messages that triggered signal detection alarms before MedISys and other well established early warning systems. We also introduce a personalized learning to rank approach that exploits the relationships discovered by: (i) latent semantic topics computed using Latent Dirichlet Allocation (LDA), and (ii) observing the social tagging behavior in Twitter, to rank tweets for epidemic intelligence. Our results provide the grounds for new public health research based on social media.
- Publication:
-
arXiv e-prints
- Pub Date:
- March 2012
- DOI:
- 10.48550/arXiv.1203.1378
- arXiv:
- arXiv:1203.1378
- Bibcode:
- 2012arXiv1203.1378D
- Keywords:
-
- Computer Science - Social and Information Networks;
- Computer Science - Computers and Society;
- Physics - Physics and Society
- E-Print:
- A short version of this work has been accepted for publication at the International AAAI Conference on Weblogs and Social Media (ICWSM 2012)