Data Mining Twitter for Science Applications
Abstract
The Twitter social microblogging database, which recently passed its tenth anniversary, is potentially a rich source of real-time and historical global information for science applications (beyond the by-now fairly familiar use of Twitter for natural hazards monitoring). Over the past several years, we have been exploring the feasibility of extracting from the Twitter data stream useful information for application to NASA precipitation research, with both "passive" and "active" participation by the twitterers. In the passive case, we have experimented with listening to the Twitter stream in real time for "precipitation" and related tweets (in different languages), applying basic filters for exact phrases, extracting location information, and mapping the resulting tweet distributions. In the active case, we have conducted preliminary experiments to evaluate different methods of engaging with potential participants. The time-varying set of "precipitation" tweets can be thought of as an organic network of rain gauges, potentially providing a widespread view of precipitation occurrence. The validation of satellite precipitation estimates is challenging, because many regions lack data or access to data, especially outside of the U.S. and in remote and developing areas. Mining the Twitter stream could augment these validation programs and, potentially, help tune existing algorithms. Though exploratory, our efforts thus far could significantly extend the application realm of Twitter, as a platform for citizen science, beyond natural hazards monitoring to science applications.
- Publication:
-
AGU Fall Meeting Abstracts
- Pub Date:
- December 2016
- Bibcode:
- 2016AGUFMPA53A2276T
- Keywords:
-
- 0499 New fields (not classifiable under other headings);
- BIOGEOSCIENCESDE: 9810 New fields (not classifiable under other headings);
- GENERAL OR MISCELLANEOUSDE: 9820 Techniques applicable in three or more fields;
- GENERAL OR MISCELLANEOUSDE: 1926 Geospatial;
- INFORMATICS