Validation of Twitter opinion trends with national polling aggregates: Hillary Clinton vs Donald Trump
Measuring and forecasting opinion trends from real-time social media is a long-standing goal of big-data analytics. Despite the large amount of work addressing this question, there has been no clear validation of online social media opinion trend with traditional surveys. Here we develop a method to infer the opinion of Twitter users by using a combination of statistical physics of complex networks and machine learning based on hashtags co-occurrence to build an in-domain training set of the order of a million tweets. We validate our method in the context of 2016 US Presidential Election by comparing the Twitter opinion trend with the New York Times National Polling Average, representing an aggregate of hundreds of independent traditional polls. The Twitter opinion trend follows the aggregated NYT polls with remarkable accuracy. We investigate the dynamics of the social network formed by the interactions among millions of Twitter supporters and infer the support of each user to the presidential candidates. Our analytics unleash the power of Twitter to uncover social trends from elections, brands to political movements, and at a fraction of the cost of traditional surveys.