It's no secret that the job of a political pollster is getting harder and harder every election cycle. People are cutting the landline, and regulations make it incredibly hard for pollsters to reach voters on their cell phones. Mass onslaughts of get-out-the-vote phone calls near Election Day swamp phone lines and make voters recoil from the idea of actually picking up the phone. Finding voters who are willing to talk about their attitudes and beliefs on politics over the phone is an increasingly difficult challenge. It's hard out there for a pollster these days.
Enter the latest and greatest tool for gauging public opinion: sentiment analysis and "big data". Advances in computing allow us to analyze huge quantities of unstructured data (think "my random 140 character musings" instead of "my clear answer to a yes or no question"). Culturally, people are more and more comfortable putting it all out there online, from their tastes in music to their political preferences. Not to mention, samples can be enormous, dwarfing the "small data" samples of a pollster who interviews a thousand registered voters. Technological innovation and a cultural shift toward sharing (and oversharing) make it possible for researchers to assess what people think without having to go to the trouble of actually asking questions.
Or do they? This week, the Pew Research Center is out with a study throwing cold water on the idea that analyzing data from sources like Twitter can be an accurate substitute for more traditional research methods. They find that Tweets are inconsistent in how they match up with polling data. Twitter users were more excited than American voters as a whole about the re-election of Barack Obama. Meanwhile, Pew finds that Twitter users were less excited about Obama's inaugural address than their poll respondents.
If the challenges facing more traditional "small data" pollsters are actually pretty big, the challenges facing "big data" analysts are huge in this area. It seems obvious that the demographics of the universe of "people Tweeting about the inaugural address" might be different from the universe of "registered voters nationwide." While traditional pollsters can get a sense of the race, age, and gender of their samples and make corrections accordingly, it's a lot harder to know all the demographic data behind the Tweets being analyzed. Not to mention, it's much less clear what counts as a "positive" or "negative" Tweet in any given context, and that this up-or-down-vote approach to sentiment analysis might be too blunt an instrument to be useful.
As technology moves forward, so too must the way people gather information about public opinion. But don't count the "small data" polls out quite yet. While some high-profile misses by political pollsters raised important questions about how accurate election polls really are, quite a few pollsters managed to get it very close to right, even given all the aforementioned challenges pollsters face these days. Both "big data" analysis of online conversations and "small data" surveys and focus groups have a role to play in politics, and smart campaigns will value both as complementary methods of learning about where voters stand.