The Devil Is in the Data

Simply adding more observational data won't expand the ranks of the GOP.


RNC Chairman Reince Priebus unveils his his post-election "autopsy" of the GOP.

By + More

Adam Schaeffer is co-founder and director of research of Evolving Strategies.

The GOP's "Growth and Opportunity Project," which details a plan for revitalizing the Republican Party in the aftermath of the 2012 defeat, is necessarily broader than it is deep. There is, however, a topic that will need to be thoroughly explored if the Republican Party is to successfully execute this ambitious plan.

Two words were used nearly 300 times throughout the report: "data" and "testing." One word—"experiment"—was not mentioned at all. But experiments are the only type of test that can produce the kind of data the GOP needs.

Put simply, "data" is information about the world we live in, and it comes in two types: "observational" and "experimental." "Observational" data is static; it's information about the things as they are, or were. For example, voters who are pro-life are also less supportive of gun control. That's the world as it is. But it doesn't tell us whether being pro-life causes people to be more pro-gun or whether a pro-life message will decrease support for gun control.

[See a collection of political cartoons on gun control and gun rights.]

"Experimental" data is dynamic; it's information about what causes things to change and how things could be. Experiments show us how specific messages or modes of contact—like telephone calls, mailers or TV ads—push or pull on voter opinion and behavior. Experiments open our eyes to a counterfactual universe: what if every citizen watched this ad, knew that fact, or was visited at their door by a volunteer? Will it shift the vote or turn more people out to the polls? Will it work with some voters, but not others, or even cause a backlash?

The experimental method is simple in concept, but difficult in practice. The core of a true experiment is random assignment of a large number of test subjects to "treatment" and  "control" groups, like a clinical drug trial. With a large numbers, random assignment ensures there is no systematic bias in who ends up in each group. We can then attribute any difference in the outcome between the "treatment" and "control" group, whether that's blood pressure or support for a candidate, to the effect of the "treatment." It's the only way to confidently identify a causal relationship.

This all might sound far too fussy and academic, even philosophical, to be a core part of a political effort. But this is the new world of politics in which we're already living.

[Read the U.S. News Debate: Is the GOP's Problem in its Strategies or its Policies?]

What made the Obama campaign so accurate in their prediction of the vote across contested states was the use of experimental results from the "lab" and the "field" in their voter modeling. Because they had a large amount of experimental data, showing them how different kinds of people shifted in response to various messages (toward or away from Obama, greater or lesser likelihood of voting), they could predict with astonishing accuracy the aggregate results of their efforts.

Simply adding more observational data won't expand the ranks of the GOP. In Iowa's 2008 caucus, the Romney campaign turned out just under 30,000 votes and lost badly to a late-surging Mike Huckabee. Romney maintained his database on the state's voters. In 2011, his campaign commenced a quiet but ambitious "data-driven" effort to win Iowa. All the experience, information and algorithms hard-won over the last four years were plowed into a massive persuasion and turnout effort. But when their work was completed and the counting was done, Romney received just under 30,000 votes once again. Four years and millions of dollars later Romney had earned about 140 fewer votes and a loss to yet another late-surging social conservative.

Observational data and the modeling it generates are cold and static. And no statistical technique, regardless of its sophistication, can overcome the inherent limitations of observational data. In contrast, experimental data and the modeling it generates are alive and dynamic.