The idea of reading a thousand political blogs every day might seem like Geneva-worthy torture, but it's hard to deny that there's some sort of useful information coming out of the blogosphere. Now the good news: you needn't actually read the blogs to find out what it is.
Wonkosphere.com, which formally launched this week, aggregates posts from about 1,000 political blogs of all types and stripes, whether run by a newspaper or magazine or someone in his parents' basement, and extracts trends and patterns. Every four hours, the site updates the information with which presidential candidates have the most buzz among bloggers, as well as how mean or nice that buzz is. Blogs are split into conservative, liberal, or independent, and they don't include those from official candidate sites.
In this way, the site's designers, two Arizona State University professors who run a company called Crawdad Technologies, say they hope to capture the collective conscience of the blogosphere at any given time, while also collecting a lot of data that they can eventually sell to analysts down the road. In the meantime, the site is a way to save readers a lot of time by identifying the hottest links of the day.
Wonkosphere.com uses patented software to "read" every new post from the directory of blogs and analyze them using a sophisticated linguistic technique known as "centering theory," which identifies the most important words in a text. By contrast, most media organizations doing text analysis merely identify the most common thematic words in a speech or text. Centering theory is a more sophisticated way of doing this, says cofounder Kevin Dooley, a professor at the ASU business school. This theory, he says, produces a "network representation of the text," in which the important words span the boundaries between others.
Armed with this kind of data, Dooley and cofounder Steve Corman, a communications professor at ASU, can also measure the tone of a post and track over time how favorably the blogs are treating one candidate or another. For each candidate, they produce a graph of both buzz share and tone over time. (Here's an example.)
One must still confront the question of what "buzz share" actually means within the perverse and often baffling economy of the blogosphere. While no one is claiming that this hippodrome is a stand-in for public opinion at large, Corman argues that anyone who takes the time to maintain a decent blog is going to be pretty informed.
"My guess is that the people on those blogs are opinion leaders," he says. "Let's face it—to take the time, to put that kind of effort into it, you've got to know what's going on."
The current site will be steadily evolving over the course of the next 14 months up to the election, Corman and Dooley say. Once the field is narrowed to a few candidates, they plan to introduce more information about which issues and topics are most discussed by bloggers. Those data, Corman says, could then be compared with data on buzz and tone for a measure of which issues bloggers consider strengths and weaknesses for each candidate.
"We would like to move into a more analyst role," Corman says. "There's a lot deeper we can go into the data."
Dooley concurs: "We're only showing a fraction of the posts," he said. "We have a lot of data that's not available to the end user." He added that a large portion of those more intensive data may eventually be sold as a report or given to paid subscribers.
In the meantime, they also say they will be working on expanding their directory of blogs they track, which they admit is too heavy on conservative blogs at the moment (though the results are broken out by ideology, preventing too much skewing of the results).
Henry Farrell, an assistant professor of international relations at George Washington University who is unaffiliated with Wonkosphere.com, said the possibilities for such tracking sites are still wide open.
"The technology is to some extent in its infancy," Farrell says. "You certainly can find out some interesting things. It's probably going to be more relevant after the primaries." He cautioned, however, that the potentially vast differential in prominence and impact between one blog and another can make the results misleading: "Trying to measure that influence is really difficult."