The World According To Google
Todd Hylton's business lived by Google, and it nearly died by Google. Last fall, his Maryland-based insurance agency ranked No. 2 in the search engine's listings when Web surfers punched in "car insurance." Orders rolled in from online customers around the country. Then one Monday in November, his site was nowhere to be found. "I was stunned," says Hylton. "Just like that, I saw my business disappearing--poof!"
The feared "Google death penalty" had fallen with no warning, and Hylton struggled to keep his business going. He wasn't alone. Online bulletin boards late last year filled with panicked webmasters trying to divine what had happened to their Google rankings. The search engine had quietly revised its secret formula for ranking Web sites. In doing so, it had altered which sites appear on its first page of results--that piece of Internet real estate coveted by webmasters worldwide.
It was just another, if particularly vivid, display of Google's grip on the Web's wealth of information. To bring some order to the wildly disorganized Web, Google has written rules that have changed the way we interact with the Internet, with knowledge in general, and even with each other.
That "Google" has become a verb synonymous with search is only one testament to its reach. The company has been thriving in spite of the Internet bust. Its technology handled nearly 80 percent of U.S. Web queries last year. Now it is planning to sell stock to the public in a megadeal that will add $2.7 billion to a war chest already fattened by profits and make paper millionaires of many Google employees (box, Page 47).
Google has succeeded because it's quicker than riffling through Yellow Pages for a store's number, cheaper than an investigator for scoping a date, and closer than the library when crashing on a term paper. It even finds a sunny escape for a midwesterner tired of the cold. "I Googled 'home for rent' and 'January and February,' and up popped a description and photos of the place we're now staying," says Larry Bodine, a marketing consultant who temporarily fled his Illinois home for Tucson, Ariz. "I can't imagine living without Google."
But with power comes scrutiny. Many of us feel a touch of discomfort at Google's ability to unearth long-forgotten facts about ourselves, preserving them in an Internet time warp. And disappointed entrepreneurs aren't alone in asking how Google decides what to display and what to leave out. "Google essentially determines what exists on the Internet and what doesn't," says Harvard law Prof. Jonathan Zittrain.
Web of secrecy. Zittrain and others say Google's largely secret criteria may not always serve up what is most important on the Web but what is most popular. They are calling on Google and other search engines to be more open about how they rank sites. Otherwise, those critics say, we are entrusting what amounts to our biggest public library to pure market forces--forces that are sure to intensify. Going public will subject Google to pressure from investors to increase profits. And the search firm is facing new competition, as giants from Yahoo! to Microsoft try to beat Google at its own game.
At the fast-growing Silicon Valley company, executives say their goal is simply to keep serving the searching public with carefully vetted information. In a letter attached to last week's stock filing, Google's founders, Sergey Brin and Larry Page, vowed that new financial pressures won't make the company any less dedicated to that goal. After all, Google execs note, any user can consign Google to oblivion with little more than a mouse click. "The costs of switching search engines are ridiculously low," says Craig Silverstein, Google's technology director. So the company periodically tweaks its ranking criteria to better identify what's relevant and weed out sites that try to trick Google and boost their standing. It also safeguards its core rankings from the questionable ethics that afflicted early search engines--the sale of top rankings, for example. "We know there are lines that we cannot cross without p - - - ing off a lot of people," says Google CEO Eric Schmidt.
The key to Google's success, though, is the radically different approach to making sense of the Web mapped out in the late 1990s by Brin and Page while they were graduate students at Stanford University. Earlier search engines like AltaVista and Lycos scanned Web pages for words matching a user's search terms, then ranked sites largely based on how often those terms appeared. Brin and Page's breakthrough formula, dubbed "PageRank" partly after its codesigner, instead toted up the number of other pages linking to a particular page, counting each one as an endorsement of the site's content.
They named their company after "googol," a math term for an impossibly high number--the seeming breadth of the burgeoning Web. Google's consciously unsplashy look added to its splash. Its simple search box was refreshing at a time of clutter in other engines, which often became "portals" that buried query boxes under news, weather, and stock quotes--and flashing, gyrating, pulsating banner ads. Combined with Google's uncanny ability to deliver relevant results fast, the clean look made searching the Web seem simple. "It gives you a false sense that you are close to the entire Internet, that it's all just a click away," says Siva Vaidhyanathan, who teaches communication studies at New York University.
Behind that illusion are powerful minds and computers, constantly refining the secret algorithms that determine what pops up on a search. "Fundamentally, we depend on our intellectual horsepower to stay ahead," says Silverstein. Indeed, the company is said to have more Ph.D.'s per square foot than any other U.S. firm--though gravitas isn't what first strikes a visitor. Rather, the self-conscious playfulness of the dot-com culture is visible throughout the corridors and common rooms of its Silicon Valley headquarters. Brightly colored exercise balls and a pool table beckon employees needing a break from brain overload. Lava lamps decorate nooks and crannies like so many company pendants. For errands, motorized scooters and Segways, the high-wheeled transporters, stand at the ready.
Spiders' stratagem. Hidden from view is the company's secret weapon--thousands of high-performance computers. The company won't say just how many, but outside estimates put the figure at more than 100,000, in perhaps a dozen centers worldwide. That could make Google the world's largest operator of a distributed computer network. The PCs send software "spiders" out to crawl the Web and retrieve page copies to store on several hundred thousand Google hard drives--a library that now includes 4.3 billion pages. It's those stored copies that a user actually searches, rather than the Web itself, which helps explain Google's fast results. Searches that typically took three or four seconds with predecessors are finished in split seconds, a fact Google proudly displays with the results.
Those results can include a remarkable, even unnerving, range of information. Criminal convictions from a wayward youth, thought to be long locked away, can surface under a name search as courts post records to the Web. "Googledorks" troll the search engine hoping to find illicit data, such as lists of credit card numbers, that landed in Google's library after its spiders slipped through holes in a company's Web site security.
Lucretia Marcus of Alamo, Calif., Googled herself and was startled to find her unlisted, unpublished phone number. She hadn't realized that the dog clubs she joined post newsletters to the Web. "I also didn't know someone would be gathering such bits of information," she says. "It felt a little like Big Brother at work." Marcus is a little unnerved by her life's new transparency, although she considers Google a crucial resource.
Brin and Page added to the credibility of their nascent Web tool when they refused to sell companies a place in their rankings or even to allow "paid inclusion," in which companies pay for a guaranteed listing in a search engine's library. Those steps cemented Google's popularity among Internet-savvy academics and technologists. "Google wants no whiff of impropriety," says Nate Elliott, an analyst at Jupiter Research. "Its greatest asset is customer trust."
The company reaps profits with unobtrusive ads displayed off to the right side of results. Advertisers bid on certain keywords, such as "swimming pools," and their ads pop up when a user searches on those terms. Google also earns profits by placing ads on other Web sites, matching the ads to keywords that appear on those pages. For example, it might run pool ads on a news site displaying stories about hot weather.
Price of success? Yet the drive for profits is already forcing Google to change. "It will become more like Yahoo!, while Yahoo! will become more like Google," says Greg Boser, an Internet consultant. Boser predicted not long ago that the first victim would be the home page's sparse look. Sure enough, Google recently made telling changes to its opening page. It added prominent links to moneymaking services, such as "Froogle," which helps shoppers on the Web, and removed a gray screen behind the ads that had made them easy to tell apart from the free listings.
While Google loses some of its perceived purity, competitors are racing to catch up with its power. Internet giant Yahoo! earlier this year dropped Google as its search engine in favor of one that it is cobbling together from several acquisitions. That alone dropped Google's market share to about 50 percent, and now the largest of the computing giants, Microsoft, is preparing to launch its own engine. These days, Google's link-counting strategy is widely employed across the Web, while competitors (and Google, for that matter) try new tweaks to deliver better results. The site Ask Jeeves, for example, bought Teoma, an engine that identifies "hub" sites, ones linked to many related pages. It's a means of tapping "community discussions" to find helpful results and even related topics a searcher might not have thought of. Says Jim Lanzone of Ask Jeeves, "We're using social networking to find relevant results."
Money from Google's IPO, meanwhile, will accelerate its continual upgrades of hardware and software and fuel new lines of business, such as a groundbreaking free Web-mail service announced last month. Gmail, as it's called, will offer consumers a gigabyte of free mail storage, many times more than the Yahoo! and Microsoft Web mail services. The catch: Subscribers have to endure E-mail ads targeted to their interests--as determined by Google's spiders, which will automatically scan every message. The proposed service has met with stiff criticism from some privacy advocates, but it plays to Google's supremacy in running computer systems that scan huge amounts of data and tailor advertising to the individual user.
For now, Google's dominance makes it the main target for Web site "optimizers": an army of consultants who try to reverse-engineer the ranking formulas and push a particular site to the top. Pages can gain rank legitimately by, for example, posting informative content, such as how-to instructions, that attract links from other sites. Less scrupulous methods include "cloaking," or presenting one page to users and another full of spider-friendly keywords to Web crawlers, and "link farms" of pages that exist only to point to a target site and boost its ranking. "We try to put those sites back in their place," says Peter Norvig, Google's director of search quality.
But changes in Google's ranking rules can also mean upheaval for businesses that have come by their rankings honestly. After last November's update, "I had people come to my booth and literally break down crying over lost rankings," says Bruce Clay, a longtime search-engine consultant. Webmasters greet such updates with the same enthusiasm as nasty storms and give them names reminiscent of hurricanes; November's became "Florida."
Google also referees noncommercial sites. A study coauthored by Zittrain at Harvard's Berkman Center for In-ter-net and Society found that the search giant, apparently at the demand of German authorities, expunges neo-Nazi sites from its German-language version. It also removed links to sites that the Church of Scientology accused of illegally posting its copyrighted material. But Google will not remove sites unless the complaints carry legal weight, and in each case it refers users to the freedom-of-information site chillingeffects.org, which posts the legal notices themselves, along with the questionable Web addresses.
The Web world may be in good hands with Google, says Zittrain, but it bothers him that only Google knows for sure. He believes search engines could reveal enough about how they rank sites to satisfy public concerns without giving away commercial secrets. "They wouldn't have to give up the recipe itself."
Winners and losers. But some critics do worry about Google's recipe itself--the PageRank algorithm. "It turns knowledge into a popularity contest," says NYU's Vaidhyanathan. And once a site ends up high in the rankings, it attracts new links and becomes harder to unseat, reinforcing the established hierarchy. That conservatism can make it hard for new sites, offbeat ideas, and minority views to find their way onto the first few screens of search results.
It can also showcase fads at the expense of substance. "Popular does not make a site accurate, comprehensive, or even interesting," Vaidhyanathan says, using his own first name as an example. A Google search on "Siva" returns links to a song by the rock group Smashing Pumpkins and to Vaidhyanathan's own weblog. "Now, I'm named for an important Hindu god, worshipped by a billion people," he says. "Don't you think that Siva should get top billing?"
Google CEO Schmidt says that letting the majority decide shouldn't be seen as a bias. "I view it as an algorithm"--an automated process to distill the Web's values, he says. "Google doesn't know the truth, but it knows what others think is the truth." If so, Google's users just need to recognize its limits. And anyone who is still worried that the company skews search results should try this: Do a Google query on "search engine."
Google ranks down at No. 6.
LEXICON: "GOOGLEWHACK"
A game to find two words, without quotes, that appear on only one page in Google's index (such as mothproof underpants or potbellied veeps).
LEXICON: "GOOGLEBOMB"
Mischievously pushing a Web page to a top ranking. The trick: Multiple Web site owners use the same phrase on their sites while also linking to the target page.
LEXICON: "GOOGLOPOLY" Google's dominance of Web searches--though its position is now being challenged
AN ENGINE THAT COULD
A chart of the leading search companies' share of the market understates the dominance of Google's technology. Until February, for example, Yahoo! relied on the Google engine.
[Chart data is incomplete.]
[labels]
0
20
40
60%
'99
'00
'01
'02
'03
'04
Share of the search market
Yahoo!
Excite
Altavista
MSN
AOL
Source: WebSideStory
USN&WR
This story appears in the May 10, 2004 print edition of U.S. News & World Report.
