Dataclysm: Who We Are (When We Think No One's Looking)
by Christian Rudder
How prejudiced is today's society? What does Facebook predict about the stability of a marriage? Where and why are gay people staying in the closet? How do political views affect romantic relationships?
Christian Rudder delved deep into the "statistical slag pits" and emerged with a bold, thought-provoking book that answers these questions and more. In Dataclysm: Who We Are (When We Think No One's Looking), he shows how technology is offering an "unprecedented sociological opportunity" and helping to transform our understanding of race, politics, sex, beauty, humor, anger and other subjects previously challenging to quantify.
We often hear about "Big Data," or large stores of information, discussed in the context of how it might be used to entice people to purchase products they don't need or to spy on us in the name of national security. But Rudder's aim is to better understand human nature and behavior. Some 87% of the United States' population is online--working, socializing, romancing. With every click, post, Tweet and web search, "our hidden thoughts are becoming part of the world. With a little creative typing, a few workarounds, and some math, we are giving humanity's inner monologue a wider audience."
Rudder's role as Virgil through the digital world has been 10 years in the making. He is a co-founder of OkCupid, one of the largest dating websites in the world, and chief analyst of the vast repository of data the company has amassed since it launched a decade ago. As more and more information was collected from the site's millions of users, trends and patterns began to emerge.
Rudder realized that this deep, varied data set of person-to-person interaction could be used to directly examine taboos like race. "I could go and look at what actually happens when, say, 100,000 white men and 100,000 black women interact in private. The data was sitting right there on our servers," he says. Unlike surveys, in which respondents can edit their answers or even outright lie, he had the unvarnished truth at his fingertips.
In Dataclysm, Rudder combines existing work with his own original research, analyzing information from OkCupid, Google, Twitter, Facebook, Reddit, Tumblr and other websites. He reveals his findings in a series of vignettes organized into three main categories: the data of people connecting, the data of division and the data of the individual.
Rudder begins by putting hard numbers to the timeless mystery of sex appeal and what brings two people together in the first blush of attraction. (Surprising find: embrace your flaws.) From there he moves on to other topics, like written communication, demonstrating that Twitter is actually improving its users' writing ability and changing the study of language.
Next, Rudder probes society's great divides, exploring charged issues like faith, politics and race. As he discovered, the unvarnished truth isn't always pleasant. Although expressing racist views publicly is no longer considered socially acceptable, digital activity proves that in private, racism is pervasive and still an implicit factor in people's decision making.
Rudder also uses his findings to give strength and nuance to previous work and suggests ways to build on it. For example, it's not news that looks matter, particularly for women, as Naomi Wolf put forth in the bestseller The Beauty Myth. But the atomized actions of millions of online participants means anecdotes are now bolstered by evidence. From the dating world to the workplace, Rudder illustrates how, "not unlike race, beauty is a card you're dealt, and it has huge repercussions."
In Dataclysm's third section, Rudder turns his attention to the individual, exploring how ethnic, sexual and political identity is expressed. He reveals how whites, blacks, Asians and Latinos are most and least likely to define themselves; how location shapes a person; and why data surrounding self-reported gay populations across the country has a sobering meaning.
Rudder is even-handed in exemplifying both the good and the bad taking place on the Internet, "a vibrant, brutal, loving, forgiving, deceitful, sensual, angry place" that reflects its users. Tumblr is reaching out to help those with eating disorders, while virtual lynch mobs have formed on Twitter, inciting collaborative rage with far-reaching effects.
Dataclysm covers broad territory, ranging from interesting curiosities, like which state's residents bathe the most frequently, and what men and women are most eager to know about the opposite sex, to subjects with larger social ramifications. Google leads the way in using data for public good, including its flu tracker, which utilizes searches for remedies and symptoms to pinpoint outbreaks and alert the CDC, and Constitute, a database of hundreds of documents emerging nations can use as a guide in designing their own constitutions.
A book based on statistics could easily be dry and boring, but not with Rudder at the helm. If numbers are the narrative, he is the consummate storyteller--smart, witty and a perceptive interpreter of the data. His pithy, conversational tone and fast-paced writing style make Dataclysm both amusing and informative. Charts and graphs appear throughout the book, each one explained in clear, colorful detail, along with pop-culture references and entertaining personal anecdotes.
The book's title is drawn from Kataklysmos, Greek for the Old Testament Flood and the origin of the English word "cataclysm," and was chosen partly in reference to the unprecedented deluge of data being collected today. Rudder concludes by ruminating on some of the challenges the data deluge is bringing with it, chiefly privacy concerns, and where we're headed from here. With every click, the floodgates will open further, strengthening the reach and power of Big Data.
"More than stretching out my arms to say This is the pinnacle, I mean to communicate the power of what's to come," says Rudder. "The cliché would be to say that this is just the tip of the iceberg, but we're not even at sea yet. In the dataclysm, the water's hardly up to our knees." Dataclysm is a valuable read for anyone who would like to know what's going on behind the scenes in cyberspace. --Shannon McKenna Schmidt