A classic sociology experiment is getting the Facebook treatment. This week, Yahoo Research launched an online study to test the oft-repeated—but as yet unresolved—theory that anyone in the world can reach anyone else in “six degrees of separation.” Using Facebook and its social graph of 750 million users, Yahoo’s “Small World Experiment” invites people worldwide to attempt to send a message to a specified “target” by creating an online chain of connections.
Led by Duncan Watts, principal research scientist at Yahoo Research, the study puts a high-tech spin on a 1967 experiment by Harvard researcher Stanley Milgram. Though its findings were not definitive, it ultimately led to the popular “six degrees” hypothesis.
Cameron Marlow, Facebook’s research scientist and "in-house sociologist," said that because Facebook’s social graph is essentially the best representation of real world relationships available, “our data can speak more definitively to this question than anything else in history.
“We believe that the average distance between two people is shrinking mostly because the types of connections you maintain online are a much better representation of people you know than the set of people you think about on a regular basis,” he said. “The fact that you transcribed all of your relationships into something like Facebook allows you to stay in touch with a much wider audience. This gives us not only a measurement on just how the world actually is but also how well people can utilize those relationships to route messages across the world.”
Adweek chatted with Watts about his study and its noteworthy roots; excerpts follow.
Adweek: Describe the background of the Small World Study.
Watts: In Milgram’s day, he used physical packets and he had just one target, this stock broker who lived just outside of Boston, and he had about 300 people trying to reach the target. The famous result was about 20 percent of the chains got to the target, and the average length of those chains was six degrees.
What was the conclusion?
There are two ways to read this result. One is that the world is small because the chains that got through were shorter—much shorter than people expected. The other way is that most chains didn’t get through, so you might suspect that the reason they didn’t get through is because most chains are actually long and the world is really not connected. A minority of people can reach each other in a small number of steps, but the majority of people cannot.
What’s your hypothesis in this study?
This dual interpretation has replicated itself for all other [similar] experiments. When we re-did this experiment 10 years ago, we used email through a Web interface and we did it on a much, much larger scale. We had 18 different targets around the world and 20,000 people trying to reach them, and we basically got the same sort of result. A small fraction of chains actually reached the targets, and the ones that got through were very short. But, of course, you have to wonder about the ones that did not get through.
The history of this is we have reasonable confidence that [for] the chains that didn’t get through, if people had kept passing them along, then about half of them would get to their targets in seven steps or less. That’s the hypothesis that we’re working with.
How does Facebook help?
The problem that all of the experiments have had—and the problem that we’re trying to address with this one—is that you never really know what the ground truth is. You know that there’s some network out there involved that connects people, and you know that messages are being passed along on top of this network. The problem is because you can’t see the network underneath them, you don’t know whether people are making the right choices, you don’t know if the chains are as short as possible, and you don’t know why the chains that aren’t completing are stopping.
The major difference here is that Facebook [is] the network over which these messages are being passed. We can see through Facebook how everyone is really connected to everyone else. We can see whether people can actually find these short paths. In previous experiments you were missing this background picture, but now we have the background and we can run the experiment on top of it.
So is this experiment testing whether people are actually six degrees apart from one another?
There’s actually two versions of the small world hypothesis. One is what we call topological, [which refers to] the structure of the underlying network. There’s a huge amount of evidence that in different kinds of networks—social networks, neural networks, other kinds of networks—they tend to
have this feature that even when the network is very large it’s possible to find a short path between any pair of nodes.
The other version of the small world hypothesis is what’s called the algorithmic small world hypothesis, which is that people don’t actually know this network, and they have to solve this search problem using local information. And that’s a much, much harder exercise. There’s some relationship between the search process and the underlying network, but we don’t really know what it is. We have theories about it, and mathematical models, and there’s all this work that has been done. But we’ve never actually been able to see in very simple terms the relationship between the search distance and the network distance.
So the Small World Experiment is looking at how people navigate their connections to create the shortest (six degree) chain.
We know that in this underling network people actually are pretty connected, people are within six hops of everyone else on Facebook. But that’s not the same thing as people being able to find the shortest path because each person only has local information about this network; they only know about their friends. And they’re given this random target person and they have to pick just one person to send it to, and if they pick the right person it could take just six hops, but if they don’t, it could take much, much longer.
Is there a concern that the people who choose to participate won’t represent the real world?
There are two issues here. You might be concerned that the Facebook network is somehow an unrepresentative sample of the real social graph of the world. The other concern is the people participating in the sample might be an unrepresentative sample of Facebook. I’m not worried about the first concern. Facebook has 750 million users. If it works on Facebook, it’s increasingly difficult to argue that it wouldn’t work for the rest of the world. But the second problem is one that we’re concerned about. It’s really just a matter of getting a broad enough recruiting effort.
You’ve said that when the study is complete, the results will be published in a scientific journal. But Yahoo and Facebook stand to gain a significant amount of data from the experiment, and the terms of service indicate that the companies reserve the right to give data to third parties. Are commercial applications planned for this study as well?
Certainly it will be published in a journal. There are no plans for us to use this information in any sort of commercial sense. It’s very much a scientific question. It’s not inconceivable that what we learn from this, in the way that’s always true of science, could be useful for something else we do down the line. But that’s not something we have any plans for. It’s purely a scientific exercise.
Is there any concern that people will be reluctant to participate because it allows Yahoo to collect some personal information, particularly for those who choose to be “targets”?
For the majority of participants, we’re not exposing information to anyone who can’t already see that information. [For senders] in this experiment, the only people you interact with are people who are already your friends. Your friends can already see your friends. For targets, you’re filling out this quite extensive profile and that is going to be shown to anybody who signs up for the experiment,
which could be anyone on Facebook.
We’re very clear with the targets that what they give us will effectively become public information. That’s something we explain to them [in the terms of service], and they have to agree to. People may be reluctant to do that; in which case, that’s fine. As with any human subject experiment, the rule is informed consent. You inform people about what you’re doing and why you’re doing it and they get to agree or not to agree. I think we’re very clear about that.
What happens to the data once it’s collected?
Once we’ve analyzed, it all gets anonymized. The only things that get published are aggregate statistics. We don’t identify individuals in the publications. We’re aware of the fact that people are very hesitant about this stuff and have lots of internal discussions about these sorts of things with our privacy people.
How many people have signed up so far?
[Tuesday] was the first day, and I don’t have [the most recent] numbers. But as of [Tuesday] afternoon, there were a few thousand. But we really need many, many more than that. We would like to get tens of thousands at minimum and preferably hundreds of thousands.
How many targets do you hope to recruit?
Ideally we would like to have on the order of one hundred or something like that. What we would like to do is achieve as much coverage as we can, both the geographical world [and to] cover different demographics. The minimum number is whatever we can get. We already have about 15 or 20. And if that’s all we ever got, we would make do with that, but that’s not ideal.
How long do you expect the study to last?
It depends. If a week from now a million people signed up, we could call it quits. If that doesn’t happen, we could run it for months. There’s no time sensitivity to it; it really just depends on how we can recruit participants.