Over the past few days Facebook has been giving developers a look inside of Facebook’s development thought process. Chris Piro posted an article on the engineering blog about the high-level development challenges that the developers faced when scaling the Facebook chat system. While I’m not a high-level developer, there are plenty of developers that will benefit from the information shared by Chris Piro.
Roddy Lindsay also had an interesting article about some of the sentiment analysis that the company has been working on. Mark Zuckerberg previously alluded to some sentiment analysis projects that the company was working on in a conversation with Robert Scoble in Davos, but this post gives us some insight into what they’re working on.
Lindsay’s post is from October but it just became publicized through Facebook’s new Facebook People subsite which displays public messages from employees at Facebook. In his article, Roddy illustrates how large of a project he has been tackling, “We developed a corpus of 5000 tagged posts labeled positive, negative or neutral about certain objects. We then started generating synonyms for sentiment words by comparing every word to every other word in a single day of data, ranking by similarity of their immediately neighboring words.”
He continues about how long the computational analysis took, “The computation is indeed enormous and took 12 hours on our 80-node cluster, producing 10 terabytes of intermediate map data.” He then goes on to explain how the sentiment was an extremely good predictor for the results of election polls. This is only a taste of things that Facebook is working on but it illustrates some of the larger projects the company is experimenting with.
Often times we find startups with tons of data but no resources to do anything with that data since they are so busy programming new features and managing scalability (as Facebook is doing currently). With massive amounts of timely data, Facebook could become an excellent source of global research. Thankfully, as the company experiments and pushes new boundaries they are giving back to the community by sharing many of their lessons learned.