What’s Behind The Door Of Facebook’s ‘Data Store?’

By David Cohen 

Facebook offered some behind-the-scenes details about its TAO (The Association and Objects) server, which the social network referred to as its graph data store, in a note on the Facebook Engineering page by Software Engineer Mark Marchukov.

Marchukov described TAO as follows:

Facebook puts an extremely demanding workload on its data back end. Every time any one of more than 1 billion active users visits Facebook through a desktop browser or on a mobile device, they are presented with hundreds of pieces of information from the social graph. Users see News Feed stories; comments, likes, and shares for those stories; photos and check-ins from their friends — the list goes on. The high degree of output customization, combined with a high update rate of a typical user’s News Feed, makes it impossible to generate the views presented to users ahead of time. Thus, the data set must be retrieved and rendered on the fly in a few hundred milliseconds.

This challenge is made more difficult because the data set is not easily partitionable, and by the tendency of some items, such as photos of celebrities, to have request rates that can spike significantly. Multiply this by the millions of times per second this kind of highly customized data set must be delivered to users, and you have a constantly changing, read-dominated workload that is incredibly challenging to serve efficiently.

Marchukov went on to discuss why the social network implemented the use of memcache in addition to MySQL, as well as the development of the Objects and Associations application-programming interface, and it described a subgraph of objects and associations that is created in TAO after Alice checks in at the Golden Gate Bridge and tags Bob there, while Cathy comments on the check-in and David likes it.

He concluded:

The TAO service runs across a collection of server clusters geographically distributed and organized logically as a tree. Separate clusters are used for storing objects and associations persistently, and for caching them in RAM and flash memory. This separation allows us to scale different types of clusters independently and to make efficient use of the server hardware.

We chose eventual consistency as the default consistency model for TAO. Our choice was driven by both performance considerations and the inescapable consequences of CAP theorem for practical distributed systems, where machine failures and network partitioning (even within the data center) are a virtual certainty. For many of our products, TAO losing consistency is a lesser evil than losing availability. TAO tries hard to guarantee with high probability that users always see their own updates. For the few use cases requiring strong consistency, TAO clients may override the default policy at the expense of higher processing cost and potential loss of availability.

A massive amount of effort has gone into making TAO the easy-to-use and powerful distributed data store that it is today. TAO has become one of the most important data stores at Facebook — the power of graph helps us tame the demanding and dynamic social workload. For more details on the design, implementation, and performance of TAO, I invite you to read our technical paper published in Usenix ATC ‘13 proceedings.