Facebook Caffe2Go Deep-Learning System Powers Video ‘Style Transfer’ Test

Facebook’s focus on artificial intelligence may soon help users turn their videos into “works of art.”

Facebook’s focus on artificial intelligence may soon help users turn their videos into “works of art.”

The social network announced that it began testing a new creative-effective camera feature within its flagship mobile application that applies “style transfer” technique to users’ videos via a Facebook-developed deep-learning technology called Caffe2Go.

Research scientists Yangqing Jia and Peter Vajda explained the test in a blog post:

We recently began testing a new creative-effect camera in the Facebook app that helps people turn videos into works of art in the moment. That technique is called “style transfer.” It takes the artistic qualities of one image style, like the way Van Gogh paintings look, and applies it to other images and videos.

It’s a technically difficult trick to pull off, normally requiring the content to be sent off to data centers for processing on big-compute servers–until now.

We’ve developed a new deep-learning platform on mobile so it can–for the first time–capture, analyze and process pixels in real time, putting state-of-the-art technology in the palm of your hand. This is a full-fledged deep learning system called Caffe2Go, and the framework is now embedded into our mobile apps.

By condensing the size of the artificial-intelligence model used to process images and videos by 100 times, we’re able to run various deep neural networks with high efficiency on both iOS and Android. Ultimately, we were able to provide AI inference on some mobile phones at less than 1/20th of a second, essentially 50 milliseconds–a human eye blink happens at one-third of a second, or 300 ms.

Video style transfer demo

Posted by Facebook Engineering on Monday, November 7, 2016

Facebook chief technology officer Mike Schroepfer also addressed Caffe2Go in a detailed Newsroom post updating Facebook’s AI initiatives:

Just three months ago, we set out to do something nobody else had done before: ship AI-based style transfer running live, in real-time, on mobile devices. This was a major engineering challenge, as we needed to design software that could run high-powered computing operations on a device with unique resource constraints in areas like power, memory and compute capability.

The result is Caffe2Go, a new deep-learning platform that can capture, analyze and process pixels in real-time on a mobile device. We found that by condensing the size of the AI model used to process images and videos by 100 times, we’re able to run deep neural networks with high efficiency on both iOS and Android. This is all happening in the palm of your hand, so you can apply styles to videos as you’re taking them.

Having an industrial-strength deep-learning platform on mobile enables other possibilities, too. We can create gesture-based controls, where the computer can see where you’re pointing and activate different styles or commands. We can recognize facial expressions and perform related actions, like putting a “yay” filter over your selfie when you smile. With Caffe2Go, AI has opened the door to new ways for people to express themselves.

Schroepfer also provided details on tools, platforms and infrastructure that allow Facebook employees to incorporate AI into their projects:

  • FBLearner Flow: He described this as the “backbone of AI-based product development at Facebook,” saying it makes AI so easy to use that 70 percent of the social network’s employees who are using the platform are not AI experts.
  • AutoML: Infrastructure that allows engineers to optimize new AI models using existing AI.
  • Lumos: A “self-serve platform” that allows teams at Facebook to “harness the power of computer vision for their products and services without the need for prior expertise.”

Readers: What are your initial impressions of Caffe2Go?