Mark Zuckerberg Introduces Jarvis, His 2016 Personal Challenge

How much progress has Facebook co-founder and CEO Mark Zuckerberg made in fulfilling his New Year’s resolution for 2016?

How much progress has Facebook co-founder and CEO Mark Zuckerberg made in fulfilling his New Year’s resolution for 2016?

Zuckerberg said in a Jan. 3 Facebook post that his personal challenge for 2016 was to use artificial intelligence to create a personal assistant, which he described as his own version of Jarvis from Iron Man. On Monday, he offered a detailed update on Jarvis, and highlights follow.

He provided the following overview of Jarvis:

So far this year, I’ve built a simple AI that I can talk to on my phone and computer; that can control my home, including lights, temperature, appliances, music and security; that learns my tastes and patterns; that can learn new words and concepts; and that can even entertain Max. It uses several artificial intelligence techniques, including natural language processing, speech recognition, face recognition and reinforcement learning, written in Python, PHP and Objective C. In this note, I’ll explain what I built and what I learned along the way.


Zuckerberg explained the challenges of connecting his home to Jarvis:

Before I could build any AI, I first needed to write code to connect these systems, which all speak different languages and protocols. We use a Crestron system with our lights, thermostat and doors; a Sonos system with Spotify for music; a Samsung TV; a Nest cam for Max; and of course my work is connected to Facebook’s systems. I had to reverse-engineer application-programming interfaces for some of these to even get to the point where I could issue a command from my computer to turn the lights on or get a song to play.

Further, most appliances aren’t even connected to the internet yet. It’s possible to control some of these using internet-connected power switches that let you turn the power on and off remotely. But often that isn’t enough. For example, one thing I learned is it’s hard to find a toaster that will let you push the bread down while it’s powered off so you can automatically start toasting when the power goes on. I ended up finding an old toaster from the 1950s and rigging it up with a connected switch. Similarly, I found that connecting a food dispenser for Beast or a grey T-shirt cannon would require hardware modifications to work.

On Jarvis’ use of facial-recognition technology, he wrote:

To do this, I installed a few cameras at my door that can capture images from all angles. AI systems today cannot identify people from the back of their heads, so having a few angles ensures that we see the person’s face. I built a simple server that continuously watches the cameras and runs a two-step process: First, it runs face detection to see if any person has come into view, and second, if it finds a face, then it runs face recognition to identify who the person is. Once it identifies the person, it checks a list to confirm I’m expecting that person, and if I am, then it will let them in and tell me they’re here.


Zuckerberg also created a Messenger bot for Jarvis, and he wrote:

I can text anything to my Jarvis bot, and it will instantly be relayed to my Jarvis server and processed. I can also send audio clips and the server can translate them into text and then execute those commands. In the middle of the day, if someone arrives at my home, Jarvis can text me an image and tell me who’s there, or it can text me when I need to go do something.

One thing that surprised me about my communication with Jarvis is that when I have the choice of either speaking or texting, I text much more than I would have expected. This is for a number of reasons, but mostly, it feels less disturbing to people around me. If I’m doing something that relates to them, like playing music for all of us, then speaking feels fine, but most of the time, text feels more appropriate. Similarly, when Jarvis communicates with me, I’d much rather receive that over text message than voice. That’s because voice can be disruptive and text gives you more control of when you want to look at it. Even when I speak to Jarvis, if I’m using my phone, I often prefer it to text or display its response.