AI, Not User Experience, Is Holding Voice Back From Its Full Potential

Artificial intelligence is just scratching the surface for what’s needed

Every time I sign into any social media platform, someone has an opinion on the future of voice. It’s either the future we all deserve from the technology at our fingertips or it’s another passing fad that will come and go. The overwhelming conversation is the issue with the user experience around voice. How it’s too public, too confusing, too nuanced, a general feeling that people will not take to it. It’s too futuristic and intimidating for the average person to interact with regularly. All of those points might very well be true, but I take issue with the reason behind why we accept that.

Throughout the digital evolution, we have pointed to a myriad of technologies that would “never be adopted” for whatever reasoning at the time. The fact is that at every turn, there have been challenges with the UX because humans are, by condition, bad at change. The thing that overcomes those challenges is function, which is the real issue right now in voice.

Voice is the most natural method with which modern humans communicate, so much so that we often speak to ourselves to get to a resolution. But the impetus behind using voice is the speed, ease and accuracy with which we can access information. Not to mention the added benefit of being able to multi-task with the devices that are permanently glued to our hands. So while we prophesize on all the reasons that voice will or won’t work, I implore us all to look at the true reason for its success or failure: artificial intelligence.

The level of true AI available to these interfaces is just scratching the surface of what would need to be available in order to unlock the actual potential of the concept.

Present-day zero-UI technology is challenging at best. Every morning, I ask Siri to tell me the weather and my calendar for the day with varying degrees of success. Sometimes it works, sometimes it doesn’t. But I do it every morning anyway because it allows me the opportunity to do other things while ingesting information I need to prepare for the day. But that’s about where the usefulness ends. The limitations of what Siri knows stops me from using it. If I could ask it more, about what those meetings were, who was in attendance, specifics about the weather, I would. But if I did any of those things, I would get the same answer as with the initial ask because Siri’s ability to parse information at that level of granularity does not exist.

The true power behind zero UI is its ability to not only detect but understand language. That’s not a user issue, that’s a technology issue. The level of true AI available to these interfaces is just scratching the surface of what would need to be available in order to unlock the actual potential of the concept. Today we can give specific, quarantined direction for linear responses. If the promise of AI is real, as it becomes more conversational, the ability to actually speak to machines, leverage the access those machines have and put that information to use is limitless. And as the human brain evolves to be able to multitask in ways that would have been unimaginable to even our recent ancestors, the opportunity for true innovation and creation expands exponentially.

As we often do in technology and media, we are focused on the wrong thing. Our users/customers/clients are highly capable of adapting, but there needs to be a reason behind that change that’s not “this thing does something that I can do already, moderately better, sometimes.” Let’s not fall in love with the idea of something before we fully understand what its true potential and current limitations are. It is my personal belief that the future of voice is real, so let’s not ruin it now before it had its chance to develop.