A new browser extension will read any article aloud to you with a passable imitation of human cadence and intonation.
Podcastle, the company behind it, built the tool on Google DeepMind’s WaveNet technology, which uses a neural network trained on hours of human voices. The extension was also tuned to pick up on the tone of each piece it reads and speak with correspondent emotion, according to founder Artavazd Yeritsyan, who is also the vp of engineering at visual editing studio PicsArt.
Yeritsyan started the extension as a tool for his own use listening to articles throughout busy days. “I could not listen to any existing text-to-speech long enough because there is a kind of robotic voice. And it doesn’t consider like the emotions and the context of what the writer is saying,” Yeritsyan said. “They’re very, very neutral about it, and you cannot understand if it’s sad news or it’s very exciting news or what happened.”
But his vision has grown since then. The extension will eventually only play a small role in what Yeritsyan says will be primarily a podcast production platform giving creators access to their own AI text-to-speech capabilities. The founder claims he is currently in talks with investors about funding.
“We are trying to power everyone to be able to make their voice heard, to remove the barrier for people to create the podcasts,” he said. “There are a lot of people that don’t have the necessary skills to create a podcast. So we are going to give them the ability to create podcasts just using text so that people and companies can buy a branded voice. Just put your text there and convert it to the podcast.”
The fledgling startup is riding a wave of new ventures built on research allowing AI voice to sound more human than ever. In April, McClatchy newspapers signed a deal with another AI text-to-speech firm, Trinity Audio, to embed AI narrations of articles above each story with similar emotive qualities. Google has also built the tech into a system called Duplex that calls restaurants and hair salons and makes reservations on a user’s behalf. And Microsoft researchers recently proved AI can convincingly sing lyrics from a page.
All this new tech also comes as media companies like The New York Times are doubling down on audio strategies. The paper acquired audio narration platform Audm in March and has frequently rolled out versions of its articles narrated by voice actors since.