Text-to-speech is a prolific and extremely useful technology, but it doesn’t burst into tears often enough while reading to you, does it? There’s a fix for that from Sonantic, a company that claims to have invented “the world’s first AI capable of crying.” Finally! AI can be just as sad as we all are.
Okay, the AI isn’t actually sad, it’s engaging in a text-to-speech process that doesn’t just read the words you give it, but simulates the emotion of an acting performance. I guess in an age of CGI and Deepfakes, computer-simulated voice acting was next troubling item on the list.
“The aim of the company is to really capture this deep emotion using machine learning,” says Felix Vaughhan, deep learning researcher at Sonantic. “And the first thing we focused on was sadness.”
You can see it for yourself in the video below. According to Sonantic, the voices of the mother and daughter in the video “are entirely computer generated.” Check it out. It’s pretty wild.
The video also includes some of Sonantic’s creators, who are unfortunately pretty darn cagey about explaining how it all works. The process does involve real human actors who help build Sonantic’s artificial voices, one of whom is also shown in the video. Actors who partner with Sonantic “can earn passive income when clients around the world use their synthetic voice within commercially released projects,” according to the website.
Users, meanwhile, will be able to import a script, choose from a selection of “voice models” to perform the dialogue, and swap between different voices with “just a few clicks.” You’ll be able to “direct” the AI by adjusting its performance for more or less emotion, projection, pacing, and other tweakable settings.
While the technology does seem pretty neat, there’s also something kinda icky about it, because this isn’t how acting or directing works. Actors aren’t a bundle of sliders and knobs, and directing a performance isn’t done by tweaking a few settings. I definitely understand the appeal for game developers to be able to change a few lines of dialogue at the last minute or adjust the tone of a performance, but acting is, y’know, an art form. It’s weird to see it boiled down to assigning a number to the ‘Emotion’ meter on a website.
But this is our weird, troubling future and it’s clear we’re all going to be replaced by computers eventually. Maybe a computer is writing this article. Maybe a computer is reading it, too. There’s no way to tell anymore.