‘Never Heard Me Before’

December 9, 2013

“Never heard me before.” That’s what William, a 9-​​year-​​old boy with a speech-​​language dis­order, said the first time he used the pros­thetic voice that North­eastern asso­ciate pro­fessor Rupal Patel made just for him.

In San Fran­cisco on Thursday, Patel, who has joint appoint­ments in the Bouvé Col­lege of Health Sci­ence and the Col­lege of Com­puter and Infor­ma­tion Sci­ence, shared William’s story with thou­sands of viewers at TED­Women, a four-​​day con­fer­ence orga­nized by the non­profit orga­ni­za­tion TED, devoted to “Ideas Worth Spreading.” This year’s TED­Women focused on inven­tion in all its forms. Patel’s talk was livestreamed for a group of North­eastern stu­dents, fac­ulty, and staff gath­ered at the Behrakis Center.

There are 2.5 mil­lion Amer­i­cans like William who are unable to speak, Patel told the audi­ence, and many of them use the same com­put­er­ized voice to com­mu­ni­cate. “That’s mil­lions of people world­wide who are using generic voices,” she said.

So much of our per­son­ality is con­tained in our voice, Patel explained. Even though people with speech-​​language dis­or­ders retain the ability to con­trol that ele­ment of speech that is crit­ical for deter­mining indi­vid­u­ality, a grown man may still have the same pros­thetic voice as a young girl.

Through a project launched simul­ta­ne­ously with her TED­Women talk, Patel is trying to change that. She and her team at Northastern’s Com­mu­ni­ca­tion Analysis and Design Lab­o­ra­tory have devel­oped a tech­nology called VocaliD that allows them to create pros­thetic voices that sound like the people with the speech impair­ments they were designed for. As William’s mother put it, “This is what William would have sounded like had he been able to speak,” Patel told the audience.

To create these voices, VocaliD extracts acoustic prop­er­ties from a target talker’s speech—whatever sounds they can still produce—and applies these fea­tures to a syn­thetic voice that was cre­ated from a sur­ro­gate voice donor who is sim­ilar in traits such as age, size, and gender. What is pro­duced is a syn­thetic voice con­taining as much of the vocal iden­tity of the target talker as pos­sible yet the speech clarity of the sur­ro­gate talker.

By mixing the person’s voice with that of a sur­ro­gate talker who has donated hours’ worth of recorded sen­tences, the team can parse these sen­tences into “small snip­pets of speech” that can be reassem­bled into any other com­bi­na­tion of words.

What hap­pens next has been described by Patel’s own daughter as “mixing colors to paint voices.” William’s vowel sound, for example, acts like a con­cen­trated drop of red food dye. This is then mixed with the recorded speech snip­pets and infuses each of them with his unique vocal identity.

“So far we have a few sur­ro­gate talkers from around the U.S. who have donated their voices,” she said. “We have been using and reusing them to build our first few per­son­al­ized voices. But there’s so much more work to be done.”

With VocaliD​.org, Patel has cre­ated a crowd-​​sourced portal for people around the world to donate their voices to the voiceless.

“We wouldn’t dream of fit­ting a little girl with a pros­thetic limb of a grown man,” Patel said. “So why then the same pros­thetic voice?” With VocaliD, that’s longer nec­es­sary, she said.