Personalizing synthetic voices for people with progressive speech disorders: judging voice similarity

S.M. Creer, S.P. Cunningham, P.D. Green, Kaniz Fatema

Research output: Chapter in Book/Published conference outputConference publication

Abstract

In building personalized synthetic voices for people with speech disorders, the output should capture the individual's vocal identity. This paper reports a listener judgment experiment on the similarity of Hidden Markov Model based synthetic voices using varying amounts of adaptation data to two non-impaired speakers. We conclude that around 100 sentences of data is needed to build a voice that retains the characteristics of the target speaker but using more data improves the voice. Experiments using Multi-Layer Perceptrons (MLPs) are conducted to find which acoustic features contribute to the similarity judgments. Results show that melcepstral distortion and fraction of voicing agreement contribute most to replicating the similarity judgment but the combination of all features is required for accurate prediction. Ongoing work applies the findings to voice building for people with impaired speech.
Original languageEnglish
Title of host publication10th annual conference of the International Speech Communication Association INTERSPEECH 2009, Brighton, UK.
Publication statusPublished - 2009

Fingerprint

Dive into the research topics of 'Personalizing synthetic voices for people with progressive speech disorders: judging voice similarity'. Together they form a unique fingerprint.

Cite this