Creating and evaluating New Zealand-accented synthesised voices using model talker voice banking technology

Type of content
Theses / Dissertations
Publisher's DOI/URI
Thesis discipline
Speech and Language Sciences
Degree name
Master of Science
Publisher
University of Canterbury
Journal Title
Journal ISSN
Volume Title
Language
English
Date
2018
Authors
Westley, Michelle Beverley
Abstract

Communication, in all its modalities, is an important way for individuals to express themselves and connect with others. An individual’s own voice portrays many aspects of their personality and identity. Individuals who have conditions which reduce the ability to speak using their natural voice face a lack of personalisation and customisation of the synthetic voices available for speech generating devices. This may lead to a decrease in device acceptance. In New Zealand, there are currently no locally-accented voices for speech generating device users. Voice banking is the process of recording one’s voice to create a personalised synthetic voice for use on speech generating device. This study explored the experiences of those who voice bank, and investigated the quality of the resulting voices. Eight healthy adults and two healthy children participated in the ModelTalker voice banking process and completed a questionnaire to gather the voice donors’ perceptions of the experience. Fifteen unfamiliar listeners assessed perceptive aspects of the synthetic voices created. The measures used included the Speech Intelligibility Test, intelligibility and naturalness visual analogue scales, and age and gender identification tasks. Personalised synthetic voices were successfully created using the ModelTalker system. The voice donors reported positive experiences and identified multiple strengths and challenges of the ModelTalker voice banking system, which were consistent with previous literature (Creer, Green, & Cunningham, 2009; Hyppa-Martin, Friese, & Barnes, 2017; Jackson et al., 2017). The synthesised voices were found to have intelligibilities similar to those previously reported for synthetic speech, and age and gender estimations followed patterns reported in the literature (Cerrato, Falcone, & Paoloni, 2000; Jreige, Patel, & Bunnell, 2009; Von Berg, Panorska, Uken, & Qeadan, 2009; Waller, Eriksson, & Sorqvist, 2015). Future directions for this area should include perceptions of voice banking experiences for clinical populations such as those with progressive speech loss. Personalisation of the voice banking process for the New Zealand accent should continue, as should the creation of a fully synthetic te reo Māori voice. The voices created by this study are available for New Zealand speech generating device users who want a locally-accented voice on their device. With the availability of these voices, this thesis has addressed the lack of New Zealand-accented synthetic voices available for speech generating device users.

Description
Citation
Keywords
Ngā upoko tukutuku/Māori subject headings
ANZSRC fields of research
Rights
All Rights Reserved