Acoustic and Perceptual Evaluation of the Quality of Radio-Transmitted Speech
Degree GrantorUniversity of Canterbury
Degree NameMaster of Audiology
Aim When speech signals are transmitted via radio, the process of transmission may add noise to the signal of interest. This study aims to examine the effect of radio transmission on the quality of speech signals transmitted using a combined acoustic and perceptual approach. Method A standard acoustic recording of the Phonetically Balanced Kindergarten (PBK) word list read by a male speaker was played back in three conditions, one without radio transmission and two with two types of radio transmission. The vowel segments (/i, a, o, u/) embedded in the original and the re-recorded signals were analysed to yield measures of frequency loci of the first two formant frequencies (F1 and F2), amplitude difference between the first two harmonics (H1-H2), and singing power ratio (SPR). Other measures included Spectral Moment One (mean), Spectral Moment Two (variance), and the energy ratio between consonant and vowel (CV energy ratio). To examine how H1-H2 and SPR were related to the perception of vowel intelligibility and clarity, vowels at five levels of each of these two measures were selected as stimuli in the perceptual study. The auditory stimuli were presented to 20 normal hearing listeners, including 10 males and 10 females aged between 21 to 42 years, the listeners were asked to identify the vowel for each vowel stimulus in the vowel identification task and judge from a contrast pair which vowel sounded “clearer” in the clarity discrimination task. A follow-up study using vowel stimuli with a constant length and five H1-H2 or five SPR levels was conducted on five listeners to determine the relationship between the perception of speech clarity and H1-H2 or SPR. Results Results from a series of one-way or two-way analyses of variance (ANOVAs) or ANOVAs on Ranks and post-hoc test revealed that radio transmission had a significant effect on all of the selected acoustic measures except for the CV energy ratio. Signal degeneration due to radio transmission is characterized by changes of F1 or F2 frequencies toward a more compressed vowel space, a H1-H2 value indicating an increase of H1 dominance, a SPR value suggestive of an increase in the energy around the 2-4 kHz region, and a loss of differentiation between /s/ and /sh/ on the measures of Spectral Moments One and Two. Vowel duration was also found to play a major role in affecting the perception of vowel intelligibility and clarity. The follow-up study, with a control on vowel duration, found that SPR played a role in affecting the perception of vowel intelligibility and clarity. Conclusion It was concluded from the findings that measures of energy ratio between different frequency regions, as well as the frequencies of the first two formant frequencies, were sensitive in detecting the effect of radio transmission.