Remapping auditory-motor representations in voice production. (33/576)

Evidence regarding visually guided limb movements suggests that the motor system learns and maintains neural maps between motor commands and sensory feedback. Such systems are hypothesized to be used in a feed-forward control strategy that permits precision and stability without the delays of direct feedback control. Human vocalizations involve precise control over vocal and respiratory muscles. However, little is known about the sensorimotor representations underlying speech production. Here, we manipulated the heard fundamental frequency of the voice during speech to demonstrate learning of auditory-motor maps. Mandarin speakers repeatedly produced words with specific pitch patterns (tone categories). On each successive utterance, the frequency of their auditory feedback was increased by 1/100 of a semitone until they heard their feedback one full semitone above their true pitch. Subjects automatically compensated for these changes by lowering their vocal pitch. When feedback was unexpectedly returned to normal, speakers significantly increased the pitch of their productions beyond their initial baseline frequency. This adaptation was found to generalize to the production of another tone category. However, results indicate that a more robust adaptation was produced for the tone that was spoken during feedback alteration. The immediate aftereffects suggest a global remapping of the auditory-motor relationship after an extremely brief training period. However, this learning does not represent a complete transformation of the mapping; rather, it is in part target dependent.  (+info)

Seeing speech affects acoustic information processing in the human brainstem. (34/576)

Afferent auditory processing in the human brainstem is often assumed to be determined by acoustic stimulus features alone and immune to stimulation by other senses or cognitive factors. In contrast, we show that lipreading during speech perception influences early acoustic processing. Event-related brainstem potentials were recorded from ten healthy adults to concordant (acoustic-visual match), conflicting (acoustic-visual mismatch) and unimodal stimuli. Audiovisual (AV) interactions occurred as early as approximately 11 ms post-acoustic stimulation and persisted for the first 30 ms of the response. Furthermore, the magnitude of interaction depended on AV pairings. These findings indicate considerable plasticity in early auditory processing.  (+info)

Acoustic characteristics of the vowel systems of six regional varieties of American English. (35/576)

Previous research by speech scientists on the acoustic characteristics of American English vowel systems has typically focused on a single regional variety, despite decades of sociolinguistic research demonstrating the extent of regional phonological variation in the United States. In the present study, acoustic measures of duration and first and second formant frequencies were obtained from five repetitions of 11 different vowels produced by 48 talkers representing both genders and six regional varieties of American English. Results revealed consistent variation due to region of origin, particularly with respect to the production of low vowels and high back vowels. The Northern talkers produced shifted low vowels consistent with the Northern Cities Chain Shift, the Southern talkers produced fronted back vowels consistent with the Southern Vowel Shift, and the New England, Midland, and Western talkers produced the low back vowel merger. These findings indicate that the vowel systems of American English are better characterized in terms of the region of origin of the talkers than in terms of a single set of idealized acoustic-phonetic baselines of "General" American English and provide benchmark data for six regional varieties.  (+info)

Production and perception of clear speech in Croatian and English. (36/576)

Previous research has established that naturally produced English clear speech is more intelligible than English conversational speech. The major goal of this paper was to establish the presence of the clear speech effect in production and perception of a language other than English, namely Croatian. A systematic investigation of the conversational-to-clear speech transformations across languages with different phonological properties (e.g., large versus small vowel inventory) can provide a window into the interaction of general auditory-perceptual and phonological, structural factors that contribute to the high intelligibility of clear speech. The results of this study showed that naturally produced clear speech is a distinct, listener-oriented, intelligibility-enhancing mode of speech production in both languages. Furthermore, the acoustic-phonetic features of the conversational-to-clear speech transformation revealed cross-language similarities in clear speech production strategies. In both languages, talkers exhibited a decrease in speaking rate and an increase in pitch range, as well as an expansion of the vowel space. Notably, the findings of this study showed equivalent vowel space expansion in English and Croatian clear speech, despite the difference in vowel inventory size across the two languages, suggesting that the extent of vowel contrast enhancement in hyperarticulated clear speech is independent of vowel inventory size.  (+info)

Intelligibilities of 1-octave rectangular bands spanning the speech spectrum when heard separately and paired. (37/576)

There is a need, both for speech theory and for many practical applications, to know the intelligibilities of individual passbands that span the speech spectrum when they are heard singly and in combination. While indirect procedures have been employed for estimating passband intelligibilities (e.g., the Speech Intelligibility Index), direct measurements have been blocked by the confounding contributions from transition band slopes that accompany filtering. A recent study has reported that slopes of several thousand dBA/octave produced by high-order finite impulse response filtering were required to produce the effectively rectangular bands necessary to eliminate appreciable contributions from transition bands [Warren et al., J. Acoust. Soc. Am. 115, 1292-1295 (2004)]. Using such essentially vertical slopes, the present study employed sentences, and reports the intelligibilities of their six 1-octave contiguous passbands having center frequencies from 0.25 to 8 kHz when heard alone, and for each of their 15 possible pairings.  (+info)

Discrimination of speaker size from syllable phrases. (38/576)

The length of the vocal tract is correlated with speaker size and, so, speech sounds have information about the size of the speaker in a form that is interpretable by the listener. A wide range of different vocal tract lengths exist in the population and humans are able to distinguish speaker size from the speech. Smith et al. [J. Acoust. Soc. Am. 117, 305-318 (2005)] presented vowel sounds to listeners and showed that the ability to discriminate speaker size extends beyond the normal range of speaker sizes which suggests that information about the size and shape of the vocal tract is segregated automatically at an early stage in the processing. This paper reports an extension of the size discrimination research using a much larger set of speech sounds, namely, 180 consonant-vowel and vowel-consonant syllables. Despite the pronounced increase in stimulus variability, there was actually an improvement in discrimination performance over that supported by vowel sounds alone. Performance with vowel-consonant syllables was slightly better than with consonant-vowel syllables. These results support the hypothesis that information about the length of the vocal tract is segregated at an early stage in auditory processing.  (+info)

Comparative analysis of perceptual evaluation, acoustic analysis and indirect laryngoscopy for vocal assessment of a population with vocal complaint. (39/576)

As a result of technology evolution and development, methods of voice evaluation have changed both in medical and speech and language pathology practice. AIM: To relate the results of perceptual evaluation, acoustic analysis and medical evaluation in the diagnosis of vocal and/or laryngeal affections of the population with vocal complaint. STUDY DESIGN: Clinical prospective. MATERIAL AND METHOD: 29 people that attended vocal health protection campaign were evaluated. They were submitted to perceptual evaluation (AFPA), acoustic analysis (AA), indirect laryngoscopy (LI) and telelaryngoscopy (TL). RESULTS: Correlations between medical and speech language pathology evaluation methods were established, verifying possible statistical signification with the application of Fischer Exact Test. There were statistically significant results in the correlation between AFPA and LI, AFPA and TL, LI and TL. CONCLUSION: This research study conducted in a vocal health protection campaign presented correlations between speech language pathology evaluation and perceptual evaluation and clinical evaluation, as well as between vocal affection and/or laryngeal medical exams.  (+info)

Women use voice parameters to assess men's characteristics. (40/576)

The purpose of this study was: (i) to provide additional evidence regarding the existence of human voice parameters, which could be reliable indicators of a speaker's physical characteristics and (ii) to examine the ability of listeners to judge voice pleasantness and a speaker's characteristics from speech samples. We recorded 26 men enunciating five vowels. Voices were played to 102 female judges who were asked to assess vocal attractiveness and speakers' age, height and weight. Statistical analyses were used to determine: (i) which physical component predicted which vocal component and (ii) which vocal component predicted which judgment. We found that men with low-frequency formants and small formant dispersion tended to be older, taller and tended to have a high level of testosterone. Female listeners were consistent in their pleasantness judgment and in their height, weight and age estimates. Pleasantness judgments were based mainly on intonation. Female listeners were able to correctly estimate age by using formant components. They were able to estimate weight but we could not explain which acoustic parameters they used. However, female listeners were not able to estimate height, possibly because they used intonation incorrectly. Our study confirms that in all mammal species examined thus far, including humans, formant components can provide a relatively accurate indication of a vocalizing individual's characteristics. Human listeners have the necessary information at their disposal; however, they do not necessarily use it.  (+info)