Interarticulator programming in VCV sequences: lip and tongue movements.
This study examined the temporal phasing of tongue and lip movements in vowel-consonant-vowel sequences where the consonant is a bilabial stop consonant /p, b/ and the vowels one of /i, a, u/; only asymmetrical vowel contexts were included in the analysis. Four subjects participated. Articulatory movements were recorded using a magnetometer system. The onset of the tongue movement from the first to the second vowel almost always occurred before the oral closure. Most of the tongue movement trajectory from the first to the second vowel took place during the oral closure for the stop. For all subjects, the onset of the tongue movement occurred earlier with respect to the onset of the lip closing movement as the tongue movement trajectory increased. The influence of consonant voicing and vowel context on interarticulator timing and tongue movement kinematics varied across subjects. Overall, the results are compatible with the hypothesis that there is a temporal window before the oral closure for the stop during which the tongue movement can start. A very early onset of the tongue movement relative to the stop closure together with an extensive movement before the closure would most likely produce an extra vowel sound before the closure. (+info)
Exchange of stuttering from function words to content words with age.
Dysfluencies on function words in the speech of people who stutter mainly occur when function words precede, rather than follow, content words (Au-Yeung, Howell, & Pilgrim, 1998). It is hypothesized that such function word dysfluencies occur when the plan for the subsequent content word is not ready for execution. Repetition and hesitation on the function words buys time to complete the plan for the content word. Stuttering arises when speakers abandon the use of this delaying strategy and carry on, attempting production of the subsequent, partly prepared content word. To test these hypotheses, the relationship between dysfluency on function and content words was investigated in the spontaneous speech of 51 people who stutter and 68 people who do not stutter. These participants were subdivided into the following age groups: 2-6-year-olds, 7-9-year-olds, 10-12-year-olds, teenagers (13-18 years), and adults (20-40 years). Very few dysfluencies occurred for either fluency group on function words that occupied a position after a content word. For both fluency groups, dysfluency within each phonological word occurred predominantly on either the function word preceding the content word or on the content word itself, but not both. Fluent speakers had a higher percentage of dysfluency on initial function words than content words. Whether dysfluency occurred on initial function words or content words changed over age groups for speakers who stutter. For the 2-6-year-old speakers that stutter, there was a higher percentage of dysfluencies on initial function words than content words. In subsequent age groups, dysfluency decreased on function words and increased on content words. These data are interpreted as suggesting that fluent speakers use repetition of function words to delay production of the subsequent content words, whereas people who stutter carry on and attempt a content word on the basis of an incomplete plan. (+info)
PET imaging of cochlear-implant and normal-hearing subjects listening to speech and nonspeech.
Functional neuroimaging with positron emission tomography (PET) was used to compare the brain activation patterns of normal-hearing (NH) with postlingually deaf, cochlear-implant (CI) subjects listening to speech and nonspeech signals. The speech stimuli were derived from test batteries for assessing speech-perception performance of hearing-impaired subjects with different sensory aids. Subjects were scanned while passively listening to monaural (right ear) stimuli in five conditions: Silent Baseline, Word, Sentence, Time-reversed Sentence, and Multitalker Babble. Both groups showed bilateral activation in superior and middle temporal gyri to speech and backward speech. However, group differences were observed in the Sentence compared to Silence condition. CI subjects showed more activated foci in right temporal regions, where lateralized mechanisms for prosodic (pitch) processing have been well established; NH subjects showed a focus in the left inferior frontal gyrus (Brodmann's area 47), where semantic processing has been implicated. Multitalker Babble activated auditory temporal regions in the CI group only. Whereas NH listeners probably habituated to this multitalker babble, the CI listeners may be using a perceptual strategy that emphasizes 'coarse' coding to perceive this stimulus globally as speechlike. The group differences provide the first neuroimaging evidence suggesting that postlingually deaf CI and NH subjects may engage differing perceptual processing strategies under certain speech conditions. (+info)
Phonotactics, neighborhood activation, and lexical access for spoken words.
Probabilistic phonotactics refers to the relative frequencies of segments and sequences of segments in spoken words. Neighborhood density refers to the number of words that are phonologically similar to a given word. Despite a positive correlation between phonotactic probability and neighborhood density, nonsense words with high probability segments and sequences are responded to more quickly than nonsense words with low probability segments and sequences, whereas real words occurring in dense similarity neighborhoods are responded to more slowly than real words occurring in sparse similarity neighborhoods. This contradiction may be resolved by hypothesizing that effects of probabilistic phonotactics have a sublexical focus and that effects of similarity neighborhood density have a lexical focus. The implications of this hypothesis for models of spoken word recognition are discussed. (+info)
Specialization of left auditory cortex for speech perception in man depends on temporal coding.
Speech perception requires cortical mechanisms capable of analysing and encoding successive spectral (frequency) changes in the acoustic signal. To study temporal speech processing in the human auditory cortex, we recorded intracerebral evoked potentials to syllables in right and left human auditory cortices including Heschl's gyrus (HG), planum temporale (PT) and the posterior part of superior temporal gyrus (area 22). Natural voiced /ba/, /da/, /ga/) and voiceless (/pa/, /ta/, /ka/) syllables, spoken by a native French speaker, were used to study the processing of a specific temporally based acoustico-phonetic feature, the voice onset time (VOT). This acoustic feature is present in nearly all languages, and it is the VOT that provides the basis for the perceptual distinction between voiced and voiceless consonants. The present results show a lateralized processing of acoustic elements of syllables. First, processing of voiced and voiceless syllables is distinct in the left, but not in the right HG and PT. Second, only the evoked potentials in the left HG, and to a lesser extent in PT, reflect a sequential processing of the different components of the syllables. Third, we show that this acoustic temporal processing is not limited to speech sounds but applies also to non-verbal sounds mimicking the temporal structure of the syllable. Fourth, there was no difference between responses to voiced and voiceless syllables in either left or right areas 22. Our data suggest that a single mechanism in the auditory cortex, involved in general (not only speech-specific) temporal processing, may underlie the further processing of verbal (and non-verbal) stimuli. This coding, bilaterally localized in auditory cortex in animals, takes place specifically in the left HG in man. A defect of this mechanism could account for hearing discrimination impairments associated with language disorders. (+info)
Training Japanese listeners to identify English /r/ and /l/: long-term retention of learning in perception and production.
Previous work from our laboratories has shown that monolingual Japanese adults who were given intensive high-variability perceptual training improved in both perception and production of English /r/-/l/ minimal pairs. In this study, we extended those findings by investigating the long-term retention of learning in both perception and production of this difficult non-native contrast. Results showed that 3 months after completion of the perceptual training procedure, the Japanese trainees maintained their improved levels of performance of the perceptual identification task. Furthermore, perceptual evaluations by native American English listeners of the Japanese trainees' pretest, posttest, and 3-month follow-up speech productions showed that the trainees retained their long-term improvements in the general quality, identifiability, and overall intelligibility of their English/r/-/l/ word productions. Taken together, the results provide further support for the efficacy of high-variability laboratory speech sound training procedures, and suggest an optimistic outlook for the application of such procedures for a wide range of "special populations." (+info)
Interarticulator phasing, locus equations, and degree of coarticulation.
A locus equation plots the frequency of the second formant at vowel onset against the target frequency of the same formant for the vowel in a consonant-vowel sequence, across different vowel contexts. It has generally been assumed that the slope of the locus equation reflects the degree of coarticulation between the consonant and the vowel, with a steeper slope showing more coarticulation. This study examined the articulatory basis for this assumption. Four subjects participated and produced VCV sequences of the consonants /b, d, g/ and the vowels /i, a, u/. The movements of the tongue and the lips were recorded using a magnetometer system. One articulatory measure was the temporal phasing between the onset of the lip closing movement for the bilabial consonant and the onset of the tongue movement from the first to the second vowel in a VCV sequence. A second measure was the magnitude of the tongue movement during the oral stop closure, averaged across four receivers on the tongue. A third measure was the magnitude of the tongue movement from the onset of the second vowel to the tongue position for that vowel. When compared with the corresponding locus equations, no measure showed any support for the assumption that the slope serves as an index of the degree of coarticulation between the consonant and the vowel. (+info)
Recognition of spoken words by native and non-native listeners: talker-, listener-, and item-related factors.
In order to gain insight into the interplay between the talker-, listener-, and item-related factors that influence speech perception, a large multi-talker database of digitally recorded spoken words was developed, and was then submitted to intelligibility tests with multiple listeners. Ten talkers produced two lists of words at three speaking rates. One list contained lexically "easy" words (words with few phonetically similar sounding "neighbors" with which they could be confused), and the other list contained lexically "hard" words (words with many phonetically similar sounding "neighbors"). An analysis of the intelligibility data obtained with native speakers of English (experiment 1) showed a strong effect of lexical similarity. Easy words had higher intelligibility scores than hard words. A strong effect of speaking rate was also found whereby slow and medium rate words had higher intelligibility scores than fast rate words. Finally, a relationship was also observed between the various stimulus factors whereby the perceptual difficulties imposed by one factor, such as a hard word spoken at a fast rate, could be overcome by the advantage gained through the listener's experience and familiarity with the speech of a particular talker. In experiment 2, the investigation was extended to another listener population, namely, non-native listeners. Results showed that the ability to take advantage of surface phonetic information, such as a consistent talker across items, is a perceptual skill that transfers easily from first to second language perception. However, non-native listeners had particular difficulty with lexically hard words even when familiarity with the items was controlled, suggesting that non-native word recognition may be compromised when fine phonetic discrimination at the segmental level is required. Taken together, the results of this study provide insight into the signal-dependent and signal-independent factors that influence spoken language processing in native and non-native listeners. (+info)