Magnetoencephalographic evidence for a precise forward model in speech production. (49/576)

Electroencephalography and magnetoencephalography studies have shown that auditory cortical responses to self-produced speech are attenuated when compared with responses to tape-recorded speech, but that this attenuation disappears if auditory feedback is altered. These results suggest that auditory feedback during speaking is processed by comparing the feedback with its internal prediction. The present study used magnetoencephalography to investigate the precision of this matching process. Auditory responses to speech feedback were recorded under altered feedback conditions. During speech production, the M100 amplitude was maximally reduced to the participants' own unaltered voice feedback, relative to pitch-shifted and alien speech feedback. This suggests that the feedback comparison process may be very precise, allowing the auditory system to distinguish between internal and external sources of auditory information.  (+info)

Predictions of formant-frequency discrimination in noise based on model auditory-nerve responses. (50/576)

To better understand how the auditory system extracts speech signals in the presence of noise, discrimination thresholds for the second formant frequency were predicted with simulations of auditory-nerve responses. These predictions employed either average-rate information or combined rate and timing information, and either populations of model fibers tuned across a wide range of frequencies or a subset of fibers tuned to a restricted frequency range. In general, combined temporal and rate information for a small population of model fibers tuned near the formant frequency was most successful in replicating the trends reported in behavioral data for formant-frequency discrimination. To explore the nature of the temporal information that contributed to these results, predictions based on model auditory-nerve responses were compared to predictions based on the average rates of a population of cross-frequency coincidence detectors. These comparisons suggested that average response rate (count) of cross-frequency coincidence detectors did not effectively extract important temporal information from the auditory-nerve population response. Thus, the relative timing of action potentials across auditory-nerve fibers tuned to different frequencies was not the aspect of the temporal information that produced the trends in formant-frequency discrimination thresholds.  (+info)

How far, how long: on the temporal scope of prosodic boundary effects. (51/576)

Acoustic lengthening at prosodic boundaries is well explored, and the articulatory bases for this lengthening are becoming better understood. However, the temporal scope of prosodic boundary effects has not been examined in the articulatory domain. The few acoustic studies examining the distribution of lengthening indicate that boundary effects extend from one to three syllables before the boundary, and that effects diminish as distance from the boundary increases. This diminishment is consistent with the pi-gesture model of prosodic influence [Byrd and Saltzman, J. Phonetics 31, 149-180 (2003)]. The present experiment tests the preboundary and postboundary scope of articulatory lengthening at an intonational phrase boundary. Movement-tracking data are used to evaluate durations of consonant closing and opening movements, acceleration durations, and consonant spatial magnitude. Results indicate that prosodic boundary effects exist locally near the phrase boundary in both directions, diminishing in magnitude more remotely for those subjects who exhibit extended effects. Small postboundary effects that are compensatory in direction are also observed.  (+info)

Somatosensory precision in speech production. (52/576)

Speech production is dependent on both auditory and somatosensory feedback. Although audition may appear to be the dominant sensory modality in speech production, somatosensory information plays a role that extends from brainstem responses to cortical control. Accordingly, the motor commands that underlie speech movements may have somatosensory as well as auditory goals. Here we provide evidence that, independent of the acoustics, somatosensory information is central to achieving the precision requirements of speech movements. We were able to dissociate auditory and somatosensory feedback by using a robotic device that altered the jaw's motion path, and hence proprioception, without affecting speech acoustics. The loads were designed to target either the consonant- or vowel-related portion of an utterance because these are the major sound categories in speech. We found that, even in the absence of any effect on the acoustics, with learning subjects corrected to an equal extent for both kinds of loads. This finding suggests that there are comparable somatosensory precision requirements for both kinds of speech sounds. We provide experimental evidence that the neural control of stiffness or impedance--the resistance to displacement--provides for somatosensory precision in speech production.  (+info)

Children hear the forest. (53/576)

How do children develop the ability to recognize phonetic structure in their native language with the accuracy and efficiency of adults? In particular, how do children learn what information in speech signals is relevant to linguistic structure in their native language, and what information is not? These questions are the focus of considerable investigation, including several studies by Catherine Mayo and Alice Turk. In a proposed Letter by Mayo and Turk, the comparative role of the isolated consonant-vowel formant transition in children's and adults' speech perception was questioned. Although Mayo and Turk ultimately decided to withdraw their letter, this note, originally written as a reply to their letter, was retained. It highlights the fact that the isolated formant transition must be viewed as part of a more global aspect of structurein the acoustic speech stream, one that arises from the rather slowly changing adjustments made invocal-tract geometry. Only by maintaining this perspective of acoustic speech structure can we ensure that we design stimuli that provide valid tests of our hypotheses and interpret results in a meaningful way.  (+info)

Speech production: the force of your words. (54/576)

Research on speech production has traditionally focused on how acoustic goals are met. A recent study shows that talking also involves somatosensory goals that do not have any acoustic consequences.  (+info)

The mean matters: effects of statistically defined nonspeech spectral distributions on speech categorization. (55/576)

Adjacent speech, and even nonspeech, contexts influence phonetic categorization. Four experiments investigated how preceding sequences of sine-wave tones influence phonetic categorization. This experimental paradigm provides a means of investigating the statistical regularities of acoustic events that influence online speech categorization and, reciprocally, reveals regularities of the sound environment tracked by auditory processing. The tones comprising the sequences were drawn from distributions sampling different acoustic frequencies. Results indicate that whereas the mean of the distributions predicts contrastive shifts in speech categorization, variability of the distributions has little effect. Moreover, speech categorization is influenced by the global mean of the tone sequence, without significant influence of local statistical regularities within the tone sequence. Further arguing that the effect is strongly related to the average spectrum of the sequence, notched noise spectral complements of the tone sequences produce a complementary effect on speech categorization. Lastly, these effects are modulated by the number of tones in the acoustic history and the overall duration of the sequence, but not by the density with which the distribution defining the sequence is sampled. Results are discussed in light of stimulus-specific adaptation to statistical regularity in the acoustic input and a speculative link to talker normalization is postulated.  (+info)

Perceptual adaptation to spectrally shifted vowels: training with nonlexical labels. (56/576)

Although normal-hearing (NH) and cochlear implant (CI) listeners are able to adapt to spectrally shifted speech to some degree, auditory training has been shown to provide more complete and/or accelerated adaptation. However, it is unclear whether listeners use auditory and visual feedback to improve discrimination of speech stimuli, or to learn the identity of speech stimuli. The present study investigated the effects of training with lexical and nonlexical labels on NH listeners' perceptual adaptation to spectrally degraded and spectrally shifted vowels. An eight-channel sine wave vocoder was used to simulate CI speech processing. Two degrees of spectral shift (moderate and severe shift) were studied with three training paradigms, including training with lexical labels (i.e., "hayed," "had," "who'd," etc.), training with nonlexical labels (i.e., randomly assigned letters "f," "b," "g," etc.), and repeated testing with lexical labels (i.e., "test-only" paradigm without feedback). All training and testing was conducted over 5 consecutive days, with two to four training exercises per day. Results showed that with the test-only paradigm, lexically labeled vowel recognition significantly improved for moderately shifted vowels; however, there was no significant improvement for severely shifted vowels. Training with nonlexical labels significantly improved the recognition of nonlexically labeled vowels for both shift conditions; however, this improvement failed to generalize to lexically labeled vowel recognition with severely shifted vowels. Training with lexical labels significantly improved lexically labeled vowel recognition with severely shifted vowels. These results suggest that storage and retrieval of speech patterns in the central nervous system is somewhat robust to tonotopic distortion and spectral degradation. Although training with nonlexical labels may improve discrimination of spectrally distorted peripheral patterns, lexically meaningful feedback is needed to identify these peripheral patterns. The results also suggest that training with lexically meaningful feedback may be beneficial to CI users, especially patients with shallow electrode insertion depths.  (+info)