Download EEG/MEG Study: Analyzing N1 and P2 Responses in AV Stimuli (van Wassenhove et al, 2005) and more Study notes Linguistics in PDF only on Docsity! Using EEG/MEG Testing hypotheses about speech perception - IV van Wassenhove et al (2005) • EEG • 3 experiments • N1/P2 • Implications for models of speech perception van Wassenhove et al (2005) • Experiment 1 (n=16) • Different Blocks: ‣ AV stimuli ([pa], [ta], [ka], McGurk [ta]) ‣ Visual only ([pa], [ta], [ka]) ‣ Audio only ([pa], [ta], [ka]) • Task: What do you hear (AV&A) /see (V)? ‣ Forced choice (pa? ta? ka?) XIU) doll
th
o
Ss
=
3
o
S
5
N1 and P2 Amplitude Reduction for AV stimuli Q: Can this be accounted by superposition of audio + video? N1 and P2 Amplitude Reduction for AV stimuli Q: Can this be accounted by superposition of audio + video? A: Estimate sum of Audio only + Video only and see how much AV deviates from it. Result: AV results cannot be accounted only on the basis of superposition BUT... Potential Confound: anyone? N1 and P2 Amplitude Reduction for AV stimuli Q: Can this be accounted by superposition of audio + video? A: Estimate sum of Audio only + Video only and see how much AV deviates from it. Result: AV results cannot be accounted only on the basis of superposition Potential Confound: In Unimodal V blocks, one always knows whether to expect an auditory stimulus van Wassenhove et al (2005) • Experiment 2 (n=10) • Blocks mixed (control for confound): ‣ AV stimuli ([pa], [ta], [ka], McGurk [ta]) ‣ Visual only ([pa], [ta], [ka]) ‣ Audio only ([pa], [ta], [ka]) • Task: What do you hear (AV&A) /see (V)? ‣ Forced choice (pa? ta? ka?)
What’s an interaction?
STATEMENTS
FAR HUMAN PROXIMITY TO CAT
RR ke
Results experiment 1 & 2 • Comparing A and congruent AV: ‣ Interaction between Modality (A vs AV) and Stimulus Identity (pa vs ta vs ka): latency facilitation + amplitude reduction: p < t < k ‣ place of articulation affects processing differently! - WHY?
time (ms)
Alka] silence silence ' VOT = 85 ms fal silence
A[pa) silence silence VOT = 115 ms fa silence
Ale) silence Silence | VOT = 155 ms fal silence
trigger pulse
Fig. 1. Timing in natural AV speech. Articulatory movements of the face naturally precede the onset of the audio speech signals by a few tens of milliseconds.
The first detectable motion frame dermarks the aspiration preceding the production of the consonantal burst in natural speech. Values are for stimuli that were
used here. The consonantal burst in the audio portion is the “audio onset” and corresponds to the onset or “index zero” in all figures and text unless otherwise
indicated. VOT, voice onset time.
Results experiment 1 & 2 • Information present in V is not equally informative about place of articulation either: ‣ pa correctly id’d 100% ‣ ka correctly id’d only 65% Peak Latency: How early the waveform exhibits a peak Peak Amplitude: “Height” of the peak
15
10
Temporal Facilitation (ms)
Absolute Amplitude Reduction (uV)
65% 75% 85% 95%
Correct Identification iin Visual Speech Alone
Fig. 3. Latency facilitation and amplitude reduction. Shown is the latency
and amplitude difference of N1/P2 in AV syllables as a function of correct
identification in the visual-alone condition (experiments 1 and 2, n = 26). The
latency (Upper) and amplitude (Lower) differences are the latency (or ampli-
tude) values for the A condition minus the latency (or amplitude) for the AV
condition for the N1 (blue) and P2 (red) ERPs. A positive value means that AV
is faster than A. The temporal facilitation of the N1 and P2 increased as the
saliency (correct identification) of visual inputs improved. The amplitude
reduction in AV speech (Lower) remained constant across syllables and is
independent of visual saliency.
Don’t do this at home kids!
Bad graph design.
oO
Amplitude Decrease (uV)
meh
om o
—_
o
Temporal Facilitaion (ms)
ono &
yo Go =}
eo -
———s«~&B 2 Latency Facilitation
congruent AV pa fusion
= McGurk [ta]
P2 Amplitude Reduction
congruent AV pa fusion
Results experiment 1 & 2 • What happens then in the McGurk case? ‣ Predictions: ‣ Not as much latency facilitation for McGurk AV stimulus as for a congruent AV [pa], which has redundant and highly predictive information ‣ Amplitude reduction (seems to be independent of place of articulation) Results experiment 1 & 2 • What happens then in the McGurk case? ‣ Amplitude reduction (seems to be independent of place of articulation) - Why? ‣ Maybe attentional effect: Visual modality divides attention that could be dedicated to auditory stimulus van Wassenhove et al (2005) • Experiment 3 (n=10; also in exp. 1) ‣ only incongruent AV stimuli (McGurk [ta] & combinations, eg. [Vta+Aka]) • Task: What do you see? ‣ Forced choice (pa? ta? ka?)
Exps 1 & 2
1.
congruent AV pa fusion fusion in visual
attention
5
4
3
2
1
0 r
= McGurk [ta]
congruent AV pa fusion fusion in visual
\ JK attention )
Exp 3
a @30 YC
—~ = Ww
on om oOo
o
Temporal Facilitaion (im:
a
Amplitude Decrease (uV)
Fig.4. P2 latency facilitation and intersensory bias. Compared to congruent
AV [pa] (a Left), no latency facilitation was observed for fusion (a Center).
When attention is directed to visual inputs in AV conditions, temporal facili-
tation is recovered in fusion (a Right), suggesting that visual attention can
enhance the biasing effect of a weak predictor. (b) The amplitude decrease
was consistent across all stimuli and independent of attended modality,
pointing to the automaticity of AV speech integration.
= McGurk [ta] Exp 3Exps 1 & 2 No Amplitude Reduction for Visual Attention
Fig. 5. Analysis by synthesis in AV speech integration. Visual speech inputs typically precede the auditory signals and elicit an abstract speech representation.
The predictive value of the abstract token varies as a function of visual saliency and is updated as more visual information is made available. Incoming auditory
speech inputs are evaluated against the prediction. Redundancy between predictor and auditory inputs are decorrelated such that greater redundancy leads
to greater decorrelation. The stronger the predictor, the faster the auditory speech processing. The N1/P2 complex reflects the residual error of the evaluation
process and is observed as amplitude decrease.
Analysis-by-Synthesis • Visual to Auditory prediction implicates a Visual to Auditory transformation • Remember phonology? What is a distinctive feature again?