Language perception—in a spoken form—benefits from face-to-face interactions, as the mouth supplies good visual information for articulating specific sounds.
For instance, in an up-close and unobstructed situation, an individual can watch their friend mention going to the beach. In this case, they use visual input—observing the movement around the lips and tongue—to clearly comprehend what was said.
However, if the friend continues to talk out of sight in another room, they might be tempted to watch the muted television and therefore must solely rely on the obstructed voice to make sense of the message.
In this case, what was actually said at the tail end, pick, interfered with the silent kick and was misinterpreted as tick. This is an example of the McGurk Effect—a perceptual illusion that arises through a mismatch between sound and visual cues.
This video demonstrates how to construct the audiovisual stimuli to test the phenomenon originally discovered by McGurk and Macdonald. It also investigates how vision interacts with sound production to understand how individuals learn language at a very young age.
In this experiment, participants are asked to watch muted videos, in which a word like gain is mouthed, while a sound such as bane is played simultaneously in the background. Afterwards, they are asked to share what they heard.
To understand the outcome, how the illusion is produced, let’s first discuss how phonemes—the minimal units of speech sounds—are articulated.
For example, bane and gain share the same elements in all positions except for the first, which are the sounds /b/ and /g/.
Although words with these initial phonemes may sound similar, when /g/ is shown and /b/ is played, individuals are expected to hear a completely different third sound—/d/—instead.
The reason /d/ is heard is due to the fact that all three are basically produced in the same manner, with only a small difference in where the speaker places an obstruction in airflow, called the points of articulation, or POA.
For instance, when a /b/ sound is made, lips provide the obstruction, resulting in a labial POA, whereas for /g/, it’s referred to as palatal—in the back of the mouth. As for /d/, the POA is dental, a consequence of the tongue touching the upper teeth.
When the brain integrates the conflicting visual /g/ and auditory /b/, it concludes that the final sound must lie somewhere in the middle of POAs, thus hearing /d/ and reporting the word Dane.
In preparation for the demonstration, obtain a computer to present videos on and a smartphone with a video camera.
First position the camera so that your head fills the display. Now, record four 10-s clips, each one containing different words that should be repeated 10 times at a rate of 1 word/s. Make sure to transfer the gain and can videos to the computer for visual playback.
To conduct the experiment, sit a participant in front of the computer. Open up the video file for the word gain and turn off the audio.
On the phone, open up the video for bane. Place it behind the computer so that its screen is hidden and only the sound can be heard clearly.
Instruct the participant to watch the computer monitor and listen. Then, play both videos simultaneously.
When the clips end, ask the participant what they heard. [Participant says: "Dane"]. Repeat the procedure by playing the video of the word can on the computer and presenting the audio for pan on the phone. Once again, question the participant as to what they heard. [Participant says: "tan"].
Here, the words bane and pan were played aloud as the participant watched gain and can being mouthed. Typically, when a term with the /g/ phoneme is shown visually and paired with the sound /b/, individuals will hear /d/.
Likewise, when a word starting with /k/ is paired with the sound /p/, individuals will hear /t/.
The reason behind such auditory perception is due to the way that sounds are produced. The brain tries to resolve conflicting information from the eyes seeing labial movements—/b/ and /p/—while the ears hear palatal units—/g/ and /k/. As a result, it concludes that the sounds must lie in the middle, resulting in the perception of dental phonemes—/d/ and /t/.
Now that you are familiar with how to produce the McGurk effect, let’s look at some other ways that researchers use this perceptual phenomenon to investigate language development and cases in which the effect is altered.
Infants can even be tested on the McGurk effect as early as five months of age, when they are pre-linguistic, using an habituation-of-looking-time paradigm.
In this procedure, Rosenblum and colleagues repeatedly presented infants with a particular syllable, like va, in both the audio and visual domains before introducing mismatched phonemes in a testing phase.
Infants showed signs of habituation to va—reduced looking times—and dishabituation, noted as increased looking, when something other than va was perceived. Thus, even before infants can talk, they display similar results as adults, in which they rely on the use of visual information for language discrimination.
However, children with autism have greater difficulty exhibiting the McGurk effect as readily as controls due to their impaired ability to understand and attend to the visual facial components. This indicates fundamental differences in processing audiovisual speech, which may contribute to their difficulty with language and communication.
Lastly, patients with lesions in their left hemisphere—the side typically predominant for understanding and learning language—often use visual facial features to help during speech therapy. Interestingly, when tested on the McGurk effect, they more often reported hearing dental sounds compared to controls. Such perceptions are likely due to their higher focus on visual information.
You’ve just watched JoVE’s video on the McGurk Effect. Now you should know how to conduct this audiovisual illusion and relate phonemes to sound production. In addition, you should also have a better understanding of the interactions between vision and hearing, and how they can be affected during development and adulthood.
Thanks for watching!