Introduction
The perception of speech is produced by a person not only through audio channels; other forms of information, in particular visual, are also involved in the formation of the message. In the literature, many experiments have been recorded related to identifying the influence of the number of channels of perception on the information received. Messages perceived through the audiovisual channel have greater clarity and are processed faster by humans, whereas purely audio information is perceived slower and often of lower quality (Rosenblum, 2019). Thus, the integration of different channels of perception plays an important role in communication.
Not only external factors affect the perception of speech, but also the internal. For example, a person’s lexicon, with the help of which the semantics of certain received words is determined, is an external factor. The main objective of the study includes determining whether there is a correlation between the size of the lexicon and the speed of perception of audiovisual materials. Thus, the hypothesis is that the wider a person’s lexicon is, the faster he or she processes both audiovisual and audio information. To confirm or disprove the hypothesis, two experiments were conducted with two participants involved.
Background
A person has several channels of information perception, which are not isolated from each other. Thus, in a healthy human under normal conditions, exclusively auditory perception of speech is impossible. When communicating, people observe visual information, which also affects the process of perception. In the literature, the main phenomenon associated with this process is the McGurk effect. It is assumed that “seeing the articulatory gestures of a talker influences the auditory speech percept” (Tiippana et al., 2003, p. 460). Many theorists argue about which channel of audiovisual information is primary or whether such information is a consequence of the integration of two streams (Rosenblum, 2019). Rosenblum (2019) indicates that “the experiments reported here demonstrate lexical influences on the McGurk effect” (p. 460). Experiments with the McGurk effect are aimed precisely at identifying the correlation between the quality of speech perception and visual information.
The combination of audio and visual materials improves the perception of speech in noisy circumstances, as well as semantically difficult words or phrases. However, despite a wide range of experiments conducted aimed at identifying the persistent source of information, the role of the lexicon in perception is poorly understood (Brancazio, 2004). The McGurk effect experiments involve the audio presentation of different phonemes and simultaneous visual presentation of the pronunciation of another phoneme. Thus, a person perceives some sounds by ear, but the pronunciation of other sounds may be perceived visually, which often causes difficulty in identifying a phoneme. However, there are no problems with identification in exclusively audio perception.
A person perceives information contextually, using many factors, including lexical ones. Thus, it is necessary to understand how lexicon influences the perception of audiovisual information. The assumption is that the lexical information known to man is primary in relation to visual perception. Thus, a person first hears a word or phoneme, then compares it with their existing lexical knowledge, looking for a match, and in case of uncertainty, turns to visual information.
Method
The experiment is conducted involving two participants with different lexicon sizes. To measure the number of words in the lexicon of each participant, they were asked to read 100 words and explain their meaning. Words were presented in decreasing order of frequency of their use. It is worth noting that this way of measuring the size of the lexicon is rather imprecise because it is influenced by the sample of the presented words. Thus, by the number of words explained, one can identify the percentage of the words known by the participants. For the first participant, it is 83%; for the second participant, it is 67%. Consequently, there is a significant difference in size in the lexicon of both participants.
After measuring the lexicon, the participant was asked to conduct a series of audio and audiovisual experiments. The experiments were performed in an environment of artificially created noise, which made it difficult for the participants to understand speech. In the first experiment, participants listened to the words and repeated them after listening. In the second experiment, participants listened to the same words but also watched the video recording with the articulation of similar but not the same words.
Data analysis was performed using records of observations during the experiment and a subset of correct answers. During the experiments, the number of words correctly repeated in both experiments was recorded, as well as the frequency of occurrence of uncertainty. The result of the experiments is the ratio of the number of correctly repeated words in the audio and audiovisual experiment, mentioning the size of the participants’ lexicon. Thus, the experiment is aimed at identifying the correlation between the size of the lexicon and the quality of speech information perception.
Results
During the first experiment, the participants were asked to close their eyes and perceive the audio information, repeating the words they heard. Each participant was offered the same range of 10 words. The following words were presented to participants: cue, allude, continual, insight, later, overdo, perspective, severe, vicious, gibe. During the experiment, the number of correct repetitions was recorded, and the inaccuracies and uncertainties, which arose, were noted. The first participant with a more extensive lexicon repeated 8 out of 10 words correctly. The second participant with a less broad lexicon repeated 6 out of 10 words correctly.
In the second experiment, the participants were asked to perceive audiovisual information of the same words. However, the recorded articulation reflected the pronunciation of similar but not the same words. Pairs of words similar to each other were chosen; the first word was read, and the articulation of the second appeared on the video. The following word pairs have been used: cue-queue, allude-elude, continual-continuous, insight-incite, later-latter, overdo-overdue, perspective-prospective, severe-sever, vicious-viscous, gibe-jibe. In the course of observations, it was revealed that the first participant repeated six words correctly. It is noted that, when perceiving audiovisual information, the participant was often unsure of what word was pronounced. The participant often compared the word heard to the articulation presented in the video. The second participant in the second experiment also repeated six words correctly, while uncertainty was noted infrequently.
Discussion
In the course of the conducted experiments, it was revealed that the participant with a more extensive lexicon, when perceiving audiovisual information, was often unsure of what he was hearing. Tiippana et al. (2003) stated that audio perception is influenced by visual material, gestures in particular. Thus, the first participant tried to pay attention to the video recording of the articulation of words which were different from those perceived by the ear. Rosenblum (2019) also identified correlation between audiovisual information perception and lexicon. Probably the broader lexicon caused the participant to identify the possibility that another new word was being offered to him. Given the artificially generated noise conditions, video articulation was often the only source of information. In 2 cases out of 10, the first participant took a word, the pronunciation of which was recorded on the video. The second participant, on the contrary, rarely doubted what he was hearing. There were no recorded cases when he repeated the word, which articulation was presented on video. Thus, the less extensive lexicon did not allow the participant to doubt that the same word was offered to him.
Thus, the experiments conducted suggest that the broader lexicon forces the participant to process the information received more carefully and consider all factors, which makes the perception slower. This approach further leads to the emergence of uncertainty due to the discrepancy between the articulation recorded on the video and the perceived word. However, the results of the experiment cannot be interpreted unambiguously, which makes it rather limited. The number of participants involved in the experiment is insufficient to conduct massive observations. Moreover, the lexicon size measurement is rather subjective, which also limits results.
Conclusion
Perception of speech occurs through several channels, including the integration of visual and audio information. However, the internal processes associated with speech and language also have an impact on how a person approaches the processing of received data. In the course of the experiments, it was revealed that the size of the lexicon could have an effect on the speed and accuracy of processing audiovisual speech information.
The hypothesis considered in this study contained the assertion that a broader lexicon has a positive effect on information processing. However, the results of the conducted experiments suggest that the opposite effect occurs. The participant with a more extensive vocabulary was more likely to doubt the interpretation of what he hears and tried to recognize the video information as well. This approach led to a longer response in comparison with the first participant. However, this observation is valid only for the experiment with audiovisual materials, while with exclusively audio materials, the first participant showed a higher result. Future research should rely on a wider audience of participants for more accurate results.
References
Brancazio, L. (2004). Lexical influences in audiovisual speech perception.Journal of Experimental Psychology: Human Perception and Performance, 30(3), 445−463. Web.
Rosenblum, L. D. (2019). Audiovisual speech perception and the McGurk effect.Oxford Research Encyclopedia, Linguistics. Web.
Tiippana, K., Andersen, T., & Sams, M. (2003). Visual attention modulates audiovisual speech perception.European Journal of Cognitive Psychology, 16(3), 457−472. Web.