Abstract
This research focuses on the role facial configuration plays in Arabic words decoding. 20 females aged between 17 and 25 years old took part in the research. The participants were shown videos revealing (upright and upside-down) faces and they were asked to say the word they think was pronounced. 72% of upright words were decoded correctly while 28% of words shown upside down were decoded correctly. It is concluded that facial configuration plays and important role in word recognition. It is also clear that people can decode words even when the image of the face is distorted. Recognition of words also depends on the word as some words were understood (almost) equally well.
Introduction
It has been acknowledged that facial impression affects the way speech is decoded. More so, researchers argue that lip reading is “unconsciously practiced” when people communicate (Dell’Aringa, Adachi & Dell’Aringa, 2007). Of course, facial configuration is especially important for people with hearing impairments as it can help them decode messages more effectively.
There is quite a significant bulk of research on the matter. For instance, Schmidt, Koller, Ney, Hoyoux and Piater (2013) stress that facial configuration facilitates words understanding when lip reading, which is essential when using silent language. Jordan, Paterson & Kurtev (2009) also state that words and non-words can be decoded irrespective of foveal stimuli.
Massaro and Jesse (2009) claim that understanding of songs’ lyrics is decreased when the songs are sang since singers often distort the duration of sounds, which effects their facial movements. At the same time, lip reading can be also less effective if the level of noise is significant. Thus, facial configuration can facilitate understanding of words if moderate noise is present (Ma, Zhou, Ross, Foxe & Parra, 2009). It is noteworthy that researchers consider the correlation between facial configuration and speech decoding using different languages.
It is clear that different approaches have been used to reveal the importance of facial configuration during speech decoding (noise, distorted pronunciation). However, there has been no research including the distorted image used for decoding. This research will focus on the impact facial configuration has on decoding Arabic words. The hypothesis of the present research can be formulated as follows: If participants employ facial configuration when decoding words, they will understand speech better when looking at upright faces rather than upside-down images while, if participants do not utilize facial configuration, there will be no difference in speech decoding irrespective of the angles of faces shown.
Method
Twenty samples took part in the research. All of them were females aged between 17 and 25 years old. The participants were shown a video with faces pronouncing 19 Arabic words. The video length was 3:20 and each sub-video was approximately 6 seconds. The sub-videos depicted faces at different angles (upright or upside-down). The face angles were randomly chosen for each participant. The video was silent and the participants were asked to say the word pronounced by the face on the screen.
Results
It is found that participants decode words more easily when looking at upright faces. Thus, 72% of words pronounced by faces in the upright position were decoded correctly (see figure 1). Interestingly, some words were decoded correctly or the differences was not very significant (see fig. 2). These words were Awraaq, Beet, Noor, Modeer. At that, the word Muhamed was decoded correctly irrespective of the face angle.
However, in many cases words were not understood at all when upside-down images were shown. These words were Rabea3, Safeenah, Dirham, Setarah, Mjiles, Dubai, Jam3yah.
Discussion
The research shows that facial configuration plays quite an important role when decoding Arabic words. Therefore, the present research is consistent with the results of other surveys that show the importance of facial configuration for speech decoding in different languages. It is clear that people are able to recognize the vast majority of words (72%) looking at the face of the speaker and the sound is not necessary.
At the same time, it is clear that in some (almost in a third of) cases, people can decode words even when looking at the inverted image. Of course, this does not mean that people simply guess the word as they still see the face and can observe certain facial configurations (even distorted ones). It is obvious that more effort is made and the distortion is a significant obstacle for the participants. However, this finding suggests that participants were able to recognize the words as they focused on certain cues.
Schmidt et al. (2013) note that mouthing plays the major role in decoding. This may be the cue used by the participants of the present research. Hence, the participants looked at the speaker’s mouth and were able to guess the word even when the face was shown upside down. This also suggests that facial configuration facilitates recognition of words.
It is necessary to add that recognition of words and the use of facial configuration as well as its effectiveness depend on the word itself. Some words are easily decoded while others need additional effort to recognize. This can be explained by the way certain sounds are usually pronounced as some sounds and words are quite unique and a specific facial configuration is used.
In sum, it is clear that the research reveals the importance of facial configuration in word decoding. At the same time, there are certain limitations to the research. For instance, the number of participants is not very big. The samples are also quite homogenous, which can distort the results of the research. In future, it is necessary to include more participants into the survey. There should be people of both genders.
As far as the future research is concerned, it is possible to use the same procedure but use phrases or even simple sentences. Clearly, fluent speech is more difficult to decode and participants are unlikely to recognize words when looking at the faces shown upside down. It can also be interesting to see the difference between the ways different groups of people decode speech. For instance, it is possible to identify the difference between the way males and females, or teenagers and young adults, adults and elderly people decode speech.
This research has a number of implications. First, it contributes to the bulk of the research on the matter. It also shows that facial configuration is important to decode Arabic words and this has to be used in teaching and using silent language. The procedure employed can be utilized in other surveys focusing on the role of mouthing in decoding.
Reference List
Dell’Aringa, A.H.B., Adachi, E.S., & Dell’Aringa, A.R. (2007). Lip reading role in the hearing aid fitting process. Brazilian Journal of Otorhinolaryngology, 73(1), 101-105.
Jordan, T.R., Paterson, K.B., & Kurtev, S. (2009). Reevaluating split-fovea processing in word recognition: Hemispheric dominance, retinal location, and the word–nonword effect. Cognitive, Affective, & Behavioral Neuroscience, 9(1), 113-121.
Ma, W.J., Zhou, X., Ross, L.A., Foxe, J.J., & Parra, L.C. (2009). Lip-reading aids word recognition most in moderate noise: A Bayesian explanation using high-dimensional feature space. PLoS ONE, 4(3), 1-14.
Massaro, D.W., & Jesse, A. (2009). Read my lips: Speech distortions in musical lyrics can be overcome (slightly) by facial information. Speech Communication, 51, 604-621.
Schmidt, C., Koller, O., Ney, H., Hoyoux, T., & Piater, J. (2013). Using viseme recognition to improve a sign language translation system. International Workshop on Spoken Language Translation, 197-203.