Language testing and assessment requires a guiding framework and a solid benchmark upon which a student’s performance can be evaluated. According to Bachman and Palmer (1996) model, a language test should not only focus on oral proficiency, but also other important components of a language.
In this regard, Bachman and Palmer (1996) developed the four structures of language testing namely; second language reading performance, state and trait strategy use, strategic competence and structural equation modelling. This paper uses Bachman and Palmer (1996) as a framework to make an evaluative commentary on TOEFL as an international English language test.
The second language reading performance focuses on the development of good reading habits. Good reading habits also reflect on a student’s attitude towards the language being studied. In this case, the language in question is English. Reading in reference to TOEFL is tested in terms of passages which students read and answer the comprehension questions asked. Comprehension questions test students’ analytical reading skills.
The state and trait strategy use thrusts the state and trait metacognition into the limelight. The aspects of this paradigm include planning, self-checking, cognition strategy and awareness (Cohen, 1998). State metacognition is transitory while trait metacognition is relatively stable.
This strategy focuses more on goal achievement. The test should give results as to whether the states goal of offering English either as a first or second language was accomplished. In addition, the results of the test should be able to help students understand whether they managed to achieve their individual goal or not.
There is the strategic competence which refers to the ability to balance incompetency in one area through competency in certain areas. In linguistics, this is done through either coining words or circumlocution. In extreme cases, one may seek refuge in sign language. Other important components discussed under the umbrella of strategic competence are grammatical, sociolinguistic and discourse competencies.
Lastly, this paper will also examine the structural equation modelling in relation to language testing. It mainly deals with statics and its testing and analyzing. Then, regarding the obtained data, it determines the relations between combination of statistics and quality. It allows for both confirmatory and exploratory modelling. It is, therefore, crucial to theory testing and theory development.
Every evaluative commentary on language tests cannot overlook Bachman and Palmer framework. Bachman and Palmer (1996) developed the four structures of language testing namely; second language reading performance, state and trait strategy use, strategic competence and structural equation modelling. These structures have already been discussed in detail in the preceding section.
The second language reading performance has been the subject of widespread research. A key feature in the second language reading performance is the centres on the development of good reading habits and a liking for the reading culture.
One thing that has been noted to impact profoundly on reading is the attitude of learners. It is true that people walk into academic institutions inevitably armed with various attitudes towards the subjects to be studied. According to Day and Bamford’s (1998) model, one of the factors influencing second language reading is the attitude towards a learner’s first language.
It has been established that there is always a relationship between first language reading and second language reading. This is because the first language reading ability is usually transferred to the second language reading. This implies that a learner must first develop the necessary proficiency in their first language before undertaking a second language.
Therefore, a language test based on this structure would be useful if it focused on extensive reading. Extensive reading will also enable the learner to develop an affinity to the second language and; hence, will be able to master the use of the language in context (Day & Bamford, 1998).
An important illustration of this is the reading test provided by TOEFL. It consists of three or four passages which are based on academic topics. These passages call for an in-depth understanding of the language structure and usage to be able to comprehend the passage and effectively answer the subsequent questions.
They require students to analyze main ideas and vocabulary, make relevant inferences, insert sentences and pick out essential details from a given passage. It is, therefore, proper to aver that a language test based on this criteria is not only relevant, but also quite in order.
The TOEFL reading test is the most appropriate in assessing a learner’s mastery of the English language since it does not require prior knowledge of the subject matter presented in the passages. It is the student’s analytical skills that are put to test. It also ensures fairness in the test to all students.
It is important to note that reading is an essential skill in any given language, particularly English. It is important that each student undertaking the study be well-prepared. This can be achieved through constant and extensive reading.
Without a proper reading background, one may find it tough going because they are likely to face challenges in the form of limited time and comprehension. Thus, a slow reader may not be able to complete the exam on time while a poor reader will be unable to synthesize the issues presented in the passage. This would make the students not to get most of the questions right.
In view of the fact that reading is a crucial element of the English language, researchers and educators are becoming increasingly aware of the importance of extensive reading. They further agree that extensive reading has to be anchored in the cultivation of good reading habits, building up of vocabulary and structure and encouraging a liking for reading (Richards and Schmidt, 2002).
There are other factors that influence second language reading. These factors include anxiety, which affects students sitting for the exam the first time. Others are comfort, self-perception and value attached to the course or exam. The panacea to these can be found in extensive reading itself.
Another paradigm set by Bachman and Palmer (1996) is the structural equation modelling. It refers to a statistical technique for testing causal relationships between variables. It uses both qualitative and quantitative methodologies. It gives room for both confirmatory and exploratory modelling. This implies that it is suited to both theory testing and theory development.
Confirmatory modelling begins by an assumption of the causal relationships between variables. This also requires handling of concepts to make them suitable for use in testing possible relationships between variables. The model is then tested against obtained data to test its practical applicability. The causal assumptions in the model have certain implications which can be tested against the data (Bollen and Long, 1993).
Various steps are followed in the creation of structural equation modelling. First, there is model specification. The model must be specified properly depending on the type of analysis a researcher is trying to confirm.
In order to achieve the correct model, two types of variables are usually brought into play. Hey are the exogenous and endogenous variables. Exogenous variables send out arrowheads when plotted on a graph whereas the endogenous variables are the recipients of the arrowheads.
Structural equation modelling exhibits two main components. They are the structural model which depicts the potential causal dependencies between the exogenous and endogenous variables. The other component is the measurement model which shows the relationships between the latent variables and their indicators.
However, some models may contain only one component or the other. An example of such model is the exploratory and confirmatory factor analysis model, which contains only the measurement part.
The next step in structural equation modelling is the estimation of free parameters. This is done by comparing the covariance matrices that represent the relationship between variables and the estimated covariance matrices of the best fitting model. A fit criterion numerical maximization is then used to obtain the best fitting model.
The third step is the model modification. This happens in order to improve the fit; hence, estimating the most likely relationships between variables. Many language programs provide modification indexes to map improvements resulting from addition of extra variables to the model.
The modifications which reflect improvements in the model are labelled as potential changes for model adjustment. These modifications must also enshrine theoretical sense.
The next step is the interpretation of the model to find the best explanation for claims concerning contrasts. It is important to note that caution has to be taken when making claims of causality. The definition “causal model” can be explained as a pattern that reveals usual suppositions, and not always a model that gives verified results.
Rival hypotheses can be eliminated by collecting data at multiple time points and by using experimental or quasi-experimental designs. Nevertheless, no research design can help identify rival hypotheses except interventional experiments (Pearl, 2000).
Sample size is also an important parameter in structural equation modelling. A minimum of ten observations per indicator is used in setting the lower bound for the sufficiency of sample sizes (Nunnally, 1967). Although they are easy to calculate, they often lead to inadequate sample sizes.
It is for this reason that Westland (2010) concluded that 80% of research articles drew conclusions from insufficient samples. When the number of potential combinations of variables increases, the complexities that increase information demands in structural model estimation also increase.
Bachman and Palmer (1996) also developed the paradigm of strategic competence in relation to language testing. Strategic competence refers to the ability to compensate for lack of ability in other areas often associated with the use of communication strategies (Bachman and Palmer, 1996).
For instance, language learners may develop certain communication strategies when communicating in their acquired language to be able to pass across information even if their knowledge of the language is insufficient.
For example, if one is hungry and would wish to order for a meal using a language in which he has insufficient vocabulary, they would still make their wish known by employing sign language. Alternatively, one would coin another word which is related in meaning, e.g., eat. Similarly, one could resort to circumlocution.
A public language test would, therefore, be considered to bear a mark of quality if it puts into consideration the strategic competence of the learners. In respect to TOEFL, the test for strategic competence is realized through the listening skills section.
This section examines the learner’s aptitude in understanding English that the native speakers use in their day-to-day lives. The questions in the listening section are used to determine the student’s ability to understand the conversations and lectures in English.
The students being tested are further required to recognize why the speakers use specific statements. The students have to evaluate the speaker’s intentions in making given comments and identify their attitudes. Needless to say, this public language test satisfies the requirements of any language testing procedure.
Closely related to strategic competence is grammatical competence. Grammatical competence tests the words and the rules of a given language, in this case, English. TOEFL examines grammatical competence through its carefully structured paper that tests sounds, words and sentence structure. It consists of short multiple choice questions that require filling in the gaps with the most appropriate words.
This testing criterion is appropriate as it covers most aspects of the grammatical structure of the English language. However, the use of multiple choices may comprise the reliability of the test.
This is because it leaves a margin of guesswork and this may not be a true reflection of a student’s grammatical ability. In order to achieve a quality and useful language test, it is important for the examiners to diversify the testing criteria so as to limit the margin of error resulting from guesswork.
Another important aspect of a public language test is the socio-linguistic competence. Socio-linguistic competence denotes the appropriate use of language in various contexts. It encompasses expressing, interpreting and negotiating meaning according to culturally defined norms and expectations. It lends credibility and reliability to the language test.
It is important for students to be able to apply various language techniques and appropriate vocabulary in relation to their social context. Being conversant with the grammatical structures of a language may not be enough if one can’t be able to apply them in real life situations.
Discourse competence is yet another crucial component of language competence. This is the ability to create and understand various forms of the language under study that are longer than sentences.
The forms include stories, conversations, letters, etc. mastery of some or all of these forms are considered the worthy goal of any language test. TOEFL as a public language test has been hugely successful as it puts into consideration discourse competence.
The fourth paradigm advanced by Bachman and Palmer (1996) is the state and trait strategy use. States are situation-specific, which vary in intensity and change rapidly over time. State metacognition is the transitory state of people in intellectual.
It is characterized by planning, self-checking, cognitive strategies and self-awareness (Spielberger, 1975). Traits, on the other hand, are predispositions of people. Trait metacognition is, therefore, a relatively stable individual difference variable to react to intellectual situations with varying degrees of state metacognition (Spielberger, 1975).
In a nutshell, metacognition is the conscious and periodic self-checking of whether one’s goal is achieved, and, if necessary, select and apply different strategies. This involves planning, self-monitoring, cognitive strategy and awareness.
The use of state and trait notions to categorize two aspects of strategy use is quite relevant. It has been proposed by Phakiti (2003) that empirical research into the relationship between perceived strategy use and actual strategy use may provide the necessary insight into an individual’s psychology.
Metacognition measures are generally acceptable. But their correlation magnitudes are a major concern to many scholars (O’Neil et al., 1990). A careful review of the metacognition literature on the issue found few studies that proved achievement in metacognition relationship (O’Neil et al., 1990).
With regard to construct validity, it is concluded that planning, self-checking, cognitive strategy and awareness are positively related. Moreover, state metacognition is more predictive that trait metacognition and that higher level of state metacognition would lead to better performance. It is also concluded that difficult tasks would require higher levels of state metacognition (O’Neil et al., 1990).
This essay managed to bring into the reader’s attention the significant need for a framework/model upon which evaluation can be based as provided by Bachman and Palmer (1996). Bachman and Palmer (1996) have created the four structures of language testing, which are second language reading performance, state and trait strategy use, strategic competence and structural equation modelling.
This model is particularly suited to language testing, especially for TOEFL and others. TOEFL has proved to be a competent international language test following Bachman and Palmer’s model. However, more research needs to be conducted on this model in order to give it more empirical evidence. This would go a long way in enhancing the model presented by Bachman and Palmer (1996).
Bachman, L. & Palmer, A. (1996). Language testing in practice. Oxford: Oxford University Press.
Bollen, K. A. & Long, S. J. (1993). Testing structural equation models. New York, NY: Sage.
Cohen, A. D. (1998). Strategies in learning and using a second language. London and New York: Longman.
Day, R. & Bamford, J. (1998). Extensive reading in the second language classroom. Cambridge: Cambridge University Press.
Hornberger, N. H. & Shohamy, E. (2008). Language testing and assessment. In encyclopaedia of language and education, (Volume 7, pp. 390-400). Berlin: Springer.
Nunnally, J.C (1967). Psychometric theory. New York: McGraw-Hill.
O’Neil, H. F., Jr., Baker, E. L., Jacoby, A., Ni, Y., & Wittrock, M. (1990). Human benchmarking studies of expert systems (Report to DARPA, Contract No. N00014-86-K-0395). Los Angeles: University of California, Centre for the Study of Evaluation/Centre for Technology Assessment.
Phakiti, A. (2003). A closer look at gender differences in strategy use in L2 reading. Language Learning, 53, 649–702.
Pearl, J. (2000). Causality: Models, reasoning, and inference. Cambridge: Cambridge University Press.
Richards, J. C. & Schmidt, R. (Eds.). (2002). Longman dictionary of language teaching and applied linguistics (3rd ed.). London: Longman.
Spielberger, C. D. (1975). Anxiety: State-trait process. In C. D. Spielberger & I. G.
Sarason (Eds.). Stress and anxiety, vol. 1, pp.115-143. Washington, DC: Hemisphere.
Westland, J.C. (2010). Lower bounds on sample size in structural equation modelling. Electron. Comm. Res. Appl, 9 (6), 476–487.