Validity and Reliability in Education Research Paper

Exclusively available on Available only on IvyPanda® • No AI

Table of Contents

Introduction
Analysis
Conclusion
References

Introduction

Reliability and validity are important characteristics of the test tools used in the research. Depending on the goals of the research team, different approaches can be utilized. The current paper reviews three research studies in the field of special education in order to identify the means of establishing the validity and reliability of the involved test instruments.

Analysis

The study by Kratz et al. (2015) is aimed at the evaluation of the test tools proposed for implementation in the special education setting. The test instrument used by the research team was the Classroom Cohesion Survey (CSS) intended to evaluate the interaction between the classroom assistants and teachers involved in the interaction with children with an autism spectrum disorder. The reliability and validity of this tool were determined by analyzing the data obtained from a sample of teachers and classroom assistants participating in a two-year fee-for-service contract. It should be noted that the development process of the CSS included several precautionary measures aimed at increasing its validity.

More specifically, the survey design was based on the review of the literature on cohesion, which increases the relevance of the included information. The conclusions were also adjusted in accordance with the available classroom observations in order to increase the focus of the study. The survey was then tested by teachers outside the target sample. In this way, confusing and irrelevant points could be detected and modified to avoid inconsistencies in the obtained data.

A separate tool, Maslach Burnout Inventory (MBI), was introduced to measure the level of burnout of the sample population. The tool in question was designed specifically for teachers, which allowed using it to determine convergent validity with cohesion (Kratz et al., 2015). The fidelity of the tool can be further corroborated by the utilization of the STAR program used for coding the video observations. The robustness of the program is ensured through a combination of three components grounded in the principles of applied behavior analysis. The resulting fidelity observations were used to establish the predictive validity of the survey (Kratz et al., 2015).

The criteria and metrics of validity were designed using the respective manuals and verified by the experts in the field to confirm that the implementation was aligned with the original vision of the developers. Importantly the research assistants who handled the data were unaware of the characteristics of the observed groups and the level of experience of the teachers as well as the research hypotheses. In this way, the part of the procedure was effectively blinded which prevented the introduction of researcher biases in the process (Ary, Jacobs, Irvine, & Walker, 2014). The statistical analysis was performed in several steps.

First, exploratory analyses were conducted to determine the best factor solutions. Second, the total scores were calculated for each measure. Third, paired-sample t-tests were used to detect systematic differences in scores. Finally, the convergent validity of the survey was established by examining the correlations between the factors. The combination of the described precautionary measures allows us to conclude that the resulting survey can be considered of sufficient reliability and validity for evaluating cohesion in a special education setting. The conclusion is consistent with the findings of the authors that confirm the CSS efficiency in the classroom.

The article by McIntyre et al. (2017) explored the relationship between the results of the language screener administered during early childhood and the occurrence of referrals to and placement in the special education setting later in middle childhood. The test tool used in the research was the Fluharty Preschool Speech and Language Screening Test. The test can be used for a brief measurement of several skills and proficiencies, such as expressive language, composite language, receptive language, and articulation (McIntyre et al., 2017). The second edition of the test was used by the authors.

The measurement of all analyses was performed using the General Language Quotient standard score. This move ensured that the results of screening demonstrated sufficient uniformity for their utilization in subsequent studies and allowed for the use of the outcomes in a similar setting. Several tests were administered to analyze the instrument’s reliability. Specifically, the Woodcock-Johnson III Tests of Achievement was used to obtain the data of five-year-old children. The results were measured using a combination of outcomes in the areas of letter-word identification, calculation, and spelling. The results were measured using the Overall Academic Skills standard score (McIntyre et al., 2017).

The scores were divided by 15, which allowed the detection of a change in the outcome consistent with the standard deviation of 15 pertinent to the selected scale (McIntyre et al., 2017). Finally, the interview administered to the parents of the involved students was used to assess the use of special education. The interview was built on three dichotomous variables – the presence of the individualized education program, the occurrence of referral for special education review, and the existence of setbacks in academic performance.

The obtained data were then analyzed using several stages of statistical analysis. First, logistical regression was used to determine the relationship between language skills at different points in time. Second, several demographic factors were included in the analysis to detect possible moderation. Third, the analysis was conducted to rule out the influence of child problem behavior and similar covariates on the later use of special education. Fourth, different encodings of the outcomes were used to conduct sensitivity tests, which were expected to detect the inconsistencies in the results and further enhance the validity of the findings (Sullivan & Bal, 2013).

The authors concluded that the outcomes of the test instrument in question exhibited an inverse relationship with the likelihood of referral for special education and the presence of individualized education program. Importantly, the said relationship remained consistent regardless of the inclusion of the demographic covariates. These results, combined with the techniques used by the research team, allow us to conclude that the instrument can be used to reliably predict the use of special education in middle school and thus convey sufficient validity.

The article by Fernández-López, Rodríguez-Fórtiz, Rodríguez-Almendros, and Martínez-Segura (2013) proposes a learning platform intended for use in the setting of special education. The platform covers all phases of the learning process and contains adjustable learning activities. The feasibility of the platform is tested using two testing instruments in the form of questionnaires. The first questionnaire was designed to assess student skills (social, autonomy, language, and math) at several levels.

Due to the diversification of skills targeted by the platform, the standardized tools were allegedly incompatible with the goals of the research team. Thus, the custom questionnaire was created that would allow measuring the abilities in all of the involved fields. The items included in the questionnaire were defined based on the core skills provided by the Spanish education system. The second questionnaire was meant to establish the patterns of use of the platform and included the items based on the use frequency, the suitability of the activity, and students’ degree of motivation, among others.

No information was provided regarding the design specificities. However, the generic character of the areas of inquiry suggests that the instrument does not require a complex design and, therefore, its relevance can be inferred. The reliability of the instruments was established by employing a Cronbach’s coefficient alpha. Each segment of data was applied to the coefficient separately, and its consistency was determined. The values determined in the process exceeded the recommended minimum from the academic literature (Peterson & Kim, 2013). The authors performed the calculation of reliability through internal consistency.

Such an approach is relatively simple and is possible to perform based on the results of a single test (Dunn, Baguley, & Brunsden, 2014). The findings of the study indicate the potential offered by the suggested platform for the development of learning skills in the special education setting (Fernández-López et al., 2013). Thus, while a more comprehensive inquiry would be necessary to arrive at a definitive conclusion, the chosen instruments employed validity and reliability sufficient for the purpose of the study.

Conclusion

As can be seen from the information above, the standardized methods are commonly used by the researchers to establish the validity and reliability of their test instruments. Review and preliminary administration were used in one instance, and one article omitted minor details of instrument design. Nevertheless, reliability and validity were sufficient in all reviewed cases.

References

Ary, D., Jacobs, L. C., Irvine, C. K. S., & Walker, D. (2014). Introduction to research in education (9th ed.). Belmont, CA: Cengage Learning.

Dunn, T. J., Baguley, T., & Brunsden, V. (2014). From alpha to omega: A practical solution to the pervasive problem of internal consistency estimation. British Journal of Psychology, 105(3), 399-412.

Fernández-López, Á., Rodríguez-Fórtiz, M. J., Rodríguez-Almendros, M. L., & Martínez-Segura, M. J. (2013). Mobile learning technology based on iOS devices to support students with special education needs. Computers & Education, 61, 77-90.

Kratz, H. E., Locke, J., Piotrowski, Z., Ouellette, R. R., Xie, M., Stahmer, A. C., & Mandell, D. S. (2015). All together now: Measuring staff cohesion in special education classrooms. Journal of Psychoeducational Assessment, 33(4), 329-338.

McIntyre, L. L., Pelham, W. E., Kim, M. H., Dishion, T. J., Shaw, D. S., & Wilson, M. N. (2017). A brief measure of language skills at 3 years of age and special education use in middle childhood. The Journal of Pediatrics, 181, 189-194.

Peterson, R. A., & Kim, Y. (2013). On the relationship between coefficient alpha and composite reliability. The Journal of Applied Psychology, 98(1), 194-198.

Sullivan, A. L., & Bal, A. (2013). Disproportionality in special education: Effects of individual and school variables on disability risk. Exceptional Children, 79(4), 475-494.