Scientists have always been guided by the principles of data collection, analysis, and interpretation of results to explore specific research questions. One of the techniques useful for this purpose is hypothesis testing, a statistical method used for judging statements (Alkarkhi, 2020). Two types of hypothesis testing are known: null and alternative (Alkarkhi, 2020). The former states that there is no significant difference between the two parameters, while the latter claims the opposite (Alkarkhi, 2020). The conclusion about the correctness of either of these hypotheses is based on data evaluation. However, if calculations are made incorrectly, there is an increased possibility for an erroneous conclusion. Therefore, every hypothesis testing procedure demands exploratory analysis and data preparation to attain maximum accuracy and avoid type I and II errors.
The process of testing a hypothesis comprises five essential steps. The first stage requires determining null and alternative hypotheses, and during the second step, the significance level (alpha) should be selected (Alkarkhi, 2020). The alpha values that are frequently chosen are 0.01 and 0.05 (Alkarkhi, 2020). Thirdly, the statistic value of the test is calculated using the sample data (Alkarkhi, 2020). Fourthly, researchers identify the study’s noncritical and critical regions that will accept or reject the null hypothesis, respectively (Alkarkhi, 2020). Lastly, the results are interpreted, and the decision about accepting or rejecting the null hypothesis is made (Alkarkhi, 2020). If the null hypothesis is falsely rejected, it is called a type I error (Denis, 2018). A type II error means accepting the null hypothesis incorrectly, resulting in an inaccurate conclusion that the parameters lack any difference (Denis, 2018). These two terms are critical in statistical hypothesis testing because scientists strive to minimize these errors by increasing studys’ sample sizes.
Before any statistical method is applied, data should be prepared and assessed. The sample population’s characteristics need to be sorted and categorized. Exploratory data analysis (EDA) is an approach to evaluate datasets with the help of visual methods (Vrancich, 2019). The primary aim of this process is to check if data is entered correctly, reducing the possibility of false acceptance of rejecting null and alternative hypotheses (Denis, 2018). Overall, EDA allows the implementation of collected information for hypothesis testing and enables visualizing particular features that cannot be quantified but still conveys the meaning for the research.
In summary, hypothesis testing is an essential statistical method used in research. It is conducted in five steps: defining null and alternative hypotheses, establishing significance, determining the critical region, calculating alpha-value, and analyzing the results. The hypothesis testing process requires proper data preparation using exploratory data analysis. It is done to ensure accurate information entry and minimize the probability of type I and II errors that occur when the null hypothesis is falsely rejected or accepted.
References
Alkarkhi, A. F. (2020). Applications of hypothesis testing for environmental science. Elsevier.
Denis, D. J. (2018). SPSS data analysis for univariate, bivariate, and multivariate statistics. John Wiley & Sons.
Vrancich, M. (2019). Exploratory data analysis of stream data in sports medicine domain. ACM, 1-5.