Hypothesis Testing
As a statistical process of making decisions based on the significance of findings, hypothesis testing is central to statistics in various fields of knowledge such as business, science, and mathematics. A real-world example of hypothesis testing is the comparison of the battery life of Android smartphones and IOS smartphones with similar features. Android and IOS are two major mobile operating systems developed by Google and Apple, respectively.
Battery life is an essential feature of a mobile phone because it determines its functions and expediency on a daily. Smartphones with long battery life are desirable because they save time and improve expediency for users who do not have to charge them frequently. Mobile phone companies strive to develop mobile phones with long battery life to satisfy the needs of customers, augment sales, dominate competitive markets, and create a unique brand.
Moreover, customers need to make informed choices regarding the nature of mobile phones that they purchase and get desired value for their money. Mobile phone companies claim that their respective smartphones have long battery life, as indicated by the charge capacity and voltage. In this view, hypothesis testing would enable the determination of the existence and extent of battery life of smartphones with Android and IOS operating systems.
The claim is that Android and IOS smartphones do not have the same battery life because they use different mobile operating systems. Formulation of a hypothesis based on the claim is a critical step in hypothesis testing. Since the claim does not specify the direction of battery life, hypothesis testing would use a two-tailed test. The null hypothesis, in this case, is that the means of battery life of Android smartphones and IOS smartphones do not differ significantly (H0: µ1 = µ2). In contrast, the alternative hypothesis is that the means of battery life of Android smartphones and IOS smartphones differ significantly (Ha: µ1 ≠ µ2).
Bowerman, O’Connell, Murphree, and Orris refer to this form of the hypothesis as “a two-sided, not equal to the alternative hypothesis” because either means can be greater than the other mean (330). To test the hypothesis, data collection of battery life of smartphones with Android and IOS operating systems and similar features regarding charge capacity, voltage, and the number of applications installed in them. Random selection of adequate sample (n) of each type of smartphone is necessary to eliminate sampling bias and enhance external validity. The selected smartphones would be charged fully, and standby battery life measured and recorded.
In data analysis, means in the descriptive statistics would indicate if Android or IOS smartphones have longer battery life than the other. A two-sample t-test would determine if the apparent difference in means as shown by descriptive statistics is statistically significant. If there is enough evidence (p < 0.05), the t-test will reject the null hypothesis, but if there is insufficient evidence (p > 0.05), the t-test will fail to reject the null hypothesis.
One benefit of hypothesis testing is that it eliminates subjective estimation that is prone to biases and, thus, validates claims. However, the concern is that hypothesis testing is prone to statistical errors. The rejection of a correct null hypothesis gives a type I error, whereas failure to reject an incorrect hypothesis gives a type II (Bowerman, O’Connell, Murphree, and Orris 331). The use of large sample size and a reduced significance level would reduce the occurrence of these statistical errors.
Regression Analysis
Regression analysis is a statistical procedure employed in predicting the influence of an independent variable on a dependent variable. A real-world example is that the influence of computer features on the preference of customers. Computer companies manufacture computers with different features with a view of meeting the unique needs of customers in various markets across the world. To understand how different features influence the preference of customers, computer companies need to perform multiple regression analyses. Major computer features are the operating system, hard disk memory size, speed, random access memory, and model.
Given that customers choose and purchase computers based on their unique preferences, computer companies have to understand these preferences and create customized computers. Regression analysis requires the rating of variables, which are dependent variables and independent variables, on a continuous scale. In this case, customer preference is a dependent variable, whereas computer features are independent variables.
The purpose of multiple regression analysis is to determine the extent to which each independent variable predicts the dependent variable. In this example, multiple regression analysis seeks to determine the extent to which operating system, hard disk memory, speed, random access memory, and model explain customer preference of computers in the market. The computer features would be rated on a scale of 1 to 10 and the preference of the computer on a scale of 1 to 5. The rating quantifies variables and allows quantitative analysis in regression analysis.
The claim is that operating system, hard disk memory size, speed, random access memory, and model predictors of customer preference of computers. The claim forms the basis of hypothesis testing through multiple regression analysis. The null hypothesis is that operating system, hard disk memory size, speed, random access memory, and model are not statistically significant predictors of customer preference of computer (p>0). The alternative hypothesis is that operating system, hard disk memory size, speed, random access memory, and model are statistically significant predictors of customer preference of computer. The regression equation comprises the y-intercept (β0), the coefficient of each variable (β1), and the error term (ε) (Heiman 49). In this view, the following is the regression model of this multiple regression analysis:
- Customer preference = β0 + β(operating system) + β(hard disk memory) + β(speed) + β(random access memory) + β(model).
Performance of multiple regression analysis would provide R and R-square values, which are central in the interpretation of the regression outcome. R indicates the strength of relationships between predictors (operating system, hard disk memory, speed, random access memory, and model) and the criterion (customer preference). R-square shows the extent to which predictors explain variation in the criterion variable (Heiman 51).
Moreover, multiple regression analysis indicates the significance of the regression model in predicting the relationship between predictors and criterion. A strong relationship and a statistically significant regression model is necessary to elucidate the influence of predictors on the criterion. The p-value of the regression model determines the rejection or failure to reject the null hypothesis that operating system, hard disk memory size, speed, random access memory, and model are not statistically significant predictors of customer preference of computer.
Regression coefficients of each predictor indicate the extent and the significance of prediction (Heiman 49). Therefore, computer companies would examine the strength of the relationship, prediction of the regression model, and regression coefficients of each predictor. As an implication, computer companies would be able to know statistically significant predictors of customer preferences and design computers that match these predictors.
Works Cited
Bowerman, Bruce, Richard O’Connell, Emily Murphree, and James Orris. Essentials of Business Statistics.5th ed., New York: McGraw-Hill Education, 2015.
Heiman, Gary. Basic Statistics for the Behavioral Sciences. New York: Cengage Learning, 2013.