This paper analyzed two related metrics, namely the weight and height of individuals. The sample was formed from people familiar with the author. Descriptive statistics, scatter plot visualization, and correlation were used to analyze. Descriptive statistics help provide general information about the data distribution, as shown in Table 1 below. One can see that the average individual in the sample was 174.6 cm tall and weighed 71.5 kg. The median function allowed us to know the boundary that divides the entire distribution into halves; in the sample, the medians were 174.5 cm and 71 kg. The maximum height and weight values were 190 cm and 93 kg, respectively, which was a characteristic of the same respondent. In contrast, the minimum values were not specific to one individual and were 156 cm and 48 kg.
Table 1. Measures of Central Tendency About the Height and Weight of The Sample
Visualizing with a scatter plot helps to understand the overall trend of the distribution of the variables relative to each other. This is useful in order to determine the nature of the relationship between the two factors in the first place. For the sample, the scatter plot is shown in Figure 1: one can see that as the overall height increased, so did the weight of the individuals. The use of correlation analysis confirmed this — the Pearson correlation coefficient was 0.75. This is a high level of positive correlation, which shows that as one variable increases, so does the other (Fernando, 2021). In general, correlation is used to assess the strength and direction of the relationship between two or more factors. Finally, the coefficient of determination R2 was determined, which is calculated as the square of the correlation. A value of 0.5562 means that only 55.62% of the total variance of the data in the sample can be explained by a linear model. In other words, linear regression is moderately suitable for comparing the data.
As a conclusion to this activity, I should say that I have consolidated my general knowledge of the features of MS Excel. I understand that working in the program is not limited to the functions used, and there is still a lot to learn. However, already now, I can easily perform correlation and regression analyses for two variables, construct a scatter plot, as well as useful elements of descriptive statistics to characterize distributions.
Reference
Fernando, J. (2021). Correlation coefficient. Investopedia.