Statistics. Exploring the Festival Data Coursework

Exclusively available on IvyPanda Available only on IvyPanda
Updated: Apr 6th, 2024

Introduction

Why do we do a normal test?

A normal test is usually done to ensure that the data is normally distributed since the test statistics used follows a normal distribution.

We will write a custom essay on your topic a custom Coursework on Statistics. Exploring the Festival Data
808 writers online

Activity One

Data exploring

In this study, we are interested in exploring the festival data if there are normally distributed. In this study, we are going to explore the festival data using the histogram and normal probability plot of the festival data (Ball, 2001). The festival data consist of three variables day 1, day 2, and day 3. We will investigate if all the three variables follow a normal distribution. We will also have the frequency table of the tree variables.

The histogram of day 1 variable

The histogram of day 1 variable

From the day one histogram, we can observe that the value of the mean is equal to 1.79, the standard deviation is equal to 0.944 and the value of observation is 810. From the histogram, we can observe that the festival data of day one is normally distributed about the mean of the data. We can also observe that the festival data is symmetrical about the mean of the day 1 of download festival. This means that the day 1 of download festival can be used to conduct analysis where the normal assumptions have been made.

The normal probability plot of hygiene day 1 of download festival

The normal probability plot of hygiene day 1 of download festival

The trended hygiene data of day 1 of download festival show that the data is normally distributed. This is indicated clearly because the residual p-p plot of the day 1 of download festival data of hygiene are close to the line implying that the errors and the festival data re normally distributed.

The exploring of hygiene day 1 of download festival

1 hour!
The minimum time our certified writers need to deliver a 100% original paper

The normal probability plot of the hygiene festival data is not normally distributed when the data has not been trended. The plots of the error are very far from the line implying that data is not normally distributed.

The exploring of hygiene day 2 of download festival

The exploring of hygiene day 2 of download festival

From the day one histogram, we can observe that the value of the mean is equal to 0.96, the standard deviation is equal to 0.721 and the value of observation is 264. From the histogram, we can observe that the festival data of day one is not normally distributed about the mean of the data. We can also observe that the festival data is not symmetrical about the mean of the day 2 of download festival. This means that the day 2 of download festival can be used to conduct analysis where the normal assumptions have been made (Kutner, Nachtsheim, Neter & Li, 2005).

The normal probability plot of hygiene day 2 of download festival

The normal probability plot of hygiene day 2 of download festival

The trended hygiene data of day 2 of download festival show that the data is normally distributed. This is indicated clearly because the residual p-p plot of the day 2 of download festival data of hygiene are close to the line implying that the errors and the festival data re normally distributed.

Normal P-P Plot

The normal probability plot of the hygiene festival data is not normally distributed when the data has not been trended. The plots of the error are very far from the line implying the data is not normally distributed.

Remember! This is just a sample
You can get your custom paper by one of our expert writers

Exploring day 3 of download festival

The histogram of the hygiene day 3 of download festival

The histogram of the hygiene day 3 of download festival

From the day 3 of download festival histogram, we can observe that the value of the mean is equal to 0.98, the standard deviation is equal to 0.71 and the value of observation is 123. From the histogram, we can observe that the festival data of day 3 is not normally distributed about the mean of the data. We can also observe that the festival data is not symmetrical about the mean of the day 3 of download festival. This means that the day 3 of download festival can be used to conduct analysis where the normal assumptions have been made. The hygiene data of day 3 download festival has two outliers and for the download festival data of day 3 to be used in any analysis that need normal assumption we need the outliers to be removed and transformed to normal.

The normal probability plot of the hygiene day 3 of download festival

The normal probability plot of the hygiene day 3 of download festival

The normal probability plot of the hygiene festival data is not normally distributed when the data has not been trended. The plots of the error are very far from the line implying that data is not normally distributed.

The descriptive statistics of the download festival

The descriptive statistics of the download festival

The trended hygiene data of day 3 of download festival show that the data is normally distributed. This is indicated clearly because the residual p-p plot of the day 3 of download festival data of hygiene are close to the line implying that the errors and the festival data re normally distributed.

The above table 1 shows the value of descriptive statistics of the variables

From the descriptive statistics, the variable day one had no missing value. The mean of hygiene day one of download festival is 1.7934, the median is 1.79, the variance is 0.892, the kurtosis statistic is 170.45, and the skewness statistic is 8.865. The variable hygiene day two of download festival had 546 missing values, the mean of 0.9609, the median is 0.79, the variance is equal to 0.52, the value of skewness statistic is 1.095 and the value of kurtosis is 0.822. The variable hygiene day three of download festival has 687 missing value, the mean of 0.9765, median is 0.76, the variance of 0.504, the skewness statistic of 1.0033, and the kurtosis statistic of 0.732.

From the skewness value we can say that the day I of download festival is skewed to the right of the mean, the standard error of the skewness statistic is 0.086. The skewness value is 1.095.

We will write
a custom essay
specifically for you
Get your first paper with
15% OFF

The description of numeracy and computer literacy

From the descriptive statistic of table 2, the total sample was equal to 50, the mean value is 4.12, the median is 4, the standard deviation is equal to 2.067, the skewness is 0.512 and the kurtosis statistic is -0.484. From the skewness statistic, we observe that the value of the skewness is negative hence the data is skewed to the left of the mean of the data.

The descriptive statistic of the numeracy and the computer literacy at university of Sussex University

From the descriptive statistic, the total sample was equal to 50, the mean value is 5.58, the median is 5, the standard deviation is equal to 3.071, the skewness is 0.793, and the kurtosis statistic is 0.26. From the skewness statistic, we observe that the value of the skew ness is positive hence the data is skewed to the right of the mean of the data.

The histogram of the numeracy

The histogram of the numeracy

From the histogram the numeracy of the university dunce town is skewed to the right.

The numeracy data is not symmetrical about the mean. The numeracy contains the outliers.

The histogram of computer literacy

Computer literacy

From the histogram of the computer literacy, we can see that the data on computer literacy is normally distributed about the mean. The data on computer literacy has three outliers.

Persentage

The test of homogeneity

Descriptive
NMeanStd. DeviationStd. Error95% Confidence Interval for MeanMinimumMaximum
Lower BoundUpper Bound
Computer literacyDuncetown University5050.268.0681.14147.9752.553567
Sussex University5051.168.5051.20348.7453.582773
Total10050.718.260.82649.0752.352773
Percentage of lectures attendedDuncetown University5056.26023.77263.361949.50463.0168.0100.0
Sussex University5063.27018.96972.682757.87968.66112.5100.0
Total10059.76521.68482.168555.46264.0688.0100.0
Test of Homogeneity of Variances
Levene Statisticdf1df2Sig.
Computer literacy.064198.801
Percentage of lectures attended1.731198.191

From the test of homogeneity of variance, the levene statistic of computer literacy is 0.064 with a significant value of 0.801 which is greater than 0.05. This means that we fail to reject the hypothesis that the variances are homogeneous. We therefore conclude that the variance of computer literacy is homogeneous. The value of the levene statistic is equal to 1.731 with a significant value of 0.191. This means that we fail to reject the hypothesis that the variance is homogeneous. We therefore conclude that the variance is homogeneous.

ANOVA
Sum of SquaresdfMean SquareFSig.
Computer literacyBetween Groups20.250120.250.295.588
Within Groups6734.3409868.718
Total6754.59099
Percentage of lectures attendedBetween Groups1228.50311228.5032.656.106
Within Groups45324.22598462.492
Total46552.72799
Test of Homogeneity of Variance
Levene Statisticdf1df2Sig.
Computer literacyBased on Mean.064198.801
Based on Median.108198.743
Based on Median and with adjusted df.108190.900.743
Based on trimmed mean.069198.793
Percentage of lectures attendedBased on Mean1.731198.191
Based on Median1.422198.236
Based on Median and with adjusted df1.422189.497.236
Based on trimmed mean1.714198.194

We can able observe from the test of homogeneity of variance table that the values of significant are all greater than the 0.05. This mean that we fail to reject the hypothesis that the variance is homogeneous and we conclude that the variance of the computer literacy and the percentages of lectures attended have their variance homogeneous based on both median and trimmed mean.

Assumption violation

When assumption of homogeneity of variance is violated, the data fail to obey the normality assumptions. This lead to contradicting conclusions hence there is over dispersion of the parameters.

References

Ball, K. S. (2001). The Use of Human Resource Information Systems: a Survey. Personnel Review, 30(6), 667- 693.

Kutner, M., Nachtsheim, C., Neter, J., & Li, W. (2005).Applied Linear Statistical Models (5th ed.). New York: McGraw-Hill/Irwin.

Appendix

Table 1: The descriptive statistic of the variables day 1, day 2, and day 3

Statistics, and day
Hygiene (Day 1 of Download Festival)Hygiene (Day 2 of Download Festival)Hygiene (Day 3 of Download Festival)
NValid810264123
Missing0546687
Mean1.7934.9609.9765
Median1.7900.7900.7600
Mode2.00.23.44a
Std. Deviation.94449.72078.71028
Variance.892.520.504
Skewness8.8651.0951.033
Std. Error of Skewness.086.150.218
Kurtosis170.450.822.732
Std. Error of Kurtosis.172.299.433
Range20.003.443.39
a. Multiple modes exist. The smallest value is shown

Table 2: The descriptive statistic of the numeracy

Statistics
UniversityComputer literacyPercentage of lectures attendedNumeracyPercentage on SPSS exam
Duncetown UniversityNValid50505050
Missing0000
Mean50.2656.2604.1240.18
Median49.0060.5004.0038.00
Mode48a48.5a434a
Std. Deviation8.06823.77262.06712.589
Variance65.094565.1354.271158.477
Skewness.225-.309.512.309
Std. Error of Skewness.337.337.337.337
Kurtosis-.515-.383-.484-.567
Std. Error of Kurtosis.662.662.662.662
Range3292.0851
Sussex UniversityNValid50505050
Missing0000
Mean51.1663.2705.5876.02
Median54.0065.7505.0075.00
Mode5442.0a572a
Std. Deviation8.50518.96973.07110.205
Variance72.341359.8499.432104.142
Skewness-.538-.365.793.272
Std. Error of Skewness.337.337.337.337
Kurtosis1.379-.221.260-.264
Std. Error of Kurtosis.662.662.662.662
Range4687.51343
a. Multiple modes exist. The smallest value is shown
Print
Need an custom research paper on Statistics. Exploring the Festival Data written from scratch by a professional specifically for you?
808 writers online
Cite This paper
Select a referencing style:

Reference

IvyPanda. (2024, April 6). Statistics. Exploring the Festival Data. https://ivypanda.com/essays/statistics-exploring-the-festival-data/

Work Cited

"Statistics. Exploring the Festival Data." IvyPanda, 6 Apr. 2024, ivypanda.com/essays/statistics-exploring-the-festival-data/.

References

IvyPanda. (2024) 'Statistics. Exploring the Festival Data'. 6 April.

References

IvyPanda. 2024. "Statistics. Exploring the Festival Data." April 6, 2024. https://ivypanda.com/essays/statistics-exploring-the-festival-data/.

1. IvyPanda. "Statistics. Exploring the Festival Data." April 6, 2024. https://ivypanda.com/essays/statistics-exploring-the-festival-data/.


Bibliography


IvyPanda. "Statistics. Exploring the Festival Data." April 6, 2024. https://ivypanda.com/essays/statistics-exploring-the-festival-data/.

Powered by CiteTotal, online referencing tool
If you are the copyright owner of this paper and no longer wish to have your work published on IvyPanda. Request the removal
More related papers
Cite
Print
1 / 1