Descriptive Statistics of Geysers’ Eruptions Coursework

Exclusively available on IvyPanda Available only on IvyPanda
Updated: Mar 12th, 2024

Introduction

Geysers comprise natural phenomena that attract tourists at Yellowstone National Park. The understanding of geysers regarding the waiting time for eruptions and duration of eruptions would enhance accurate predictions of their incidences and occurrences. Statistical analysis of the duration of eruptions and waiting time for occurrences could provide trends and patterns of geysers. In statistical analysis, descriptive statistics play a central role in the exploration of data because they reveal patterns and trends, which characterize and summarize a given data effectively. Therefore, the purpose of the coursework is to explore data collected from Old Faithful geysers at Yellowstone National Park with 299 eruptions by examining descriptive statistics of waiting time and the duration of occurrences.

We will write a custom essay on your topic a custom Coursework on Descriptive Statistics of Geysers’ Eruptions
808 writers online

Descriptive Statistics

Table 1 provides measures of central tendency, namely, means, modes, and medians of the waiting time for eruptions and the duration of eruptions in minutes. The waiting time for eruptions has a mean of 72.314, mode of 78, and median of 76. Since the mean, mode, and median are not equal, it implies that the distribution of the waiting time for eruptions has skewness. Specifically, the data for the waiting time exhibit negative skew in distribution because the mean is less than the mode and median. Comparatively, the duration of eruptions has a mean of 3.461, mode of 4, and a median of 4. In the same manner, as the waiting time for eruptions, the duration of eruptions has a negatively skewed distribution because the mean is less than both the mode and the median. Thus, the analysis of the measures of the central tendency indicates that modes and medians for both the waiting time and the duration for eruptions do not cluster around their respective means.

Table 1. Descriptive Statistics for Geysers’ Waiting Time and Duration of Eruptions

VariableMeanStd DevVarianceMinimumMaximumModeRangeN
Waiting Time for Eruptions72.31413.890192.94143.000108.00078.00065.000299
Duration of Eruptions3.4611.1481.3180.8335.4504.0004.617299

Table 1 also provides measures of dispersion, viz., standard deviation, variance, range, maximum value, and minimum value. The waiting time for eruptions has a standard deviation of 13.890 and a variance of 192.941, which means that the distribution deviates considerably from the mean (M = 72.314±13.890). The dispersion of the waiting time for eruptions is high because it varies from 43 to 108 with a range of 65. Since the data of the duration for eruptions has a standard deviation of 1.148 and variance of 1.318, it implies that the distribution does not deviate markedly from the mean (M = 3.461±1.148). The duration for eruptions has a low dispersion level because the data varies from 0.833 to 5.45 with a range of 4.617. Therefore, the comparative analysis indicates that the waiting time for eruptions has a high level of dispersion, whereas the duration for eruptions has a low level of dispersion.

Histogram and the Normality of the Distribution

The histogram for the waiting time for eruptions (figure 1) indicates a bimodal form of the distribution. The first modal distribution occurs between the waiting time for eruptions between 42 and 66 minutes, while the second modal distribution occurs between 66 and 108 minutes. However, the distribution has a negative skew that deviates from the normal distribution because the first modal distribution (54-60) forming 10% is less than the second modal distribution (78) constituting about 22%.

The Histogram for the Waiting Time for Eruptions
Figure 1. The Histogram for the Waiting Time for Eruptions

Figure 2 is a histogram that depicts the existence of a bimodal distribution in the duration of eruptions. The first mode of 2 minutes forms about 27%, while the second mode of 4 minutes consists of approximately 28%. Thus, the existence of the approximately equal proportions of the two modes in the bimodal histograms indicates the duration of eruptions has significant outliers that make the distribution possess a negative skew. Specifically, outliers occurring at 2 minutes and representing about 27% of the distribution distorts the mean because over 50% of data points are more than 3.5. Therefore, the distribution of data does not follow the normal distribution due to the presence of outliers, which creates the second model in the distribution.

 Histogram for the Duration of Eruptions 
Figure 2. Histogram for the Duration of Eruptions

Learning Outcomes

In the data analysis, I have learned how to import data into SAS, perform descriptive statistics, and determine the normality of data using the histogram. Figure 3 shows the process flow employed in the analysis of data using SAS. From a statistical point of view, I have learned that measures of central tendency, measures of dispersion, and the pattern of distributions offer an adequate characterization of data.

1 hour!
The minimum time our certified writers need to deliver a 100% original paper
Process Flow of Data Analysis in SAS 
Figure SEQ Figure * ARABIC 3. Process Flow of Data Analysis in SAS

Conclusion

Data analysis shows that geysers in Old Faithful, Yellowstone National Park, vary in the aspects of the time taken to erupt and the duration of eruptions. An average time for waiting eruptions is 72.314 minutes, while that of the duration of eruptions is 3.46 minutes. The histogram depicts the existence of negative skewness since the distributions of both variables do not follow the normal distribution. The presence of bimodal distribution in the waiting time for eruptions and the significant outliers in the duration of eruptions contributes to the skewness of data.

Print
Need an custom research paper on Descriptive Statistics of Geysers’ Eruptions written from scratch by a professional specifically for you?
808 writers online
Cite This paper
Select a referencing style:

Reference

IvyPanda. (2024, March 12). Descriptive Statistics of Geysers’ Eruptions. https://ivypanda.com/essays/descriptive-statistics-of-geysers-eruptions/

Work Cited

"Descriptive Statistics of Geysers’ Eruptions." IvyPanda, 12 Mar. 2024, ivypanda.com/essays/descriptive-statistics-of-geysers-eruptions/.

References

IvyPanda. (2024) 'Descriptive Statistics of Geysers’ Eruptions'. 12 March.

References

IvyPanda. 2024. "Descriptive Statistics of Geysers’ Eruptions." March 12, 2024. https://ivypanda.com/essays/descriptive-statistics-of-geysers-eruptions/.

1. IvyPanda. "Descriptive Statistics of Geysers’ Eruptions." March 12, 2024. https://ivypanda.com/essays/descriptive-statistics-of-geysers-eruptions/.


Bibliography


IvyPanda. "Descriptive Statistics of Geysers’ Eruptions." March 12, 2024. https://ivypanda.com/essays/descriptive-statistics-of-geysers-eruptions/.

Powered by CiteTotal, easy bibliography maker
If you are the copyright owner of this paper and no longer wish to have your work published on IvyPanda. Request the removal
More related papers
Cite
Print
1 / 1