Descriptive Statistics Method: Household Income Analysis Essay

Exclusively available on IvyPanda Available only on IvyPanda

Introduction

What is the best way to understand what the data is telling about the matter of interest? Instead of looking at raw data, it is easier to look at the summary of information about the values included in the dataset. In statistics, summative data about a data set is presented in the form of descriptive statistics, which is a quantitative description of the main features of a collection of information (Tanner, 2016). Descriptive statistics may include various calculations, among which are mean, standard error, median, mode, standard deviation, sample variance, kurtosis, skewness, range, minimum, maximum, sum, and count. The present paper aims to analyze a dataset of the annual income of 20 households using descriptive statistics and report the results of the analysis.

We will write a custom essay on your topic a custom Essay on Descriptive Statistics Method: Household Income Analysis
808 writers online

Dataset Overview and Central Tendency

The dataset under analysis includes 20 values with a mean of 31.85 and a standard deviation of 19.95. The dataset is presented in Table 1, while descriptive statistics are shown in Table 2.

Household1234567891011121314151617181920
Income2225281824252026292325312725322699262878

Table 1. Dataset of Annual Household Income

Income
Mean31.85
Standard Error4.46
Median26.00
Mode25.00
Standard Deviation19.95
Sample Variance397.92
Kurtosis7.85
Skewness2.89
Range81.00
Minimum18.00
Maximum99.00
Sum637.00
Count20.00

Table 2. Dataset Descriptive Statistics

The central characteristic of a dataset is the evaluation of the central tendency. There are three methods for illustrating central tendency: calculate the mean, the median, or the mode of the dataset. The mean value is most frequently used when describing the central tendency; however, there are some cases where the median or mode values are preferred. Mode is used for describing datasets that are measured on a nominal scale, while the median is favored when a dataset has several extreme values (Tanner, 2016). As can be seen from Table 1, the dataset includes two extreme values, which are 99 and 78, which implies that the median value is the most appropriate value to describe the central tendency.

According to Table 2, median and mode values are approximately the same, while the mean value differs considerably. Differences between the measures of central tendency lead to the asymmetrical distribution of values. According to Tanner (2016), “when the measures of central tendency do not agree, it is because some scores on one side of the distribution are not counterbalanced by scores a similar distance from the mean on the other side of the distribution” (chapter 2.3). The misbalance tells that the dataset will have a skew significantly different from zero. The only mode of the dataset is 25, as this value appears most frequently in the dataset (4 times).

Interquartile Range and Outliers

As it was mentioned in the previous section, the dataset includes two extreme values, which may influence the mean score considerably. In statistics, extreme values are called outliers, and they are often excluded from the analysis as they may disrupt the results of the analysis, especially when talking about small datasets. There are two ways to calculate outliers. The first method is to exclude values that lie outside the range of two standard deviations below the mean and above the mean. Since the mean is 31.85, and the standard deviation is 19.95, the limits are calculated as follows:

1 hour!
The minimum time our certified writers need to deliver a 100% original paper
FormulaFormula

The calculations show that the outliers lie below -8.05 and above 71.75, which implies that household incomes of 78 and 99 should not be included in the analysis.

The second method to find outliers is to exclude all the values 1.5 interquartile ranges (IQRs) above the 75 percentile rank (Q3) and 1.5 IRQs below the 25 percentile rank (Q1). IQR is calculated by subtracting Q1 from Q3. Given that Q1 = 24.5, and Q3 = 28.5, IQR = 28.5 – 24.5 = 4. Therefore, the limits are calculated as follows:

FormulaFormula

According to the second method, the dataset includes three outliers, which are 18, 78, and 99, which is different from the results received from using the first method.

Discussion

I believe that the most useful descriptive statistics are the mean, the count, and the standard deviation. These values allow the researcher to understand the average value of all the data points and their dispersion. In other words, knowing only to values will allow the observer to understand how the data is distributed. The total count is also vital as it helps to identify if the results are reliable or additional research may be required. However, in the present dataset it is vital to identify the median value as well, to understand that the dataset is skewed and some extreme values may be present.

Conclusion

Descriptive statistics is a powerful method of describing a dataset with a limited number of values. It provides an appreciation of the central tendency and distribution of data points. However, researchers need to be aware that extreme values may disrupt the results of the analysis; therefore, every dataset should be checked for outliers before conducting the analysis.

Reference

Tanner, D. (2016). Statistics for the Behavioral & social sciences (2nd ed.). Bridgepoint Education.

Print
Need an custom research paper on Descriptive Statistics Method: Household Income Analysis written from scratch by a professional specifically for you?
808 writers online
Cite This paper
Select a referencing style:

Reference

IvyPanda. (2022, July 26). Descriptive Statistics Method: Household Income Analysis. https://ivypanda.com/essays/descriptive-statistics-method-household-income-analysis/

Work Cited

"Descriptive Statistics Method: Household Income Analysis." IvyPanda, 26 July 2022, ivypanda.com/essays/descriptive-statistics-method-household-income-analysis/.

References

IvyPanda. (2022) 'Descriptive Statistics Method: Household Income Analysis'. 26 July.

References

IvyPanda. 2022. "Descriptive Statistics Method: Household Income Analysis." July 26, 2022. https://ivypanda.com/essays/descriptive-statistics-method-household-income-analysis/.

1. IvyPanda. "Descriptive Statistics Method: Household Income Analysis." July 26, 2022. https://ivypanda.com/essays/descriptive-statistics-method-household-income-analysis/.


Bibliography


IvyPanda. "Descriptive Statistics Method: Household Income Analysis." July 26, 2022. https://ivypanda.com/essays/descriptive-statistics-method-household-income-analysis/.

Powered by CiteTotal, online essay referencing maker
If you are the copyright owner of this paper and no longer wish to have your work published on IvyPanda. Request the removal
More related papers
Cite
Print
1 / 1