Housing Data Visualization and Descriptive Statistics Case Study

Exclusively available on IvyPanda Available only on IvyPanda

Introduction

The present paper is to provide a descriptive analysis of data on four variables, including the percentage of owner-occupied housing units (Pct Owner Occ), home value, household income (HH Inc), and per capita income (Per Cap Inc). These variables were estimated for all 50 states and united in a dataset that was analyzed for the present report. The purpose of the present paper is to provide basic information and highlight possible correlations between them. In other words, the purpose of the present paper is to study the variables before conducting in-depth inferential analysis. Descriptive statistics, correlation analysis, and data visualization methods, including histograms, boxplots, and scatterplots, were used to achieve the purpose. This report is a supplement to the analysis conducted in Microsoft Excel.

We will write a custom essay on your topic a custom Case Study on Housing Data Visualization and Descriptive Statistics
808 writers online

Descriptive Statistics

The purpose of descriptive statistics is to summarize large samples of data to understand the distribution of data. Descriptive statistics usually include the measures of central tendency (mean, median, and mode) and measures of dispersion (such as standard deviation, variance, skewness, and kurtosis). Four variables were described using Excel’s data analysis function called “Descriptive statistics.” The results of the analysis are provided in Table 1 below.

Table 1. Descriptive statistics by variable

Owner-occupied housing units (%)Home ValueHousehold incomePer capita income
Mean65.97215,114.0060,181.0031,935.78
Standard Error0.6012,946.201,400.99646.36
Median66.30186,100.0058,848.0030,921.00
ModeN/A166,800.00N/AN/A
Standard Deviation4.2891,543.459,906.474,570.49
Sample Variance18.298,380,203,677.5598,138,127.0220,889,365.85
Kurtosis1.135.22-0.62-0.22
Skewness-0.961.950.490.57
Range19.04473,200.0038,301.0019,622.00
Minimum53.90114,500.0043,567.0023,434.00
Maximum72.94587,700.0081,868.0043,056.00
Sum3,298.4810,755,700.003,009,050.001,596,789.00
Count50.0050.0050.0050.00

According to the analysis, the mean of Pct Owner Occ is 66%, with a standard deviation (SD) of 4.28. The distribution is left-skewed (skewness = -0.96) and the tails are heavier in comparison with normal distribution (kurtosis = 1.13). The mean Home Value is $215,114 with an SD of $9,906. The distribution of the home value differs from the normal distribution considerably, as it is right-skewed with a large skewness coefficient (skewness = 1.95), and the tails are very heavy (kurtosis = 5.22). The distributions of both HH Inc and Per Capita Inc are very close to the normal distribution, as skewness and kurtosis are close to 0. The mean of HH Inc is $60,181 with an SD of $9,906, and Per Cap Inc’ mean is $31,936 with an SD of $4,570.

Frequency Histograms and Boxplots

The purpose of the histograms and boxplots is to provide visualization for data distribution. Additionally, such visualizations help to identify outliers if there are any present. The present section will discuss the histograms and boxplots for Pct Owner Occ and Home Value only, as the analysis of these two variables generated valuable findings. The histograms and boxplots for HH Inc and Per Cap Inc revealed that the distributions of the variables were close to the normal distribution with no outliers.

Figures 1 and 2 below demonstrate the distribution of data for Pct Owner Occ in a histogram and a boxplot correspondingly. The histogram visualizes the fact that the left tail of the distribution is longer. Moreover, it suggests that there are outliers on the left from the mean. The boxplot confirms the fact that there are two outliers below the mean for the percentage of occupied housing units in New York (53.9%) and Nevada (55.8%).

Histogram for Pct Owner Occ
Figure 1. Histogram for Pct Owner Occ
Boxplot for Pct Owner Occ
Figure 2. Boxplot for Pct Owner Occ

Figures 3 and 4 visualize the distribution of data for Home Value using the same methods. The histogram confirmed that the distribution of Home value by state is heavily right-skewed. Moreover, there was significant evidence that there may be some outliers, as many states had average home values much larger than the mean value. The boxplot demonstrated that two outliers were present, including California ($475,900) and Hawaii ($587,700).

1 hour!
The minimum time our certified writers need to deliver a 100% original paper
Histogram for Home Value
Figure 3. Histogram for Home Value
Boxplot for Home Value
Figure 4. Boxplot for Home Value

Scatterplots and Correlations

The scatterplots and Pearson’s correlation analysis are used to assess the relationships between two variables. Scatterplots help to eyeball the relationships, while Pearson’s correlation analysis helps to quantify the correlations. The present section will discuss scatterplots and correlation analysis for all variables. Table 2 below provides a correlation matrix for the variables.

Table 2. Correlation matrix

Owner-occupied housing units (%)Home Value (median / dollars)Household income (median / dollars)Per capita income (median / dollars)
Owner-occupied housing units (%)1
Home Value (median / dollars)-0.5602767461
Household income (median / dollars)-0.2929427490.7762331371
Per capita income (median / dollars)-0.2710252040.6400641490.9150071921

The correlation analysis demonstrated that all the variables are somewhat correlated with each other. On the one hand, there are very strong correlations. Home Value was found to have a strong positive correlation with HH Inc (Pearson’s R = 0.78) and Per Cap inc (Pearson’s R = 0.64). At the same time, Home Value and Pct Owner Occ had a medium negative correlation (Pearson’s R = – 0.56). The strongest correlation was found to be between HH Inc and Per Cap Inc (Pearson’s R = 0.92).

Figure 5 below demonstrates the scatterplot of Pct Owner Occ against Home Value with a trendline. The scatterplot demonstrates that there is a medium negative correlation. Since there is no distinctive pattern that can be recognized, the relationship appears to be linear.

Scatterplot of Pct Owner Occ against Home Value
Figure 5. Scatterplot of Pct Owner Occ against Home Value

Figures 6 and 7 below show a scatterplot of Pct Owner Occ against HH Inc and Pct Owner Occ against Per Cap inc with trendlines correspondingly. The points in both scatterplots do not seem to form any patterns, which demonstrates weak correlations between the variables. However, it may still be acknowledged that there are weak negative linear correlations between the variables.

Scatterplot of Pct Owner Occ against HH Inc
Figure 6. Scatterplot of Pct Owner Occ against HH Inc
Scatterplot of Pct Owner Occ against Per Cap Inc
Figure 7. Scatterplot of Pct Owner Occ against Per Cap Inc

Figures 8 and 9 below demonstrate scatterplots of Home Value against HH Inc and Home Value against Per Cap Inc with trendlines. The data points form ascending linear patterns, which implies that there are strong positive correlations between the variables.

Scatterplot of Home Value against HH Inc
Figure 8. Scatterplot of Home Value against HH Inc
Scatterplot of Home Value against Per Cap Inc
Figure 9. Scatterplot of Home Value against Per Cap Inc

Finally, Figure 10 visualizes correlation between HH Inc and Per Cap Inc using a scatterplot with a trendline. The scatterplot forms almost a perfect ascending line, which demonstrates a strong positive linear correlation between the two variables.

Remember! This is just a sample
You can get your custom paper by one of our expert writers
Scatterplot of HH Inc against Per Cap Inc
Figure 9. Scatterplot of HH Inc against Per Cap Inc

Conclusion

Descriptive analysis of the dataset provided with valuable findings. In particular, the analysis of descriptive statistics, boxplots, and histograms revealed that the distribution of Pct Owner Occ was found to be left-skewed with heavy tails, and Home Value was found to be right-skewed with even heavier tails. At the same time, the distributions of HH Inc and Per Cap Inc were found to be very close to the normal distribution. The correlation analysis revealed that all the variables were somewhat correlated. Home Value was found to have the strongest correlation with HH Inc. At the same time, the relationships between Per Cap Inc and Home value were also strong.

Print
Need an custom research paper on Housing Data Visualization and Descriptive Statistics written from scratch by a professional specifically for you?
808 writers online
Cite This paper
Select a referencing style:

Reference

IvyPanda. (2022, July 22). Housing Data Visualization and Descriptive Statistics. https://ivypanda.com/essays/housing-data-visualization-and-descriptive-statistics/

Work Cited

"Housing Data Visualization and Descriptive Statistics." IvyPanda, 22 July 2022, ivypanda.com/essays/housing-data-visualization-and-descriptive-statistics/.

References

IvyPanda. (2022) 'Housing Data Visualization and Descriptive Statistics'. 22 July.

References

IvyPanda. 2022. "Housing Data Visualization and Descriptive Statistics." July 22, 2022. https://ivypanda.com/essays/housing-data-visualization-and-descriptive-statistics/.

1. IvyPanda. "Housing Data Visualization and Descriptive Statistics." July 22, 2022. https://ivypanda.com/essays/housing-data-visualization-and-descriptive-statistics/.


Bibliography


IvyPanda. "Housing Data Visualization and Descriptive Statistics." July 22, 2022. https://ivypanda.com/essays/housing-data-visualization-and-descriptive-statistics/.

Powered by CiteTotal, best essay citation generator
If you are the copyright owner of this paper and no longer wish to have your work published on IvyPanda. Request the removal
More related papers
Cite
Print
1 / 1