In descriptive statistics, many calculated values and their graphical visualizations can help identify trends that are not obvious at first. Looking at a set of numerical data, it is usually not difficult to determine its mode, median, or even the approximate mean, since these measures of central tendency appear almost intuitively clear. In reality, however, the median is only the second quartile, which means others besides it. For example, some of the most valuable quartiles are Q1 and Q3, with 50 percent of all numerical data between them: this distance is called the IQR. With IQR and Q1, Q3, it becomes possible to determine any outliers in the numerical set. This can be done using two simple formulas:
By substituting specific values, two bounds, lower and upper, beyond which the outliers are found, can be found for any numerical data set. Outliers, therefore, are any values in the array that are not within the calculated interval. Finally, using the five main numbers — median, minimum, maximum, and Q1, Q3 — it becomes possible to depict a boxplot plot, reflecting general trends in the data distribution.
Each of the characteristics described finds useful applications in statistical analysis. More specifically, IQR helps assess the degree of dispersion of data relative to the mean: thus, the higher the value of IQR, the greater the dispersion of data is characteristic of the set. Determining outliers allows assessing how good the initial sample was and whether all factors have been considered. If outliers were found in the sample, this indicates not only the potential for incorrect data but also random outliers of interest. Finally, constructing a boxplot makes it possible to visually look at the entire data set, examine its symmetry, and compare the sets to each other if there are multiple boxplots.