The five-number summary is a descriptive statistic that provides a concise overview of a dataset’s distribution. It consists of five key values: the minimum, the first quartile (Q1), the median (Q2), the third quartile (Q3), and the maximum. The minimum represents the smallest value in the dataset, while the maximum represents the largest. The median is the middle value when the data is ordered. The first quartile (Q1) is the median of the lower half of the data, and the third quartile (Q3) is the median of the upper half. For example, consider the dataset: 3, 7, 8, 5, 12, 14, 21, 13, 18. After ordering, it becomes: 3, 5, 7, 8, 12, 13, 14, 18, 21. The minimum is 3, the maximum is 21, the median is 12. To find Q1, consider 3, 5, 7, 8. The median of this lower half is (5+7)/2 = 6. Similarly, for Q3, consider 13, 14, 18, 21. The median of this upper half is (14+18)/2 = 16. Therefore, the five-number summary is: 3, 6, 12, 16, 21.
This summary offers significant advantages in data analysis. It provides a quick and easy way to understand the central tendency, spread, and potential skewness of a dataset. It is particularly useful when comparing different datasets or identifying outliers. The historical context of the five-number summary is rooted in exploratory data analysis, emphasizing visualization and understanding data before applying more complex statistical techniques. Its resistance to outliers, unlike the mean, makes it robust for describing data with extreme values.