Measures of Central Tendency and Spread
The three main measures of central tendency tell you different stories about your data. The mean (arithmetic average) accounts for every value but can be skewed by outliers. The median (middle value) resists the influence of extreme values. The mode (most frequent value) shows what's most common in your dataset.
When working with grouped data in a frequency distribution, calculating the mean requires using the midpoint of each class interval. Multiply each midpoint by its frequency, sum these products, and divide by the total frequency to find the mean.
Standard deviation measures how spread out your data is from the mean. A small standard deviation indicates values clustered near the mean, while a large one shows greater variability. This measure is crucial for understanding data consistency and making reliable predictions.
Percentiles and quartiles divide data into sections, helping you understand relative positions. The 33rd percentile, for example, is the value below which 33% of observations fall. Quartiles divide data into four equal parts, with Q1 (25%), Q2 (median, 50%), and Q3 (75%) being especially useful for creating box plots.
Pro Tip: The five-number summary (minimum, Q1, median, Q3, maximum) provides a complete picture of your data's distribution without requiring you to look at every value. It's the foundation for creating box plots.