Data Distribution and Dispersion
Data distributions can be symmetric or asymmetric. In a symmetric distribution, the mode, median, and mean are all equal. In asymmetric distributions, these three measures differ, which tells us about the shape of our data.
Measuring dispersion helps us understand how spread out our data is. The simplest measure is the range, which is just the difference between the maximum and minimum values. However, this only uses two data points and ignores everything in between.
The standard deviation gives us a much better picture of dispersion by measuring how far values typically stray from the mean. A small standard deviation means data points cluster closely around the mean, while a large one indicates widely scattered values.
📏 Think of standard deviation as the "average distance from average." The smaller this distance, the more consistent your data!
For example, consider two sets with the same mean (10): Set A {8, 5, 7, 6, 35, 5, 4} and Set B {11, 8, 10, 9, 17, 8, 7}. Set A has values ranging from 4 to 35, while Set B only ranges from 7 to 17. Set B has less dispersion and likely contains more reliable measurements.