The median is the middle value of a dataset when values are arranged in ascending or descending order; it divides the distribution into two equal halves. For an even number of values, it is the average of the two central values. The median is a robust measure of central tendency, unaffected by extreme outliers, making it preferable to the mean for skewed distributions such as income or house prices.
Problem
Find the median of the following 7 exam scores: 72, 45, 88, 91, 63, 55, 80.
Solution
Step 1: Sort in ascending order: 45, 55, 63, 72, 80, 88, 91. Step 2: n = 7 (odd), so the median is the (7+1)/2 = 4th value. Step 3: The 4th value in the sorted list is 72.
Answer
Median = 72
| Dataset Size | Method | Example Dataset | Median |
|---|---|---|---|
| Odd (n=5) | Middle value: position (n+1)/2 | 3, 7, 9, 12, 15 | 9 (3rd value) |
| Even (n=6) | Average of two middle values | 3, 7, 9, 12, 15, 20 | (9+12)/2 = 10.5 |
| Odd (n=7) | Middle value: position 4 | 1, 2, 4, 7, 9, 10, 14 | 7 (4th value) |
| Even (n=4) | Average of 2nd and 3rd | 5, 10, 15, 20 | (10+15)/2 = 12.5 |
Wikimedia Commons, CC BY-SA
The mean (arithmetic mean) is the sum of all values in a dataset divided by the number of values, and represents the central or typical value. It is the most commonly used measure of central tendency and is sensitive to extreme values (outliers). The mean is used extensively in data analysis, quality control, and as the foundation for more advanced statistical measures such as variance and standard deviation.
The mode is the value that appears most frequently in a dataset, making it the only measure of central tendency applicable to nominal (categorical) data. A dataset can be unimodal (one mode), bimodal (two modes), multimodal (many modes), or have no mode if all values are equally frequent. The mode is particularly useful in market research, fashion, and any context where the most common category or value is of interest.
Variance measures the average squared deviation of a random variable from its mean, quantifying how spread out the values in a distribution are. A low variance indicates values cluster tightly around the mean; a high variance indicates they are widely dispersed. Variance is the square of the standard deviation and is fundamental to ANOVA, regression analysis, and portfolio theory.
From Latin medianus (of the middle), derived from medius (middle). The statistical use of "median" was popularised by Francis Galton in the 1880s as he sought a measure of the "typical" value robust to extreme observations.