MathematicsStatisticsMedium

Correlation Coefficient

Also known as:Pearson's rPearson correlationProduct-moment correlation coefficient

The Pearson correlation coefficient (r) is a dimensionless statistic that measures the strength and direction of the linear relationship between two continuous variables, ranging from −1 (perfect negative correlation) to +1 (perfect positive correlation), with 0 indicating no linear association. It is calculated as the covariance of the two variables divided by the product of their standard deviations. While correlation quantifies association, it does not imply causation — a fundamental principle in statistical reasoning.

Key Formula

r = Σ[(xᵢ−x̄)(yᵢ−ȳ)] / √[Σ(xᵢ−x̄)² × Σ(yᵢ−ȳ)²]

LaTeX: r = \dfrac{\sum_{i=1}^{n}(x_i - \bar{x})(y_i - \bar{y})}{\sqrt{\sum(x_i-\bar{x})^2 \sum(y_i-\bar{y})^2}}

SymbolMeaningUnit
rPearson correlation coefficientdimensionless
x_i, y_iIndividual data point valuessame as data
\bar{x}, \bar{y}Sample means of x and ysame as data
nNumber of data pairscount

Worked Example

Problem

Two variables: x = [1, 2, 3, 4, 5], y = [2, 4, 5, 4, 5]. Calculate the Pearson correlation coefficient.

Solution

Step 1: x̄ = 3, ȳ = 4. Step 2: (xᵢ−x̄): −2, −1, 0, 1, 2; (yᵢ−ȳ): −2, 0, 1, 0, 1. Step 3: Products: 4, 0, 0, 0, 2 → Σ = 6. Step 4: Σ(xᵢ−x̄)² = 4+1+0+1+4 = 10; Σ(yᵢ−ȳ)² = 4+0+1+0+1 = 6. Step 5: r = 6 / √(10 × 6) = 6 / √60 = 6/7.746 ≈ 0.775.

Answer

r ≈ 0.775 — moderate-to-strong positive linear correlation

Interpretation of Pearson Correlation Coefficient Values

r Value RangeStrengthDirectionExample
−1.00PerfectNegativeExact inverse relationship
−0.70 to −0.99StrongNegativeStudy time vs errors
−0.30 to −0.69ModerateNegativeStress vs sleep
−0.29 to 0.29Weak/NoneShoe size vs IQ
0.30 to 0.69ModeratePositiveHeight vs weight
0.70 to 1.00StrongPositiveTemperature vs ice cream sales

Interactive Tools

Desmos — Correlation

Plot data, compute correlation, and visualise scatter plots interactively

Open Tool

Khan Academy — Correlation

Lessons on interpreting correlation and avoiding causation fallacies

Open Tool

Wolfram Alpha

Compute Pearson and Spearman correlations from data sets

Open Tool
Grid of scatter plots illustrating various correlation coefficient values

Wikimedia Commons, CC BY-SA

Related Terms

The term "correlation" comes from the Latin "correlatio" (mutual relation), popularised by Francis Galton in 1888. Karl Pearson formalised the product-moment correlation coefficient formula in 1895, hence the eponym "Pearson's r".

statisticsbivariate-analysisassociationlinear-relationshipprobability