MathematicsStatisticsMedium

Hypothesis Testing

Also known as:Significance testingStatistical hypothesis testingNull hypothesis testing

Hypothesis testing is a formal statistical procedure for making decisions about a population parameter based on sample data, by evaluating evidence against a null hypothesis (H₀) in favour of an alternative hypothesis (H₁). A test statistic is computed and compared to a critical value or converted to a p-value; if the result is statistically significant (p < α), the null hypothesis is rejected. It underpins scientific research, clinical trials, quality assurance, and data-driven decision-making across all quantitative disciplines.

Worked Example

Problem

A manufacturer claims its bottles contain exactly 500 mL. A quality-control inspector measures a sample of 36 bottles and finds x̄ = 497 mL with σ = 9 mL. Test at α = 0.05 (two-tailed) whether the mean differs from 500 mL.

Solution

Step 1: H₀: μ = 500; H₁: μ ≠ 500, α = 0.05. Step 2: Z = (497 − 500) / (9 / √36) = −3 / 1.5 = −2.00. Step 3: Critical values: ±Z₀.₀₂₅ = ±1.96. Step 4: |−2.00| = 2.00 > 1.96 → reject H₀.

Answer

Z = −2.00; reject H₀ — evidence that the mean fill differs from 500 mL

Key Components and Decision Rules in Hypothesis Testing

Component	Symbol/Term	Description	Decision Rule
Null hypothesis	H₀	Default claim, assumed true	Reject if p < α
Alternative hypothesis	H₁ or Hₐ	Claim to support	Accept if H₀ rejected
Significance level	α	Threshold probability (e.g., 0.05)	Chosen before the test
Test statistic	Z, t, χ²	Standardised sample measure	Compared to critical value
p-value	p	Probability of result if H₀ true	Reject H₀ if p < α
Type I error	α	False positive (reject true H₀)	Controlled by setting α

Interactive Tools

Khan Academy — Hypothesis Testing

Step-by-step video lessons on one-sample significance tests

Open Tool

Brilliant.org — Statistics

Interactive problems covering hypothesis testing and statistical inference

Open Tool

Wolfram Alpha

Automate hypothesis test computations with Wolfram Alpha

Open Tool

Wikimedia Commons, CC BY-SA

Related Terms

Mathematics

p-value

The p-value is the probability of obtaining a test statistic at least as extreme as the one observed, assuming the null hypothesis is true. A small p-value (typically p < 0.05) indicates that the observed data would be unlikely under H₀, providing evidence to reject it; it does not measure the probability that the null hypothesis is true. Correct interpretation of p-values is essential to avoid common statistical fallacies in research and data analysis.

Mathematics

t-Distribution

The t-distribution (Student's t-distribution) is a continuous probability distribution that arises when estimating the mean of a normally distributed population when the sample size is small and the population standard deviation is unknown. It has heavier tails than the normal distribution, reflecting greater uncertainty; as the degrees of freedom increase toward infinity, it converges to the standard normal distribution. It is the foundation of t-tests and is central to small-sample statistical inference.

Mathematics

Confidence Interval

A confidence interval (CI) is a range of plausible values for an unknown population parameter, constructed from sample data so that the procedure captures the true parameter with a specified probability (the confidence level, e.g., 95%). Crucially, the confidence level refers to the long-run success rate of the procedure — not the probability that a particular interval contains the parameter. Confidence intervals are used throughout science, medicine, and engineering to quantify estimation uncertainty.

The formal framework of hypothesis testing was developed independently by Ronald Fisher (significance testing, 1925) and by Jerzy Neyman and Egon Pearson (decision-theoretic approach, 1933). The term "null hypothesis" was coined by Fisher, from Latin "nullus" (none/nothing), denoting the hypothesis of no effect.

statisticsinferencescientific-methodprobabilitydecision-making