Computer ScienceAlgorithms & Data StructuresAdvanced

Hash Table

Also known as:Hash MapDictionaryAssociative ArraySymbol Table

A hash table (also called a hash map) is a data structure that implements an associative array by using a hash function to map keys to array indices (buckets), enabling average-case O(1) time for insertion, deletion, and lookup operations. When multiple keys hash to the same index (a collision), the table resolves it via chaining (each bucket holds a linked list of entries) or open addressing (probing for an alternate empty slot). Hash tables underlie Python dictionaries, Java HashMaps, database indexing, caches, and symbol tables in compilers, making them one of the most widely used data structures in software engineering.

Key Formula

h(k) = k mod m

LaTeX: h(k) = k \bmod m

Symbol	Meaning	Unit
h(k)	Hash index (bucket) for key k	dimensionless
k	Key (or its integer representation)	dimensionless
m	Number of buckets in the hash table (ideally prime)	dimensionless

Worked Example

Problem

Insert keys {15, 27, 43, 36, 10} into a hash table with m = 7 buckets using h(k) = k mod 7. Use chaining for collision resolution. What is the load factor?

Solution

Step 1: Compute hash values: h(15) = 15 mod 7 = 1; h(27) = 27 mod 7 = 6; h(43) = 43 mod 7 = 1 (collision with 15); h(36) = 36 mod 7 = 1 (collision); h(10) = 10 mod 7 = 3. Step 2: Build table: Bucket 0: empty; Bucket 1: [15 → 43 → 36]; Bucket 2: empty; Bucket 3: [10]; Bucket 4: empty; Bucket 5: empty; Bucket 6: [27]. Step 3: Load factor α = n/m = 5/7 ≈ 0.714. Step 4: Average lookup in non-empty bucket = 1 + α/2 ≈ 1.357 comparisons.

Answer

Load factor α ≈ 0.714; Bucket 1 holds chain [15→43→36]; average lookup ≈ 1.36 comparisons

Hash Table Collision Resolution Strategies Compared

Strategy	Method	Load Factor Limit	Cache Performance	Worst-Case Lookup
Chaining	Linked list per bucket	Any (α can exceed 1)	Poor (pointer chasing)	O(n) if all keys collide
Linear Probing	Probe next slot	α < 1 (≤ 0.7 recommended)	Excellent (contiguous)	O(n) clustering
Quadratic Probing	Probe slot +1², +2², ...	α < 0.5 for full coverage	Good	O(n) secondary clustering
Double Hashing	Second hash for step size	α < 1	Moderate	O(n) theoretical
Robin Hood Hashing	Displace if current has lower psl	α < 0.9 practical	Good	O(log n) expected

Interactive Tools

VisuAlgo — Hash Table Visualiser

Open Tool

Brilliant — Hash Tables

Open Tool

WolframAlpha — Modular Arithmetic

Open Tool

Wikimedia Commons, CC BY-SA

Related Terms

Computer Science

Array (data structure)

An array is a fundamental data structure that stores elements of the same type in contiguous memory locations, enabling O(1) random access via an integer index computed as base_address + index × element_size. Arrays are the basis for many higher-level structures (stacks, queues, heaps, hash tables) and are cache-efficient due to spatial locality. In advanced usage, multi-dimensional arrays, dynamic arrays (vectors), and circular arrays extend the basic model to support resizing, matrix operations, and efficient queue implementations.

Computer Science

Linked List

A linked list is a linear data structure in which elements (nodes) are stored at arbitrary memory locations, with each node containing a data field and one or more pointers linking it to adjacent nodes. Unlike arrays, linked lists do not require contiguous memory, enabling O(1) insertion and deletion at known positions without shifting elements, but they sacrifice O(1) random access (requiring O(n) traversal) and incur pointer overhead per node. Variants include singly linked lists, doubly linked lists, circular linked lists, and skip lists, each trading memory for specific operation efficiency.

Computer Science

Binary Search Tree

A binary search tree (BST) is a binary tree in which every node satisfies the property that all keys in its left subtree are less than the node's key and all keys in its right subtree are greater. This ordering property makes search, insertion, and deletion operations efficient, averaging O(log n) for balanced trees. BSTs are foundational in database indexing, symbol tables, and dynamic set operations.

The term "hash" in computing derives from the culinary meaning "to chop and mix," metaphorically describing how a hash function mixes the bits of a key to produce an index. The concept of hash tables was independently discovered by several researchers around 1953–1956, including Hans Peter Luhn at IBM who described it in a 1953 internal memo. The term "hashing" was popularised by Arnold Dumey in 1956 and formalized by Robert Morris in 1968.

data-structureshash-tablekey-valuecollision-resolutionconstant-timeindexing