EngineeringChemical & Computer EngineeringMedium

Memory Hierarchy (computer)

Also known as:Storage hierarchyMemory pyramid

The memory hierarchy in computer systems is a structured pyramid of storage levels organized by speed, cost, and capacity, where faster and more expensive memory (registers, cache) sits close to the processor and slower, cheaper, larger storage (RAM, SSD, HDD) resides farther away. The hierarchy exploits the principle of locality — programs tend to reuse recently accessed data (temporal locality) and access nearby memory addresses (spatial locality) — to make the average memory access time approach that of the fastest level. Effective hierarchy design is critical to bridging the speed gap between the processor and main memory.

Memory Hierarchy Levels: Speed, Capacity, and Cost

LevelTypeTypical SizeAccess TimeCost per GB
L0CPU Registers< 1 KB< 1 nsExtremely high
L1L1 Cache (on-chip)32–512 KB1–4 nsVery high
L2L2 Cache (on/near chip)256 KB–4 MB4–12 nsHigh
L3L3 Cache (shared)4–64 MB10–40 nsModerate-high
L4Main Memory (DRAM)4–512 GB50–100 ns~$3–8/GB
L5SSD / NVMe Storage256 GB–4 TB50–200 µs~$0.08–0.20/GB
L6HDD / Tape ArchiveTB–PB5–20 ms~$0.02–0.05/GB

Interactive Tools

Khan Academy — Computer Memory

Open Tool

Brilliant.org — Memory and Storage

Open Tool

Wolfram Alpha — Data Storage Units

Open Tool
Pyramid diagram of the computer memory hierarchy from registers at the top to tape storage at the base

Wikimedia Commons, CC BY-SA

Related Terms

Engineering

Cache Memory

Cache memory is a small, high-speed memory layer placed between the processor and main memory (RAM) that stores copies of frequently accessed data and instructions to reduce average memory access latency. Modern processors use a multi-level cache hierarchy (L1, L2, L3), each level larger and slower than the previous, organized around the principles of temporal locality (recently used data will likely be reused) and spatial locality (nearby data will likely be accessed soon). Cache performance is measured by the hit rate — the fraction of memory requests satisfied by the cache — and miss penalty — the extra time needed to fetch data from a lower level.

Engineering

Microprocessor Architecture

Microprocessor architecture describes the internal organization and design of a microprocessor, including the arrangement of its arithmetic logic unit (ALU), control unit, registers, cache, buses, and instruction set, which collectively determine how the processor fetches, decodes, and executes instructions. Architectures are broadly classified as RISC (Reduced Instruction Set Computer) or CISC (Complex Instruction Set Computer), each with distinct trade-offs in instruction complexity, pipeline depth, and energy efficiency. Modern processors incorporate multiple cores, branch prediction, out-of-order execution, and deep cache hierarchies to maximize performance.

Engineering

Computer Pipeline

A computer pipeline is a hardware technique that overlaps the execution of multiple instructions by dividing instruction processing into discrete sequential stages — typically fetch, decode, execute, memory access, and write-back — so that each stage operates on a different instruction simultaneously, analogous to an assembly line. Pipelining increases instruction throughput (instructions completed per second) without reducing the time to complete a single instruction (latency), ideally executing one instruction per clock cycle at steady state. Pipeline performance is limited by hazards: structural hazards (resource conflicts), data hazards (dependency between instructions), and control hazards (branches altering instruction flow).

From Latin "hierarchia" (rank, order), from Greek "hierarkhia" (rule of a high priest), later extended to any ordered system of levels. "Memory" from Latin "memoria" (recollection). The concept of a storage hierarchy was formally described in the 1940s and 1950s as engineers sought to balance speed and cost in early computer designs.

memory-hierarchycachedramcomputer-architecturestoragelocality