The memory hierarchy in computer systems is a structured pyramid of storage levels organized by speed, cost, and capacity, where faster and more expensive memory (registers, cache) sits close to the processor and slower, cheaper, larger storage (RAM, SSD, HDD) resides farther away. The hierarchy exploits the principle of locality — programs tend to reuse recently accessed data (temporal locality) and access nearby memory addresses (spatial locality) — to make the average memory access time approach that of the fastest level. Effective hierarchy design is critical to bridging the speed gap between the processor and main memory.
| Level | Type | Typical Size | Access Time | Cost per GB |
|---|---|---|---|---|
| L0 | CPU Registers | < 1 KB | < 1 ns | Extremely high |
| L1 | L1 Cache (on-chip) | 32–512 KB | 1–4 ns | Very high |
| L2 | L2 Cache (on/near chip) | 256 KB–4 MB | 4–12 ns | High |
| L3 | L3 Cache (shared) | 4–64 MB | 10–40 ns | Moderate-high |
| L4 | Main Memory (DRAM) | 4–512 GB | 50–100 ns | ~$3–8/GB |
| L5 | SSD / NVMe Storage | 256 GB–4 TB | 50–200 µs | ~$0.08–0.20/GB |
| L6 | HDD / Tape Archive | TB–PB | 5–20 ms | ~$0.02–0.05/GB |
Wikimedia Commons, CC BY-SA
Cache memory is a small, high-speed memory layer placed between the processor and main memory (RAM) that stores copies of frequently accessed data and instructions to reduce average memory access latency. Modern processors use a multi-level cache hierarchy (L1, L2, L3), each level larger and slower than the previous, organized around the principles of temporal locality (recently used data will likely be reused) and spatial locality (nearby data will likely be accessed soon). Cache performance is measured by the hit rate — the fraction of memory requests satisfied by the cache — and miss penalty — the extra time needed to fetch data from a lower level.
Microprocessor architecture describes the internal organization and design of a microprocessor, including the arrangement of its arithmetic logic unit (ALU), control unit, registers, cache, buses, and instruction set, which collectively determine how the processor fetches, decodes, and executes instructions. Architectures are broadly classified as RISC (Reduced Instruction Set Computer) or CISC (Complex Instruction Set Computer), each with distinct trade-offs in instruction complexity, pipeline depth, and energy efficiency. Modern processors incorporate multiple cores, branch prediction, out-of-order execution, and deep cache hierarchies to maximize performance.
A computer pipeline is a hardware technique that overlaps the execution of multiple instructions by dividing instruction processing into discrete sequential stages — typically fetch, decode, execute, memory access, and write-back — so that each stage operates on a different instruction simultaneously, analogous to an assembly line. Pipelining increases instruction throughput (instructions completed per second) without reducing the time to complete a single instruction (latency), ideally executing one instruction per clock cycle at steady state. Pipeline performance is limited by hazards: structural hazards (resource conflicts), data hazards (dependency between instructions), and control hazards (branches altering instruction flow).
From Latin "hierarchia" (rank, order), from Greek "hierarkhia" (rule of a high priest), later extended to any ordered system of levels. "Memory" from Latin "memoria" (recollection). The concept of a storage hierarchy was formally described in the 1940s and 1950s as engineers sought to balance speed and cost in early computer designs.