Cache memory is a small, high-speed memory layer placed between the processor and main memory (RAM) that stores copies of frequently accessed data and instructions to reduce average memory access latency. Modern processors use a multi-level cache hierarchy (L1, L2, L3), each level larger and slower than the previous, organized around the principles of temporal locality (recently used data will likely be reused) and spatial locality (nearby data will likely be accessed soon). Cache performance is measured by the hit rate — the fraction of memory requests satisfied by the cache — and miss penalty — the extra time needed to fetch data from a lower level.
T_avg = h × Tc + (1 − h) × Tm
LaTeX: T_{avg} = h \cdot T_c + (1-h) \cdot T_m
| Symbol | Meaning | Unit |
|---|---|---|
| T_avg | Average memory access time | ns |
| h | Cache hit rate (fraction of accesses found in cache) | dimensionless |
| Tc | Cache access time | ns |
| Tm | Main memory access time on a miss | ns |
Problem
A processor has an L1 cache with access time Tc = 4 ns and main memory access time Tm = 80 ns. If the cache hit rate h = 0.92, calculate the average memory access time.
Solution
Step 1: Apply the formula: T_avg = h × Tc + (1 − h) × Tm. Step 2: Substitute values: T_avg = 0.92 × 4 + (1 − 0.92) × 80. Step 3: T_avg = 3.68 + 0.08 × 80 = 3.68 + 6.40 = 10.08 ns.
Answer
Average memory access time = 10.08 ns, much closer to cache speed (4 ns) than main memory (80 ns).
| Cache Level | Typical Size | Access Latency | Shared? | Replacement Policy |
|---|---|---|---|---|
| L1-I (Instruction) | 32–64 KB per core | 4–5 cycles | No (per core) | LRU |
| L1-D (Data) | 32–64 KB per core | 4–5 cycles | No (per core) | LRU |
| L2 (Unified) | 256 KB–2 MB per core | 12–15 cycles | No (per core) | LRU or pseudo-LRU |
| L3 (Last Level Cache) | 4–64 MB total | 30–45 cycles | Yes (all cores) | Adaptive / QLRU |
| DRAM (Main Memory) | 4–512 GB | 200–300 cycles | Yes (system-wide) | N/A |
Wikimedia Commons, CC BY-SA
The memory hierarchy in computer systems is a structured pyramid of storage levels organized by speed, cost, and capacity, where faster and more expensive memory (registers, cache) sits close to the processor and slower, cheaper, larger storage (RAM, SSD, HDD) resides farther away. The hierarchy exploits the principle of locality — programs tend to reuse recently accessed data (temporal locality) and access nearby memory addresses (spatial locality) — to make the average memory access time approach that of the fastest level. Effective hierarchy design is critical to bridging the speed gap between the processor and main memory.
Microprocessor architecture describes the internal organization and design of a microprocessor, including the arrangement of its arithmetic logic unit (ALU), control unit, registers, cache, buses, and instruction set, which collectively determine how the processor fetches, decodes, and executes instructions. Architectures are broadly classified as RISC (Reduced Instruction Set Computer) or CISC (Complex Instruction Set Computer), each with distinct trade-offs in instruction complexity, pipeline depth, and energy efficiency. Modern processors incorporate multiple cores, branch prediction, out-of-order execution, and deep cache hierarchies to maximize performance.
A computer pipeline is a hardware technique that overlaps the execution of multiple instructions by dividing instruction processing into discrete sequential stages — typically fetch, decode, execute, memory access, and write-back — so that each stage operates on a different instruction simultaneously, analogous to an assembly line. Pipelining increases instruction throughput (instructions completed per second) without reducing the time to complete a single instruction (latency), ideally executing one instruction per clock cycle at steady state. Pipeline performance is limited by hazards: structural hazards (resource conflicts), data hazards (dependency between instructions), and control hazards (branches altering instruction flow).
From French "cache" (a hiding place, storage), derived from "cacher" (to hide), from Latin "coacticare" (to compress, conceal). The term was applied to computer memory in the 1960s at IBM by Liptay and others who described a small, fast "buffer" hiding between the CPU and core memory. The word entered common computing vocabulary around 1968.