A computer pipeline is a hardware technique that overlaps the execution of multiple instructions by dividing instruction processing into discrete sequential stages — typically fetch, decode, execute, memory access, and write-back — so that each stage operates on a different instruction simultaneously, analogous to an assembly line. Pipelining increases instruction throughput (instructions completed per second) without reducing the time to complete a single instruction (latency), ideally executing one instruction per clock cycle at steady state. Pipeline performance is limited by hazards: structural hazards (resource conflicts), data hazards (dependency between instructions), and control hazards (branches altering instruction flow).
CPI_pipeline = 1 + stalls_per_instruction
LaTeX: CPI_{pipeline} = 1 + \text{stalls per instruction}
| Symbol | Meaning | Unit |
|---|---|---|
| CPI_pipeline | Average clock cycles per instruction in a pipelined processor | cycles/instruction |
| 1 | Ideal CPI for a fully pipelined processor with no hazards | cycles |
| stalls_per_instruction | Average pipeline stall cycles introduced by hazards | cycles |
Problem
A 5-stage pipeline runs at 2 GHz. A program has 10% branch instructions, each causing a 3-cycle stall (no branch prediction). Data hazards add 0.1 stall cycles per instruction on average. Calculate the effective CPI and throughput.
Solution
Step 1: Stalls per instruction = branch stalls + data stalls = (0.10 × 3) + 0.1 = 0.30 + 0.10 = 0.40 cycles/instruction. Step 2: CPI = 1 + 0.40 = 1.40 cycles/instruction. Step 3: Throughput = clock frequency / CPI = 2×10⁹ / 1.40 = 1.43×10⁹ instructions/second.
Answer
CPI = 1.40; effective throughput = ~1.43 billion instructions per second (GIPS), versus 2 GIPS for an ideal pipeline.
| Stage | Abbreviation | Operation Performed | Key Hardware | Hazard Risk |
|---|---|---|---|---|
| Instruction Fetch | IF | Read next instruction from memory using PC | PC register, instruction memory | Control hazard |
| Instruction Decode | ID | Decode opcode, read register file, sign-extend immediates | Register file, decoder | Data hazard (RAW) |
| Execute | EX | Perform ALU operation or compute address | ALU, forwarding unit | Data hazard (forwarded) |
| Memory Access | MEM | Read/write data memory for load/store instructions | Data cache | Structural hazard |
| Write Back | WB | Write ALU result or loaded data to register file | Register file write port | Data hazard (stale register) |
Wikimedia Commons, CC BY-SA
Microprocessor architecture describes the internal organization and design of a microprocessor, including the arrangement of its arithmetic logic unit (ALU), control unit, registers, cache, buses, and instruction set, which collectively determine how the processor fetches, decodes, and executes instructions. Architectures are broadly classified as RISC (Reduced Instruction Set Computer) or CISC (Complex Instruction Set Computer), each with distinct trade-offs in instruction complexity, pipeline depth, and energy efficiency. Modern processors incorporate multiple cores, branch prediction, out-of-order execution, and deep cache hierarchies to maximize performance.
Cache memory is a small, high-speed memory layer placed between the processor and main memory (RAM) that stores copies of frequently accessed data and instructions to reduce average memory access latency. Modern processors use a multi-level cache hierarchy (L1, L2, L3), each level larger and slower than the previous, organized around the principles of temporal locality (recently used data will likely be reused) and spatial locality (nearby data will likely be accessed soon). Cache performance is measured by the hit rate — the fraction of memory requests satisfied by the cache — and miss penalty — the extra time needed to fetch data from a lower level.
A Boolean logic circuit is a digital electronic circuit that performs logical operations on binary inputs (0 or 1) using combinations of fundamental logic gates — AND, OR, NOT, NAND, NOR, XOR, and XNOR — to produce a binary output defined by Boolean algebra. These circuits are the building blocks of all digital computers, forming the basis of arithmetic units, control logic, and memory elements. The behavior of any combinational logic circuit can be fully described by a Boolean expression or truth table.
From Old English/Old High German "pipa" (pipe, tube), extended metaphorically to any assembly-line process. "Pipeline" as a computing concept was introduced by IBM in the 1960s for the Stretch (7030) and System/360 Model 91 processors. The 5-stage RISC pipeline was formalized by Patterson and Hennessy in their influential computer architecture textbook (1990).