Goal: increase instruction throughput by overlapping stages
Classic 5-stage pipeline:
IF: Instruction Fetch
ID: Instruction Decode & Register Read
EX: Execute / Address Calculation
MEM: Data Memory Access
WB: Write-Back to Register File
Ideal speedup: ≈ 5× (one instruction completes each cycle after fill)
2. Pipeline Registers & Datapath
Between each stage, pipeline registers hold intermediate values
Registers: IF/ID, ID/EX, EX/MEM, MEM/WB
Data path: each stage’s inputs come from previous stage registers
3. Pipeline Hazards
Structural Hazards: resource conflicts (e.g., single memory for IF and MEM)
Data Hazards: RAW, WAR, WAW dependencies
RAW (Read After Write): true data dependency
WAR (Write After Read), WAW (Write After Write): uncommon in RISC-V single-cycle
Control Hazards: branches and jumps change PC flow
4. Data Hazard Resolution
Forwarding (Data Bypassing): feed EX/MEM or MEM/WB outputs back to ALU inputs
Stalling (Pipeline Interlock): insert bubbles (NOPs) when forwarding insufficient
Instruction Pair
Hazard
Resolution
add x1,x2,x3
sub x4,x1,x5
Forward result
lw x1,0(x2)
add x3,x1,x4
Stall one cycle
5. Control Hazard Resolution
Flush / Squash: convert wrong-path instructions into NOPs after misprediction
Branch Prediction:
Static: always-not-taken or always-taken
Dynamic: 1-bit or 2-bit predictors, BTB
Delayed Branch (older RISC): execute fixed number of instructions before branch takes effect
6. Performance Metrics
Pipeline CPI: ideally 1, but hazards add stalls
CPI=1+Stall cycles per instruction
Speedup:
$\displaystyle ext{Speedup} = rac{ ext{Single-cycle time}}{ ext{Pipeline time per instruction}}
7. Example: Hazard Walkthrough
1: lw x1,0(x2)2: add x3,x1,x4 # RAW hazard, use stall or forwarding3: sub x5,x3,x64: beq x5,x0,L1 # control hazard5: ... # speculatively fetchedL1:6: and x7,x1,x8
Cycle timing diagram: show IF-IF/ID-ID/EX-EX/MEM-MEM/WB across cycles with stalls/forwards