1. Overview of Single-Cycle CPU

  • Concept: entire instruction executes in one clock cycle
  • Implication: cycle time must accommodate worst-case path (load/store)
  • Stages (all in one cycle): Fetch → Decode/Register read → Execute → Memory → Write-back → PC update

2. Datapath Components

  1. Instruction Memory
    • Address = PC; outputs 32-bit instruction
  2. Register File
    • Two read ports (rs1, rs2), one write port (rd)
    • Read data → ALU inputs or branch comparator
  3. Sign-Extend Unit
    • Expands immediate fields to 32 bits
  4. ALU
    • Performs arithmetic/logic based on ALU control
    • Inputs: rs1_data, (rs2_data or imm)
  5. Data Memory
    • Address = ALU result; read/write controlled by signals
  6. MUXes
    • ALU src: choose register vs. immediate
    • MemToReg: choose ALU result vs. memory data for write-back
    • PC Src: choose next PC (PC+4, branch target, jump target)
  7. PC + 4 Adder
    • Computes sequential PC
  8. Branch Adder
    • Computes PC + sign-extended offset

3. Control Signals (from Main & ALU Control)

Signal    | Action
----------|--------------------------------------------
RegWrite  | enables write to Register File
MemRead   | enables Data Memory read
MemWrite  | enables Data Memory write
MemToReg  | 0: ALU result → RF, 1: Data Mem → RF
ALUSrc    | 0: second ALU input = rs2, 1: = immediate
Branch    | 1: perform branch decision
Jump      | 1: use jump address
ALUOp[1:0]| selects ALU operation category

4. Data Flow per Instruction Type

R-Type (add x5,x6,x7)

  1. Fetch instruction
  2. Decode: rs1=x6, rs2=x7 → read register file
  3. ALUSrc=0, ALUOp=10 → ALU computes x6 + x7
  4. MemToReg=0, RegWrite=1 → write ALU result to x5
  5. PC←PC+4

I-Type Load (lw x8,12(x9))

  1. Fetch
  2. Decode: rs1=x9, imm=12
  3. ALUSrc=1, ALUOp=00 → ALU computes address x9+12
  4. MemRead=1 → read Data Mem at address
  5. MemToReg=1, RegWrite=1 → write memory data to x8
  6. PC←PC+4

Store (sw x10,8(x11))

  1. Fetch
  2. Decode: rs1=x11, rs2=x10, imm=8
  3. ALUSrc=1, ALUOp=00 → address x11+8
  4. MemWrite=1 → store x10 to Data Mem
  5. PC←PC+4

Branch (beq x1,x2,offset)

  1. Fetch
  2. Decode: rs1, rs2, offset
  3. ALUSrc=0, ALUOp=01 → ALU computes x1 - x2 and Zero flag
  4. If Branch=1 & Zero=1 → PC←PC+offset else → PC←PC+4

Jump (jal x3,label)

  • PCSrc MUX selects jump target, writes PC+4 to x3

5. Timing Considerations

  • Critical path: instruction memory → control → ALU → Data Mem → register write
  • Single-cycle cost: cycle time = sum of component delays
  • Trade-off: slow cycle time vs. simpler control; motivates pipelining (Lecture 8)