## inst.eecs.berkeley.edu/~cs61c

#### **CS61C: Machine Structures**

# Lecture 25 – Single Cycle CPU Datapath



#### Lecturer PSOE Dan Garcia

www.cs.berkeley.edu/~ddgarcia

Paid to write a Mac virus?! ⇒
A Symantec employee claimed

"Macs aren't more secure, there are just fewer virus writers!". DVForge, believing that Macs WERE more secure, initially offered \$25K to anyone (\$50K if from Symantec) who could infect a honeypot mac. They later retracted the offer.





www.dvforge.com/virus.shtml

## **Anatomy: 5 components of any Computer**







## **Outline of Today's Lecture**

- Design a processor: step-by-step
- Requirements of the Instruction Set
- Hardware components that match the instruction set requirements



## How to Design a Processor: step-by-step

- 1. Analyze instruction set architecture (ISA)
   => datapath <u>requirements</u>
  - meaning of each instruction is given by the register transfers
  - datapath must include storage element for ISA registers
  - datapath must support each register transfer
- 2. Select set of datapath components and establish clocking methodology
- 3. <u>Assemble</u> datapath meeting requirements
- 4. Analyze implementation of each instruction to determine setting of control points that effects the register transfer.

#### **Review: The MIPS Instruction Formats**

• All MIPS instructions are 32 bits long. 3 formats:



I-type

J-type

| 31      | 26     | 21      | 16             | 71     | 6          | 0       |  |
|---------|--------|---------|----------------|--------|------------|---------|--|
|         | op     | rs      | rt             | rd     | shamt      | funct   |  |
|         | 6 bits | 5 bits  | 5 bits         | 5 bits | 5 bits     | 6 bits  |  |
| 31      | 26     | 21      | 16             |        |            | C       |  |
|         | op     | rs      | rt             |        | address/im | mediate |  |
|         | 6 bits | 5 bits  | 5 bits         |        | 16 bits    |         |  |
| 31      | 26     |         |                |        |            | C       |  |
|         | op     |         | target address |        |            |         |  |
| <u></u> | 6 bits | 26 bits |                |        |            |         |  |

#### The different fields are:

- op: operation ("opcode") of the instruction
- · rs, rt, rd: the source and destination register specifiers
- shamt: shift amount
- funct: selects the variant of the operation in the "op" field
- address / immediate: address offset or immediate value
- target address: target address of jump instruction



## **Step 1a: The MIPS-lite Subset for today**

ADDU and SUBU<sub>31</sub>



•subu rd,rs,rt

OR Immediate:

| 31            | 26   | 21     | 16     | 0         |
|---------------|------|--------|--------|-----------|
| op            | )    | rs     | rt     | immediate |
| <b>a c</b> 61 | nite | 5 hits | 5 hits | 16 hits   |

•ori rt,rs,imm16<sup>6 bits</sup>

TO DITS

 LOAD and **STORE Word** 



•lw rt,rs,imm16

•sw rt,rs,imm16

• BRANCH:

| - | 31 26  | 21     | 16     | 0         |
|---|--------|--------|--------|-----------|
|   | op     | rs     | rt     | immediate |
|   | 6 bits | 5 bits | 5 bits | 16 bits   |

•beq rs,rt,imm16



## Register Transfer Language

## RTL gives the meaning of the instructions

```
{op, rs, rt, rd, shamt, funct} = MEM[PC]
{op, rs, rt, Imm16} = MEM[PC]
```

## All start by fetching the instruction

inst Register Transfers

```
ADDU R[rd] = R[rs] + R[rt]; PC = PC + 4

SUBU R[rd] = R[rs] - R[rt]; PC = PC + 4

ORI R[rt] = R[rs] | zero_ext(Imm16); PC = PC + 4

LOAD R[rt] = MEM[ R[rs] + sign_ext(Imm16)]; PC = PC + 4
```

STORE MEM[ 
$$R[rs] + sign_ext(Imm16)$$
 ] =  $R[rt]$ ; $PC = PC + 4$ 

BEQ if (R[rs] == R[rt]) then
$$PC = PC + 4 + (sign\_ext(Imm16) \parallel 00)$$

$$else PC = PC + 4$$

$$cselc L25 Single Cycle CPU Datapath (7)$$

## Step 1: Requirements of the Instruction Set

- Memory (MEM)
  - instructions & data
- Registers (R: 32 x 32)
  - read RS
  - read RT
  - Write RT or RD
- PC
- Extender (sign extend)
- Add and Sub register or extended immediate
- Add 4 or extended immediate to PC



## **Step 2: Components of the Datapath**

- Combinational Elements
- Storage Elements
  - Clocking methodology



#### **Combinational Logic Elements (Building Blocks)**





•MUX







#### **ALU Needs for MIPS-lite + Rest of MIPS**

Addition, subtraction, logical OR, ==:

```
ADDU R[rd] = R[rs] + R[rt]; ...
SUBU R[rd] = R[rs] - R[rt]; ...
ORI R[rt] = R[rs] |
zero_ext(Imm16)...
BEQ if ( R[rs] == R[rt] )...
```

- Test to see if output == 0 for any ALU operation gives == test. How?
- P&H also adds AND, Set Less Than (1 if A < B, 0 otherwise)</li>



#### **Administrivia**

- Final Exam location TBA (exam grp 5)
  - Sat, 2005-05-14, 12:30-3:30pm
  - ALL students are required to complete ALL of the exam (even if you aced the midterm)
  - Same format as the midterm
    - 3 Hours
    - Closed book, except for 2 study sheets + green
    - Leave your backpacks, books at home
- Homework 6 out today, due in a week (mon)



## **Storage Element: Idealized Memory**

- Memory (idealized)
  - One input bus: Data In
  - One output bus: Data Out



- Memory word is selected by:
  - Address selects the word to put on Data Out
  - Write Enable = 1: address selects the memory word to be written via the Data In bus
- Clock input (CLK)
  - The CLK input is a factor ONLY during write operation
  - During read operation, behaves as a combinational logic block:
    - Address valid => Data Out valid after "access time."



## Storage Element: Register (Building Block)

#### Similar to D Flip Flop except

- N-bit input and output
- Write Enable input
- Write Enable:
  - negated (or deasserted) (0):
     Data Out will not change
  - asserted (1):
     Data Out will become Data In





## **Storage Element: Register File**

- Register File consists of 32 registers:
  - Two 32-bit output busses: busA and busB
- - · RA (number) selects the register to put on busA (data)
  - RB (number) selects the register to put on busB (data)
  - RW (number) selects the register to be written via busW (data) when Write Enable is 1
- Clock input (CLK)
  - The CLK input is a factor ONLY during write operation
  - During read operation, behaves as a combinational logic block:
    - RA or RB valid => busA or busB valid after "access time."



**RWRA RB** 

Write Enable 51

#### **Step 3: Assemble DataPath meeting requirements**

- Register Transfer Requirements
   ⇒ Datapath Assembly
- Instruction Fetch
- Read Operands and Execute Operation



#### 3a: Overview of the Instruction Fetch Unit

- The common RTL operations
  - Fetch the Instruction: mem[PC]
  - Update the program counter:
    - Sequential Code: PC = PC + 4
    - Branch and Jump: PC = "something else"





#### 3b: Add & Subtract

- •R[rd] = R[rs] op R[rt] Ex.: addU rd,rs,rt
  - Ra, Rb, and Rw come from instruction's Rs, Rt, and Rd fields 31 26 21 16 11 6

 op
 rs
 rt
 rd
 shamt
 funct

 6 bits
 5 bits
 5 bits
 5 bits
 5 bits
 5 bits

 ALUctr and RegWr: control logic after decoding the instruction





Already defined register file, ALU

#### **Peer Instruction**

- A. We should use the main ALU to compute PC=PC+4
- B. We're going to be able to read 2 registers and write a 3<sup>rd</sup> in 1 cycle
- C. Datapath is hard, Control is easy

ABC
1: FFF
2: FFT
3: FTF
4: FTT
5: TFF
6: TFT
7: TTF
8: TTT

## How to Design a Processor: step-by-step

- 1. Analyze instruction set architecture (ISA)
   => datapath <u>requirements</u>
  - meaning of each instruction is given by the register transfers
  - datapath must include storage element for ISA registers
  - datapath must support each register transfer
- 2. Select set of datapath components and establish clocking methodology
- 3. <u>Assemble</u> datapath meeting requirements
- 4. Analyze implementation of each instruction to determine setting of control points that effects the register transfer.
- 5. Assemble the control logic (hard part!)