



# N-Way Set Associative Cache (1/4)

# • Memory address fields:

- Tag: same as before
- Offset: same as before
- Index: points us to the correct "row" (called a <u>set</u> in this case)

#### • So what's the difference?

CS61C L34 Caches IV (3)

CS61C L34 Caches IV (5)

- · each set contains multiple blocks
- once we've found correct set, must compare with all tags in that set to find our data

#### N-Way Set Associative Cache (3/4)

# • Given memory address:

- · Find correct set using Index value.
- Compare Tag with all Tag values in the determined set.
- · If a match occurs, hit!, otherwise a miss.
- Finally, use the offset field as usual to find the desired data within the block.

# N-Way Set Associative Cache (2/4)

#### • Summary:

Col CS61C L34

Garcia © UCB

- · cache is direct-mapped w/respect to sets
- · each set is fully associative
- basically N direct-mapped caches working in parallel: each has its own valid bit and data

# N-Way Set Associative Cache (4/4)

#### • What's so great about this?

- even a 2-way set assoc cache avoids a lot of conflict misses
- hardware cost isn't that bad: only need N comparators
- In fact, for a cache with M blocks,
  - it's Direct-Mapped if it's 1-way set assoc
  - · it's Fully Assoc if it's M-way set assoc
  - so these two are just special cases of the more general set associative design





Gamia © UCB

# Block Replacement Policy: LRU

# LRU (Least Recently Used)

CSEIC L34 Caches IV (11)

- Idea: cache out block which has been accessed (read or write) least recently
- Pro: temporal locality ⇒ recent past use implies likely future use: in fact, this is a very effective policy
- Con: with 2-way set assoc, easy to keep track (one LRU bit); with 4-way or greater, requires complicated hardware and much time to keep track of this



Block Replacement Example • We have a 2-way set associative















- L1 Hit Time = 1 cycle
- L1 Miss rate = 5%
- L2 Hit Time = 5 cycles
- · L2 Miss rate = 15% (% L1 misses that miss)
- L2 Miss Penalty = 200 cycles
- •L1 miss penalty = 5 + 0.15 \* 200 = 35
- Avg mem access time = 1 + 0.05 x 35 = <u>2.75 cycles</u>







|    | Peer Instructions                                                                                                                     |    |            |
|----|---------------------------------------------------------------------------------------------------------------------------------------|----|------------|
|    |                                                                                                                                       |    |            |
|    |                                                                                                                                       |    |            |
|    |                                                                                                                                       |    |            |
|    |                                                                                                                                       |    |            |
|    |                                                                                                                                       |    | ABC        |
| 1. | In the last 10 years, the gap between the access<br>time of DRAMs & the cycle time of processors has<br>decreased. (I.e., is closing) | 1: | FFF        |
|    |                                                                                                                                       | 2: | FFT        |
| 2. | A 2-way set-associative cache can be outperformed                                                                                     | 3: | FTF<br>FTT |
|    | by a direct-mapped cache.                                                                                                             | 4: | TFF        |
| 3. | Larger block size ⇒ lower miss rate                                                                                                   | 6: | TFT        |
|    |                                                                                                                                       |    |            |
|    |                                                                                                                                       | 7: | TTF        |

| Cache de                    | esign choice              | es:                        |  |
|-----------------------------|---------------------------|----------------------------|--|
|                             | cache: speed              |                            |  |
|                             | napped v. ass             |                            |  |
|                             | ay set assoc:             |                            |  |
| <ul> <li>block r</li> </ul> | eplacement p              | olicy                      |  |
|                             | cache?                    |                            |  |
|                             | I cache?                  |                            |  |
| • Write th                  | nrough v. writ            | e back?                    |  |
| Use perfection              | ormance mo<br>choices, de | odel to pick<br>pending on |  |
| program                     | s, technólog              | pending on<br>jy, budget,  |  |
|                             |                           |                            |  |