Docsity
Docsity

Prepare for your exams
Prepare for your exams

Study with the several resources on Docsity


Earn points to download
Earn points to download

Earn points by helping other students or get them with a premium plan


Guidelines and tips
Guidelines and tips

Comparing Machine Performance & Memory Hierarchy in CS161 - Prof. Harry Hsieh, Assignments of Computer Science

An insight into the design and architecture of computer systems, focusing on comparing the performance of different machines and the concept of memory hierarchy. Students will learn about single cycle and multicycle machines, pipelined systems, and the benefits of sram and dram. The document also discusses the importance of cache in improving system performance.

Typology: Assignments

2009/2010

Uploaded on 03/28/2010

koofers-user-3gu
koofers-user-3gu 🇺🇸

10 documents

1 / 8

Toggle sidebar

Related documents


Partial preview of the text

Download Comparing Machine Performance & Memory Hierarchy in CS161 - Prof. Harry Hsieh and more Assignments Computer Science in PDF only on Docsity! CS161: Design and Architecture of Computer Systems November 16, 2007 12004 Morgan Kaufmann Publishers COMPUTER SCIENCE & ENGINEERING Administrative Matters • Midterm #2 – Monday, 11/19 – Cover chapter 5 and 6 – Cover homework 4 and 5 – 15% of your grade • Office Hour – Sunday • 4-5PM – Call me at 951-827-2030 if there is a problem with door… – Monday • Edward 9-10 • Harry 1-2 22004 Morgan Kaufmann Publishers COMPUTER SCIENCE & ENGINEERING Comparing performance • Machine 1 – Single cycle, 200ps memory, 100ps ALU, 50ps register – Clock cycle of 200+50+100+200+50=600, CPI=1 – Time per instruction = 600 ps • Machine 2 – Multicycle, 25% loads, 10% stores, 11% branches, 2% jump, 52% ALU, 5 cycle loads, 4 cycle stores, 4 cycle ALU, 3 cycle Branches, and 3 cycle jumps – CPI = 0.25*5+0.1*4+0.52*4+0.11*3+0.02*3=4.12 – Clock = max of stages = 200, Time per instruction = 824 ps • Machine 3 – Pipelined, half of loads take 2 cycles, 25% of branches are miss predicted hence take 2 cycles, jump always take 2 cycles – CPI = 0.25*1.5+0.1*1+0.52*1+0.11*1.25+0.02*2=1.17 – Clock = max of stages = 200, Time per instruction = 234 ps CS161: Design and Architecture of Computer Systems November 16, 2007 32004 Morgan Kaufmann Publishers COMPUTER SCIENCE & ENGINEERING What if we are allowed 50ps clock? • Machine 1 – Single cycle, 200ps memory, 100ps ALU, 50ps register – Clock cycle of 200+50+100+200+50=600, CPI=1 – Time per instruction = 600 ps • Machine 2 – Multicycle, 25% loads, 10% stores, 11% branches, 2% jump, 52% ALU – Load takes 4+1+2+4+1 = 12 cycles – Store takes 4+1+2+4 = 11 cycles – ALU takes 4+1+2+1 = 8 cyles – Branch takes 4+1+2 = 7 cycles – Jump takes 4+1+2= cycles – CPI = 0.25*12+0.1*11+0.52*8+0.11*7+0.02*7=9.17 – Clock = 50ps, Time per instruction = 458.5ps 42004 Morgan Kaufmann Publishers COMPUTER SCIENCE & ENGINEERING Chapter Seven Large and Fast: Exploiting Memory Hierarchy CS161: Design and Architecture of Computer Systems November 16, 2007 92004 Morgan Kaufmann Publishers COMPUTER SCIENCE & ENGINEERING Memory Hierarchy • There is also power and area consideration • Our initial focus: two levels (upper, lower) – block: minimum unit of data (usually several words) (a.k.a. line) – hit: data requested is in the upper level – miss: data requested is not in the upper level 102004 Morgan Kaufmann Publishers COMPUTER SCIENCE & ENGINEERING Memory Hierarchy (continue) • Hit rate (to a level) (a.k.a. hit ratio) – Percentage of access that is found in that level – HIGHLY application dependent • Miss rate (to a level) (a.k.a. miss ratio) – Percentage of access that is not found in that level – 1 – hit_rate • Hit time (to a level) – Time it takes for a single memory access to that level • Miss penalty (to a level) – Time require to fetch a block from lower level – Total time when there is a miss in top level and hit in lower level • You should think of it as “Miss time” CS161: Design and Architecture of Computer Systems November 16, 2007 112004 Morgan Kaufmann Publishers COMPUTER SCIENCE & ENGINEERING • Two issues: – How do we know if a data item is in the cache? – If it is, how do we find it? • Our first example: – Each block has exactly one location in the cache – "direct mapped“ – Lots of blocks in the lower level shared the same location in the cache Cache 122004 Morgan Kaufmann Publishers COMPUTER SCIENCE & ENGINEERING • Cache address is – Block address modulo the number of blocks in the cache – E.g. 8 cache lines, takes lower 3 bits of block address Direct Mapped Cache CS161: Design and Architecture of Computer Systems November 16, 2007 132004 Morgan Kaufmann Publishers COMPUTER SCIENCE & ENGINEERING • How do we know it is in the cache? – The upper bits (e.g. 2 bits) becomes the tag, to compare – Add another valid bit to each cache lines for initialization Direct Mapped Cache 142004 Morgan Kaufmann Publishers COMPUTER SCIENCE & ENGINEERING Behavior of a direct mapped cache
Docsity logo



Copyright © 2024 Ladybird Srl - Via Leonardo da Vinci 16, 10126, Torino, Italy - VAT 10816460017 - All rights reserved