Prepare for your exams
Get points
Guidelines and tips

Prepare for your exams

Study with the several resources on Docsity

Earn points to download

Earn points by helping other students or get them with a premium plan

Guidelines and tips

Sell on Docsity

Prepare for your exams

Study with the several resources on Docsity

Find documents

Prepare for your exams with the study notes shared by other students like you on Docsity

Search Store documents

The best documents sold by students who completed their studies

Search through all study resources

Docsity AINEW

Summarize your documents, ask them questions, convert them into quizzes and concept maps

Explore questions

Clear up your doubts by reading the answers to questions asked by your fellow students

Earn points to download

Earn points by helping other students or get them with a premium plan

Share documents

20 Points

For each uploaded document

Answer questions

5 Points

For each given answer (max 1 per day)

All the ways to get free points

Get points immediately

Choose a premium plan with all the points you need

Study Opportunities

Search for study opportunitiesNEW

Connect with the world's best universities and choose your course of study

Community

Ask the community

Ask the community for help and clear up your study doubts

University Rankings

Discover the best universities in your country according to Docsity users

Free resources

Our save-the-student-ebooks!

Download our free guides on studying techniques, anxiety management strategies, and thesis advice from Docsity tutors

From our blog

Exams and Study

Go to the blog

Comparing Machine Performance & Memory Hierarchy in CS161 - Prof. Harry Hsieh, Assignments of Computer Science

University of California-Riverside Computer Science

Prof. Harry Hsieh

An insight into the design and architecture of computer systems, focusing on comparing the performance of different machines and the concept of memory hierarchy. Students will learn about single cycle and multicycle machines, pipelined systems, and the benefits of sram and dram. The document also discusses the importance of cache in improving system performance.

Typology: Assignments

2009/2010

Uploaded on 03/28/2010

koofers-user-3gu 🇺🇸

10 documents

1 / 8

Partial preview of the text

Download Comparing Machine Performance & Memory Hierarchy in CS161 - Prof. Harry Hsieh and more Assignments Computer Science in PDF only on Docsity! CS161: Design and Architecture of Computer Systems November 16, 2007 12004 Morgan Kaufmann Publishers COMPUTER SCIENCE & ENGINEERING Administrative Matters • Midterm #2 – Monday, 11/19 – Cover chapter 5 and 6 – Cover homework 4 and 5 – 15% of your grade • Office Hour – Sunday • 4-5PM – Call me at 951-827-2030 if there is a problem with door… – Monday • Edward 9-10 • Harry 1-2 22004 Morgan Kaufmann Publishers COMPUTER SCIENCE & ENGINEERING Comparing performance • Machine 1 – Single cycle, 200ps memory, 100ps ALU, 50ps register – Clock cycle of 200+50+100+200+50=600, CPI=1 – Time per instruction = 600 ps • Machine 2 – Multicycle, 25% loads, 10% stores, 11% branches, 2% jump, 52% ALU, 5 cycle loads, 4 cycle stores, 4 cycle ALU, 3 cycle Branches, and 3 cycle jumps – CPI = 0.25*5+0.1*4+0.52*4+0.11*3+0.02*3=4.12 – Clock = max of stages = 200, Time per instruction = 824 ps • Machine 3 – Pipelined, half of loads take 2 cycles, 25% of branches are miss predicted hence take 2 cycles, jump always take 2 cycles – CPI = 0.25*1.5+0.1*1+0.52*1+0.11*1.25+0.02*2=1.17 – Clock = max of stages = 200, Time per instruction = 234 ps CS161: Design and Architecture of Computer Systems November 16, 2007 32004 Morgan Kaufmann Publishers COMPUTER SCIENCE & ENGINEERING What if we are allowed 50ps clock? • Machine 1 – Single cycle, 200ps memory, 100ps ALU, 50ps register – Clock cycle of 200+50+100+200+50=600, CPI=1 – Time per instruction = 600 ps • Machine 2 – Multicycle, 25% loads, 10% stores, 11% branches, 2% jump, 52% ALU – Load takes 4+1+2+4+1 = 12 cycles – Store takes 4+1+2+4 = 11 cycles – ALU takes 4+1+2+1 = 8 cyles – Branch takes 4+1+2 = 7 cycles – Jump takes 4+1+2= cycles – CPI = 0.25*12+0.1*11+0.52*8+0.11*7+0.02*7=9.17 – Clock = 50ps, Time per instruction = 458.5ps 42004 Morgan Kaufmann Publishers COMPUTER SCIENCE & ENGINEERING Chapter Seven Large and Fast: Exploiting Memory Hierarchy CS161: Design and Architecture of Computer Systems November 16, 2007 92004 Morgan Kaufmann Publishers COMPUTER SCIENCE & ENGINEERING Memory Hierarchy • There is also power and area consideration • Our initial focus: two levels (upper, lower) – block: minimum unit of data (usually several words) (a.k.a. line) – hit: data requested is in the upper level – miss: data requested is not in the upper level 102004 Morgan Kaufmann Publishers COMPUTER SCIENCE & ENGINEERING Memory Hierarchy (continue) • Hit rate (to a level) (a.k.a. hit ratio) – Percentage of access that is found in that level – HIGHLY application dependent • Miss rate (to a level) (a.k.a. miss ratio) – Percentage of access that is not found in that level – 1 – hit_rate • Hit time (to a level) – Time it takes for a single memory access to that level • Miss penalty (to a level) – Time require to fetch a block from lower level – Total time when there is a miss in top level and hit in lower level • You should think of it as “Miss time” CS161: Design and Architecture of Computer Systems November 16, 2007 112004 Morgan Kaufmann Publishers COMPUTER SCIENCE & ENGINEERING • Two issues: – How do we know if a data item is in the cache? – If it is, how do we find it? • Our first example: – Each block has exactly one location in the cache – "direct mapped“ – Lots of blocks in the lower level shared the same location in the cache Cache 122004 Morgan Kaufmann Publishers COMPUTER SCIENCE & ENGINEERING • Cache address is – Block address modulo the number of blocks in the cache – E.g. 8 cache lines, takes lower 3 bits of block address Direct Mapped Cache CS161: Design and Architecture of Computer Systems November 16, 2007 132004 Morgan Kaufmann Publishers COMPUTER SCIENCE & ENGINEERING • How do we know it is in the cache? – The upper bits (e.g. 2 bits) becomes the tag, to compare – Add another valid bit to each cache lines for initialization Direct Mapped Cache 142004 Morgan Kaufmann Publishers COMPUTER SCIENCE & ENGINEERING Behavior of a direct mapped cache

Documents

questions

Comparing Machine Performance & Memory Hierarchy in CS161 - Prof. Harry Hsieh, Assignments of Computer Science

Related documents

Partial preview of the text