Docsity
Docsity

Prepare for your exams
Prepare for your exams

Study with the several resources on Docsity


Earn points to download
Earn points to download

Earn points by helping other students or get them with a premium plan


Guidelines and tips
Guidelines and tips

Performance Measurement in Computer Architecture: Metrics, Trends, and Technologies, Assignments of Computer Architecture and Organization

An overview of a university course, cs/ee 6810, focusing on measuring performance, cost, and power in computer architecture. Topics include instruction level parallelism, memory hierarchy, multiprocessors, storage systems, networks, and processor technology trends. Students will learn about performance metrics, technology trends, and power consumption trends. The course includes lectures, homework assignments, and midterms.

Typology: Assignments

Pre 2010

Uploaded on 08/30/2009

koofers-user-6xz
koofers-user-6xz 🇺🇸

10 documents

1 / 19

Toggle sidebar

Related documents


Partial preview of the text

Download Performance Measurement in Computer Architecture: Metrics, Trends, and Technologies and more Assignments Computer Architecture and Organization in PDF only on Docsity! 1 Introduction • Background: CS 3810 or equivalent, based on Hennessy and Patterson’s Computer Organization and Design • Text for CS/EE 6810: Hennessy and Patterson’s Computer Architecture, A Quantitative Approach, 4th Edition • Topics Measuring performance/cost/power Instruction level parallelism, dynamic and static Memory hierarchy Multiprocessors Storage systems and networks 2 Organizational Issues • Office hours, MEB 3414, by appointment • TA: Kenneth Williams; TA office hrs: TBA • Special accommodations, add/drop policies (see class webpage) • Class web-page and class mailing list at http://www.eng.utah.edu/~cs6810 • Grades: Two midterms, 25% each Homework assignments, 50%, you may skip one No tolerance for cheating 5 Where Are We Headed? • Modern trends: Clock speed improvements are slowing power constraints already doing less work per stage Difficult to further optimize a single core for performance Multi-cores: each new processor generation will accommodate more cores 6 Processor Technology Trends • Shrinking of transistor sizes: 250nm (1997) 130nm (2002) 65nm (2007) 22nm • Transistor density increases by 35% per year and die size increases by 10-20% per year… more cores! • Transistor speed improves linearly with size (complex equation involving voltages, resistances, capacitances)… clock speed improvements! • Wire delays do not scale down at the same rate as logic delays… the Pentium 4 has pipeline stages for wire delays 7 Technology Trends • DRAM density increases by 40-60% per year, latency has reduced by 33% in 10 years (the memory wall!), bandwidth improves twice as fast as latency decreases • Disk density improves by 100% every year, latency improvement similar to DRAM • Networks: primary focus on bandwidth; 10Mb 100Mb in 10 years; 100Mb 1Gb in 5 years 10 Summarizing Performance • Consider 25 programs from a benchmark set – how do we capture the behavior of all 25 programs with a single number? P1 P2 P3 Sys-A 10 8 25 Sys-B 12 9 20 Sys-C 8 8 30 Total (average) execution time Total (average) weighted execution time Average of normalized execution times Geometric mean of normalized execution times 11 AM Example • We fixed a reference machine X and ran 4 programs A, B, C, D on it such that each program ran for 1 second • The exact same workload (the four programs execute the same number of instructions that they did on machine X) is run on a new machine Y and the execution times for each program are 0.8, 1.1, 0.5, 2 • With AM of normalized execution times, we can conclude that Y is 1.1 times slower than X – perhaps, not for all workloads, but definitely for one specific workload (where all programs run on the ref-machine for an equal #cycles) • With GM, you may find inconsistencies 12 GM Example Computer-A Computer-B Computer-C P1 1 sec 10 secs 20 secs P2 1000 secs 100 secs 20 secs Conclusion with GMs: (i) A=B (ii) C is ~1.6 times faster • For (i) to be true, P1 must occur 100 times for every occurrence of P2 • With the above assumption, (ii) is no longer true Hence, GM can lead to inconsistencies 15 CPU Performance Equation • CPU time = clock cycle time x cycles per instruction x number of instructions • Influencing factors for each: clock cycle time: technology and organization CPI: organization and instruction set design instruction count: instruction set design and compiler • CPI (cycles per instruction) or IPC (instructions per cycle) can not be accurately estimated analytically 16 Measuring System CPI • Assume that an architectural innovation only affects CPI • For 3 programs, base CPIs: 1.2, 1.8, 2.5 CPIs for proposed model: 1.4, 1.9, 2.3 • What is the best way to summarize performance with a single number? AM, HM, or GM of CPIs? 17 Example • AM of CPI for base case = 1.2 cyc + 1.8 cyc + 2.5 cyc instr instr instr 5.5 cycles is execution time if each program ran for one instruction – therefore, AM of CPI defines a workload where every program runs for an equal #instrs • HM of CPI = 1 / AM of IPC ; defines a workload where every program runs for an equal number of cycles • GM of CPI: warm fuzzy number, not necessarily representing any workload
Docsity logo



Copyright © 2024 Ladybird Srl - Via Leonardo da Vinci 16, 10126, Torino, Italy - VAT 10816460017 - All rights reserved