Docsity
Docsity

Prepare for your exams
Prepare for your exams

Study with the several resources on Docsity


Earn points to download
Earn points to download

Earn points by helping other students or get them with a premium plan


Guidelines and tips
Guidelines and tips

CS 141 Chien: Performance Metrics and Analysis, Study notes of Computer Architecture and Organization

A collection of notes from a university computer science course, cs 141, taught by professor chien. The notes cover various topics related to performance metrics and analysis, including execution time, cpu cycles, instruction count, cpi, amdahl's law, and relative performance. The professor also discusses the importance of understanding who affects performance and the variation in performance between different programs and machines.

Typology: Study notes

2009/2010

Uploaded on 03/28/2010

koofers-user-f2b
koofers-user-f2b 🇺🇸

10 documents

1 / 16

Toggle sidebar

Related documents


Partial preview of the text

Download CS 141 Chien: Performance Metrics and Analysis and more Study notes Computer Architecture and Organization in PDF only on Docsity! 1 CS 141 Chien 1 Jan 18, 2000 Performance II  Last Time – Computer Architecture: definition and drivers – Basic notions of Performance and Relative Performance  Today – Time bases and Performance Metrics – Amdahl’s Law – Comparing Performance  Reminders/Announcements – Homework #1 posted to the web on 1/14/00, Due 1/24/00 at the beginning of discussion section (no late homeworks) – TA office hours: W9-11am, F1-3pm, both in 3337D APM – Quiz on Thursday, as usual CS 141 Chien 2 Jan 18, 2000 Performance Metrics  Execution Time  Many are units of work / units of time  Work: instructions, floating point operations, polygons, *answers*, etc.  Time: seconds, other things… 2 CS 141 Chien 3 Jan 18, 2000 What is Time? CPU Execution Time = CPU clock cycles * Clock cycle time = CPU clock cycles / Clock rate  Every conventional processor has a clock with an associated clock cycle time or clock rate.  Every program runs in an integral number of clock cycles. x MHz = x millions of cycles/second (clock rate) 1/ (x MHz) = cycle time, 1/(500 MHz) = 2 ns CS 141 Chien 4 Jan 18, 2000 How many clock cycles? Number of CPU cycles = Instructions executed * Average Clock Cycles per Instruction (CPI) or CPI = CPU clock cycles / Instruction count => Nice technology scaled “architecture metric” 5 CS 141 Chien 9 Jan 18, 2000 Amdahl’s Law  The impact of a performance improvement is limited by the percent of execution time affected by the improvement  Make the common case fast!! Execution time after improvement = Execution Time Affected Amount of Improvement Execution Time Unaffected+ CS 141 Chien 10 Jan 18, 2000 Comparing Computers using Metrics  Run programs, record execution times  How can we describe the relative performance of machines with such a metric? A: B: Runtime (secs) System A 50 System B 75 6 CS 141 Chien 11 Jan 18, 2000 Relative Performance  Can be confusing – A runs in 12 seconds – B runs in 20 seconds – A/B = .6 , so A is 40% faster, or 1.4X faster, or B is .40% slower – B/A = 1.67, so A is 67% faster, or 1.67X faster, or B is 67% slower  Needs a precise definition CS 141 Chien 12 Jan 18, 2000 Relative Performance Statements  Performance Ratio (A/B) – “A is 1.5 times faster than B”  Performance Ratio (A/B)  Performance Ratio (B/A) – “B is 0.67 times the performance of A” Runtime (secs) System A 50 System B 75 ExecTimeB 75 = __________ = __ = 1.5 ExecTimeA 50 PerfA 1/ExecTimeA 75 = ______ = __________ = __ = 1.5 PerfB 1/ExecTimeB 50 ExecTimeA 50 = __________ = __ = 0.67 ExecTimeB 75 7 CS 141 Chien 13 Jan 18, 2000 Performance PerformanceX = 1 Execution TimeX  only has meaning in the context of a program or workload  Not very intuitive as an absolute measure , for program X CS 141 Chien 14 Jan 18, 2000 Defining Relative Performance PerformanceX Execution TimeXPerformanceY Relative Performance Execution TimeY= = = n  We can remove all ambiguity by always constraining n to be > 1 => machine x is n times faster than y. 10 CS 141 Chien 19 Jan 18, 2000 Combining Performance  If workload consists of 10 programs, how to combine the performance metric results?  Incomparable results in a 10-dimensional space?  Want a one-dimensional metric space of better- worse CS 141 Chien 20 Jan 18, 2000 Summarizing Performance  Arithmetic mean is a consistent summary: – average execution time – uses equal measures of each application – contribution is based on the data sets actually run!! => same performance => Different mix, very different performance Arithmetic Mean = AM = (1/N) Σ Timei 11 CS 141 Chien 21 Jan 18, 2000 Practical Performance Evaluation  Is this how people really evaluate machines? – Some, but not most.  Why not? – Time, effort, expense, approximations of the workload and system  Practically, several well established “benchmark sets”. – Whetstone and Dhrystones – Livermore Loops (Kernels) – Systems Performance Evaluation Cooperative (SPEC) – PERFECT Club, SPLASH – Transaction Processing Council (TPC) – Norton Index, WinMarks, Xstones  Read and buy, areas, specialization, complications CS 141 Chien 22 Jan 18, 2000 Summarizing Performance  Arithmetic Mean – AM(A) = 36/5 = 7.2 secs – AM(B) = 33/5 = 6.6 secs  Machine A is factor on 3 of 5 Applications, however... arithmetic mean says that is inferior...  What does this mean? Benchmark 1 2 3 4 5 Machine A 3 8 12 8 5 Machine B 4 5 14 4 6 12 CS 141 Chien 23 Jan 18, 2000 Arithmetic Mean  Includes not only how often it was faster, but also how much faster.  Results may be fragile: – Eliminating any of the 2 programs for which B > A, then the summary outcome would change  What if A = 100x B on one benchmark, but 0.8x on the rest? CS 141 Chien 24 Jan 18, 2000 Weighted Arithmetic Mean  Benchmarks and Data sets not always available in convenient/appropriate sizes  Arithmetic summaries are sensitive to the workload mix  => weight the contribution by expected usage  WAM = (1/N) Σ wi * Ti  where Σ wi = 1 15 CS 141 Chien 29 Jan 18, 2000 SPEC Benchmarks  Popular performance suite  Contains – 8 “integer” programs (compilers, compression, lisp intrepreter, perl, database) – 9 “floating point” benchmarks – Results are combined using the geometric mean (see textbook)  Caveats: – This is a CPU benchmark (not a good test of the entire system…); no I/O and cpu-intensive programs – some testing of the memory system – compiler is a critical part of the test CS 141 Chien 30 Jan 18, 2000 SPEC89 Suite -- “Compiler Breakthrough”  measures compiler as much as architecture! 0 100 200 300 400 500 600 700 800 tomcatvfppppmatrix300eqntottlinasa7doducspiceespressogcc Benchmark Compiler Enhanced compiler S PE C p er fo rm an ce r at io 16 CS 141 Chien 31 Jan 18, 2000 SPEC Results -- Pentium and Pentium Pro Clock rate (MHz) S PE C in t 2 0 4 6 8 3 1 5 7 9 10 200 25015010050 Pentium Pentium Pro CS 141 Chien 32 Jan 18, 2000 Performance II Summary  CPI Metric  Amdahl’s Law  Pitfalls of single metrics  Summarizing Performance: AM, WAM  Pitfalls of summarizing performance  Spec benchmark suite  Next Time: – Instruction Sets: the Software/Hardware Interface
Docsity logo



Copyright © 2024 Ladybird Srl - Via Leonardo da Vinci 16, 10126, Torino, Italy - VAT 10816460017 - All rights reserved