Docsity
Docsity

Prepare for your exams
Prepare for your exams

Study with the several resources on Docsity


Earn points to download
Earn points to download

Earn points by helping other students or get them with a premium plan


Guidelines and tips
Guidelines and tips

Computer Architecture –Course Objectives, Study notes of Computer Architecture and Organization

An overview of the course objectives of Computer Architecture. It examines the role of computer architecture in system/program performance and explores the key components of CA and the architectures of today’s processors. The document also discusses the aspects of architecture design that affect the performance of applications and how to extract maximum performance out of today’s CAs. It also covers the role of software in architecture performance and emerging trends in CA. a mix of lecture notes and study notes and could be useful for a university student preparing for an exam or an assignment.

Typology: Study notes

2021/2022

Uploaded on 05/11/2023

alley
alley 🇺🇸

4

(4)

20 documents

1 / 22

Toggle sidebar

Related documents


Partial preview of the text

Download Computer Architecture –Course Objectives and more Study notes Computer Architecture and Organization in PDF only on Docsity! 1 CS 211: Computer Architecture Instructor: Prof. Bhagi Narahari Dept. of Computer Science Course URL: www.seas.gwu.edu/~narahari/cs211/ CS 211: Computer Architecture, Bhagi Narahari Computer Architecture – Course Objectives • Examine the role of computer architecture (CA) in system/program performance What are the key components of CA ? What are the architectures of today’s processors ? What aspects of architecture design affect performance of application and how ? How to extract max performance out of today’s CAs ? Role of software in architecture performance What are the emerging trends in CA ? • quantitative approach to CA CS 211: Computer Architecture, Bhagi Narahari What it is not.. • What the course is not Detailed exposition on hardware design Semiconductor technology details Case studies How to assemble/buy a new computer CS 211: Computer Architecture, Bhagi Narahari Perspective • Computer architecture design is directly linked to underlying technology Semiconductor Compiler technology Computational models • Goal of software designers is to run an application program efficiently on the architecture Compiler plays a key role interplay between architecture features and application program properties Bottom line is performance of application 2 CS 211: Computer Architecture, Bhagi Narahari Let’s look at Architecture Trends, Technologies • Interplay between hardware and software • Implications of technology trends on emerging architecture designs CS 211: Computer Architecture, Bhagi Narahari Today • What is Computer Architecture Architecture levels and our focus • Technology Trends Summary of what has happened in CA Hardware performance trends and designs Impact of current trends on new designs • Performance models What to measure and how Models linking hardware and software Thumb rules for CA design • Read Chapter 1 CS 211: Computer Architecture, Bhagi Narahari An Important Idea: what are Computers meant to do ? • We will be solving problems that are describable in English (or Greek or French or Hindi or Chinese or ...) and using a box filled with electrons and magnetism to accomplish the task. This is accomplished using a system of well defined (sometimes) transformations that have been developed over the last 50+ years. As a whole the process is complex, examined individually the steps are simple and straightforward CS 211: Computer Architecture, Bhagi Narahari Hardware Vs. Software Hardware Medium to compute functions Software Functions to compute Computational Model connects them 5 CS 211: Computer Architecture, Bhagi Narahari The hardware/software interface: Instruction Set Architecture (ISA) instruction set software hardware Which is easier to change/design??? CS 211: Computer Architecture, Bhagi Narahari The Backdrop: Users • Who will program these machines? Programmers • What do they expect? Performance Correctness • How? Write HLL program and Compile • Compilation is key to performance Requires Hardware/Software interaction at ISA level Knowledge of architecture, application, algorithm CS 211: Computer Architecture, Bhagi Narahari Architecture: Introduction • What is Computer Architecture Architecture levels and our focus • Technology Trends Summary of what has happened in CA Hardware performance trends and designs Impact of current trends on new designs • Performance models What to measure and how Models linking hardware and software Thumb rules for CA design CS 211: Computer Architecture, Bhagi Narahari Trends In Technology, Applications,Architectures 6 CS 211: Computer Architecture, Bhagi Narahari Performance: Original Food Chain Picture Big Fishes Eating Little Fishes CS 211: Computer Architecture, Bhagi Narahari Processor Performance Trends Microprocessors Minicomputers Mainframes Supercomputers Year 0.1 1 10 100 1000 1965 1970 1975 1980 1985 1990 1995 2000 CS 211: Computer Architecture, Bhagi Narahari 1998 Computer Food Chain: Cost/Performance PCWork- station Mainframe Supercomputer Mini- supercomputerMassively Parallel Processors Mini- computer Now who is eating whom? Server CS 211: Computer Architecture, Bhagi Narahari Computer Architecture: Over the years • Microprocessors today (Intel, PowerPC,etc.) faster than first Cray supercomputer CRAY-1 • ENIAC filled a room, MicroProc today fit on palm • Big increase in functionality “old” days, one had to buy separate Math co- processor for Intel PCs Now, even separate special purpose engines (graphics co-proc., network proc. etc.) are standard 7 CS 211: Computer Architecture, Bhagi Narahari Why Such Change? • Performance Technology Advances- Moore’s Law CMOS VLSI dominates older technologies (TTL, ECL) in cost AND performance and is progressing rapidly Computer architecture advances improves low- end RISC, superscalar, RAID, … • Price: Lower costs due to … Simpler development, volumes, lower margins • Function Rise of networking/local interconnection technology CS 211: Computer Architecture, Bhagi Narahari Memory Capacity (Single Chip DRAM) size Year 1000 10000 100000 1000000 10000000 100000000 1000000000 1970 1975 1980 1985 1990 1995 2000 year size(Mb) cyc time 1980 0.0625 250 ns 1983 0.25 220 ns 1986 1 190 ns 1989 4 165 ns 1992 16 145 ns 1996 64 120 ns 2000 256 100 ns CS 211: Computer Architecture, Bhagi Narahari Technology Trends summary Capacity Speed (latency) Logic 2x in 2 years 2x in 3 years DRAM 4x in 3 years 2x in 10 years Disk 4x in 3 years 2x in 10 years CS 211: Computer Architecture, Bhagi Narahari Performance Trends: Summary • Workstation performance (measured in Spec Marks) improves roughly 50% per year (2X every 18 months) • Improvement in cost performance estimated at 70% per year 10 CS 211: Computer Architecture, Bhagi Narahari Sequential Processor Sequential Instructions Processor Execution unitti it CS 211: Computer Architecture, Bhagi Narahari Instruction Level Parallelism: Shrinking of the Parallel Processor • Put multiple processors into one chip • execute multiple instructions in each cycle • move from multiple processor architectures to multiple issue processors • Two classes of Instruction Level Parallel (ILP) processors Superscalar processors Explicitly Parallel Instruction Computers (EPIC) also known as Very Large Ins Word (VLIW) CS 211: Computer Architecture, Bhagi Narahari ILP Processors:Superscalar Sequential Instructions Superscalar Processor Scheduling Logic li i Instruction scheduling/ parallelism extraction done by hardware I i li / ll li i Example: Intel IA-32/Pentium CS 211: Computer Architecture, Bhagi Narahari Serial Program (C code) i l ( ) Scheduled Instructions EPIC Processor ILP Processors:EPIC/VLIW compiler Example: Intel IA-64; Itanium 11 CS 211: Computer Architecture, Bhagi Narahari Multi-Core Processors Sequential Instructions Multi-Core Processor Multi-processing on Chip; Multiple threads – for each core l i i i l i l Example: Intel Core 2 Duo ILP “processor”ILP “processor” “core 1” “core 2” CS 211: Computer Architecture, Bhagi Narahari Frontend and Optimizer Determine Dependences Determine Independences Bind Operations to Function Units Bind Transports to Busses Determine Dependences Bind Transports to Busses Execute Superscalar Dataflow Indep. Arch. VLIW TTA Compiler Hardware Determine Independences Bind Operations to Function Units B. Ramakrishna Rau and Joseph A. Fisher. Instruction-level parallel: History overview, and perspective. The Journal of Supercomputing, 7(1-2):9-50, May 1993. Who is doing what: Compiler vs. Processor CS 211: Computer Architecture, Bhagi Narahari Importance of Compilers in ILPArchitectures • Role of compiler more important than ever optimize code analyze dependencies between instructions extract parallelism schedule code onto processors EPIC processors does not have any hardware utilities for scheduling, conflict resolution etc. has to be done by the compiler CS 211: Computer Architecture, Bhagi Narahari Another aspect: Quantifying Power Consumption • What else is an issue in processor/system design/performance • Power consumption/heat dissipation Limited energy source (battery) in embedded systems (or even laptops) Apple switch to Intel chips in 2005 ? 12 CS 211: Computer Architecture, Bhagi Narahari Power Equation • PAVG - the average dynamic power consumed by the gates • NG - the number of gates that transition This is usually dropped from the equation • fclk - the frequency of the system clock • CL - the average capacitive load per gate • VDD - the supply voltage 2 2 1 DDLclkGAVG VCfNP = • For mobile devices, energy better metric VoltageLoadCapacitiveEnergydynamic 2 ×= CS 211: Computer Architecture, Bhagi Narahari Define and quantify power • For CMOS chips, traditional dominant energy consumption has been in switching transistors, called dynamic power • For a fixed task, slowing clock rate (frequency switched) reduces power, but not energy • Capacitive load a function of number of transistors connected to output and technology, which determines capacitance of wires and transistors • Dropping voltage helps both, so went from 5V to 1V • To save energy & dynamic power, most CPUs now turn off clock of inactive modules (e.g. Fl. Pt. Unit) CS 211: Computer Architecture, Bhagi Narahari Example of quantifying power • Suppose 15% reduction in voltage results in a 15% reduction in frequency. What is impact on dynamic power? dynamic dynamic dynamic OldPower OldPower witchedFrequencySVoltageLoadCapacitive witchedFrequencySVoltageLoadCapacitivePower × × ×××× ××× ≈ = ×= = 6.0 )85(. )85(.85.2/1 2/1 3 2 2 CS 211: Computer Architecture, Bhagi Narahari Power • Because leakage current flows even when a transistor is off, now static power important too • Leakage current increases in processors with smaller transistor sizes • Increasing the number of transistors increases power even if they are turned off • In 2006, goal for leakage is 25% of total power consumption; high performance designs at 40% • Very low power systems even gate voltage to inactive modules to control loss due to leakage VoltageCurrentPower staticstatic ×= 15 CS 211: Computer Architecture, Bhagi Narahari Course Information • Textbook: Hennessy and Patterson, Computer Architecture: A quantitative approach; 4th Edition, Pub. Morgan Kauffman If you have 3rd Edition that will work fine. • course topic to book chapter mapping placed on website • Website will contain lecture materials and homeworks, as well as references • Homework & Project submissions will use Blackboard CS 211: Computer Architecture, Bhagi Narahari Course Requirements • Prerequisites: data structures, discrete math, computer organization • Requirements: Exams: 65% Midterm and Final Homework assignments: 10% Work individually Projects – 15% Work in teams of 3 persons Students *may* be permitted to substitute term paper or project for some of the projects—will have to meet me before October 1. Substitute different project for assigned project Class discussions & presentations Readings will be assigned to teams; present and lead discussion in class • Academic Integrity Policy Absolutely no collaboration of any kind on homeworks No outside sources (people or content) Programming projects can be done in 2-3 person teams – no collaboration between teams CS 211: Computer Architecture, Bhagi Narahari Programming projects • Projects require programming using Simple Scalar simulator Some homeworks may also require use of this Students placed into teams (3 person teams; 2 also allowed) for programming projects – team selection target date is October 1. • www.simplescalar.com • Objective of using Simplescalar Connect concepts covered with ‘real’ implementations and study impact of architecture techniques on actual applications. • Machines in Academic Center, 7th Floor Terminal Room 724. Linux machines Grad student (part-time TA) will cover this in office hours • No regular TA for course CS 211: Computer Architecture, Bhagi Narahari Course Outline • Computer Organization Review – Mostly Self study • Architecture challenges, design objectives, thumb rules, emerging issues • (I) Processor architectures: Instruction level parallel (ILP) processors Pipelined, superscalar, and EPIC/VLIW..vector Midterm – date to be decided…plan for 8th or 9th week • (II) Components: Compiler Optimization Memory Design: cache optimizations I/O system • (III) Multi-core and Multiprocessors: Multiprocessor Architectures overview Introduction to Multi-core computing • Other topics time permitting 16 CS 211: Computer Architecture, Bhagi Narahari Architecture: Introduction • What is Computer Architecture Architecture levels and our focus • Technology Trends Summary of what has happened in CA Hardware performance trends and designs Impact of current trends on new designs • Performance models What to measure and how Models linking hardware and software Thumb rules for CA design CS 211: Computer Architecture, Bhagi Narahari Recurring Theme Performance – Calculating & measuring performance – Designing & tuning software CS 211: Computer Architecture, Bhagi Narahari Performance • How do you measure performance? Throughput Number of tasks completed per time unit Response time/latency time taken to complete the task metric chosen depends on user community System admin vs single user submitting homework CS 211: Computer Architecture, Bhagi Narahari The Bottom Line: Performance (and Cost) Plane Boeing 747 BAD/Sud Concodre Speed 610 mph 1350 mph DC to Paris 6.5 hours 3 hours Passengers 470 132 Performance ? 17 CS 211: Computer Architecture, Bhagi Narahari The Bottom Line: Performance (and Cost) • Time to run the task (Execution Time/Response Time/Latency) – Time to travel from DC to Paris • Tasks per unit time (Throughput/Bandwidth) • Passenger miles per hour; how many passengers transported per unit time Plane Boeing 747 BAD/Sud Concodre Speed 610 mph 1350 mph DC to Paris 6.5 hours 3 hours Passengers 470 132 Throughput (pmph) 286,700 178,200 CS 211: Computer Architecture, Bhagi Narahari The Bottom Line: Performance (and Cost) "X is n times faster than Y" means ExTime(Y) Performance(X) --------- = --------------- ExTime(X) Performance(Y) • Speed of Concorde vs. Boeing 747 • Throughput of Boeing 747 vs. Concorde CS 211: Computer Architecture, Bhagi Narahari How to Model Performance • What are we trying to model ? Time taken to run an application program • Why not just use “time” function in Unix? CS 211: Computer Architecture, Bhagi Narahari Aspects of CPU Performance CPU time = Seconds = Instructions x Cycles x Seconds Program Program Instruction Cycle CPU time = Seconds = Instructions x Cycles x Seconds Program Program Instruction Cycle CPU = IC * CPI * Clk Holy grail of CS 211 ☺ 20 CS 211: Computer Architecture, Bhagi Narahari How to Summarize Performance • Arithmetic mean (weighted arithmetic mean) tracks execution time: Σ(Ti)/n or Σ(Wi*Ti) • Harmonic mean (weighted harmonic mean) of rates (e.g., MFLOPS) tracks execution time: n/Σ(1/Ri) or n/Σ(Wi/Ri) • Normalized execution time is handy for scaling performance (e.g., X times faster than SPARCstation 10) CS 211: Computer Architecture, Bhagi Narahari Performance • How do you measure performance? Throughput, Response time/latency metric chosen depends on user community System admin vs single user submitting homework • Models for performance CPU time equation • What to measure Benchmarks- SPEC, MIBench, etc. • Next: How to improve performance – thumb rules CS 211: Computer Architecture, Bhagi Narahari Performance: The AAA rule for designers • Application • Algorithm • Architecture CS 211: Computer Architecture, Bhagi Narahari Quantitative Principles of Computer Architecture Design ( Thumb Rules) • Performance equation • Common case fast Focus on improving those instructions that are frequently used • Amdahl’s Law Fraction enhanced/optimized runs faster Parts of program that cannot be enhanced • Locality Spatial Temporal • Concurrency/Parallelism – overlap instruction execution 21 CS 211: Computer Architecture, Bhagi Narahari Parallelism • Increasing throughput of server computer via multiple processors or multiple disks • Detailed HW design Carry lookahead adders uses parallelism to speed up computing sums from linear to logarithmic in number of bits per operand Multiple memory banks searched in parallel in set-associative caches • Pipelining: overlap instruction execution to reduce the total time to complete an instruction sequence. CS 211: Computer Architecture, Bhagi Narahari The Principle of Locality • The Principle of Locality: Program access a relatively small portion of the address space at any instant of time. • Two Different Types of Locality: Temporal Locality (Locality in Time): If you use something then you will use it again soon If an item is referenced, it will tend to be referenced again soon (e.g., loops, reuse) Spatial Locality (Locality in Space): If you use something then you will use something nearby If an item is referenced, items whose addresses are close by tend to be referenced soon (e.g., straight-line code, array access) • Last 30 years, HW relied on locality for memory perf. P MEM$ CS 211: Computer Architecture, Bhagi Narahari Focus on the Common Case • Common sense guides computer design Since its engineering, common sense is valuable • In making a design trade-off, favor the frequent case over the infrequent case E.g., Instruction fetch and decode unit used more frequently than multiplier, so optimize it 1st E.g., If database server has 50 disks / processor, storage dependability dominates system dependability, so optimize it 1st • Frequent case is often simpler and can be done faster than the infrequent case E.g., overflow is rare when adding 2 numbers, so improve performance by optimizing more common case of no overflow May slow down overflow, but overall performance improved by optimizing for the normal case • What is frequent case and how much performance improved by making case faster => Amdahl’s Law CS 211: Computer Architecture, Bhagi Narahari Common Case • 90% time spent on 10% of code • Examples: Word proc, CAD 80% of program instructions executed were from 3-5% of the code 90% of inst. executed were from 9-12% code 22 CS 211: Computer Architecture, Bhagi Narahari Amdahl’s Law: Speedup • Application takes X time • How to run it faster Enhance/optimize a portion of it Which portion Can we enhance all of it Note that we are talking of solving the enhanced part in a different way, and possibly using different (more costly) resources • Eg: Getting from A to B, B to C. Two portions to the task (A-B) and (B-C) CS 211: Computer Architecture, Bhagi Narahari Amdahl’s Law ( ) enhanced enhanced enhanced new old overall Speedup Fraction Fraction 1 ExTime ExTime Speedup +− == 1 Best you could ever hope to do: ( )enhanced maximum Fraction - 1 1 Speedup = ( ) ⎥ ⎦ ⎤ ⎢ ⎣ ⎡ +−×= enhanced enhanced enhancedoldnew Speedup FractionFraction ExTime ExTime 1 CS 211: Computer Architecture, Bhagi Narahari Amdahl’s Law example • New CPU 10X faster • I/O bound server, so 60% time waiting for I/O Implies can “enhance”/optimize only 40% of code ( ) ( ) 56.1 64.0 1 10 0.4 0.4 1 1 Speedup Fraction Fraction 1 1 Speedup enhanced enhanced enhanced overall == +− = +− = • Apparently, its human nature to be attracted by 10X faster, vs. keeping in perspective its just 1.6X faster ☺ CS 211: Computer Architecture, Bhagi Narahari Architecture Design: Summary • Design to last through trends • Understand the principles Make common case fast Amdahl’s law Locality Parallelism/concurrency
Docsity logo



Copyright © 2024 Ladybird Srl - Via Leonardo da Vinci 16, 10126, Torino, Italy - VAT 10816460017 - All rights reserved