Prepare for your exams
Get points
Guidelines and tips

Prepare for your exams

Study with the several resources on Docsity

Earn points to download

Earn points by helping other students or get them with a premium plan

Guidelines and tips

Sell on Docsity

Prepare for your exams

Study with the several resources on Docsity

Find documents

Prepare for your exams with the study notes shared by other students like you on Docsity

Search Store documents

The best documents sold by students who completed their studies

Search through all study resources

Docsity AINEW

Summarize your documents, ask them questions, convert them into quizzes and concept maps

Explore questions

Clear up your doubts by reading the answers to questions asked by your fellow students

Earn points to download

Earn points by helping other students or get them with a premium plan

Share documents

20 Points

For each uploaded document

Answer questions

5 Points

For each given answer (max 1 per day)

All the ways to get free points

Get points immediately

Choose a premium plan with all the points you need

Study Opportunities

Search for study opportunitiesNEW

Connect with the world's best universities and choose your course of study

Community

Ask the community

Ask the community for help and clear up your study doubts

University Rankings

Discover the best universities in your country according to Docsity users

Free resources

Our save-the-student-ebooks!

Download our free guides on studying techniques, anxiety management strategies, and thesis advice from Docsity tutors

From our blog

Exams and Study

Go to the blog

CS 61C: Great Ideas in Computer Architecture Course Summary & Review, Lecture notes of Computer Architecture and Organization

University of California - Berkeley Computer Architecture and Organization

A summary and review of the CS 61C course on Great Ideas in Computer Architecture. The document covers topics such as number representation, higher-level language, assembly language, logic circuit description, Moore's Law, technology trends, memory, memory management, and caching details. likely to be useful as study notes, lecture notes, summary, or exam for university students studying computer science or computer engineering.

Typology: Lecture notes

2021/2022

Uploaded on 05/11/2023

heathl 🇺🇸

4.5

(11)

16 documents

1 / 50

Partial preview of the text

Download CS 61C: Great Ideas in Computer Architecture Course Summary & Review and more Lecture notes Computer Architecture and Organization in PDF only on Docsity! Instructor: Justin Hsia 8/06/2012 Summer 2012 ‐‐ Lecture #28 1 CS 61C: Great Ideas in Computer Architecture Course Summary & Review Agenda • Course Summary • Administrivia • What’s Next? • Acknowledgements 8/06/2012 Summer 2012 ‐‐ Lecture #28 2 Number Representation • Anything can be represented as a number! – With n digits in base B, can represent Bn things • IEC (vs. SI) prefixes (210 ≈ 103) • Signed and unsigned integers – Addition, subtraction, overflow, sign extension – Two’s complement (better than 1’s and sign&mag) • Floating point (sign, biased exp, significand) – Inf, NaN, 0, denorms – Precision and truncation 8/06/2012 Summer 2012 ‐‐ Lecture #28 5 Higher‐Level Language (HLL) • We studied C because exposes more of hardware (particularly memory) – Compiled language is machine‐dependent • Arrays and strings – Don’t run off the end or forget null terminator • Pointers hold addresses, used to pass by ref – Pointer arithmetic – Array vs. pointer syntax • Structs are padded collections of variables 8/06/2012 Summer 2012 ‐‐ Lecture #28 6 Assembly Language • Close to the level that a machine understands – ISA in human‐readable format – TAL vs. MAL (pseudo‐instructions) • RISC vs. CISC and effects • MIPS Instruction Formats: R, I, J – Meaning and limitations of the fields – Relative (branch) vs. absolute (jump) addressing – Register conventions (saved/volatile; caller/callee) • Assembler: instr translation, sym/rel tables 8/06/2012 Summer 2012 ‐‐ Lecture #28 7 Logic Circuit Description • Build Synchronous Digital Systems out of combinational and sequential logic • Equivalence between Circuit Diagrams, Truth Tables, and Boolean Expressions – Can convert between all representations • Boolean algebra allows for circuit simplification (Karnaugh maps, too) • FSMs built with registers and CL • In reality, everything wires and transistors – Voltage‐controlled switches (1: high, 0: low) 8/06/2012 Summer 2012 ‐‐ Lecture #28 10 Great Idea #2: Moore’s Law 8/06/2012 Summer 2012 ‐‐ Lecture #28 11 Predicts: Transistor count per chip doubles every 2 years Gordon Moore Intel Cofounder B.S. Cal 1950# of tr an sis to rs o n an in te gr at ed c irc ui t ( IC ) Year: Technology Trends • Dynamic power = C × V2 × f – Capacitance, voltage, switching frequency • In WSC: Power Usage Effectiveness (PUE) = Total building power / IT equipment power • Technology growth is slowing, processors have hit a power wall – Everywhere: transistor density, CPU speed, disk and memory capacity – Performance improvements now coming from parallelism and multicore processors 8/06/2012 Summer 2012 ‐‐ Lecture #28 12 Memory • Programmer treats as one long array – You know that this is just an illusion (VM)! • Memory is byte‐addressed – Most data (including instructions) in words and word‐aligned, so all word addresses are multiples of 4 (end in 0b00) • Multicore systems use shared memory – Synchronization/cache coherence necessary 8/06/2012 Summer 2012 ‐‐ Lecture #28 15 Memory Management • Program’s address space contains four regions: – Stack: local variables, grows downward – Heap: space requested for pointers via malloc(); resizes dynamically, grows upward – Static Data: global and static variables, does not grow or shrink – Code: loaded when program starts, does not change size code static data heap stack ~ FFFF FFFFhex ~ 0hex 7/03/2012 Summer 2012 ‐‐ Lecture #10 16 • Take advantage of the principle of locality to present the user with as much memory as is available in the cheapest technology at the speed offered by the fastest technology 8/06/2012 Summer 2012 ‐‐ Lecture #28 17 Typical Memory Hierarchy On‐Chip Components Second Level Cache (SRAM) Control Datapath Secondary Memory (Disk or Flash)RegFile Main Memory (DRAM)Data Cache Instr Cache Cost/bit: highest lowest Speed: ½’s 1’s 10’s 100’s 1,000,000’s(cycles) Size: 100’s 10K’s M’s G’s T’s(bytes) I TLB D TLB Caching Details (2/2) • Cache parameters affect performance – Block size, cache size, set associativity – Write‐back/write‐through policies – Write allocate/no‐write allocate policies – Block replacement policy (Least Recently Used) • Source of cache misses: The 3 C’s – Compulsory, capacity, conflict • Multilevel caches reduce miss penalty 8/06/2012 Summer 2012 ‐‐ Lecture #28 20 Virtual Memory Details (1/3) • Give main memory effective size of disk without major penalty to performance – Move data in contiguous pages from disk to main memory – Assumption is that memory is small compared to both disk and virtual address space (or many processes) • Also provide protection for multiple processes – Requires a lot of work by operating system 8/06/2012 Summer 2012 ‐‐ Lecture #28 21 Virtual Memory Details (2/3) • Paging requires address translation – Can run programs larger than main memory – Hides variable machine configurations (RAM/HDD) – Solves fragmentation problem • Address mappings stored in page tables in memory – Additional memory access mitigated with TLB, which is a cache for page table – Management bits: Valid, Dirty, Ref, Access Rights 7/31/2012 22Summer 2012 ‐‐ Lecture #25 Great Idea #4: Parallelism 6/18/2012 Summer 2012 ‐‐ Lecture #1 Smart Phone Warehouse Scale Computer Leverage Parallelism & Achieve High Performance Core … Memory Input/Output Computer Core • Parallel Requests Assigned to computer e.g. search “Katz” • Parallel Threads Assigned to core e.g. lookup, ads • Parallel Instructions > 1 instruction @ one time e.g. 5 pipelined instructions • Parallel Data > 1 data item @ one time e.g. add of 4 pairs of words • Hardware descriptions All gates functioning in parallel at same time Software Hardware Cache Memory Core Instruction Unit(s) Functional Unit(s) A0+B0 A1+B1 A2+B2 A3+B3 Logic Gates 25 Types of Parallelism (1/4) • Request‐Level Parallelism (RLP) – Handling many requests per second (e.g. web search) • Data‐Level Parallelism (DLP) – Operate on many pieces of data at once – SIMD: at the level of single instructions – MapReduce: at the level of programs (split into map and reduce) 8/06/2012 Summer 2012 ‐‐ Lecture #28 26 Types of Parallelism (2/4) • Thread‐Level Parallelism (TLP) – Have many processors, run either different programs or different parts of same program at same time – If same program, need to deal with shared memory (cache coherence and synchronization primitives to prevent data races) – Splitting up work properly is difficult! • Shared vs. private variables in OpenMP • Often requires re‐designing your algorithm 8/06/2012 Summer 2012 ‐‐ Lecture #28 27 Great Idea #5: Performance Measurement and Improvement • Allows direct comparisons of architectures and quantification of improvements • It is all about time to finish (latency) – Includes both setup and execution. • Match application and hardware to exploit: – Locality – Parallelism – Special hardware features, like specialized instructions (e.g. matrix manipulation) 8/06/2012 Summer 2012 ‐‐ Lecture #28 30 Performance Measurements • Execution time (latency) and work per time (throughput) – CPU Time = Instructions × CPI × Clock Cycle Time • Memory Access: – AMAT, CPIstall use hit time, miss rate, miss penalty – Definitions recursive back to last level in hierarchy • Amdahl’s Law – Speedup = 1 / [ (1‐F) + F/S ] – Why we almost never get max possible speedup 8/06/2012 Summer 2012 ‐‐ Lecture #28 31 Performance Programming • Key challenge: Craft parallel programs that that scale well (weak/strong scaling) – Scheduling, load balancing, time for synchronization, overhead for communication • Some techniques: – Register/Cache Blocking – Data Parallelism & Loop Unrolling – Multithreading 8/06/2012 Summer 2012 ‐‐ Lecture #28 32 Redundant Arrays of Inexpensive Disks • Possible to simulate behavior of single larger disk with an array of smaller disks – Cheaper, higher bandwidth, more resistant to failure • RAID 0 – No redundancy • RAID 1 – Mirroring for redundancy • RAID 2 – Bit‐level striping • RAID 3 – Parity disks • RAID 4 – Block‐level striping with parity disk • RAID 5 – Striped parity 8/06/2012 Summer 2012 ‐‐ Lecture #28 35 Error Detection & Correction • Even parity using XOR • Hamming Distance – Distance 2 can detect 1‐bit error – Distance 3 can correct 1‐bit error – Distance 4 can correct 1‐bit error and detect 2‐bit error • Hamming ECC – Introduce extra parity bits (one per group) – Sum of group errors indicates corrupted bit 8/06/2012 Summer 2012 ‐‐ Lecture #28 36 Agenda • Course Summary • Administrivia • What’s Next? • Acknowledgements 8/06/2012 Summer 2012 ‐‐ Lecture #28 37 Project 2: sgemm‐small.c 8/06/2012 Summer 2012 ‐‐ Lecture #28 40 4 6 8 10 12 14 16 18 0 5 10 15 Fr eq ue nc y Gflops/s Average Speed on 36 x 36 Matrices Mean: Std Dev: 11.1 2.2 Project 2: sgemm‐openmp.c 8/06/2012 Summer 2012 ‐‐ Lecture #28 41 0 10 20 30 40 50 60 70 80 0 1 2 3 4 5 6 7 8 9 Fr eq ue nc y Gflops/s Average Speed on Large Matrices m=[1000,10000] by n=[32,100] Mean: Std Dev: 51.3 17.5 Project 2 Fastest Submissions • sgemm‐small (small): 1) 12.4 Gflop/s Harkiran Bolaria, Andrew Cai 2) 11.0 Gflop/s Yun Jae Cho, Duc Nguyen 3) 10.4 Gflop/s Shawn Park, Tananun Songdechakraiwut • sgemm‐small (36×36): 1) 16.4 Gflop/s Luis De Pombo, Steven Roger 2) 16.4 Gflop/s Bryan Cote, Myron Chen 3) 16.1 Gflop/s Chris Buonocore, Ali Jishi 8/06/2012 Summer 2012 ‐‐ Lecture #28 42 What’s Next? • Take classes from great teachers! (teacher > class) – Distinguished Teaching Award (very hard to get) – HKN Course evaluations (≥ 6 is very good) – Upcoming instructors for classes: (CS / EE) • Classes related to CS 61C – CS169 Software Engineering (for SaaS, Fox/Patterson Fall 12) – CS194‐15 Engineering Parallel Software – CS164 Programming Languages and Compilers – CS162 Operating Systems and Systems Programming – CS152 Computer Architecture and Engineering (Sp13) – CS150 Components and Design Techniques for Digital Systems 8/06/2012 Summer 2012 ‐‐ Lecture #28 45 Opportunities in Teaching • Interest in joining the CS staff? – Applies for CS 10, 61A, 61B, 61C – Usual path: Lab Assistant  Reader  TA – Also: Self‐Paced Center Tutor • Requirements: – Interest in teaching – Stricter grade requirements based on where you want to jump in • Applying: – Application form (for TA, Reader, or Lab Assistant) – Doesn’t hurt to e‐mail professor as well 8/06/2012 Summer 2012 ‐‐ Lecture #28 46 Opportunities at Cal • Why are we a top university in the WORLD? – Research, research, research! – Classes are just the tip of the iceberg – Whether you want to go to grad school or industry, you need someone to vouch for you – Won’t know if you like it or not until you try • Find out what you like, do lots of web research (read published papers), hit OH of professor, show enthusiasm & initiative 8/06/2012 Summer 2012 ‐‐ Lecture #28 47

Documents

questions

CS 61C: Great Ideas in Computer Architecture Course Summary & Review, Lecture notes of Computer Architecture and Organization

Related documents

Partial preview of the text