Docsity
Docsity

Prepare for your exams
Prepare for your exams

Study with the several resources on Docsity


Earn points to download
Earn points to download

Earn points by helping other students or get them with a premium plan


Guidelines and tips
Guidelines and tips

Midterm Exam for Computer Architecture | CS 2410, Exams of Computer Architecture and Organization

Material Type: Exam; Class: COMPUTER ARCHITECTURE; Subject: Computer Science; University: University of Pittsburgh; Term: Fall 2004;

Typology: Exams

Pre 2010

Uploaded on 09/02/2009

koofers-user-5v4
koofers-user-5v4 🇺🇸

10 documents

1 / 10

Toggle sidebar

Related documents


Partial preview of the text

Download Midterm Exam for Computer Architecture | CS 2410 and more Exams Computer Architecture and Organization in PDF only on Docsity! Name ___________________________________________ 1 of 11 CS 2410 Computer Architecture Fall 2004 Midterm Exam Directions: This exam is closed book—put all books and notes under your desk on the floor. There are five questions with several subquestions. All questions are marked with their point value. There should be plenty of workspace provided in the exam booklet, but if you need extra pages, you may use a blank piece of paper. Be sure to show all work, otherwise partial credit will not be given. STATE ALL OF YOUR ASSUMPTIONS There are 11 pages to the exam. Do as much as you can. But don’t panic: Several of the pages are blank for work space or have long problem descriptions. All questions are weighted the same—20 points each. The point value for each subquestion is indicated. Write your name on each exam page! Topics covered by each question: Question 1 - Quick Answer Question 2 - Performance Evaluation Question 3 - Branch Prediction Question 4 - Tomasulo’s Algorithm and Speculative Execution Question 5 - Dynamic Scheduling Name ___________________________________________ 2 of 11 Question 1 - Instruction Set Design, Static Scheduling, Classic 5-stage Pipelines 2 points each. Circle “True” or “False”. A question should be strictly true when you select “True”. 1. Software pipelining and loop unrolling improve performance in simi- lar ways. Hence, it is rarely beneficial to apply both to the same loop. True False 2. It is better (performance wise) to execute the controlling branch when its cost is less than the cost of executing the predicated code sequence. True False 3. Annulling predicated instructions early in the pipeline works better than late annullment because the predicates are usually available when an instruction issues to a reservation station. True False 4. Boosting is essentially a form of predication because the branch out- comes can be considered to be predicates. Boosting and predication have the same restrictions on moving instructions. True False 5. A statically scheduled multiple-issue superscalar processor is essen- tially the same as a strict VLIW, except the binary format of instruction operations is different. True False 6. Assuming there are no stores and data dependences involving regis- ters are not violated, the compiler can interchange the order of loads and speculatively move them across branches without hardware support. True False 7. If-conversion is a transformation that turns a piece of code into a pred- icated version. Software pipelining should be done before if-conversion. True False 8. In a classic 5-stage RISC pipeline (Fetch, Decode, Execute, Memory, Writeback), RAW and WAR hazards are possible, while WAW hazards are not. True False 9. In a classic 5-stage RISC pipeline (Fetch, Decode, Execute, Memory, Writeback), store instructions need forwarding paths to only the ALU (in Execute). True False 10. In a dynamically scheduled processor without speculation that uses Tomasulo’s algorithm, reservation stations help to avoid write-after- write hazards. True False Name ___________________________________________ 5 of 11 (b) 14 points. Consider the following pseudo-code: Assume that the above code is inside of a loop and the loop executes 9 iterations. On each itera- tion, the value of x changes. The values are: Suppose a gselect branch predictor is used with the properties: • The predictor uses 1 address bit and 1 history bit. • An address bit of 0 is branch B1 and an address bit of 1 is branch B2. • The history bit is kept in a 1-bit register GHR. A 0 means the last branch was not taken and 1 means the last branch was taken. • The GHR is initially 0. • Each entry in the BHT is 1-bit. 0 is predict not-taken (NT) and 1 is predict taken (T) • All entries in the BHT are initialized to 0 (NT). Fill in the following table with the predictions and outcomes for the branches: Use “T” for taken and “NT” for not taken. if (x is even) then // this is branch B1 a = a + 1; // B1 is taken if (x is a multiple of 10) then // this is branch B2 b = b + 1; // B2 is taken Iteration 0 1 2 3 4 5 6 7 8 x’s value 8 9 10 11 12 20 29 30 31 x’s value 8 9 10 11 12 20 29 30 31 B1 predicted B1 actual B2 predicted B2 actual GHR value Name ___________________________________________ 6 of 11 Question 4 - Tomasulo’s Algorithm & Speculative Execution In this problem, consider a single issue processor that has dynamic scheduling, branch prediction, and speculative execution. This processor uses Tomasulo’s algorithm to schedule instructions and a reorder buffer to allow speculation. Assume the following information: • Execution latency is how long an operation spends in a functional unit once it has all source operands. • An operation can read its source operands from the register file during issue, assuming the source operands are available in the register file. • An operation that is pending (i.e., waiting) for source operands can begin execution on the cycle after the last pending operand is broadcast on the CDB. • ADD.D takes 2 execution cycles, LD.D takes 2 execution cycle, MULT.D takes 4execution cycles, DIV.D takes 6 execution cycles • One of the execution cycles for the load operation computes the effective address. • 2 load buffers, 6 reorder buffer entries, 1 reservation station per functional unit, and there are 1 add functional unit, 1 multiply functional unit and 1 divide functional unit Consider the code: . There are four parts to this question, starting on the next page. LD.D F1,0(R2) LD.D F2,0(R3) MUL.D F5,F2,F3 ADD.D F1,F1,F4 DIV.D F1,F1,F5 Name ___________________________________________ 7 of 11 (a) 8 points. Fill in the table below with the state of each instruction on each clock cycle until the DIV.D instruction commits its result. You may not need all of the listed cycles. Write “Issue”, “Execute”, “Write Result”, or “Commit” in each box to indicate the state of the instruction on a given clock cycle. If an instruction is not active in the pipeline, then leave the box blank. Parts (b), (c), and (d) are on the next page. Cycle LD.D F9,.. LD.D F10,.. MUL.D ADD.D DIV.D 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21
Docsity logo



Copyright © 2024 Ladybird Srl - Via Leonardo da Vinci 16, 10126, Torino, Italy - VAT 10816460017 - All rights reserved