Download Computer Architecture: Homework 5 - Datapath and Pipeline Performance Analysis - Prof. Vis and more Assignments Computer Architecture and Organization in PDF only on Docsity! ELEC 5200-001/6200-001 Computer Architecture and Design Fall 2007 Homework 5 Solution Assigned 10/12/07, due 10/19/07 Problem 1: Consider a single-cycle datapath as a single-stage pipeline. Assume that it consists of combinational logic with a maximum delay of T and a register (program counter) with delay q. Show that an upper bound on the performance ratio for multi- stage pipeline to single-cycle datapaths is 1+T/q. Answer: Pipeline of n stages: We divide the asynchronous logic into n equal delay slices, each with delay T/n. One register of delay q is added to each slice. Thus, the cycle time for the n-stage pipeline is q+T/n. We assume no hazards in the pipeline, because hazards can only reduce the performance and hence our upper bound will remain valid. Execution rate for the pipeline is then 1 cycle per instruction (CPI), neglecting the latency, which can only reduce the rate. Single-cycle: Cycle time is q+T and CPI is 1. Performance ratio = (single-cycle time per instruction)/(pipeline time per instruction) = (q+T)/(q+T/n) This monotonically increases with n, and approaches 1+T/q as n→ . Problem 2: An n-stage pipeline on an average requires α n bubbles per instruction to resolve hazards, where 0.0 ≤ α ≤ 1.0. Assume that the entire datapath logic delay T is equally divided among n stages with an additional register delay q per stage. Show that for optimum performance the number of stages should be √[T/(α q)]. If T = 10ns, q = 2ns, and α = 0.1, find the optimum number of stages. What is the maximum clock frequency for the optimum pipeline and what is its throughput ratio with respect to a single-cycle datapath using similar hardware? Answer: Considering the hardware delay of a pipeline stage, the clock cycle time for pipeline should be q+T/n. Average CPI is 1+α n. Thus the average time per instruction is, TPI = (q+T/n)(1+α n) = q + T/n +qα n +Tα To optimize, we set ∂TPI/∂n = – T/n2 + q α = 0, i.e., n = √[T/(α q)]α q)] Because ∂2TPI/∂n2 = 2T/n3 > 0, this value of n minimizes TPI.