Docsity
Docsity

Prepare for your exams
Prepare for your exams

Study with the several resources on Docsity


Earn points to download
Earn points to download

Earn points by helping other students or get them with a premium plan


Guidelines and tips
Guidelines and tips

Computer Organization: Homework Solutions for Chapter 5 - Prof. Jiang Li, Assignments of Computer Architecture and Organization

Solutions to homework 4 for the computer organization course, covering topics such as datapath modifications, instruction execution, and exception handling.

Typology: Assignments

Pre 2010

Uploaded on 08/19/2009

koofers-user-2y0
koofers-user-2y0 🇺🇸

10 documents

1 / 7

Toggle sidebar

Related documents


Partial preview of the text

Download Computer Organization: Homework Solutions for Chapter 5 - Prof. Jiang Li and more Assignments Computer Architecture and Organization in PDF only on Docsity! SYCS 201: Computer Organization Homework 4 Solutions 5.2 (12 pts) According to page 300 in the textbook, the discussed datapath only covers lw, sw, beq and the arithmetic-logical instructions add, sub, and, or and slt (set on less than). Therefore, only these instructions are considered for the answers to this problem. The R-format instructions referred to in the following only includes the aforementioned arithmetic-logical instructions, i.e. add, sub, and, or and slt. a. RegWrite = 0: All the arithmetic-logical instructions (add/sub/and/or /slt) and lw, will not work because these instructions will not be able to write their results to the register file. b. ALUOp0 = 0: The beq instruction will not work because the ALU will perform addition instead of subtraction (see Figure 5.12), so the branch outcome may be wrong. c. ALUOp1 = 0: All R-format instructions except add will not work correctly because the ALU will always perform addition instead of the required ALU operation when ALUOp0 = 0 (see Figure 5.12). d. Branch (or PCSrc) = 0: The beq instruction will not execute correctly. The branch instruction will always be not taken even when it should be taken. e. MemRead = 0: lw will not execute correctly because it will not be able to read data from memory. f. MemWrite = 0: sw will not work correctly because it will not be able to write to the data memory. 5.8 (11 pts) A modification to the datapath is necessary to allow the new PC to come from a register (Read data 1 port), and a new signal (e.g., JumpReg) to control it through a multiplexor as shown in the following figure. Augmentation Modification A new line should be added to the truth table in Figure 5.18 on page 308 to implement the jr instruction and a new column to produce the JumpReg signal. The modified table is presented as follows. (‘X’ in the table means it does not matter whether the signal is 0 or 1.) Instruction RegDst ALUSrc MemtoReg RegWrite MemRead MemWrite Branch ALUOp1 ALUOp0 JumpReg R-format 1 0 0 1 0 0 0 1 0 0 lw 0 1 1 1 1 0 0 0 0 0 sw X 1 X 0 0 1 0 0 0 0 beq X 0 X 0 0 0 1 0 1 0 jr X X X 0 0 0 X X X 1 5.9 (11 pts) A modification to the data path is necessary (see Figure 5.43) to feed the shamt field (instruction[10:6]) to the ALU in order to determine the shift amount. The instruction is in R-Format and is controlled according to the first line in Figure 5.18 on page 308. Therefore, the table in Figure 5.18 needs not be modified. The ALU will identify the sll operation by the ALUop signals and the funct field (instruction[5:0]). Figure 5.13 on page 302 should be modified to recognize the opcode of sll: the third line should be changed to 1X1X0000 0010 (to discriminate the add and ssl functions), and a new line, inserted, for example, 1X0X0000 0011 (to define sll by the 0011 operation code). The modified table is presented as follows. ALUOp Funct field Operation ALUOp1 ALUOp0 F5 F4 F3 F2 F1 F0 0 0 X X X X X X 0010 X 1 X X X X X X 0110 Augmentation 5.30 (11 pts) This solution can be done by modifying the data path to extract and shift the immediate field outside the ALU. Once we recognize the instruction as lui (in cycle 2), we will be ready to store the immediate field into the register file the next cycle. This way the instruction takes 3 cycles. 1. Instruction fetch step: Unchanged. 2. Instruction decode: Also unchanged, but the immediate field extraction and shifting will be done in this cycle as well. 3. Now the final form of the immediate value is ready to be loaded into the register file. The MemtoReg control signal has to be modified in order to allow its multiplexor to select the immediate upper field as the write data source. We can assume that this signal becomes a 2-bit control signal, and that the value 2 will select the immediate upper field. The following figure plots the modified datapath. The first two cycles are identical to the FSM of Figure 5.38. By the end of the second cycle, the FSM will recognize the opcode. We add the Op = 'lui', a new transition condition from state 1 to a new state 10. In this state we store the immediate upper field into the register file by these signals: RedDst = 0, RegWrite, MemtoReg= 2. State 10 will make the transition back to state 0 after its completion. The modified FSM is in the following figure. Augmentation 5.34 (10 pts) On M2, since state 3 and 4 are combined, the CPI of load instructions is reduced to 4; since state 6 and 7 are combined, the CPI of R-type instructions is reduced to 3. On M3, since state 2, 3 and 4 are combined, the CPI of load instructions is reduced 3; since state 2 and 5 are combined, the CPI of store instructions is reduced 3; since state 6 and 7 are combined, the CPI of R-type instructions is reduced 3. We can compare the number of instructions executed per second by each processor. Number of instructions executed per second = Clock rate / Average CPI ( ) ( )( )∑ = ×= n i ii 1 Typen Instructio ofFrequency Typen Instructio of CPI CPI Average Instruction Type Instruction Frequency CPI M1 M2 M3 Load 26% 5 4 3 Store 10% 4 4 3 R-type 49% 4 3 3 Branch/Jump 15% 3 3 3 Average CPI 4.11 3.36 3 Clock Rate 4×109Hz 3.2×109Hz 2.8×109Hz # Instructions Executed Per Second 9.73×108 9.52×108 9.33×108 Augmentation The table shows that M1 can execute the most number of instructions per second, and therefore is the fastest. M3 can be the fastest, if the frequency of load and store instructions is very high. An example is shown in the following table. Instruction Type Instruction Frequency CPI M1 M2 M3 Load 49% 5 4 3 Store 49% 4 4 3 R-type 1% 4 3 3 Branch/Jump 1% 3 3 3 Average CPI 4.48 3.98 3 Clock Rate 4×109Hz 3.2×109Hz 2.8×109Hz # Instructions Executed Per Second 8.92×108 8.04×108 9.33×108 5.43 (12 pts) a. Divide by zero exception can be detected in the ALU in cycle 3, before executing the divide instruction. b. Overflow can be hardware detected after the completion of the ALU operation. This is done in cycle 4 (see Figure 5.40). c. Invalid opcode can be detected by the end of cycle 2 (see Figure 5.40). d. This is an asynchronous exception event that can occur at any cycle. We can design this machine to test for this condition either at a specific cycle (and then the exception can take place only in a specific stage), or check in every cycle (and then this exception can occur at any processor stage). e. Check for instruction memory address can be done at the time we update the PC. This can be done in cycle 1. f. Check for data memory address can be done after address calculation at the end of cycle 3.
Docsity logo



Copyright © 2024 Ladybird Srl - Via Leonardo da Vinci 16, 10126, Torino, Italy - VAT 10816460017 - All rights reserved