Download Understanding Instruction Set Architectures: High-Level, Low-Level Languages & Compiling and more Study notes Computer Architecture and Organization in PDF only on Docsity! 4/10/2006 ISA 1 Instruction set architectures • Last week we built a simple, but complete, datapath. • The datapath is ultimately controlled by a programmer, so today we’ll look at several aspects of programming in more detail. – How programs are executed on processors – An introduction to instruction set architectures – Example instructions and programs • Next, we’ll see how programs are encoded in a processor. Following that, we’ll finish our processor by designing a control unit, which converts our programs into signals for the datapath. 4/10/2006 ISA 2 Programming and CPUs • Programs written in a high-level language like C++ must be compiled to produce an executable program. • The result is a CPU-specific machine language program. This can be loaded into memory and executed by the processor. • CS231 focuses on stuff below the dotted blue line, but machine language serves as the interface between hardware and software. Datapath High-level program Executable file Control words Compiler Control Unit Hardware Software 4/10/2006 ISA 5 Compiling • Processors can’t execute programs written in high-level languages directly, so a special program called a compiler is needed to translate high-level programs into low-level machine code. • In the “good” old days, people often wrote machine language programs by hand to make their programs faster, smaller, or both. • Now, compilers almost always do a better job than people. – Programs are becoming more complex, and it’s hard for humans to write and maintain large, efficient machine language code. – CPUs are becoming more complex. It’s difficult to write code that takes full advantage of a processor’s features. • Some languages, like Perl or Lisp, are usually interpreted instead of compiled. – Programs are translated into an intermediate format. – This is a “middle ground” between efficiency and portability. 4/10/2006 ISA 6 Assembly and machine languages • Machine language instructions are sequences of bits in a specific order. • To make things simpler, people typically use assembly language. – We assign “mnemonic” names to operations and operands. – There is (almost) a one-to-one correspondence between these mnemonics and machine instructions, so it is very easy to convert assembly programs to machine language. • We’ll use assembly code this today to introduce the basic ideas, and switch to machine language tomorrow. 4/10/2006 ISA 7 Data manipulation instructions • Data manipulation instructions correspond to ALU operations. • For example, here is a possible addition instruction, and its equivalent using our register transfer notation: • This is similar to a high-level programming statement like R0 = R1 + R2 • Here, all of the operands are registers. ADD R0, R1, R2 operation destination sources operands R0 ← R1 + R2 Register transfer instruction: 4/10/2006 ISA 10 What about RAM? • Recall that our ALU has direct access only to the register file. • RAM contents must be copied to the registers before they can be used as ALU operands. • Similarly, ALU results must go through the registers before they can be stored into memory. • We rely on data movement instructions to transfer data between the RAM and the register file. D data Write D address A address B address A data B data Register File WR DA AA BA Q D1 D0 S RAM ADRS DATA CS WR OUT MW +5V A B ALU F Z N C V FSFS MD S D1 D0 Q Constant MB 4/10/2006 ISA 11 Loading a register from RAM • A load instruction copies data from a RAM address to one of the registers. LD R1,(R3) R1 ← M[R3] • Remember in our datapath, the RAM address must come from one of the registers—in the example above, R3. • The parentheses help show which register operand holds the memory address. D data Write D address A address B address A data B data Register File WR DA AA BA RAM ADRS DATA CS WR OUT MW +5V A B ALU F Z N C V FSFS MD S D1 D0 Q Constant MB Q D1 D0 S 4/10/2006 ISA 12 Storing a register to RAM • A store instruction copies data from a register to an address in RAM. ST (R3),R1 M[R3] ← R1 • One register specifies the RAM address to write to—in the example above, R3. • The other operand specifies the actual data to be stored into RAM—R1 above. Q D1 D0 S A B ALU F Z N C V FSFS MD S D1 D0 Q Constant MB RAM ADRS DATA CS WR OUT MW +5V D data Write D address A address B address A data B data Register File WR DA AA BA 4/10/2006 ISA 15 The # and ( ) are important! • We’ve seen several statements containing the # or ( ) symbols. These are ways of specifying different addressing modes. • The addressing mode we use determines which data are actually used as operands: • The design of our datapath determines which addressing modes we can use. – The second example above wouldn’t work in our datapath. Why not? • We’ll talk about addressing modes in more detail next week. LD R0, #1000 // R0 ← 1000 LD R0, 1000 // R0 ← M[1000] LD R3, R0 // R3 ← R0 LD R3, (R0) // R3 ← M[R0] 4/10/2006 ISA 16 A small example • Here’s an example register-transfer operation. M[1000] ← M[1000] + 1 • This is the assembly-language equivalent: • An awful lot of assembly instructions are needed! – For instance, we have to load the memory address 1000 into a register first, and then use that register to access the RAM. – This is due to our relatively simple datapath design, which only allows register and constant operands to the ALU. – Later on, mostly in CS232, you’ll see why this can be a good thing. LD R0, #1000 // R0 ← 1000 LD R3, (R0) // R3 ← M[1000] ADD R3, R3, #1 // R3 ← R3 + 1 ST (R0), R3 // M[1000] ← R3 4/10/2006 ISA 17 • Programs consist of a lot of sequential instructions, which are meant to be executed one after another. • Thus, programs are stored in memory so that: – Each program instruction occupies one address. – Instructions are stored one after another. • A program counter (PC) keeps track of the current instruction address. – Ordinarily, the PC just increments after executing each instruction. – But sometimes we need to change this normal sequential behavior, with special control flow instructions. Control flow instructions 768: LD R0, #1000 // R0 ← 1000 769: LD R3, (R0) // R3 ← M[1000] 770: ADD R3, R3, #1 // R3 ← R3 + 1 771: ST (R0), R3 // M[1000] ← R3 4/10/2006 ISA 20 Types of branches • Branch conditions are often based on the ALU result. • This is what the ALU status bits V, C, N and Z are used for. With them we can implement various branch instructions like the ones below. • Other branch conditions (e.g., branch if greater, equal or less) can be derived from these, along with the right ALU operation. Condition Mnemonic ALU status bit Branch on overflow BV V = 1 Branch on no overflow BNV V = 0 Branch if carry set BC C = 1 Branch if carry clear BNC C = 0 Branch if negative BN N = 1 Branch if positive BNN N = 0 Branch if zero BZ Z = 1 Branch if non-zero BNZ Z = 0 4/10/2006 ISA 21 • These jumps and branches are much simpler than the control flow constructs provided by high-level languages. • Conditional statements execute only if some Boolean value is true. • Loops cause some statements to be executed many times High-level control flow // Find the absolute value of *X R1 = *X; if (R1 < 0) R1 = -R1; // This might not be executed R3 = R1 + R1; // Sum the integers from 1 to 5 R1 = 0; for (R2 = 1; R2 <= 5; R2++) R1 = R1 + R2; // This is executed five times R3 = R1 + R1; 4/10/2006 ISA 22 • We can use branch instructions to translate high-level conditional statements into assembly code. • Sometimes it’s easier to invert the original condition. Here, we effectively changed the R1 < 0 test into R1 >= 0. Translating the C if-then statement R1 = *X; if (R1 < 0) R1 = -R1; R3 = R1 + R1; LD R1, (X) // R1 = *X BNN R1, L // Skip MUL if R1 is not negative MUL R1, R1, #-1 // R1 = -R1 L ADD R3, R1, R1 // R3 = R1 + R1