Prepare for your exams
Get points
Guidelines and tips

Prepare for your exams

Study with the several resources on Docsity

Earn points to download

Earn points by helping other students or get them with a premium plan

Guidelines and tips

Sell on Docsity

Prepare for your exams

Study with the several resources on Docsity

Find documents

Prepare for your exams with the study notes shared by other students like you on Docsity

Search Store documents

The best documents sold by students who completed their studies

Search through all study resources

Docsity AINEW

Summarize your documents, ask them questions, convert them into quizzes and concept maps

Explore questions

Clear up your doubts by reading the answers to questions asked by your fellow students

Earn points to download

Earn points by helping other students or get them with a premium plan

Share documents

20 Points

For each uploaded document

Answer questions

5 Points

For each given answer (max 1 per day)

All the ways to get free points

Get points immediately

Choose a premium plan with all the points you need

Study Opportunities

Search for study opportunitiesNEW

Connect with the world's best universities and choose your course of study

Community

Ask the community

Ask the community for help and clear up your study doubts

University Rankings

Discover the best universities in your country according to Docsity users

Free resources

Our save-the-student-ebooks!

Download our free guides on studying techniques, anxiety management strategies, and thesis advice from Docsity tutors

From our blog

Exams and Study

Go to the blog

Introduction to the MIPS 32 Architecture - Architecture for Programmers Volume II | CS 1541, Papers of Computer Science

University of Pittsburgh (Pitt) - Medical Center-Health System Computer Science

Prof. Sangyeun Cho

Material Type: Paper; Professor: Cho; Class: INTRO TO COMPUTER ARCHITECTURE; Subject: Computer Science; University: University of Pittsburgh; Term: Summer 2003;

Typology: Papers

Pre 2010

Uploaded on 09/02/2009

koofers-user-agj 🇺🇸

10 documents

1 / 93

Partial preview of the text

Download Introduction to the MIPS 32 Architecture - Architecture for Programmers Volume II | CS 1541 and more Papers Computer Science in PDF only on Docsity! Document Number: MD00082 Revision 2.00 June 8, 2003 MIPS Technologies, Inc. 1225 Charleston Road Mountain View, CA 94043-1353 Copyright © 2001-2003 MIPS Technologies Inc. All rights reserved. MIPS32™ Architecture For Programmers Volume I: Introduction to the MIPS32™ Architecture Copyright © 2001-2003 MIPS Technologies, Inc. All rights reserved. Unpublished rights (if any) reserved under the copyright laws of the United States of America and other countries. This document contains information that is proprietary to MIPS Technologies, Inc. ("MIPS Technologies"). Any copying, reproducing, modifying or use of this information (in whole or in part) that is not expressly permitted in writing by MIPS Technologies or an authorized third party is strictly prohibited. At a minimum, this information is protected under unfair competition and copyright laws. Violations thereof may result in criminal penalties and fines. Any document provided in source format (i.e., in a modifiable form such as in FrameMaker or Microsoft Word format) is subject to use and distribution restrictions that are independent of and supplemental to any and all confidentiality restrictions. UNDER NO CIRCUMSTANCES MAY A DOCUMENT PROVIDED IN SOURCE FORMAT BE DISTRIBUTED TO A THIRD PARTY IN SOURCE FORMAT WITHOUT THE EXPRESS WRITTEN PERMISSION OF MIPS TECHNOLOGIES, INC. MIPS Technologies reserves the right to change the information contained in this document to improve function, design or otherwise. MIPS Technologies does not assume any liability arising out of the application or use of this information, or of any error or omission in such information. Any warranties, whether express, statutory, implied or otherwise, including but not limited to the implied warranties of merchantability or fitness for a particular purpose, are excluded. Except as expressly provided in any written license agreement from MIPS Technologies or an authorized third party, the furnishing of this document does not give recipient any license to any intellectual property rights, including any patent rights, that cover the information in this document. The information contained in this document shall not be exported or transferred for the purpose of reexporting in violation of any U.S. or non-U.S. regulation, treaty, Executive Order, law, statute, amendment or supplement thereto. The information contained in this document constitutes one or more of the following: commercial computer software, commercial computer software documentation or other commercial items. If the user of this information, or any related documentation of any kind, including related technical data or manuals, is an agency, department, or other entity of the United States government ("Government"), the use, duplication, reproduction, release, modification, disclosure, or transfer of this information, or any related documentation of any kind, is restricted in accordance with Federal Acquisition Regulation 12.212 for civilian agencies and Defense Federal Acquisition Regulation Supplement 227.7202 for military agencies. The use of this information by the Government is further restricted in accordance with the terms of the license agreement(s) and/or applicable contract terms and conditions covering this information from MIPS Technologies or an authorized third party. MIPS, R3000, R4000, R5000 and R10000 are among the registered trademarks of MIPS Technologies, Inc. in the United States and other countries, and MIPS16, MIPS16e, MIPS32, MIPS64, MIPS-3D, MIPS-based, MIPS I, MIPS II, MIPS III, MIPS IV, MIPS V, MIPSsim, SmartMIPS, MIPS Technologies logo, 4K, 4Kc, 4Km, 4Kp, 4KE, 4KEc, 4KEm, 4KEp, 4KS, 4KSc, 4KSd, M4K, 5K, 5Kc, 5Kf, 20Kc, 25Kf, ASMACRO, ATLAS, At the Core of the User Experience., BusBridge, CoreFPGA, CoreLV, EC, JALGO, MALTA, MDMX, MGB, PDtrace, Pipeline, Pro, Pro Series, SEAD, SEAD-2, SOC-it and YAMON are among the trademarks of MIPS Technologies, Inc. All other trademarks referred to herein are the property of their respective owners. MIPS32™ Architecture For Programmers Volume I, Revision 2.00 Copyright © 2001-2003 MIPS Technologies Inc. All rights reserved. Template: B1.08, Built with tags: 2B ARCH MIPS32 MIPS32™ Architecture For Programmers Volume I, Revision 2.00 iii Copyright © 2001-2003 MIPS Technologies Inc. All rights reserved. List of Figures Figure 2-1: Relationship between the MIPS32 and MIPS64 Architectures.......................................................................11 Figure 2-2: One-Deep Single-Completion Instruction Pipeline .........................................................................................13 Figure 2-3: Four-Deep Single-Completion Pipeline ..........................................................................................................14 Figure 2-4: Four-Deep Superpipeline.................................................................................................................................14 Figure 2-5: Four-Way Superscalar Pipeline .......................................................................................................................15 Figure 2-6: CPU Registers..................................................................................................................................................18 Figure 2-7: FPU Registers for a 32-bit FPU.......................................................................................................................20 Figure 2-8: FPU Registers for a 64-bit FPU if StatusFR is 1 ..............................................................................................21 Figure 2-9: FPU Registers for a 64-bit FPU if StatusFR is 0 ..............................................................................................22 Figure 2-10: Big-Endian Byte Ordering.............................................................................................................................23 Figure 2-11: Little-Endian Byte Ordering ..........................................................................................................................23 Figure 2-12: Big-Endian Data in Doubleword Format.......................................................................................................24 Figure 2-13: Little-Endian Data in Doubleword Format....................................................................................................24 Figure 2-14: Big-Endian Misaligned Word Addressing ....................................................................................................25 Figure 2-15: Little-Endian Misaligned Word Addressing..................................................................................................25 Figure 3-1: MIPS ISAs and ASEs ......................................................................................................................................27 Figure 3-2: User-Mode MIPS ISAs and Optional ASEs....................................................................................................27 Figure 4-1: Immediate (I-Type) CPU Instruction Format ..................................................................................................42 Figure 4-2: Jump (J-Type) CPU Instruction Format ..........................................................................................................42 Figure 4-3: Register (R-Type) CPU Instruction Format ....................................................................................................42 Figure 5-1: Single-Precisions Floating Point Format (S) ...................................................................................................45 Figure 5-2: Double-Precisions Floating Point Format (D).................................................................................................45 Figure 5-3: Paired Single Floating Point Format (PS) .......................................................................................................46 Figure 5-4: Word Fixed Point Format (W).........................................................................................................................48 Figure 5-5: Longword Fixed Point Format (L) ..................................................................................................................48 Figure 5-6: FPU Word Load and Move-to Operations ......................................................................................................49 Figure 5-7: FPU Doubleword Load and Move-to Operations............................................................................................50 Figure 5-8: Single Floating Point or Word Fixed Point Operand in an FPR .....................................................................50 Figure 5-9: Double Floating Point or Longword Fixed Point Operand in an FPR ............................................................50 Figure 5-10: Paired-Single Floating Point Operand in an FPR..........................................................................................50 Figure 5-11: FIR Register Format ......................................................................................................................................51 Figure 5-12: FCSR Register Format...................................................................................................................................53 Figure 5-13: FCCR Register Format ..................................................................................................................................55 Figure 5-14: FEXR Register Format ..................................................................................................................................56 Figure 5-15: FENR Register Format ..................................................................................................................................56 Figure 5-16: Effect of FPU Operations on the Format of Values Held in FPRs................................................................58 Figure 5-17: I-Type (Immediate) FPU Instruction Format ................................................................................................71 Figure 5-18: R-Type (Register) FPU Instruction Format...................................................................................................71 Figure 5-19: Register-Immediate FPU Instruction Format ................................................................................................71 Figure 5-20: Condition Code, Immediate FPU Instruction Format ...................................................................................71 Figure 5-21: Formatted FPU Compare Instruction Format ................................................................................................71 Figure 5-22: FP RegisterMove, Conditional Instruction Format .......................................................................................71 Figure 5-23: Four-Register Formatted Arithmetic FPU Instruction Format ......................................................................72 Figure 5-24: Register Index FPU Instruction Format.........................................................................................................72 Figure 5-25: Register Index Hint FPU Instruction Format ................................................................................................72 Figure 5-26: Condition Code, Register Integer FPU Instruction Format ...........................................................................72 Figure A-1: Sample Bit Encoding Table ............................................................................................................................76 List ofTables Table 1-1: Symbols Used in Instruction Operation Statements ...........................................................................................2 Table 2-1: MIPS32 Instructions .........................................................................................................................................12 Table 2-2: MIPS64 Instructions .........................................................................................................................................13 Table 2-3: Unaligned Load and Store Instructions.............................................................................................................24 Table 4-1: Load and Store Operations Using Register + Offset Addressing Mode...........................................................30 Table 4-2: Aligned CPU Load/Store Instructions .............................................................................................................30 Table 4-3: Unaligned CPU Load and Store Instructions ...................................................................................................31 Table 4-4: Atomic Update CPU Load and Store Instructions ............................................................................................31 Table 4-5: Coprocessor Load and Store Instructions .........................................................................................................31 Table 4-6: FPU Load and Store Instructions Using Register + Register Addressing .......................................................32 Table 4-7: ALU Instructions With an Immediate Operand ...............................................................................................33 Table 4-8: Three-Operand ALU Instructions ....................................................................................................................33 Table 4-9: Two-Operand ALU Instructions ......................................................................................................................34 Table 4-10: Shift Instructions ............................................................................................................................................34 Table 4-11: Multiply/Divide Instructions ..........................................................................................................................35 Table 4-12: Unconditional Jump Within a 256 Megabyte Region.....................................................................................36 Table 4-13: PC-Relative Conditional Branch Instructions Comparing Two Registers .....................................................36 Table 4-14: PC-Relative Conditional Branch Instructions Comparing With Zero ...........................................................37 Table 4-15: Deprecated Branch Likely Instructions ..........................................................................................................37 Table 4-16: Serialization Instruction ..................................................................................................................................38 Table 4-17: System Call and Breakpoint Instructions........................................................................................................38 Table 4-18: Trap-on-Condition Instructions Comparing Two Registers ...........................................................................38 Table 4-19: Trap-on-Condition Instructions Comparing an Immediate Value ..................................................................38 Table 4-20: CPU Conditional Move Instructions...............................................................................................................39 Table 4-21: Prefetch Instructions .......................................................................................................................................39 Table 4-22: NOP Instructions.............................................................................................................................................40 Table 4-23: Coprocessor Definition and Use in the MIPS Architecture............................................................................40 Table 4-24: CPU Instruction Format Fields .......................................................................................................................42 Table 5-1: Parameters of Floating Point Data Types .........................................................................................................45 Table 5-2: Value of Single or Double Floating Point DataType Encoding .......................................................................46 Table 5-3: Value Supplied When a New Quiet NaN Is Created ........................................................................................47 Table 5-4: FIR Register Field Descriptions........................................................................................................................51 Table 5-5: FCSR Register Field Descriptions ....................................................................................................................53 Table 5-6: Cause, Enable, and Flag Bit Definitions ...........................................................................................................55 Table 5-7: Rounding Mode Definitions .............................................................................................................................55 Table 5-8: FCCR Register Field Descriptions....................................................................................................................56 Table 5-9: FEXR Register Field Descriptions....................................................................................................................56 Table 5-10: FENR Register Field Descriptions..................................................................................................................57 Table 5-11: Default Result for IEEE Exceptions Not Trapped Precisely .........................................................................60 Table 5-12: FPU Data Transfer Instructions ......................................................................................................................62 Table 5-13: FPU Loads and Stores Using Register+Offset Address Mode .......................................................................63 Table 5-14: FPU Loads and Using Register+Register Address Mode...............................................................................63 Table 5-15: FPU Move To and From Instructions .............................................................................................................63 Table 5-16: FPU IEEE Arithmetic Operations...................................................................................................................64 Table 5-17: FPU-Approximate Arithmetic Operations ......................................................................................................64 Table 5-18: FPU Multiply-Accumulate Arithmetic Operations.........................................................................................65 Table 5-19: FPU Conversion Operations Using the FCSR Rounding Mode.....................................................................65 Table 5-20: FPU Conversion Operations Using a Directed Rounding Mode ....................................................................65 Table 5-21: FPU Formatted Operand Move Instructions...................................................................................................66 Table 5-22: FPU Conditional Move on True/False Instructions ........................................................................................66iv MIPS32™ Architecture For Programmers Volume I, Revision 2.00 Copyright © 2001-2003 MIPS Technologies Inc. All rights reserved. Table 5-23: FPU Conditional Move on Zero/Nonzero Instructions...................................................................................67 Table 5-24: FPU Conditional Branch Instructions .............................................................................................................67 Table 5-25: Deprecated FPU Conditional Branch Likely Instructions ..............................................................................67 Table 5-26: CPU Conditional Move on FPU True/False Instructions ...............................................................................68 Table 5-27: FPU Operand Format Field (fmt, fmt3) Encoding .........................................................................................68 Table 5-28: Valid Formats for FPU Operations ................................................................................................................69 Table 5-29: FPU Instruction Format Fields .......................................................................................................................72 Table A-1: Symbols Used in the Instruction Encoding Tables ..........................................................................................76 Table A-2: MIPS32 Encoding of the Opcode Field ...........................................................................................................77 Table A-3: MIPS32 SPECIAL Opcode Encoding of Function Field.................................................................................78 Table A-4: MIPS32 REGIMM Encoding of rt Field..........................................................................................................78 Table A-5: MIPS32 SPECIAL2 Encoding of Function Field ............................................................................................78 Table A-6: MIPS32 SPECIAL3 Encoding of Function Field for Release 2 of the Architecture.......................................78 Table A-7: MIPS32 MOVCI Encoding of tf Bit ................................................................................................................79 Table A-8: MIPS32 SRL Encoding of Shift/Rotate ...........................................................................................................79 Table A-9: MIPS32 SRLV Encoding of Shift/Rotate ........................................................................................................79 Table A-10: MIPS32 BSHFL Encoding of sa Field...........................................................................................................79 Table A-11: MIPS32 COP0 Encoding of rs Field..............................................................................................................79 Table A-12: MIPS32 COP0 Encoding of Function Field When rs=CO ............................................................................80 Table A-13: MIPS32 COP1 Encoding of rs Field..............................................................................................................80 Table A-14: MIPS32 COP1 Encoding of Function Field When rs=S................................................................................80 Table A-15: MIPS32 COP1 Encoding of Function Field When rs=D...............................................................................81 Table A-16: MIPS32 COP1 Encoding of Function Field When rs=W or L ......................................................................81 Table A-17: MIPS64 COP1 Encoding of Function Field When rs=PS .............................................................................81 Table A-18: MIPS32 COP1 Encoding of tf Bit When rs=S, D, or PS, Function=MOVCF ..............................................81 Table A-19: MIPS32 COP2 Encoding of rs Field..............................................................................................................82 Table A-20: MIPS64 COP1X Encoding of Function Field................................................................................................82 Table A-21: Floating Point Unit Instruction Format Encodings ........................................................................................82MIPS32™ Architecture For Programmers Volume I, Revision 2.00 v Copyright © 2001-2003 MIPS Technologies Inc. All rights reserved. Chapter 1 About This Book1.2 UNPREDICTABLE and UNDEFINED The terms UNPREDICTABLE and UNDEFINED are used throughout this book to describe the behavior of the processor in certain cases. UNDEFINED behavior or operations can occur only as the result of executing instructions in a privileged mode (i.e., in Kernel Mode or Debug Mode, or with the CP0 usable bit set in the Status register). Unprivileged software can never cause UNDEFINED behavior or operations. Conversely, both privileged and unprivileged software can cause UNPREDICTABLE results or operations. 1.2.1 UNPREDICTABLE UNPREDICTABLE results may vary from processor implementation to implementation, instruction to instruction, or as a function of time on the same implementation or instruction. Software can never depend on results that are UNPREDICTABLE. UNPREDICTABLE operations may cause a result to be generated or not. If a result is generated, it is UNPREDICTABLE. UNPREDICTABLE operations may cause arbitrary exceptions. UNPREDICTABLE results or operations have several implementation restrictions: • Implementations of operations generating UNPREDICTABLE results must not depend on any data source (memory or internal state) which is inaccessible in the current processor mode • UNPREDICTABLE operations must not read, write, or modify the contents of memory or internal state which is inaccessible in the current processor mode. For example, UNPREDICTABLE operations executed in user mode must not access memory or internal state that is only accessible in Kernel Mode or Debug Mode or in another process • UNPREDICTABLE operations must not halt or hang the processor 1.2.2 UNDEFINED UNDEFINED operations or behavior may vary from processor implementation to implementation, instruction to instruction, or as a function of time on the same implementation or instruction. UNDEFINED operations or behavior may vary from nothing to creating an environment in which execution can no longer continue. UNDEFINED operations or behavior may cause data loss. UNDEFINED operations or behavior has one implementation restriction: • UNDEFINED operations or behavior must not cause the processor to hang (that is, enter a state from which there is no exit other than powering down the processor). The assertion of any of the reset signals must restore the processor to an operational state 1.3 Special Symbols in Pseudocode Notation In this book, algorithmic descriptions of an operation are described as pseudocode in a high-level language notation resembling Pascal. Special symbols used in the pseudocode notation are listed in Table 1-1. Table 1-1 Symbols Used in Instruction Operation Statements Symbol Meaning ← Assignment =, ≠ Tests for equality and inequality || Bit string concatenation xy A y-bit string formed by y copies of the single-bit value x2 MIPS32™ Architecture For Programmers Volume I, Revision 2.00 Copyright © 2001-2003 MIPS Technologies Inc. All rights reserved. 1.3 Special Symbols in Pseudocode Notationb#n A constant value n in base b. For instance 10#100 represents the decimal value 100, 2#100 represents the binary value 100 (decimal 4), and 16#100 represents the hexadecimal value 100 (decimal 256). If the "b#" prefix is omitted, the default base is 10. xy..z Selection of bits y through z of bit string x. Little-endian bit notation (rightmost bit is 0) is used. If y is less than z, this expression is an empty (zero length) bit string. +, − 2’s complement or floating point arithmetic: addition, subtraction ∗, × 2’s complement or floating point multiplication (both used for either) div 2’s complement integer division mod 2’s complement modulo / Floating point division < 2’s complement less-than comparison > 2’s complement greater-than comparison ≤ 2’s complement less-than or equal comparison ≥ 2’s complement greater-than or equal comparison nor Bitwise logical NOR xor Bitwise logical XOR and Bitwise logical AND or Bitwise logical OR GPRLEN The length in bits (32 or 64) of the CPU general-purpose registers GPR[x] CPU general-purpose register x. The content of GPR[0] is always zero. SGPR[s,x] In Release 2 of the Architecture, multiple copies of the CPU general-purpose registers may be implemented.SGPR[s,x] refers to GPR set s, register x. GPR[x] is a short-hand notation for SGPR[ SRSCtlCSS, x]. FPR[x] Floating Point operand register x FCC[CC] Floating Point condition code CC. FCC[0] has the same value as COC[1]. FPR[x] Floating Point (Coprocessor unit 1), general register x CPR[z,x,s] Coprocessor unit z, general register x, select s CP2CPR[x] Coprocessor unit 2, general register x CCR[z,x] Coprocessor unit z, control register x CP2CCR[x] Coprocessor unit 2, control register x COC[z] Coprocessor unit z condition signal Xlat[x] Translation of the MIPS16e GPR number x into the corresponding 32-bit GPR number BigEndianMem Endian mode as configured at chip reset (0 →Little-Endian, 1 → Big-Endian). Specifies the endianness of the memory interface (see LoadMemory and StoreMemory pseudocode function descriptions), and the endianness of Kernel and Supervisor mode execution. BigEndianCPU The endianness for load and store instructions (0 → Little-Endian, 1 → Big-Endian). In User mode, this endianness may be switched by setting the RE bit in the Status register. Thus, BigEndianCPU may be computed as (BigEndianMem XOR ReverseEndian). Table 1-1 Symbols Used in Instruction Operation Statements Symbol MeaningMIPS32™ Architecture For Programmers Volume I, Revision 2.00 3 Copyright © 2001-2003 MIPS Technologies Inc. All rights reserved. Chapter 1 About This Book1.4 For More Information Various MIPS RISC processor manuals and additional information about MIPS products can be found at the MIPS URL: http://www.mips.com ReverseEndian Signal to reverse the endianness of load and store instructions. This feature is available in User mode only, and is implemented by setting the RE bit of the Status register. Thus, ReverseEndian may be computed as (SRRE and User mode). LLbit Bit of virtual state used to specify operation for instructions that provide atomic read-modify-write. LLbit is set when a linked load occurs; it is tested and cleared by the conditional store. It is cleared, during other CPU operation, when a store to the location would no longer be atomic. In particular, it is cleared by exception return instructions. I:, I+n:, I-n: This occurs as a prefix to Operation description lines and functions as a label. It indicates the instruction time during which the pseudocode appears to “execute.” Unless otherwise indicated, all effects of the current instruction appear to occur during the instruction time of the current instruction. No label is equivalent to a time label of I. Sometimes effects of an instruction appear to occur either earlier or later — that is, during the instruction time of another instruction. When this happens, the instruction operation is written in sections labeled with the instruction time, relative to the current instruction I, in which the effect of that pseudocode appears to occur. For example, an instruction may have a result that is not available until after the next instruction. Such an instruction has the portion of the instruction operation description that writes the result register in a section labeled I+1. The effect of pseudocode statements for the current instruction labelled I+1 appears to occur “at the same time” as the effect of pseudocode statements labeled I for the following instruction. Within one pseudocode sequence, the effects of the statements take place in order. However, between sequences of statements for different instructions that occur “at the same time,” there is no defined order. Programs must not depend on a particular order of evaluation between such sections. PC The Program Counter value. During the instruction time of an instruction, this is the address of the instruction word. The address of the instruction that occurs during the next instruction time is determined by assigning a value to PC during an instruction time. If no value is assigned to PC during an instruction time by any pseudocode statement, it is automatically incremented by either 2 (in the case of a 16-bit MIPS16e instruction) or 4 before the next instruction time. A taken branch assigns the target address to the PC during the instruction time of the instruction in the branch delay slot. PABITS The number of physical address bits implemented is represented by the symbol PABITS. As such, if 36 physicaladdress bits were implemented, the size of the physical address space would be 2PABITS = 236 bytes. FP32RegistersMode Indicates whether the FPU has 32-bit or 64-bit floating point registers (FPRs). In MIPS32, the FPU has 32 32-bit FPRs in which 64-bit data types are stored in even-odd pairs of FPRs. In MIPS64, the FPU has 32 64-bit FPRs in which 64-bit data types are stored in any FPR. In MIPS32 implementations, FP32RegistersMode is always a 0. MIPS64 implementations have a compatibility mode in which the processor references the FPRs as if it were a MIPS32 implementation. In such a case FP32RegisterMode is computed from the FR bit in the Status register. If this bit is a 0, the processor operates as if it had 32 32-bit FPRs. If this bit is a 1, the processor operates with 32 64-bit FPRs. The value of FP32RegistersMode is computed from the FR bit in the Status register. InstructionInBranchD elaySlot Indicates whether the instruction at the Program Counter address was executed in the delay slot of a branch or jump. This condition reflects the dynamic state of the instruction, not the static state. That is, the value is false if a branch or jump occurs to an instruction whose PC immediately follows a branch or jump, but which is not executed in the delay slot of a branch or jump. SignalException(exce ption, argument) Causes an exception to be signaled, using the exception parameter as the type of exception and the argument parameter as an exception-specific argument). Control does not return from this pseudocode function - the exception is signaled at the point of the call. Table 1-1 Symbols Used in Instruction Operation Statements Symbol Meaning4 MIPS32™ Architecture For Programmers Volume I, Revision 2.00 Copyright © 2001-2003 MIPS Technologies Inc. All rights reserved. Chapter 2 The MIPS Architecture: An Introduction 2.1 MIPS32 and MIPS64 Overview 2.1.1 Historical Perspective The MIPS® Instruction Set Architecture (ISA) has evolved over time from the original MIPS I™ ISA, through the MIPS V™ ISA, to the current MIPS32™ and MIPS64™ Architectures. As the ISA evolved, all extensions have been backward compatible with previous versions of the ISA. In the MIPS III™ level of the ISA, 64-bit integers and addresses were added to the instruction set. The MIPS IV™ and MIPS V™ levels of the ISA added improved floating point operations, as well as a set of instructions intended to improve the efficiency of generated code and of data movement. Because of the strict backward-compatible requirement of the ISA, such changes were unavailable to 32-bit implementations of the ISA which were, by definition, MIPS I™ or MIPS II™ implementations. While the user-mode ISA was always backward compatible, the privileged environment was allowed to change on a per-implementation basis. As a result, the R3000® privileged environment was different from the R4000® privileged environment, and subsequent implementations, while similar to the R4000 privileged environment, included subtle differences. Because the privileged environment was never part of the MIPS ISA, an implementation had the flexibility to make changes to suit that particular implementation. Unfortunately, this required kernel software changes to every operating system or kernel environment on which that implementation was intended to run. Many of the original MIPS implementations were targeted at computer-like applications such as workstations and servers. In recent years MIPS implementations have had significant success in embedded applications. Today, most of the MIPS parts that are shipped go into some sort of embedded application. Such applications tend to have different trade-offs than computer-like applications including a focus on cost of implementation, and performance as a function of cost and power. The MIPS32 and MIPS64 Architectures are intended to address the need for a high-performance but cost-sensitive MIPS instruction set. The MIPS32 Architecture is based on the MIPS II ISA, adding selected instructions from MIPS III, MIPS IV, and MIPS V to improve the efficiency of generated code and of data movement. The MIPS64 Architecture is based on the MIPS V ISA and is backward compatible with the MIPS32 Architecture. Both the MIPS32 and MIPS64 Architectures bring the privileged environment into the Architecture definition to address the needs of operating systems and other kernel software. Both also include provision for adding MIPS Application Specific Extensions (ASEs), User Defined Instructions (UDIs), and custom coprocessors to address the specific needs of particular markets. MIPS32 and MIPS64 Architectures provides a substantial cost/performance advantage over microprocessor implementations based on traditional architectures. This advantage is a result of improvements made in several contiguous disciplines: VLSI process technology, CPU organization, system-level architecture, and operating system and compiler design. 2.1.2 Architectural Evolution The evolution of an architecture is a dynamic process that takes into account both the need to provide a stable platform for implementations, as well as new market and application areas that demand new capabilities. Enhancements to an architecture are appropriate when they: • are applicable to a wide market • provide long-term benefitMIPS32™ Architecture For Programmers Volume I, Revision 2.00 7 Copyright © 2001-2003 MIPS Technologies Inc. All rights reserved. Chapter 2 The MIPS Architecture: An Introduction• maintain architectural scalability • are standardized to prevent fragmentation • are a superset of the existing architecture The MIPS Architecture community constantly evaluates suggestions for architectural changes and enhancements against these criteria. New releases of the architecture, while infrequent, are made at appropriate points, following these criteria. At present, there are two releases of the MIPS Architecture: Release 1 (the original version of the MIPS32 Architecture) and Release 2 which was added in 2002. 2.1.2.1 Release 2 of the MIPS32 Architecture Enhancements included in Release 2 of the MIPS32 Architecture are: • Vectored interrupts: This enhancement provides the ability to vector interrupts directly to a handler for that interrupt. Vectored interrupts are an option in Release 2 implementations and the presence of that option is denoted by the Config3VInt bit. • Support for an external interrupt controller: This enhancement reconfigures the on-core interrupt logic to take full advantage of an external interrupt controller. This support is an option in Release 2 implementations and the presence of that option is denoted by the Config3EIC bit. • Programmable exception vector base: This enhancement allows the base address of the exception vectors to be moved for exceptions that occur when StatusBEV is 0. Doing so allows multi-processor systems to have separate exception vectors for each processor, and allows any system to place the exception vectors in memory that is appropriate to the system environment. This enhancement is required in a Release 2 implementation. • Atomic interrupt enable/disable: Two instructions have been added to atomically enable or disable interrupts, and return the previous value of the Status register. These instructions are required in a Release 2 implementation. • The ability to disable the Count register for highly power-sensitive applications. This enhancement is required in a Release 2 implementation. • GPR shadow registers: This addition provides the addition of GPR shadow registers and the ability to bind these registers to a vectored interrupt or exception. Shadow registers are an option in Release 2 implementations and the presence of that option is denoted by a non-zero value in SRSCtlHSS. If shadow registers are implemented, either vectored interrupts or support for an external interrupt controller must also be implemented. • Field, Rotate and Shuffle instructions: These instructions add additional capability in processing bit fields in registers. These instructions are required in a Release 2 implementation. • Explicit hazard management: This enhancement provides a set of instructions to explicitly manage hazards, in place of the cycle-based SSNOP method of dealing with hazards. These instructions are required in a Release 2 implementation. • Access to a new class of hardware registers and state from an unprivileged mode. This enhancement is required in a Release 2 implementation. • Coprocessor 0 Register changes: These changes add or modify CP0 registers to indicate the existence of new and optional state, provide L2 and L3 cache identification, add trigger bits to the Watch registers, and add support for 64-bit performance counter count registers. This enhancement is required in a Release 2 implementation. • Support for 64-bit coprocessors with 32-bit CPUs: These changes allow a 64-bit coprocessor (including an FPU) to be attached to a 32-bit CPU. This enhancement is optional in a Release 2 implementation. • New Support for Virtual Memory: These changes provide support for a 1KByte page size. This change is optional in Release 2 implementations, and support is denoted by Config3SP.8 MIPS32™ Architecture For Programmers Volume I, Revision 2.00 Copyright © 2001-2003 MIPS Technologies Inc. All rights reserved. 2.2 Compliance and Subsetting2.1.3 Architectural Changes Relative to the MIPS I through MIPS V Architectures In addition to the MIPS32 Architecture described in this document set, the following changes were made to the architecture relative to the earlier MIPS RISC Architecture Specification, which describes the MIPS I through MIPS V Architectures. • The MIPS IV ISA added a restriction to the load and store instructions which have natural alignment requirements (all but load and store byte and load and store left and right) in which the base register used by the instruction must also be naturally aligned (the restriction expressed in the MIPS RISC Architecture Specification is that the offset be aligned, but the implication is that the base register is also aligned, and this is more consistent with the indexed load/store instructions which have no offset field). The restriction that the base register be naturally-aligned is eliminated by the MIPS32 Architecture, leaving the restriction that the effective address be naturally-aligned. • Early MIPS implementations required two instructions separating a mflo or mfhi from the next integer multiply or divide operation. This hazard was eliminated in the MIPS IV ISA, although the MIPS RISC Architecture Specification does not clearly explain this fact. The MIPS32 Architecture explicitly eliminates this hazard and requires that the hi and lo registers be fully interlocked in hardware for all integer multiply and divide instructions (including, but not limited to, the madd, maddu, msub, msubu, and mul instructions introduced in this specification). • The Implementation and Programming Notes included in the instruction descriptions for the madd, maddu, msub, msubu, and mul instructions should also be applied to all integer multiply and divide instructions in the MIPS RISC Architecture Specification. 2.2 Compliance and Subsetting To be compliant with the MIPS32 Architecture, designs must implement a set of required features, as described in this document set. To allow flexibility in implementations, the MIPS32 Architecture does provide subsetting rules. An implementation that follows these rules is compliant with the MIPS32 Architecture as long as it adheres strictly to the rules, and fully implements the remaining instructions.Supersetting of the MIPS32 Architecture is only allowed by adding functions to the SPECIAL2 major opcode, by adding control for co-processors via the COP2, LWC2, SWC2, LDC2, and/or SDC2, and/or COP3 opcodes, or via the addition of approved Application Specific Extensions. Note, however, that a decision to use the COP3 opcode in an implementation of the MIPS32 Architecture precludes a compatible upgrade to the MIPS64 Architecture because the COP3 opcode is used as part of the floating point ISA in the MIPS64 Architecture. The instruction set subsetting rules are as follows: • All CPU instructions must be implemented - no subsetting is allowed. • The FPU and related support instructions, including the MOVF and MOVT CPU instructions, may be omitted. Software may determine if an FPU is implemented by checking the state of the FP bit in the Config1 CP0 register. If the FPU is implemented, it must include S, D, and W formats, operate instructions, and all supporting instructions. Software may determine which FPU data types are implemented by checking the appropriate bit in the FIR CP1 register. The following allowable FPU subsets are compliant with the MIPS32 architecture: – No FPU – FPU with S, D, and W formats and all supporting instructions • Coprocessor 2 is optional and may be omitted. Software may determine if Coprocessor 2 is implemented by checking the state of the C2 bit in the Config1 CP0 register. If Coprocessor 2 is implemented, the Coprocessor 2 interface instructions (BC2, CFC2, COP2, CTC2, LDC2, LWC2, MFC2, MTC2, SDC2, and SWC2) may be omitted on an instruction-by-instruction basis. • Supervisor Mode is optional. If Supervisor Mode is not implemented, bit 3 of the Status register must be ignored on write and read as zero. • The standard TLB-based memory management unit may be replaced with a simpler MMU (e.g., a Fixed Mapping MMU). If this is done, the rest of the interface to the Privileged Resource Architecture must be preserved. If aMIPS32™ Architecture For Programmers Volume I, Revision 2.00 9 Copyright © 2001-2003 MIPS Technologies Inc. All rights reserved. Chapter 2 The MIPS Architecture: An Introduction2.6 Instructions, Sorted by ISA This section lists the instructions that are a part of the MIPS32 and MIPS64 ISAs. 2.6.1 List of MIPS32 Instructions Table 2-1 lists of those instructions included in the MIPS32 ISA.. Table 2-1 MIPS32 Instructions ABS.D ABS.PS1 1. In Release 1 of the Architecture, these instructions are legal only with a MIPS64 processor with 64-bit operations enabled (they are, in effect, actually MIPS64 instructions). In Release 2 of the Architecture, these instructions are legal with either a MIPS32 or MIPS64 processor which includes a 64-bit floating point unit. ABS.S ADD ADD.D ADD.PS1 ADD.S ADDI ADDIU ADDU ALNV.PS1 AND ANDI BC1F BC1FL BC1T BC1TL BC2F BC2FL BC2T BC2TL BEQ BEQL BGEZ BGEZAL BGEZALL BGEZL BGTZ BGTZL BLEZ BLEZL BLTZ BLTZAL BLTZALL BLTZL BNE BNEL BREAK C.cond.D C.cond.PS1 C.cond.S CACHE CEIL.L.D1 CEIL.L.S1 CEIL.W.D CEIL.W.S CFC1 CFC2 CLO CLZ COP2 CTC1 CTC2 CVT.D.L1 CVT.D.S CVT.D.W CVT.L.D1 CVT.L.S1 CVT.PS.S1 CVT.S.D CVT.S.L1 CVT.S.PL1 CVT.S.PU1 CVT.S.W CVT.W.D CVT.W.S DERET DI2 2. These instructions are legal only in an implementation of Release 2 of the Architecture DIV DIV.D DIV.S DIVU EHB2 EI2 ERET EXT2 FLOOR.L.D1 FLOOR.L.S1 FLOOR.W.D FLOOR.W.S INS2 J JAL JALR JALR.HB2 JR JR.HB2 LB LBU LDC1 LDC2 LDXC11 LH LHU LL LUI LUXC11 LW LWC1 LWC2 LWL LWR LWXC11 MADD MADD.D1 MADD.PS1 MADD.S1 MADDU MFC0 MFC1 MFC2 MFHC12 MFHC22 MFHI MFLO MOV.D MOV.PS1 MOV.S MOVF MOVF.D MOVF.PS1 MOVF.S MOVN MOVN.D MOVN.PS1 MOVN.S MOVT MOVT.D MOVT.PS1 MOVT.S MOVZ MOVZ.D MOVZ.PS1 MOVZ.S MSUB MSUB.D1 MSUB.PS1 MSUB.S1 MSUBU MTC0 MTC1 MTC2 MTHC12 MTHC22 MTHI MTLO MUL MUL.D MUL.PS1 MUL.S MULT MULTU NEG.D NEG.PS1 NEG.S NMADD.D1 NMADD.PS1 NMADD.S1 NMSUB.D1 NMSUB.PS1 NMSUB.S1 NOR OR ORI PLL.PS1 PLU.PS1 PREF PREFX1 PUL.PS1 PUU.PS1 RDHWR2 RDPGPR2 RECIP.D1 RECIP.S1 ROTR2 ROTRV2 ROUND.L.D1 ROUND.L.S1 ROUND.W.D ROUND.W.S RSQRT.D1 RSQRT.S1 SB SC SDBBP SDC1 SDC2 SDXC11 SEB2 SEH2 SH SLL SLLV SLT SLTI SLTIU SLTU SQRT.D SQRT.S SRA SRAV SRL SRLV SSNOP SUB SUB.D SUB.PS1 SUB.S SUBU SUXC11 SW SWC1 SWC2 SWL SWR SWXC11 SYNC SYNCI2 SYSCALL TEQ TEQI TGE TGEI TGEIU TGEU TLBP TLBR TLBWI TLBWR TLT TLTI TLTIU TLTU TNE TNEI TRUNC.L.D1 TRUNC.L.S1 TRUNC.W.D TRUNC.W.S WAIT WRPGPR2 WSBH2 XOR XORI12 MIPS32™ Architecture For Programmers Volume I, Revision 2.00 Copyright © 2001-2003 MIPS Technologies Inc. All rights reserved. 2.7 Pipeline Architecture2.6.2 List of MIPS64 Instructions Table 2-2 lists of those instructions introduced in the MIPS64 ISA. Table 2-2 MIPS64 Instructions 2.7 Pipeline Architecture This section describes the basic pipeline architecture, along with two types of improvements: superpipelines and superscalar pipelines. (Pipelining and multiple issuing are not defined by the ISA, but are implementation dependent.) 2.7.1 Pipeline Stages and Execution Rates MIPS processors all use some variation of a pipeline in their architecture. A pipeline is divided into the following discrete parts, or stages, shown in Figure 2-2: • Fetch • Arithmetic operation • Memory access • Write back Figure 2-2 One-Deep Single-Completion Instruction Pipeline In the example shown in Figure 2-2, each stage takes one processor clock cycle to complete. Thus it takes four clock cycles (ignoring delays or stalls) for the instruction to complete. In this example, the execution rate of the pipeline is one instruction every four clock cycles. Conversely, because only a single execution can be fetched before completion, only one stage is active at any time. DADD DADDI DADDIU DADDU DCLO DDIV DDIVU DEXT1 1. These instructions are legal only in an implementation of Release 2 of the Architeture DEXTM1 DEXTU1 DINS1 DINSM1 DINSU1 DLCZ DMFC0 DMFC1 DMFC2 DMTC0 DMTC1 DMTC2 DMULT DMULTU DROTR1 DROTR321 DROTRV1 DSBH1 DSHD1 DSLL DSLL32 DSLLV DSRA DSRA32 DSRAV DSRL DSRL32 DSRLV DSUB DSUBU LD LDL LDR LLD LWU SCD SD SDL SDR Instruction 1 Fetch ALU Memory Write Cycle 1 Cycle 2 Cycle 3 Cycle 4 Stage 1 Stage 2 Stage 3 Stage 4 Execution Rate Cycle 5 Cycle 6 Cycle 7 Cycle 8 Cycle 3 Instruction 2 Stage 1 Stage 2 Stage 3 Stage 4 Fetch ALU Memory Write Instruction completionMIPS32™ Architecture For Programmers Volume I, Revision 2.00 13 Copyright © 2001-2003 MIPS Technologies Inc. All rights reserved. Chapter 2 The MIPS Architecture: An Introduction2.7.2 Parallel Pipeline Figure 2-3 illustrates a remedy for the latency (the time it takes to execute an instruction) inherent in the pipeline shown in Figure 2-2. Instead of waiting for an instruction to be completed before the next instruction can be fetched (four clock cycles), a new instruction is fetched each clock cycle. There are four stages to the pipeline so the four instructions can be executed simultaneously, one at each stage of the pipeline. It still takes four clock cycles for the first instruction to be completed; however, in this theoretical example, a new instruction is completed every clock cycle thereafter. Instructions in Figure 2-3 are executed at a rate four times that of the pipeline shown in Figure 2-2. Figure 2-3 Four-Deep Single-Completion Pipeline 2.7.3 Superpipeline Figure 2-4 shows a superpipelined architecture. Each stage is designed to take only a fraction of an external clock cycle—in this case, half a clock. Effectively, each stage is divided into more than one substage. Therefore more than one instruction can be completed each cycle. Figure 2-4 Four-Deep Superpipeline 2.7.4 Superscalar Pipeline A superscalar architecture also allows more than one instruction to be completed each clock cycle. Figure 2-5 shows a four-way, five-stage superscalar pipeline. Cycle 1 Instruction 1 Cycle 2 Cycle 3 Cycle 4 Cycle 5 Cycle 6 Cycle 7 Instruction 2 Instruction 3 Instruction 4 Fetch ALU Memory Write Fetch ALU Memory Write Fetch ALU Memory Write Fetch ALU Memory Write Clock Phase Cycle 1 Cycle 2 Cycle 3 Cycle 4 Cycle 5 Cycle 6 Cycle 7 Cycle 8 1 2 1 2 1 2 1 2 1 2 1 2 1 2 1 2 Fetch ALU Mem Write Fetch ALU Mem Write Fetch ALU Mem Write Fetch ALU Mem Write Fetch ALU Mem Write Fetch ALU Mem Write Fetch ALU Mem Write Fetch ALU Mem Write14 MIPS32™ Architecture For Programmers Volume I, Revision 2.00 Copyright © 2001-2003 MIPS Technologies Inc. All rights reserved. 2.9 Programming Model• a special-purpose program counter (PC), which is affected only indirectly by certain instructions - it is not an architecturally-visible register. 2.9.4.1 CPU General-Purpose Registers Two of the CPU general-purpose registers have assigned functions: • r0 is hard-wired to a value of zero, and can be used as the target register for any instruction whose result is to be discarded. r0 can also be used as a source when a zero value is needed. • r31 is the destination register used by JAL, BLTZAL, BLTZALL, BGEZAL, and BGEZALL without being explicitly specified in the instruction word. Otherwise r31 is used as a normal register. The remaining registers are available for general-purpose use. 2.9.4.2 CPU Special-Purpose Registers The CPU contains three special-purpose registers: • PC—Program Counter register • HI—Multiply and Divide register higher result • LO—Multiply and Divide register lower result – During a multiply operation, the HI and LO registers store the product of integer multiply. – During a multiply-add or multiply-subtract operation, the HI and LO registers store the result of the integer multiply-add or multiply-subtract. – During a division, the HI and LO registers store the quotient (in LO) and remainder (in HI) of integer divide. – During a multiply-accumulate, the HI and LO registers store the accumulated result of the operation. Figure 2-6 shows the layout of the CPU registers.MIPS32™ Architecture For Programmers Volume I, Revision 2.00 17 Copyright © 2001-2003 MIPS Technologies Inc. All rights reserved. Chapter 2 The MIPS Architecture: An Introduction2.9.5 FPU Registers The MIPS32 Architecture defines the following FPU registers: • 32 floating point registers (FPRs). These registers are 32 bits wide in a 32-bit FPU and 64 bits wide on a 64-bit FPU. • Five FPU control registers are used to identify and control the FPU. • Eight floating point condition codes that are part of the FCSR register Figure 2-6 CPU Registers 31 0 31 0 r0 (hardwired to zero) HI r1 LO r2 r3 r4 r5 r6 r7 r8 r9 r10 r11 r12 r13 r14 r15 r16 r17 r18 r19 r20 r21 r22 r23 r24 r25 r26 r27 r28 r29 r30 31 0 r31 PC General Purpose Registers Special Purpose Registers18 MIPS32™ Architecture For Programmers Volume I, Revision 2.00 Copyright © 2001-2003 MIPS Technologies Inc. All rights reserved. 2.9 Programming ModelIn Release 1 of the Architecture, 64-bit floating point units were supported only by implementations of the MIPS64 Architecture. Similarly, implementations of MIPS32 of the Architecture only supported 32-bit floating point units. In Release 2 of the Architecture, a 64-bit floating point unit is supported on implementations of both the MIPS32 and MIPS64 Architectures. A 32-bit floating point unit contains 32 32-bit FPRs, each of which is capable of storing a 32-bit data type. Double-precision (type D) data types are stored in even-odd pairs of FPRs, and the long-integer (type L) and paired single (type PS) data types are not supported. Figure 2-7 shows the layout of these registers. A 64-bit floating point unit contains 32 64-bit FPRs, each of which is capable of storing any data type. For compatibility with 32-bit FPUs, the FR bit in the CP0 Status register is used by a MIPS64 Release 1, or any Release 2 processor that supports a 64-bit FPU to configure the FPU in a mode in which the FPRs are treated as 32 32-bit registers, each of which is capable of storing only 32-bit data types. In this mode, the double-precision floating point (type D) data type is stored in even-odd pairs of FPRs, and the long-integer (type L) and paired single (type PS) data types are not supported. Figure 2-8 shows the layout of the FPU Registers when the FR bit in the CP0 Status register is 1; Figure 2-9 shows the layout of the FPU Registers when the FR bit in the CP0 Status register is 0.MIPS32™ Architecture For Programmers Volume I, Revision 2.00 19 Copyright © 2001-2003 MIPS Technologies Inc. All rights reserved. Chapter 2 The MIPS Architecture: An IntroductionBytes within larger CPU data formats—halfword, word, and doubleword—can be configured in either big-endian or little-endian order, as described in the following subsections: • “Big-Endian Order” • “Little-Endian Order” • “MIPS Bit Endianness” Figure 2-9 FPU Registers for a 64-bit FPU if StatusFR is 0 63 32 31 0 U N P R E D IC T A B L E f0 f1 f2 f3 f4 f5 f6 f7 f8 f9 f10 f11 f12 f13 f14 f15 f16 f17 f18 f19 f20 f21 f22 f23 f24 f25 f26 31 0 f27 FCR0 f28 FCR25 f29 FCR26 f30 FCR28 f31 FCSR General Purpose Registers Special Purpose Registers22 MIPS32™ Architecture For Programmers Volume I, Revision 2.00 Copyright © 2001-2003 MIPS Technologies Inc. All rights reserved. 2.9 Programming ModelEndianness defines the location of byte 0 within a larger data structure (in this book, bits are always numbered with 0 on the right). Figures 2-10 and 2-11 show the ordering of bytes within words and the ordering of words within multiple-word structures for both big-endian and little-endian configurations. 2.9.6.1 Big-Endian Order When configured in big-endian order, byte 0 is the most-significant (left-hand) byte. Figure 2-10 shows this configuration. Figure 2-10 Big-Endian Byte Ordering 2.9.6.2 Little-Endian Order When configured in little-endian order, byte 0 is always the least-significant (right-hand) byte. Figure 2-11 shows this configuration. Figure 2-11 Little-Endian Byte Ordering 2.9.6.3 MIPS Bit Endianness In this book, bit 0 is always the least-significant (right-hand) bit. Although no instructions explicitly designate bit positions within words, MIPS bit designations are always little-endian. Figure 2-12 shows big-endian and Figure 2-13 shows little-endian byte ordering in doublewords. Bit #Higher Address Word Address Lower Address 12 8 4 0 12 13 14 15 111098 7654 3210 1 word = 4 bytes 31 24 23 16 15 8 7 0 Bit #Higher Address Word Address Lower Address 12 8 4 0 15 14 13 12 891011 4567 0123 31 24 23 16 15 8 7 0MIPS32™ Architecture For Programmers Volume I, Revision 2.00 23 Copyright © 2001-2003 MIPS Technologies Inc. All rights reserved. Chapter 2 The MIPS Architecture: An IntroductionFigure 2-12 Big-Endian Data in Doubleword Format Figure 2-13 Little-Endian Data in Doubleword Format 2.9.6.4 Addressing Alignment Constraints The CPU uses byte addressing for halfword, word, and doubleword accesses with the following alignment constraints: • Halfword accesses must be aligned on an even byte boundary (0, 2, 4...). • Word accesses must be aligned on a byte boundary divisible by four (0, 4, 8...). • Doubleword accesses must be aligned on a byte boundary divisible by eight (0, 8, 16...). 2.9.6.5 Unaligned Loads and Stores The following instructions load and store words that are not aligned on word (W) or doubleword (D) boundaries: Figure 2-14 show a big-endian access of a misaligned word that has byte address 3, and Figure 2-15 shows a little-endian access of a misaligned word that has byte address 1.1 Table 2-3 Unaligned Load and Store Instructions Alignment Instructions Instruction Set Word LWL, LWR, SWL, SWR MIPS32 ISA Doubleword LDL, LDR, SDL, SDR MIPS64 ISA Bit # Halfword Word Byte # 63 40 4 1556 55 48 47 3239 765 16 32 7831 24 23 67 0 Byte Most-significant byte Least-significant byte Bits in a byte Bit # 1 0 5 4 3 2 1 0 Bit # Halfword Word Byte # 63 40 3 1556 55 48 47 3239 012 16 45 7831 24 23 67 7 Byte Most-significant byte Least-significant byte Bits in a byte Bit # 6 0 5 4 3 2 1 024 MIPS32™ Architecture For Programmers Volume I, Revision 2.00 Copyright © 2001-2003 MIPS Technologies Inc. All rights reserved. Chapter 3 Application Specific Extensions This section gives an overview of the Architecture Specific Extensions that are supported by the MIPS32 Architecture. 3.1 Description of ASEs As the MIPS architecture is adopted into a wider variety of markets, the need to extend this architecture in different directions becomes more and more apparent. Therefore various optional application-specific extensions are provided for use with the base ISAs (MIPS32 and MIPS64). The ASEs are optional, so the architecture is not permanently bound to support them and the ASEs are used only as needed. Extensions to the ISA are driven by the requirements of the computer segment, or by customers whose focus is primarily on performance. An ASE can be used with the appropriate ISA to meet the needs of a specific application or an entire class of applications. Figure 3-1 shows how ASEs interrelate with ISAs. Figure 3-1 MIPS ISAs and ASEs Figure 3-2 User-Mode MIPS ISAs and Optional ASEs The MIPS32 Architecture is a strict subset of the MIPS64 Architecture. ASEs are applicable to one or both of the base architectures as dictated by market need and the requirements placed on the base architecture by the ASE definition. MIPS-3D ASE MIPS16e ASE MDMX ASE Next Generation ASE Next Generation ASE MIPS32 Architecture MIPS64 Architecture SmartMIPS ASE Code Compaction Smart Cards Enhanced Geometry Processing Media ProcessingMIPS32™ Architecture For Programmers Volume I, Revision 2.00 27 Copyright © 2001-2003 MIPS Technologies Inc. All rights reserved. Chapter 3 Application Specific Extensions3.2 List of Application Specific Instructions As of the publishing date of this document, the following Application Specific Extensions were supported by the architecture. 3.2.1 The MIPS16e Application Specific Extension to the MIPS32Architecture The MIPS16e ASE is composed of 16-bit compressed code instructions, designed for the embedded processor market and situations with tight memory constraints. The core can execute both 16- and 32-bit instructions intermixed in the same program, and is compatible with both the MIPS32 and MIPS64 Architectures. Volume IV-a of this document set describes the MIPS16e ASE. 3.2.2 The MDMX Application Specific Extension to the MIPS64 Architecture The MIPS Digital Media Extension (MDMX) provides video, audio, and graphics pixel processing through vectors of small integers. Although not a part of the MIPS ISA, this extension is included for informational purposes. Because the MDMX ASE requires the MIPS64 Architecture, it is not discussed in this document set. 3.2.3 The MIPS-3D Application Specific Extension to the MIPS64 Architecture The MIPS-3D ASE provides enhanced performance of geometry processing calculations by building on the paired single floating point data type, and adding specific instructions to accelerate computations on these data types.Volume IV-c of this document set describes the MIPS-3D ASE. Because the MIPS-3D ASE requires a 64-bit floating point unit, it is only available with a Release 1 MIPS64 processor, or a Release 2 MIPS32 or MIPS64 processor that includes a 64-bit FPU. 3.2.4 The SmartMIPS Application Specific Extension to the MIPS32 Architecture The SmartMIPS ASE extends the MIPS32 Architecture with a set of new and modified instruction designed to improve the performance and reduce the memory consumption of MIPS-based smart card or smart object systems. Volume IV-d of this document set describes the SmartMIPS ASE. ASE Base Architecture Requirement Use MIPS16e™ MIPS32 or MIPS64 Code Compaction MDMX™ MIPS64 Digital Media MIPS-3D™ MIPS64 Geometry Processing SmartMIPS™ MIPS32 Smart Cards and Smart Objects28 MIPS32™ Architecture For Programmers Volume I, Revision 2.00 Copyright © 2001-2003 MIPS Technologies Inc. All rights reserved. Chapter 4 Overview of the CPU Instruction Set This chapter gives an overview of the CPU instructions, including a description of CPU instruction formats. An overview of the FPU instructions is given in Chapter 5. 4.1 CPU Instructions, Grouped By Function CPU instructions are organized into the following functional groups: • Load and store • Computational • Jump and branch • Miscellaneous • Coprocessor Each instruction is 32 bits long. 4.1.1 CPU Load and Store Instructions MIPS processors use a load/store architecture; all operations are performed on operands held in processor registers and main memory is accessed only through load and store instructions. 4.1.1.1 Types of Loads and Stores There are several different types of load and store instructions, each designed for a different purpose: • Transferring variously-sized fields (for example, LB, SW) • Trading transferred data as signed or unsigned integers (for example, LHU) • Accessing unaligned fields (for example, LWR, SWL) • Selecting the addressing mode (for example, SDXC1, in the FPU) • Atomic memory update (read-modify-write: for instance, LL/SC) Regardless of the byte ordering (big- or little-endian), the address of a halfword, word, or doubleword is the lowest byte address among the bytes forming the object: • For big-endian ordering, this is the most-significant byte. • For a little-endian ordering, this is the least-significant byte. Refer to “Byte Ordering and Endianness” on page 21 for more information on big-endian and little-endian data ordering.MIPS32™ Architecture For Programmers Volume I, Revision 2.00 29 Copyright © 2001-2003 MIPS Technologies Inc. All rights reserved. Chapter 4 Overview of the CPU Instruction SetTable 4-6 lists the specific FPU load and store instructions;1 it also lists the MIPS ISA within which an instruction was first defined. 4.1.2 Computational Instructions This section describes the following: • “ALU Immediate and Three-Operand Instructions” • “ALU Two-Operand Instructions” • “Shift Instructions” • “Multiply and Divide Instructions” 2’s complement arithmetic is performed on integers represented in 2’s complement notation. These are signed versions of the following operations: • Add • Subtract • Multiply • Divide The add and subtract operations labelled “unsigned” are actually modulo arithmetic without overflow detection. There are also unsigned versions of multiply and divide, as well as a full complement of shift and logical operations. Logical operations are not sensitive to the width of the register. MIPS32 provided 32-bit integers and 32-bit arithmetic. 1 FPU loads and stores are listed here with the other coprocessor loads and stores for convenience. Table 4-6 FPU Load and Store Instructions Using Register + Register Addressing Mnemonic Instruction Defined in MIPS ISA LWXC1 Load Word Indexed to Floating Point MIPS64MIPS32 Release 2 SWXC1 Store Word Indexed from Floating Point MIPS64MIPS32 Release 2 LDXC1 Load Doubleword Indexed to Floating Point MIPS64MIPS32 Release 2 SDXC1 Store Doubleword Indexed from Floating Point MIPS64MIPS32 Release 2 LUXC1 Load Doubleword Indexed Unaligned to Floating Point MIPS64MIPS32 Release 2 SUXC1 Store Doubleword Indexed Unaligned from Floating Point MIPS64MIPS32 Release 232 MIPS32™ Architecture For Programmers Volume I, Revision 2.00 Copyright © 2001-2003 MIPS Technologies Inc. All rights reserved. 4.1 CPU Instructions, Grouped By Function4.1.2.1 ALU Immediate and Three-Operand Instructions Table 4-7 lists those arithmetic and logical instructions that operate on one operand from a register and the other from a 16-bit immediate value supplied by the instruction word. This table also lists the MIPS ISA within which an instruction is defined. The immediate operand is treated as a signed value for the arithmetic and compare instructions, and treated as a logical value (zero-extended to register length) for the logical instructions. Table 4-8 describes ALU instructions that use three operands, along with the MIPS ISA within which an instruction is defined. Table 4-7 ALU Instructions With an Immediate Operand Mnemonic Instruction Defined in MIPS ISA ADDI Add Immediate Word MIPS32 ADDIU1 1. The term “unsigned” in the instruction name is a misnomer; this operation is 32-bit modulo arithmetic that does not trap on overflow. Add Immediate Unsigned Word MIPS32 ANDI And Immediate MIPS32 LUI Load Upper Immediate MIPS32 ORI Or Immediate MIPS32 SLTI Set on Less Than Immediate MIPS32 SLTIU Set on Less Than Immediate Unsigned MIPS32 XORI Exclusive Or Immediate MIPS32 Table 4-8 Three-Operand ALU Instructions Mnemonic Instruction Defined in MIPS ISA ADD Add Word MIPS32 ADDU1 1. The term “unsigned” in the instruction name is a misnomer; this operation is 32-bit modulo arithmetic that does not trap on overflow. Add Unsigned Word MIPS32 AND And MIPS32 NOR Nor MIPS32 OR Or MIPS32 SLT Set on Less Than MIPS32 SLTU Set on Less Than Unsigned MIPS32 SUB Subtract Word MIPS32 SUBU1 Subtract Unsigned Word MIPS32 XOR Exclusive Or MIPS32MIPS32™ Architecture For Programmers Volume I, Revision 2.00 33 Copyright © 2001-2003 MIPS Technologies Inc. All rights reserved. Chapter 4 Overview of the CPU Instruction Set4.1.2.2 ALU Two-Operand Instructions Table 4-8 describes ALU instructions that use two operands, along with the MIPS ISA within which an instruction is defined. 4.1.2.3 Shift Instructions The ISA defines two types of shift instructions: • Those that take a fixed shift amount from a 5-bit field in the instruction word (for instance, SLL, SRL) • Those that take a shift amount from the low-order bits of a general register (for instance, SRAV, SRLV) Shift instructions are listed in Table 4-10, along with the MIPS ISA within which an instruction is defined. 4.1.2.4 Multiply and Divide Instructions The multiply and divide instructions produce twice as many result bits as is typical with other processors. With one exception, they deliver their results into the HI and LO special registers. The MUL instruction delivers the lower half of the result directly to a GPR. • Multiply produces a full-width product twice the width of the input operands; the low half is loaded into LO and the high half is loaded into HI. • Multiply-Add and Multiply-Subtract produce a full-width product twice the width of the input operations and adds or subtracts the product from the concatenated value of HI and LO. The low half of the addition is loaded into LO and the high half is loaded into HI. • Divide produces a quotient that is loaded into LO and a remainder that is loaded into HI. The results are accessed by instructions that transfer data between HI/LO and the general registers. Table 4-9 Two-Operand ALU Instructions Mnemonic Instruction Defined in MIPS ISA CLO Count Leading Ones in Word MIPS32 CLZ Count Leading Zeros in Word MIPS32 Table 4-10 Shift Instructions Mnemonic Instruction Defined in MIPS ISA ROTR Rotate Word Right MIPS32 Release 2 ROTRV Rotate Word Right Variable MIPS32 Release 2 SLL Shift Word Left Logical MIPS32 SLLV Shift Word Left Logical Variable MIPS32 SRA Shift Word Right Arithmetic MIPS32 SRAV Shift Word Right Arithmetic Variable MIPS32 SRL Shift Word Right Logical MIPS32 SRLV Shift Word Right Logical Variable MIPS3234 MIPS32™ Architecture For Programmers Volume I, Revision 2.00 Copyright © 2001-2003 MIPS Technologies Inc. All rights reserved. 4.1 CPU Instructions, Grouped By Function4.1.4 Miscellaneous Instructions Miscellaneous instructions include: • “Instruction Serialization (SYNC and SYNCI)” • “Exception Instructions” • “Conditional Move Instructions” • “Prefetch Instructions” • “NOP Instructions” 4.1.4.1 Instruction Serialization (SYNC and SYNCI) In normal operation, the order in which load and store memory accesses appear to a viewer outside the executing processor (for instance, in a multiprocessor system) is not specified by the architecture. The SYNC instruction can be used to create a point in the executing instruction stream at which the relative order of some loads and stores can be determined: loads and stores executed before the SYNC are completed before loads and stores after the SYNC can start. Table 4-14 PC-Relative Conditional Branch Instructions Comparing With Zero Mnemonic Instruction Defined in MIPS ISA BGEZ Branch on Greater Than or Equal to Zero MIPS32 BGEZAL Branch on Greater Than or Equal to Zero and Link MIPS32 BGTZ Branch on Greater Than Zero MIPS32 BLEZ Branch on Less Than or Equal to Zero MIPS32 BLTZ Branch on Less Than Zero MIPS32 BLTZAL Branch on Less Than Zero and Link MIPS32 Table 4-15 Deprecated Branch Likely Instructions Mnemonic Instruction Defined in MIPS ISA BEQL Branch on Equal Likely MIPS32 BGEZALL Branch on Greater Than or Equal to Zero and Link Likely MIPS32 BGEZL Branch on Greater Than or Equal to Zero Likely MIPS32 BGTZL Branch on Greater Than Zero Likely MIPS32 BLEZL Branch on Less Than or Equal to Zero Likely MIPS32 BLTZALL Branch on Less Than Zero and Link Likely MIPS32 BLTZL Branch on Less Than Zero Likely MIPS32 BNEL Branch on Not Equal Likely MIPS32MIPS32™ Architecture For Programmers Volume I, Revision 2.00 37 Copyright © 2001-2003 MIPS Technologies Inc. All rights reserved. Chapter 4 Overview of the CPU Instruction SetThe SYNCI instruction synchronizes the processor caches with previous writes or other modifications to the instruction stream. Table 4-16 lists the synchronization instructions, along with the MIPS ISA within which it is defined. 4.1.4.2 Exception Instructions Exception instructions transfer control to a software exception handler in the kernel. There are two types of exceptions, conditional and unconditional. These are caused by the following instructions: Trap instructions, which cause conditional exceptions based upon the result of a comparison System call and breakpoint instructions, which cause unconditional exceptions Table 4-17 lists the system call and breakpoint instructions. Table 4-18 lists the trap instructions that compare two registers. Table 4-19 lists trap instructions, which compare a register value with an immediate value. Each table also lists the MIPS ISA within which an instruction is defined. Table 4-16 Serialization Instruction Mnemonic Instruction Defined in MIPS ISA SYNC Synchronize Shared Memory MIPS32 SYNCI Synchronize Caches to Make Instruction Writes Effective MIPS32 Release 2 Table 4-17 System Call and Breakpoint Instructions Mnemonic Instruction Defined in MIPS ISA BREAK Breakpoint MIPS32 SYSCALL System Call MIPS32 Table 4-18 Trap-on-Condition Instructions Comparing Two Registers Mnemonic Instruction Defined in MIPS ISA TEQ Trap if Equal MIPS32 TGE Trap if Greater Than or Equal MIPS32 TGEU Trap if Greater Than or Equal Unsigned MIPS32 TLT Trap if Less Than MIPS32 TLTU Trap if Less Than Unsigned MIPS32II TNE Trap if Not Equal MIPS32 Table 4-19 Trap-on-Condition Instructions Comparing an Immediate Value Mnemonic Instruction Defined in MIPS ISA TEQI Trap if Equal Immediate MIPS32 TGEI Trap if Greater Than or Equal Immediate MIPS32 TGEIU Trap if Greater Than or Equal Immediate Unsigned MIPS3238 MIPS32™ Architecture For Programmers Volume I, Revision 2.00 Copyright © 2001-2003 MIPS Technologies Inc. All rights reserved. 4.1 CPU Instructions, Grouped By Function4.1.4.3 Conditional Move Instructions MIPS32 includes instructions to conditionally move one CPU general register to another, based on the value in a third general register. For floating point conditional moves, refer to Chapter 4. Table 4-20 lists conditional move instructions, along with the MIPS ISA within which an instruction is defined. 4.1.4.4 Prefetch Instructions There are two prefetch advisory instructions: • One with register+offset addressing (PREF) • One with register+register addressing (PREFX) These instructions advise that memory is likely to be used in a particular way in the near future and should be prefetched into the cache. The PREFX instruction is encoded in the FPU opcode space, along with the other operations using register+register addressing 4.1.4.5 NOP Instructions The NOP instruction is actually encoded as an all-zero instruction. MIPS processors special-case this encoding as performing no operation, and optimize execution of the instruction. In addition, SSNOP instruction, takes up one issue cycle on any processor, including super-scalar implementations of the architecture. TLTI Trap if Less Than Immediate MIPS32 TLTIU Trap if Less Than Immediate Unsigned MIPS32 TNEI Trap if Not Equal Immediate MIPS32 Table 4-20 CPU Conditional Move Instructions Mnemonic Instruction Defined in MIPS ISA MOVF Move Conditional on Floating Point False MIPS32 MOVN Move Conditional on Not Zero MIPS32 MOVT Move Conditional on Floating Point True MIPS32 MOVZ Move Conditional on Zero MIPS32 Table 4-21 Prefetch Instructions Mnemonic Instruction Addressing Mode Defined in MIPS ISA PREF Prefetch Register+Offset MIPS32 PREFX Prefetch Indexed Register+Register MIPS64 Table 4-19 Trap-on-Condition Instructions Comparing an Immediate Value Mnemonic Instruction Defined in MIPS ISAMIPS32™ Architecture For Programmers Volume I, Revision 2.00 39 Copyright © 2001-2003 MIPS Technologies Inc. All rights reserved. Chapter 4 Overview of the CPU Instruction SetTable 4-24 describes the fields used in these instructions. Figure 4-1 Immediate (I-Type) CPU Instruction Format Figure 4-2 Jump (J-Type) CPU Instruction Format Figure 4-3 Register (R-Type) CPU Instruction Format Table 4-24 CPU Instruction Format Fields Field Description opcode 6-bit primary operation code rd 5-bit specifier for the destination register rs 5-bit specifier for the source register rt 5-bit specifier for the target (source/destination) register or used to specify functions within theprimary opcode REGIMM immediate 16-bit signed immediate used for logical operands, arithmetic signed operands, load/storeaddress byte offsets, and PC-relative branch signed instruction displacement instr_index 26-bit index shifted left two bits to supply the low-order 28 bits of the jump target address sa 5-bit shift amount function 6-bit function field used to specify functions within the primary opcode SPECIAL 31 26 25 21 20 16 15 0 opcode rs rt immediate 6 5 5 16 31 26 25 21 20 16 15 11 10 6 5 0 opcode instr_index 6 26 31 26 25 21 20 16 15 11 10 6 5 0 opcode rs rt rd sa function 6 5 5 5 5 642 MIPS32™ Architecture For Programmers Volume I, Revision 2.00 Copyright © 2001-2003 MIPS Technologies Inc. All rights reserved. Chapter 5 Overview of the FPU Instruction Set This chapter describes the instruction set architecture (ISA) for the floating point unit (FPU) in the MIPS32 architecture. In the MIPS architecture, the FPU is implemented via Coprocessor 1 and Coprocessor 3, an optional processor implementing IEEE Standard 7541 floating point operations. The FPU also provides a few additional operations not defined by the IEEE standard. This chapter provides an overview of the following FPU architectural details: • Section 5.1, "Binary Compatibility" • Section 5.2, "Enabling the Floating Point Coprocessor" • Section 5.3, "IEEE Standard 754" • Section 5.4, "FPU Data Types" • Section 5.5, "Floating Point Register Types" • Section 5.6, "Floating Point Control Registers (FCRs)" • Section 5.7, "Formats of Values Used in FP Registers" • Section 5.8, "FPU Exceptions" • Section 5.9, "FPU Instructions" • Section 5.10, "Valid Operands for FPU Instructions" • Section 5.11, "FPU Instruction Formats" The FPU instruction set is summarized by functional group. Each instruction is also described individually in alphabetical order in Volume II. 5.1 Binary Compatibility In addition to an Instruction Set Architecture, the MIPS architecture definition includes processing resources such as the set of coprocessor general registers. In Release 1 of the Architecture, the 32-bit registers in MIPS32 were enlarged to 64-bits in MIPS64; however, these 64-bit FPU registers are not backwards compatible. Instead, processors implementing the MIPS64 Architecture provide a mode bit to select either the 32-bit or 64-bit register model. In Release 2 of the Architecture, a 32-bit CPU may include a full 64-bit coprocessor, including a floating point unit which implements the same mode bit to select 32-bit or 64-bit FPU register model. Any processor implementing MIPS64 can also run MIPS32 binary programs, built for the same, or a lower release of the Architecture, without change. 1 In this chapter, references to “IEEE standard” and “IEEE Standard 754” refer to IEEE Standard 754-1985, “IEEE Standard for Binary Floating Point Arithmetic.” For more information about this standard, see the IEEE web page at http://stdsbbs.ieee.org/.MIPS32™ Architecture For Programmers Volume I, Revision 2.00 43 Copyright © 2001-2003 MIPS Technologies Inc. All rights reserved. Chapter 5 Overview of the FPU Instruction Set5.2 Enabling the Floating Point Coprocessor Enabling the Floating Point Coprocessor is done by enabling Coprocessor 1, and is a privileged operation provided by the System Control Coprocessor. If Coprocessor 1 is not enabled, an attempt to execute a floating point instruction causes a Coprocessor Unusable exception. Every system environment either enables the FPU automatically or provides a means for an application to request that it is enabled. 5.3 IEEE Standard 754 IEEE Standard 754 defines the following: • Floating point data types • The basic arithmetic, comparison, and conversion operations • A computational model The IEEE standard does not define specific processing resources nor does it define an instruction set. The MIPS architecture includes non-IEEE FPU control and arithmetic operations (multiply-add, reciprocal, and reciprocal square root) which may not supply results that match the IEEE precision rules. 5.4 FPU Data Types The FPU provides both floating point and fixed point data types, which are described in the next two sections. • The single and double precision floating point data types are those specified by the IEEE standard. • The fixed point types are signed integers provided by the CPU architecture. 5.4.1 Floating Point Formats The following two floating point formats are provided by the FPU: • 32-bit single precision floating point (type S, shown in Figure 5-1) • 64-bit double precision floating point (type D, shown in Figure 5-2) • 64-bit paired single floating point, combining two single precision data types (Type PS, shown in Figure 5-3) The floating point data types represent numeric values as well as other special entities, such as the following: • Two infinities, +∞ and -∞ • Signaling non-numbers (SNaNs) • Quiet non-numbers (QNaNs)s44 MIPS32™ Architecture For Programmers Volume I, Revision 2.00 Copyright © 2001-2003 MIPS Technologies Inc. All rights reserved. 5.4 FPU Data Typesindicating that an exceptional condition arose during the computation. To permit this, each floating point format defines representations, listed in Table 5-2, for plus infinity (+∞), minus infinity (-∞), quiet non-numbers (QNaN), and signaling non-numbers (SNaN). 5.4.1.3 Infinity and Beyond Infinity represents a number with magnitude too large to be represented in the format; in essence it exists to represent a magnitude overflow during a computation. A correctly signed ∞ is generated as the default result in division by zero and some cases of overflow; details are given in the IEEE exception condition described in. Once created as a default result, ∞ can become an operand in a subsequent operation. The infinities are interpreted such that -∞ < (every finite number) < +∞. Arithmetic with ∞ is the limiting case of real arithmetic with operands of arbitrarily large magnitude, when such limits exist. In these cases, arithmetic on ∞ is regarded as exact and exception conditions do not arise. The out-of-range indication represented by ∞ is propagated through subsequent computations. For some cases there is no meaningful limiting case in real arithmetic for operands of ∞, and these cases raise the Invalid Operation exception condition (see “Invalid Operation Exception” on page 60). 5.4.1.4 Signalling Non-Number (SNaN) SNaN operands cause the Invalid Operation exception for arithmetic operations. SNaNs are useful values to put in uninitialized variables. An SNaN is never produced as a result value. IEEE Standard 754 states that “Whether copying a signaling NaN without a change of format signals the Invalid Operation exception is the implementor’s option.” The MIPS architecture has chosen to make the formatted operand move instructions (MOV.fmt MOVT.fmt MOVF.fmt MOVN.fmt MOVZ.fmt) non-arithmetic and they do not signal IEEE 754 exceptions. 5.4.1.5 Quiet Non-Number (QNaN) QNaNs are intended to afford retrospective diagnostic information inherited from invalid or unavailable data and results. Propagation of the diagnostic information requires information contained in a QNaN to be preserved through arithmetic operations and floating point format conversions. QNaN operands do not cause arithmetic operations to signal an exception. When a floating point result is to be delivered, a QNaN operand causes an arithmetic operation to supply a QNaN result. When possible, this QNaN result is one of the operand QNaN values. QNaNs do have effects similar to SNaNs on operations that do not deliver a floating point result— specifically, comparisons. (For more information, see the detailed description of the floating point compare instruction, C.cond.fmt.) When certain invalid operations not involving QNaN operands are performed but do not trap (because the trap is not enabled), a new QNaN value is created. Table 5-3 shows the QNaN value generated when no input operand QNaN value can be copied. The values listed for the fixed point formats are the values supplied to satisfy the IEEE standard when a QNaN or infinite floating point value is converted to fixed point. There is no other feature of the architecture that detects or makes use of these “integer QNaN” values. Table 5-3 Value Supplied When a New Quiet NaN Is Created Format New QNaN value Single floating point 16#7fbf ffff Double floating point 16#7ff7 ffff ffff ffff Word fixed point 16#7fff ffff Longword fixed point 16#7fff ffff ffff ffffMIPS32™ Architecture For Programmers Volume I, Revision 2.00 47 Copyright © 2001-2003 MIPS Technologies Inc. All rights reserved. Chapter 5 Overview of the FPU Instruction Set5.4.1.6 Paired Single Exceptions Exception conditions that arise while executing the two halves of a floating point vector operation are ORed together, and the instruction is treated as having caused all the exceptional conditions arising from both operations. The hardware makes no effort to determine which of the two operations encountered the exceptional condition. 5.4.1.7 Paired Single Condition Codes The c.cond.PS instruction compares the upper and lower halves of FPR fs and FPR ft independently and writes the results into condition codes CC +1 and CC respectively. The CC number must be even. If the number is not even the operation of the instruction is UNPREDICTABLE. 5.4.2 Fixed Point Formats The FPU provides two fixed point data types: • 32-bit Word fixed point (type W), shown in Figure 5-4 • 64-bit Longword fixed point (type L), shown in Figure 5-5 The fixed point values are held in the 2’s complement format used for signed integers in the CPU. Unsigned fixed point data types are not provided by the architecture; application software may synthesize computations for unsigned integers from the existing instructions and data types. Figure 5-4 Word Fixed Point Format (W) Figure 5-5 Longword Fixed Point Format (L) 5.5 Floating Point Register Types This section describes the organization and use of the two types of FPU register sets: In Release 1 of the Architecture, 64-bit floating point units were supported only by implementations of the MIPS64 Architecture. Similarly, implementations of MIPS32 of the Architecture only supported 32-bit floating point units. In Release 2 of the Architecture, a 64-bit floating point unit is supported on implementations of both the MIPS32 and MIPS64 Architectures. Floating Point registers (FPRs) are 32 or 64 bits wide. A 32-bit floating point unit contains 32 32-bit FPRs, each of which is capable of storing a 32-bit data type. Double-precision (type D) data types are stored in even-odd pairs of FPRs, and the long-integer (type L) and paired single (type PS) data types are not supported. A 64-bit floating point unit contains 3 1 3 0 0 S Integer 1 31 6 3 6 2 0 S Integer 1 6348 MIPS32™ Architecture For Programmers Volume I, Revision 2.00 Copyright © 2001-2003 MIPS Technologies Inc. All rights reserved. 5.5 Floating Point Register Types32 64-bit FPRs, each of which is capable of storing any data type. For compatibility with 32-bit FPUs, the FR bit in the CP0 Status register is used by a MIPS64 Release 1, or any Release 2 processor that supports a 64-bit FPU to configure the FPU in a mode in which the FPRs are treated as 32 32-bit registers, each of which is capable of storing only 32-bit data types. In this mode, the double-precision floating point (type D) data type is stored in even-odd pairs of FPRs, and the long-integer (type L) and paired single (type PS) data types are not supported. • These registers transfer binary data between the FPU and the system, and are also used to hold formatted FPU operand values. Refer to Volume III, The MIPS Privileged Architecture Manual, for more information on the CP0 Registers. • Floating Point Control registers (FCRs), which are 32 bits wide. There are five FPU control registers, used to identify and control the FPU. These registers are indicated by the fs field of the instruction word. Three of these registers, FCCR, FEXR, and FENR, select subsets of the floating point Control/Status register, the FCSR. 5.5.1 FPU Register Models There are separate FPU register models in Release 1 of the Architecture: • MIPS32 defines 32 32-bit registers, with D-format values stored in even-odd pairs of registers. • MIPS64 defines 32 64-bit registers, with all formats supported in a register. To support MIPS32 programs, MIPS64 processors also provide the MIPS32 register model, which is available as a mode selection through the FR Bit of the CP0 Status Register. In Release 2 of the Architecture, both FPU register models are supported in MIPS32 (as well as MIPS64) implementations, and the FR bit of the CP0 Status Register. 5.5.2 Binary Data Transfers (32-Bit and 64-Bit) The data transfer instructions move words and doublewords between the FPU FPRs and the remainder of the system. The operations of the word and doubleword load and move-to instructions are shown in Figure 5-6 and Figure 5-7. The store and move-from instructions operate in reverse, reading data from the location which the corresponding load or move-to instruction wrote. Figure 5-6 FPU Word Load and Move-to Operations Reg 0 Reg 1 63 0 FR BIT = 1 FR BIT = 0 Reg 0 Reg 1 Reg 0 Reg 1 Initial value 1 Initial value 2 Undefined/Unused Data word (0) Initial value 2 Undefined/Unused Undefined/Unused Data word (0) Data word (4) 63 0 63 0 63 0 63 0 63 0 Reg 0 Reg 2 Reg 0 Reg 2 Reg 0 Reg 2 Undefined/Unused Data word (0) Initial value 2 Data word (4) Data word (0) Initial value 2 Initial value 1 Initial value 2 LWC1 f0, 0(r0) / MTC1 f0,r0 LWC1 f1, 4(r0) / MTC1 f1,r4MIPS32™ Architecture For Programmers Volume I, Revision 2.00 49 Copyright © 2001-2003 MIPS Technologies Inc. All rights reserved. Chapter 5 Overview of the FPU Instruction SetW 20 Indicates that the word fixed point (W) data type and instructions are implemented: R Preset or Externally Set Required (Release 2) 3D 19 In Release 1 of the Architecture, this bit is used by MIPS64 processors to indicate that the MIPS-3D ASE is implemented. It is not used by MIPS32 processors and reads as zero. In Release 2 of the Architecture, the MIPS-3D ASE is supported on both MIPS32 and MIPS64 processors with a 64-bit floating point unit, and this bit indicates that the MIPS-3D ASE is implemented: R Preset Required PS 18 In Release 1 of the Architecture, this bit is used by MIPS64 processors to indicate that the paired single floating point data type is implemented. It is not used by MIPS32 processors and reads as zero. In Release 2 of the Architecture, the paired single floating point data type is supported on both MIPS32 and MIPS64 processors with a 64-bit floating point unit, and this bit indicates that the paired single floating point data type is implemented: R Preset Required D 17 Indicates that the double-precision (D) floating point data type and instructions are implemented: R Preset Required S 16 Indicates that the single-precision (S) floating point data type and instructions are implemented: R Preset Required ProcessorID 15:8 Identifies the floating point processor. R Preset Required Table 5-4 FIR Register Field Descriptions Fields Description Read/ Write Reset State ComplianceName Bits Encoding Meaning 0 W fixed point not implemented 1 W fixed point implemented Encoding Meaning 0 MIPS-3D ASE not implemented 1 MIPS-3D ASE implemented Encoding Meaning 0 PS floating point not implemented 1 PS floating point implemented Encoding Meaning 0 D floating point not implemented 1 D floating point implemented Encoding Meaning 0 S floating point not implemented 1 S floating point implemented52 MIPS32™ Architecture For Programmers Volume I, Revision 2.00 Copyright © 2001-2003 MIPS Technologies Inc. All rights reserved. 5.6 Floating Point Control Registers (FCRs)5.6.2 Floating Point Control and Status Register (FCSR, CP1 Control Register 31) Compliance Level: Required if floating point is implemented. The Floating Point Control and Status Register (FCSR) is a 32-bit register that controls the operation of the floating point unit, and shows the following status information: • selects the default rounding mode for FPU arithmetic operations • selectively enables traps of FPU exception conditions • controls some denormalized number handling options • reports any IEEE exceptions that arose during the most recently executed instruction • reports IEEE exceptions that arose, cumulatively, in completed instructions • indicates the condition code result of FP compare instructions Access to FCSR is not privileged; it can be read or written by any program that has access to the floating point unit (via the coprocessor enables in the Status register). Figure 5-12 shows the format of the FCSR register; Table 5-5 describes the FCSR register fields. Revision 7:0 Specifies the revision number of the floating point unit. This field allows software to distinguish between one revision and another of the same floating point processor type. If this field is not implemented, it must read as zero. R Preset Optional Figure 5-12 FCSR Register Format 31 30 29 28 27 26 25 24 23 22 21 20 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0 FCC FS FCC Impl 0000 Cause Enables Flags RM 7 6 5 4 3 2 1 0 E V Z O U I V Z O U I V Z O U I Table 5-5 FCSR Register Field Descriptions Fields Description Read/ Write Reset State ComplianceName Bits FCC 31:25, 23 Floating point condition codes. These bits record the result of floating point compares and are tested for floating point conditional branches and conditional moves. The FCC bit to use is specified in the compare, branch, or conditional move instruction. For backward compatibility with previous MIPS ISAs, the FCC bits are separated into two, non-contiguous fields. R/W Undefined Required Table 5-4 FIR Register Field Descriptions Fields Description Read/ Write Reset State ComplianceName BitsMIPS32™ Architecture For Programmers Volume I, Revision 2.00 53 Copyright © 2001-2003 MIPS Technologies Inc. All rights reserved. Chapter 5 Overview of the FPU Instruction SetFS 24 Flush to Zero. When FS is one, denormalized results are flushed to zero instead of causing an Unimplemented Operation exception. It is implementation dependent whether denormalized operand values are flushed to zero before the operation is carried out. R/W Undefined Required Impl 22:21 Available to control implementation dependent features of the floating point unit. If these bits are not implemented, they must be ignored on write and read as zero. R/W Undefined Optional 0 20:18 Reserved for future use; Must be written as zero;returns zero on read. 0 0 Reserved Cause 17:12 Cause bits. These bits indicate the exception conditions that arise during execution of an FPU arithmetic instruction. A bit is set to 1 if the corresponding exception condition arises during the execution of an instruction and is set to 0 otherwise. By reading the registers, the exception condition caused by the preceding FPU arithmetic instruction can be determined. Refer to Table 5-6 for the meaning of each bit. R/W Undefined Required Enables 11:7 Enable bits. These bits control whether or not a exception is taken when an IEEE exception condition occurs for any of the five conditions. The exception occurs when both an Enable bit and the corresponding Cause bit are set either during an FPU arithmetic operation or by moving a value to FCSR or one of its alternative representations. Note that Cause bit E has no corresponding Enable bit; the non-IEEE Unimplemented Operation exception is defined by MIPS as always enabled. Refer to Table 5-6 for the meaning of each bit. R/W Undefined Required Flags 6:2 Flag bits. This field shows any exception conditions that have occurred for completed instructions since the flag was last reset by software. When a FPU arithmetic operation raises an IEEE exception condition that does not result in a Floating Point Exception (i.e., the Enable bit was off), the corresponding bit(s) in the Flag field are set, while the others remain unchanged. Arithmetic operations that result in a Floating Point Exception (i.e., the Enable bit was on) do not update the Flag bits. This field is never reset by hardware and must be explicitly reset by software. Refer to Table 5-6 for the meaning of each bit. R/W Undefined Required RM 1:0 Rounding mode. This field indicates the rounding mode used for most floating point operations (some operations use a specific rounding mode). Refer to Table 5-7 for the meaning of the encodings of this field. R/W Undefined Required. Table 5-5 FCSR Register Field Descriptions Fields Description Read/ Write Reset State ComplianceName Bits54 MIPS32™ Architecture For Programmers Volume I, Revision 2.00 Copyright © 2001-2003 MIPS Technologies Inc. All rights reserved. 5.7 Formats of Values Used in FP Registers5.7 Formats of Values Used in FP Registers Unlike the CPU, the FPU does not interpret the binary encoding of source operands nor produce a binary encoding of results for every operation. The value held in a floating point operand register (FPR) has a format, or type, and it may be used only by instructions that operate on that format. The format of a value is either uninterpreted, unknown, or one of the valid numeric formats: single and double floating point, and word and long fixed point. The value in an FPR is always set when a value is written to the register: • When a data transfer instruction writes binary data into an FPR (a load), the FPR receives a binary value that is uninterpreted. • A computational or FP register move instruction that produces a result of type fmt puts a value of type fmt into the result register. When an FPR with an uninterpreted value is used as a source operand by an instruction that requires a value of format fmt, the binary contents are interpreted as an encoded value in format fmt and the value in the FPR changes to a value of format fmt. The binary contents cannot be reinterpreted in a different format. If an FPR contains a value of format fmt, a computational instruction must not use the FPR as a source operand of a different format. If this occurs, the value in the register becomes unknown and the result of the instruction is also a value that is unknown. Using an FPR containing an unknown value as a source operand produces a result that has an unknown value. The format of the value in the FPR is unchanged when it is read by a data transfer instruction (a store). A data transfer instruction produces a binary encoding of the value contained in the FPR. If the value in the FPR is unknown, the encoded binary value produced by the operation is not defined. The state diagram in Figure 5-16 illustrates the manner in which the formatted value in an FPR is set and changed. Table 5-10 FENR Register Field Descriptions Fields Description Read/ Write Reset State ComplianceName Bits 0 31:12,6:3 Must be written as zero; returns zero on read 0 0 Reserved Enables 11:7 Enable bits. Refer to the description of this field in theFCSR register. R/W Undefined Required FS 2 Flush to Zero bit. Refer to the description of this fieldin the FCSR register. R/W Undefined Required RM 1:0 Rounding mode. Refer to the description of this field inthe FCSR register. R/W Undefined RequiredMIPS32™ Architecture For Programmers Volume I, Revision 2.00 57 Copyright © 2001-2003 MIPS Technologies Inc. All rights reserved. Chapter 5 Overview of the FPU Instruction SetFigure 5-16 Effect of FPU Operations on the Format of Values Held in FPRs 5.8 FPU Exceptions This section provides the following information FPU exceptions: • Precise exception mode • Descriptions of the exceptions FPU exceptions are implemented in the MIPS FPU architecture with the Cause, Enable, and Flag fields of the Control/Status register. The Flag bits implement IEEE exception status flags, and the Cause and Enable bits control exception trapping. Each field has a bit for each of the five IEEE exception conditions and the Cause field has an additional exception bit, Unimplemented Operation, used to trap for software emulation assistance. A, B:Example formats Load:Destination of LWC1, LDC1, or MTC1 instructions. Store:Source operand of SWC1, SDC1, or MFC1 instructions. Src fmt:Source operand of computational instruction expecting format “fmt.” Rslt fmt:Result of computational instruction producing value of format “fmt.” Load Store Rslt unknown Rslt A Rslt B Src A (interpret) Src B (interpret) B Load Rslt A Src B Src A Rslt A Rslt B Rslt unknown Rslt unknown Src A Src B Store Load Src A Rslt A Store Src B Rslt B Store Value in format Value uninterpreted (binary encoding) Value in format Value unknown58 MIPS32™ Architecture For Programmers Volume I, Revision 2.00 Copyright © 2001-2003 MIPS Technologies Inc. All rights reserved. 5.8 FPU Exceptions5.8.0.1 Precise Exception Mode In precise exception mode, a trap occurs before the instruction that causes the trap, or any following instruction, can complete and write its results. If desired, the software trap handler can resume execution of the interrupted instruction stream after handling the exception. The Cause field reports per-bit instruction exception conditions. The Cause bits are written during each floating point arithmetic operation to show any exception conditions that arise during the operation. The bit is set to 1 if the corresponding exception condition arises; otherwise it is set to 0. A floating point trap is generated any time both a Cause bit and its corresponding Enable bit are set. This occurs either during the execution of a floating point operation or by moving a value into the FCSR. There is no Enable for Unimplemented Operation; this exception always generates a trap. In a trap handler, exception conditions that arise during any trapped floating point operations are reported in the Cause field. Before returning from a floating point interrupt or exception, or before setting Cause bits with a move to the FCSR, software must first clear the enabled Cause bits by executing a move to FCSR to prevent the trap from being erroneously retaken. User-mode programs cannot observe enabled Cause bits being set. If this information is required in a User-mode handler, it must be available someplace other than through the Status register. If a floating point operation sets only non-enabled Cause bits, no trap occurs and the default result defined by the IEEE standard is stored (see Table 5-11). When a floating point operation does not trap, the program can monitor the exception conditions by reading the Cause field. The Flag field is a cumulative report of IEEE exception conditions that arise as instructions complete; instructions that trap do not update the Flag bits. The Flag bits are set to 1 if the corresponding IEEE exception is raised, otherwise the bits are unchanged. There is no Flag bit for the MIPS Unimplemented Operation exception. The Flag bits are never cleared as a side effect of floating point operations, but may be set or cleared by moving a new value into the FCSR. Addressing exceptions are precise. 5.8.1 Exception Conditions The following five exception conditions defined by the IEEE standard are described in this section: • “Invalid Operation Exception” • “Division By Zero Exception” • “Underflow Exception” • “Overflow Exception” • “Inexact Exception” This section also describes a MIPS-specific exception condition, Unimplemented Operation, that is used to signal a need for software emulation of an instruction. Normally an IEEE arithmetic operation can cause only one exception condition; the only case in which two exceptions can occur at the same time are Inexact With Overflow and Inexact With Underflow. At the program’s direction, an IEEE exception condition can either cause a trap or not cause a trap. The IEEE standard specifies the result to be delivered in case the exception is not enabled and no trap is taken. The MIPS architecture supplies these results whenever the exception condition does not result in a precise trap (that is, no trap or an impreciseMIPS32™ Architecture For Programmers Volume I, Revision 2.00 59 Copyright © 2001-2003 MIPS Technologies Inc. All rights reserved. Chapter 5 Overview of the FPU Instruction SetThere is no Enable bit for this condition; it always causes a trap. After the appropriate emulation or other operation is done in a software exception handler, the original instruction stream can be continued. 5.9 FPU Instructions The FPU instructions comprise the following functional groups: • “Data Transfer Instructions” • “Arithmetic Instructions” • “Conversion Instructions” • “Formatted Operand-Value Move Instructions” • “Conditional Branch Instructions” • “Miscellaneous Instructions” 5.9.1 Data Transfer Instructions The FPU has two separate register sets: coprocessor general registers and coprocessor control registers. The FPU has a load/store architecture; all computations are done on data held in coprocessor general registers. The control registers are used to control FPU operation. Data is transferred between registers and the rest of the system with dedicated load, store, and move instructions. The transferred data is treated as unformatted binary data; no format conversions are performed, and therefore no IEEE floating point exceptions can occur. The supported transfer operations are listed in Table 5-12. 5.9.1.1 Data Alignment in Loads, Stores, and Moves All coprocessor loads and stores operate on naturally-aligned data items. An attempt to load or store to an address that is not naturally aligned for the data item causes an Address Error exception. Regardless of byte-ordering (the endianness), the address of a word or doubleword is the smallest byte address in the object. For a big-endian machine, this is the most-significant byte; for a little-endian machine, this is the least-significant byte (endianness is described in “Byte Ordering and Endianness” on page 21). 5.9.1.2 Addressing Used in Data Transfer Instructions The FPU has loads and stores using the same register+offset addressing as that used by the CPU. Moreover, for the FPU only, there are load and store instructions using register+register addressing. Table 5-12 FPU Data Transfer Instructions Transfer Direction Data Transferred FPU general reg ↔ Memory Word/doubleword load/store FPU general reg ↔ CPU general reg Word move FPU control reg ↔ CPU general reg Word move62 MIPS32™ Architecture For Programmers Volume I, Revision 2.00 Copyright © 2001-2003 MIPS Technologies Inc. All rights reserved. 5.9 FPU InstructionsTables 5-13 through 5-15 list the FPU data transfer instructions. 5.9.2 Arithmetic Instructions Arithmetic instructions operate on formatted data values. The results of most floating point arithmetic operations meet the IEEE standard specification for accuracy—a result is identical to an infinite-precision result that has been rounded to the specified format, using the current rounding mode. The rounded result differs from the exact result by less than one unit in the least-significant place (ULP). Table 5-13 FPU Loads and Stores Using Register+Offset Address Mode Mnemonic Instruction Defined in MIPS ISA LDC1 Load Doubleword to Floating Point MIPS32 LWC1 Load Word to Floating Point MIPS32 SDC1 Store Doubleword to Floating Point MIPS32 SWC1 Store Word to Floating Point MIPS32 Table 5-14 FPU Loads and Using Register+Register Address Mode Mnemonic Instruction Defined in MIPS ISA LDXC1 Load Doubleword Indexed to Floating Point MIPS64 MIPS32 Release 2 LUXC1 Load Doubleword Indexed Unaligned to Floating Point MIPS64 MIPS32 Release 2 LWXC1 Load Word Indexed to Floating Point MIPS64 MIPS32 Release 2 SDXC1 Store Doubleword Indexed to Floating Point MIPS64 MIPS32 Release 2 SUXC1 Store Doubleword Indexed Unaligned to Floating Point MIPS64 MIPS32 Release 2 SWXC1 Store Word Indexed to Floating Point MIPS64 MIPS32 Release 2 Table 5-15 FPU Move To and From Instructions Mnemonic Instruction Defined in MIPS ISA CFC1 Move Control Word From Floating Point MIPS32 CTC1 Move Control Word To Floating Point MIPS32 MFC1 Move Word From Floating Point MIPS32 MFHC1 Move Word from High Half of Floating Point Register MIPS32 Release 2 MTC1 Move Word To Floating Point MIPS32 MTHC1 Move Word to High Half of Floating Point Register MIPS32 Release 2MIPS32™ Architecture For Programmers Volume I, Revision 2.00 63 Copyright © 2001-2003 MIPS Technologies Inc. All rights reserved. Chapter 5 Overview of the FPU Instruction SetFPU IEEE-approximate arithmetic operations are listed in Table 5-16. Two operations, Reciprocal Approximation (RECIP) and Reciprocal Square Root Approximation (RSQRT), may be less accurate than the IEEE specification: • The result of RECIP differs from the exact reciprocal by no more than one ULP. • The result of RSQRT differs from the exact reciprocal square root by no more than two ULPs. Within these error limits, the results of these instructions are implementation specific. A list of FPU-approximate arithmetic operations is given in Table 5-17.. Four compound-operation instructions perform variations of multiply-accumulate—that is, multiply two operands, accumulate the result to a third operand, and produce a result. These instructions are listed in Table 5-18. The product is rounded according to the current rounding mode prior to the accumulation. This model meets the IEEE accuracy specification; the result is numerically identical to an equivalent computation using multiply, add, subtract, or negate instructions. Table 5-16 FPU IEEE Arithmetic Operations Mnemonic Instruction Defined in MIPS ISA ABS.fmt Floating Point Absolute Value MIPS32 ABS.fmt (PS) Floating Point Absolute Value (Paired Single) MIPS64 MIPS32 Release 2 ADD.fmt Floating Point Add MIPS32 ADD.fmt (PS) Floating Point Add (Paired Single) MIPS64 MIPS32 Release 2 C.cond.fmt Floating Point Compare MIPS32 C.cond.fmt (PS) Floating Point Compare (Paired Single) MIPS64 MIPS32 Release 2 DIV.fmt Floating Point Divide MIPS32 MUL.fmt Floating Point Multiply MIPS32 MUL.fmt (PS) Floating Point Multiply (Paired Single) MIPS64 MIPS32 Release 2 NEG.fmt Floating Point Negate MIPS32 NEG.fmt (PS) Floating Point Negate (Paired Single) MIPS64 MIPS32 Release 2 SQRT.fmt Floating Point Square Root MIPS32 SUB.fmt Floating Point Subtract MIPS32 SUB.fmt (PS) Floating Point Subtract (Paired Single) MIPS64 MIPS32 Release 2 Table 5-17 FPU-Approximate Arithmetic Operations Mnemonic Instruction Defined in MIPS ISA RECIP.fmt Floating Point Reciprocal Approximation MIPS64 MIPS32 Release 2 RSQRT.fmt Floating Point Reciprocal Square Root Approximation MIPS64 MIPS32 Release 264 MIPS32™ Architecture For Programmers Volume I, Revision 2.00 Copyright © 2001-2003 MIPS Technologies Inc. All rights reserved. 5.9 FPU Instructions5.9.5 Conditional Branch Instructions The FPU has PC-relative conditional branch instructions that test condition codes set by FPU compare instructions (C.cond.fmt). All branches have an architectural delay of one instruction. When a branch is taken, the instruction immediately following the branch instruction is said to be in the branch delay slot, and it is executed before the branch to the target instruction takes place. Conditional branches come in two versions, depending upon how they handle an instruction in the delay slot when the branch is not taken and execution falls through: • Branch instructions execute the instruction in the delay slot. • Branch likely instructions do not execute the instruction in the delay slot if the branch is not taken (they are said to nullify the instruction in the delay slot). Although the Branch Likely instructions are included in this specification, software is strongly encouraged to avoid the use of the Branch Likely instructions, as they will be removed from a future revision of the MIPS Architecture. The MIPS32 Architecture defines eight condition codes for use in compare and branch instructions. For backward compatibility with previous revision of the ISA, condition code bit 0 and condition code bits 1 thru 7 are in discontiguous fields in FCSR. Table 5-24 lists the conditional branch (branch and branch likely) FPU instructions; Table 5-25 lists the deprecated conditional branch likely instructions. Table 5-23 FPU Conditional Move on Zero/Nonzero Instructions Mnemonic Instruction Defined in MIPS ISA MOVN.fmt Floating Point Move Conditional on Nonzero MIPS32 MOVN.fmt (PS) Floating Point Move Conditional on Nonzero(Paired Single) MIPS64 MIPS32 Release 2 MOVZ.fmt Floating Point Move Conditional on Zero MIPS32 MOVZ.fmt (PS) Floating Point Move Conditional on Zero(Paired Single) MIPS64 MIPS32 Release 2 Table 5-24 FPU Conditional Branch Instructions Mnemonic Instruction Defined in MIPS ISA BC1F Branch on FP False MIPS32 BC1T Branch on FP True MIPS32 Table 5-25 Deprecated FPU Conditional Branch Likely Instructions Mnemonic Instruction Defined in MIPS ISA BC1FL Branch on FP False Likely MIPS32 BC1TL Branch on FP True Likely MIPS32MIPS32™ Architecture For Programmers Volume I, Revision 2.00 67 Copyright © 2001-2003 MIPS Technologies Inc. All rights reserved. Chapter 5 Overview of the FPU Instruction Set5.9.6 Miscellaneous Instructions The MIPS ISA defines various miscellaneous instructions that conditionally move one CPU general register to another, based on an FPU condition code. It also defines an instruction to align a misaligned pair of paired-single values (ALNV.PS) and a quartet of instructions that merge a pair of paired-single values (PLL.PS, PLU.PS, PUL.PS, PUU.PS). Table 5-26 lists these conditional move instructions. 5.10 Valid Operands for FPU Instructions The floating point unit arithmetic, conversion, and operand move instructions operate on formatted values with different precision and range limits and produce formatted values for results. Each representable value in each format has a binary encoding that is read from or stored to memory. The fmt or fmt3 field of the instruction encodes the operand format required for the instruction. A conversion instruction specifies the result type in the function field; the result of other operations is given in the same format as the operands. The encodings of the fmt and fmt3 field are shown in Table 5-27. Table 5-26 CPU Conditional Move on FPU True/False Instructions Mnemonic Instruction Defined in MIPS ISA ALNV.PS FP Align Variable MIPS64 MIPS32 Release 2 MOVN Move Conditional on FP False MIPS32 MOVZ Move Conditional on FP True MIPS32 PLL.PS Pair Lower Lower MIPS64 MIPS32 Release 2 PLU.PS Pair Lower Upper MIPS64 MIPS32 Release 2 PUL.PS Pair Upper Lower MIPS64 MIPS32 Release 2 PUU.PS Pair Upper Upper MIPS64 MIPS32 Release 2 Table 5-27 FPU Operand Format Field (fmt, fmt3) Encoding fmt fmt3 Instruction Mnemonic Size Data TypeName Bits 0-15 - Reserved 16 0 S single 32 Floating point 17 1 D double 64 Floating point 18-19 2-3 Reserved 20 4 W word 32 Fixed point 21 5 L long 64 Fixed point 22 6 PS paired single 64 Floating point 23–31 7 Reserved68 MIPS32™ Architecture For Programmers Volume I, Revision 2.00 Copyright © 2001-2003 MIPS Technologies Inc. All rights reserved. 5.10 Valid Operands for FPU InstructionsThe result of an instruction using operand formats marked U in Table 5-28 is not currently specified by this architecture and causes a Reserved Instruction exception. Table 5-28 Valid Formats for FPU Operations Mnemonic Operation Operand Fmt COP1 Function Value COP1X op4 Value Float Fixed S D P S W L ABS Absolute value • • • U U 5 ADD Add • • • U U 0 C.cond Floating Point compare • • • U U 48–63 CEIL.L, (CEIL.W) Convert to longword (word) fixed point, round toward +∞ • • U U U 10 (14) CVT.D Convert to double floating point • U U • • 33 CVT.L Convert to longword fixed point • • U U U 37 CVT.S Convert to single floating point U • U • • 32 CVT. PU, PL Convert to single floating point (paired upper, pairedlower) U U • U U 32, 40 CVT.W Convert to 32-bit fixed point • • U U U 36 DIV Divide • • U U U 3 FLOOR.L, (FLOOR.W) Convert to longword (word) fixed point, round toward -∞ • • U U U 11 (15) MADD Multiply-Add • • • U U 4 MOV Move Register • • • U U 6 MOVC FP Move conditional on condition • • • U U 17 MOVN FP Move conditional on GPR≠zero • • • U U 19 MOVZ FP Move conditional on GPR=zero • • • U U 18 MSUB Multiply-Subtract • • • U U 5 MUL Multiply • • • U U 2 NEG Negate • • • U U 7 NMADD Negative Multiply-Add • • • U U 6 NMSUB Negative Multiply-Subtract • • • U U 7 PLL, PLU, PUL, PUU Pair (Lower Lower, Lower Upper, Upper Lower, Upper Upper) U U • U U 44-47 RECIP Reciprocal Approximation • • U U U 21 ROUND.L, (ROUND.W) Convert to longword (word) fixed point, round to nearest/even • • U U U 8 (12)MIPS32™ Architecture For Programmers Volume I, Revision 2.00 69 Copyright © 2001-2003 MIPS Technologies Inc. All rights reserved. Chapter 5 Overview of the FPU Instruction SetFigure 5-23 Four-Register Formatted Arithmetic FPU Instruction Format Figure 5-24 Register Index FPU Instruction Format Figure 5-25 Register Index Hint FPU Instruction Format Figure 5-26 Condition Code, Register Integer FPU Instruction Format 31 26 25 21 20 16 15 11 10 6 5 0 COP1X fr ft fs fd op4 fmt3 6 5 5 5 5 3 3 Register-4: Four-register formatted arithmetic operations 31 26 25 21 20 16 15 11 10 6 5 0 COP1X base index 0 fd function 6 5 5 5 5 6 Register Index: Load and Store using register + register addressing 31 26 25 21 20 16 15 11 10 6 5 0 COP1X base index hint 0 PREFX 6 5 5 5 5 6 Register Index Hint: Prefetch using register + register addressing 31 26 25 21 20 18 17 16 15 11 10 6 5 0 SPECIAL rs cc 0 tf rd 0 MOVCI 6 5 3 1 1 5 5 6 Condition Code, Register Integer: CPU register move-conditional on FP, cc Table 5-29 FPU Instruction Format Fields Field Description BC1 Branch Conditional instruction subcode (op=COP1). base CPU register: base address for address calculations. COP1 Coprocessor 1 primary opcode value in op field. COP1X Coprocessor 1 eXtended primary opcode value in op field. cc Condition Code specifier; for architectural levels prior to MIPS IV, this must be set to zero. fd FPU register: destination (arithmetic, loads, move-to) or source (stores, move-from). fmt Destination and/or operand type (format) specifier. fr FPU register: source. fs FPU register: source. ft FPU register: source (for stores, arithmetic) or destination (for loads). function Field specifying a function within a particular op operation code.72 MIPS32™ Architecture For Programmers Volume I, Revision 2.00 Copyright © 2001-2003 MIPS Technologies Inc. All rights reserved. 5.11 FPU Instruction Formatsfunction: op4 + fmt3 op4 is a 3-bit function field specifying a 4-register arithmetic operation for COP1X. fmt3 is a 3-bit field specifying the format of the operands and destination. The combinations are shown as distinct instructions in the opcode tables. hint Hint field made available to cache controller for prefetch operation. index CPU register that holds the index address component for address calculations. MOVC Value in function field for a conditional move. There is one value for the instruction whenop=COP1, another value for the instruction when op=SPECIAL. nd Nullify delay. If set, the branch is Likely, and the delay slot instruction is not executed. offset Signed offset field used in address calculations. op Primary operation code (see COP1, COP1X, LWC1, SWC1, LDC1, SDC1, SPECIAL). PREFX Value in function field for prefetch instruction when op=COP1X. rd CPU register: destination. rs CPU register: source. rt CPU register: can be either source or destination. SPECIAL SPECIAL primary opcode value in op field. sub Operation subcode field for COP1 register immediate-mode instructions. tf True/False. The condition from an FP compare that is tested for equality with the tf bit. Table 5-29 FPU Instruction Format Fields (Continued) Field DescriptionMIPS32™ Architecture For Programmers Volume I, Revision 2.00 73 Copyright © 2001-2003 MIPS Technologies Inc. All rights reserved. Chapter 5 Overview of the FPU Instruction Set74 MIPS32™ Architecture For Programmers Volume I, Revision 2.00 Copyright © 2001-2003 MIPS Technologies Inc. All rights reserved. A.2 Instruction Bit Encoding Tablesθ Operation or field codes marked with this symbol are available to licensed MIPS partners. To avoid multiple conflicting instruction definitions, MIPS Technologies will assist the partner in selecting appropriate encodings if requested by the partner. The partner is not required to consult with MIPS Technologies when one of these encodings is used. If no instruction is encoded with this value, executing such an instruction must cause a Reserved Instruction Exception (SPECIAL2 encodings or coprocessor instruction encodings for a coprocessor to which access is allowed) or a Coprocessor Unusable Exception (coprocessor instruction encodings for a coprocessor to which access is not allowed). σ Field codes marked with this symbol represent an EJTAG support instruction and implementation of this encoding is optional for each implementation. If the encoding is not implemented, executing such an instruction must cause a Reserved Instruction Exception. If the encoding is implemented, it must match the instruction encoding as shown in the table. ε Operation or field codes marked with this symbol are reserved for MIPS Application Specific Extensions. If the ASE is not implemented, executing such an instruction must cause a Reserved Instruction Exception. φ Operation or field codes marked with this symbol are obsolete and will be removed from a futurerevision of the MIPS32 ISA. Software should avoid using these operation or field codes. ⊕ Operation or field codes marked with this symbol are valid for Release 2 implementations of the architecture. Executing such an instruction in a Release 1 implementation must cause a Reserved Instruction Exception. Table A-2 MIPS32 Encoding of the Opcode Field opcode bits 28..26 0 1 2 3 4 5 6 7 bits 31..29 000 001 010 011 100 101 110 111 0 000 SPECIAL δ REGIMM δ J JAL BEQ BNE BLEZ BGTZ 1 001 ADDI ADDIU SLTI SLTIU ANDI ORI XORI LUI 2 010 COP0 δ COP1 δ COP2 θδ COP1X1 δ 1. In Release 1 of the Architecture, the COP1X opcode was called COP3, and was available as another user-available coprocessor. In Release 2 of the Architecture, a full 64-bit floating point unit is available with 32-bit CPUs, and the COP1X opcode is reserved for that purpose on all Release 2 CPUs. 32-bit implementations of Release 1 of the architecture are strongly discouraged from using this opcode for a user-available coprocessor as doing so will limit the potential for an upgrade path to a 64-bit floating point unit. BEQL φ BNEL φ BLEZL φ BGTZL φ 3 011 β β β β SPECIAL2 δ JALX ε ε SPECIAL3 2 δ⊕ 2. Release 2 of the Architecture added the SPECIAL3 opcode. Implementations of Release 1 of the Architecture signaled a Reserved Instruction Exception for this opcode. 4 100 LB LH LWL LW LBU LHU LWR β 5 101 SB SH SWL SW β β SWR CACHE 6 110 LL LWC1 LWC2 θ PREF β LDC1 LDC2 θ β 7 111 SC SWC1 SWC2 θ * β SDC1 SDC2 θ β Table A-1 Symbols Used in the Instruction Encoding Tables Symbol MeaningMIPS32™ Architecture For Programmers Volume I, Revision 2.00 77 Copyright © 2001-2003 MIPS Technologies Inc. All rights reserved. Appendix A Instruction Bit EncodingsTable A-3 MIPS32 SPECIAL Opcode Encoding of Function Field function bits 2..0 0 1 2 3 4 5 6 7 bits 5..3 000 001 010 011 100 101 110 111 0 000 SLL1 1. Specific encodings of the rt, rd, and sa fields are used to distinguish among the SLL, NOP, SSNOP and EHB functions. MOVCI δ SRL δ SRA SLLV * SRLV δ SRAV 1 001 JR2 2. Specific encodings of the hint field are used to distinguish JR from JR.HB and JALR from JALR.HB JALR2 MOVZ MOVN SYSCALL BREAK * SYNC 2 010 MFHI MTHI MFLO MTLO β * β β 3 011 MULT MULTU DIV DIVU β β β β 4 100 ADD ADDU SUB SUBU AND OR XOR NOR 5 101 * * SLT SLTU β β β β 6 110 TGE TGEU TLT TLTU TEQ * TNE * 7 111 β * β β β * β β Table A-4 MIPS32 REGIMM Encoding of rt Field rt bits 18..16 0 1 2 3 4 5 6 7 bits 20..19 000 001 010 011 100 101 110 111 0 00 BLTZ BGEZ BLTZL φ BGEZL φ * * * * 1 01 TGEI TGEIU TLTI TLTIU TEQI * TNEI * 2 10 BLTZAL BGEZAL BLTZALL φ BGEZALL φ * * * * 3 11 * * * * * * * SYNCI ⊕ Table A-5 MIPS32 SPECIAL2 Encoding of Function Field function bits 2..0 0 1 2 3 4 5 6 7 bits 5..3 000 001 010 011 100 101 110 111 0 000 MADD MADDU MUL θ MSUB MSUBU θ θ 1 001 θ θ θ θ θ θ θ θ 2 010 θ θ θ θ θ θ θ θ 3 011 θ θ θ θ θ θ θ θ 4 100 CLZ CLO θ θ β β θ θ 5 101 θ θ θ θ θ θ θ θ 6 110 θ θ θ θ θ θ θ θ 7 111 θ θ θ θ θ θ θ SDBBP σ Table A-6 MIPS32 SPECIAL31 Encoding of Function Field for Release 2 of the Architecture function bits 2..0 0 1 2 3 4 5 6 7 bits 5..3 000 001 010 011 100 101 110 111 0 000 EXT ⊕ β β β INS ⊕ β β β 1 001 * * * * * * * * 2 010 * * * * * * * * 3 011 * * * * * * * * 4 100 BSHFL ⊕δ * * * β * * * 5 101 * * * * * * * * 6 110 * * * * * * * * 7 111 * * * RDHWR ⊕ * * * *78 MIPS32™ Architecture For Programmers Volume I, Revision 2.00 Copyright © 2001-2003 MIPS Technologies Inc. All rights reserved. A.2 Instruction Bit Encoding Tables1. Release 2 of the Architecture added the SPECIAL3 opcode. Implementations of Release 1 of the Architecture signaled a Reserved Instruction Exception for this opcode and all function field values shown above. Table A-7 MIPS32 MOVCI Encoding of tf Bit tf bit 16 0 1 MOVF MOVT Table A-8 MIPS321 SRL Encoding of Shift/Rotate 1. Release 2 of the Architecture added the ROTR instruction. Implementations of Release 1 of the Architecture ig- nored bit 21 and treated the instruc- tion as an SRL R bit 21 0 1 SRL ROTR Table A-9 MIPS321 SRLV Encoding of Shift/Rotate 1. Release 2 of the Architecture added the ROTRV instruction. Implementa- tions of Release 1 of the Architecture ignored bit 6 and treated the instruc- tion as an SRLV R bit 6 0 1 SRLV ROTRV Table A-10 MIPS32 BSHFL Encoding of sa Field1 1. The sa field is sparsely decoded to identify the final instructions. Entries in this table with no mnemonic are reserved for future use by MIPS Technologies and may or may not cause a Reserved Instruction exception. sa bits 8..6 0 1 2 3 4 5 6 7 bits 10..9 000 001 010 011 100 101 110 111 0 00 WSBH 1 01 2 10 SEB 3 11 SEH Table A-11 MIPS32 COP0 Encoding of rs Field rs bits 23..21 0 1 2 3 4 5 6 7 bits 25..24 000 001 010 011 100 101 110 111 0 00 MFC0 β * * MTC0 β * * 1 01 * * RDPGPR ⊕ MFMC01 δ⊕ * * WRPGPR ⊕ * 2 10 C0 δ 3 11MIPS32™ Architecture For Programmers Volume I, Revision 2.00 79 Copyright © 2001-2003 MIPS Technologies Inc. All rights reserved. Appendix A Instruction Bit EncodingsA.3 Floating Point Unit Instruction Format Encodings Instruction format encodings for the floating point unit are presented in this section. This information is a tabular presentation of the encodings described in tables Table A-13 and Table A-20 above. Table A-19 MIPS32 COP2 Encoding of rs Field rs bits 23..21 0 1 2 3 4 5 6 7 bits 25..24 000 001 010 011 100 101 110 111 0 00 MFC2 θ β CFC2 θ MFHC2 θ⊕ MTC2 θ β CTC2 θ MTHC2 θ⊕ 1 01 BC2 θ * * * * * * * 2 10 C2 θδ 3 11 Table A-20 MIPS64 COP1X Encoding of Function Field1 1. COP1X instructions are legal only if 64-bit floating point operations are enabled. function bits 2..0 0 1 2 3 4 5 6 7 bits 5..3 000 001 010 011 100 101 110 111 0 000 LWXC1 ∇ LDXC1 ∇ * * * LUXC1 ∇ * * 1 001 SWXC1 ∇ SDXC1 ∇ * * * SUXC1 ∇ * PREFX ∇ 2 010 * * * * * * * * 3 011 * * * * * * ALNV.PS ∇ * 4 100 MADD.S ∇ MADD.D ∇ * * * * MADD.PS ∇ * 5 101 MSUB.S ∇ MSUB.D ∇ * * * * MSUB.PS ∇ * 6 110 NMADD.S ∇ NMADD.D ∇ * * * * NMADD.PS ∇ * 7 111 NMSUB.S ∇ NMSUB.D ∇ * * * * NMSUB.PS ∇ * Table A-21 Floating Point Unit Instruction Format Encodings fmt field (bits 25..21 of COP1 opcode) fmt3 field (bits 2..0 of COP1X opcode) Mnemonic Name Bit Width Data TypeDecimal Hex Decimal Hex 0..15 00..0F — — Used to encode Coprocessor 1 interface instructions (MFC1,CTC1, etc.). Not used for format encoding. 16 10 0 0 S Single 32 FloatingPoint 17 11 1 1 D Double 64 FloatingPoint 18..19 12..13 2..3 2..3 Reserved for future use by the architecture. 20 14 4 4 W Word 32 Fixed Point 21 15 5 5 L Long 64 Fixed Point 22 16 6 6 PS PairedSingle 2 × 32 Floating Point 23 17 7 7 Reserved for future use by the architecture.82 MIPS32™ Architecture For Programmers Volume I, Revision 2.00 Copyright © 2001-2003 MIPS Technologies Inc. All rights reserved. A.3 Floating Point Unit Instruction Format Encodings24..31 18..1F — — Reserved for future use by the architecture. Not available forfmt3 encoding. Table A-21 Floating Point Unit Instruction Format Encodings fmt field (bits 25..21 of COP1 opcode) fmt3 field (bits 2..0 of COP1X opcode) Mnemonic Name Bit Width Data TypeDecimal Hex Decimal HexMIPS32™ Architecture For Programmers Volume I, Revision 2.00 83 Copyright © 2001-2003 MIPS Technologies Inc. All rights reserved. Appendix A Instruction Bit Encodings84 MIPS32™ Architecture For Programmers Volume I, Revision 2.00 Copyright © 2001-2003 MIPS Technologies Inc. All rights reserved.

Documents

questions

Introduction to the MIPS 32 Architecture - Architecture for Programmers Volume II | CS 1541, Papers of Computer Science

Related documents

Partial preview of the text