- MIPS architecture
MIPS (originally an acronym for Microprocessor without Interlocked Pipeline Stages) is a RISC microprocessor architecture developed by
MIPS Technologies. As of|1999|alt=By the late 1990s it was estimated that one in three RISC chips produced were MIPS-based designs.Fact|date=February 2007
MIPS designs are currently primarily used in many
embedded systems such as the Series2 TiVo, Windows CEdevices, Cisco routers, Foneras, Avaya, and video game consoles like the Nintendo 64and Sony PlayStation, PlayStation 2, and PlayStation Portablehandheld system. Until late 2006 they were also used in many of SGI's computer products.
The early MIPS architectures were 32-bit implementations (generally 32-bit wide registers and data paths), while later versions were 64-bit implementations. Multiple revisions of the MIPS
instruction setexist, including MIPS I, MIPS II, MIPS III, MIPS IV, MIPS V, MIPS32, and MIPS64. The current revisions are MIPS32 (for 32-bit implementations) and MIPS64 (for 64-bit implementations). MIPS32 and MIPS64 define a control register set as well as the instruction set. Several "add-on" extensions are also available, including MIPS-3D which is a simple set of floating-point SIMDinstructions dedicated to common 3D tasks, MDMX(MaDMaX) which is a more extensive integer SIMDinstruction set using the 64-bit floating-point registers, MIPS16e which adds compression to the instruction stream to make programs take up less room (allegedly a response to the Thumb encoding in the ARM architecture), and the recent addition of MIPS MT, new multithreadingadditions to the system similar to HyperThreadingin the Intel's Pentium 4 processors. Computer architecturecourses in universities and technical schools often study the MIPS architecture. The design of the MIPS CPU family greatly influenced later RISCarchitectures such as DEC Alpha.
1981, a team led by John L. Hennessyat Stanford Universitystarted work on what would become the first MIPS processor. The basic concept was to dramatically increase performance through the use of deep instruction pipelines, a technique that was well known, but difficult to implement. CPUs are built up from a number of dedicated subunits known as modules or units. Typical modules include the load/store unit which handles external memory, the ALU which handles basic integer math and logic, or the FPU that handles floating point math. In a traditional design, each instruction flows from unit to unit until it is complete, at which point the next instruction is read in and the cycle continues. Generally in a pipeline architecture, successive instructions in a program sequence will overlap in execution. Instead of waiting for the instruction to complete, each unit inside the CPU will fetch and start executing an instruction before the preceding instruction is complete. For instance, as soon as a math instruction fed into the floating point module, the load/store unit can start loading up the data needed by the next instruction.
One major barrier to pipelining was that not all instructions can be handed off in this fashion. Some instructions, like a floating point division, take longer to complete and the CPU has to wait before passing the next instruction into the system. The normal solution to this problem was to use a series of interlocks that allowed the modules to indicate they were still busy, pausing the other modules upstream. Hennessy's team viewed this interlocks as a major performance barrier moving forward; since they had to communicate to all the modules in the CPU, communications time was an issue and this appeared to limit increases in clock speed. A major aspect of the MIPS design was to fit every sub-phase (including memory access) of all instructions into one cycle, thereby removing any needs for interlocking, and permitting a single cycle throughput.
Although this design eliminated a number of useful instructions, notably things like multiply and divide which would take multiple execution steps, it was felt that the overall performance of the system would be dramatically improved because the chips could run at much higher clock rates. This ramping of the speed would be difficult with interlocking involved, as the time needed to set up locks is as much a function of die size as clock rate: adding the needed hardware might actually slow down the overall speed. The elimination of these instructions became a contentious point. Many observers claimed the design (and RISC in general) would never live up to its hype. If one simply replaces the complex multiply instruction with many simpler additions, where is the speed increase? This overly-simple analysis ignored the fact that the speed of the design was in the pipelines, not the instructions.
The other difference between the MIPS design and the competing Stanford RISC involved the handling of
subroutinecalls. RISC used a technique called register windows to improve performance of these very common tasks, but in using hardware to do this they locked in the number of calls that could be supported. Each subroutine call required its own set of registers, which in turn required more real estate on the CPU and more complexity in its design. Hennessy felt that a careful compiler could find free registers without resorting to a hardware implementation, and that simply increasing the number of registers would not only make this simple, but increase the performance of all tasks.
In other ways the MIPS design was very much in keeping with the overall RISC design philosophy. To improve overall performance, RISC designs reduce the number of instructions in order to use fewer bits to encode them - in the MIPS design the instructions normally require only 5 bits of the 32-bit word. The rest of the space in the instruction word are used as storage, either for pointers to addresses in main memory, or as direct storage for small numbers. This allows a RISC CPU to load up the instruction and the data it needs in a single operation, whereas older designs, the
MOS Technology 6502for instance, would require separate cycles to load the instructions and data. This change is one of the major performance improvements that RISC offers.
In 1984 Hennessy was convinced of the future commercial potential of the design, and left Stanford to form MIPS Computer Systems. They released their first design, the R2000, in 1985, improving the design as the R3000 in 1988. These 32-bit CPUs formed the basis of their company through the 1980s, used primarily in SGI's series of
workstations. These commercial designs deviated from the Stanford academic research by implementing most of the interlocks in hardware, supplying full multiply and divide instructions (among others).
In 1991 MIPS released the first 64-bit microprocessor, the R4000. However, MIPS had financial difficulties while bringing it to market. The design was so important to SGI, at the time one of MIPS' few major customers, that SGI bought the company outright in 1992 in order to guarantee the design would not be lost. As a subsidiary of SGI, the company became known as
In the early 1990s MIPS started licensing their designs to third-party vendors. This proved fairly successful due to the simplicity of the core, which allowed it to be used in a number of applications that would have formerly used much less capable CISC designs of similar
gate countand price -- the two are strongly related; the price of a CPU is generally related to the number of gates and the number of external pins. Sun Microsystemsattempted to enjoy similar success by licensing their SPARCcore but was not nearly as successful. By the late 1990s MIPS was a powerhouse in the embedded processorfield, and in 1997 the 48-millionth MIPS-based CPU shipped, making it the first RISC CPU to outship the famous 68kfamily. MIPS was so successful that SGI spun-off MIPS Technologies in 1998. Fully half of MIPS' income today comes from licensing their designs, while much of the rest comes from contract design work on cores that will then be produced by third parties.
In 1999 MIPS formalized their licensing system around two basic designs, the 32-bit MIPS32 (based on MIPS II with some additional features from MIPS III, MIPS IV, and MIPS V) and the 64-bit MIPS64 (based on MIPS V). NEC,
Toshibaand SiByte(later acquired by Broadcom) each obtained licenses for the MIPS64 as soon as it was announced. Philips, and IDT have since joined them. Success followed success, and today the MIPS cores are one of the most-used "heavyweight" cores in the marketplace for computer-like devices ( hand-held computers, set-top boxes, etc.), with other designers fighting it out for other niches. Some indication of their success is the fact that Freescale(spun-off by Motorola) uses MIPS cores in their set-top box designs, instead of their own PowerPC-based cores.
Since the MIPS architecture is licensable, it has attracted several processor start-up companies over the years. One of the first start-ups to design MIPS processors was
Quantum Effect Devices(see next section). The MIPS design team that designed the R4300 started the company SandCraft, which designed the R5432 for NEC and later produced the SR71000, one of the first out-of-order executionprocessors for the embedded market. The original DEC StrongARMteam eventually split into two MIPS-based start-ups: SiByte which produced the SB-1250, one of the first high-performance MIPS-based systems-on-a-chip (SOC); while Alchemy Semiconductor(later acquired by AMD) produced the Au-1000 SoC for low-power applications. Lexraused a MIPS-"like" architecture and added DSP extensions for the audio chip market and multithreadingsupport for the networking market. Due to Lexra not licensing the architecture, two lawsuits were started between the two companies. The first was quickly resolved when Lexra promised not to advertise their processors as MIPS-compatible. The second (about MIPS patent 4814976 for handling unaligned memory access) was protracted, hurt both companies' business, and culminated in MIPS Technologies giving Lexra a free license and a large cash payment.
Two companies have emerged that specialize in building Multi-core devices using the MIPS architecture.
Raza Microelectronics Incpurchased the product line from failing Sandcraft and later produced devices that contained 8 CPU cores that were targeted at the telecom and networking markets. Cavium Networks, originally a security processor vendor also produced devices with 8 CPU cores for the same markets. Both of these companies designed their cores in-house, just licensing the architecture instead of purchasing cores from MIPS.
Losing the Desktop
Among the manufacturers which have made computer
workstationsystems using MIPS processors are SGI, MIPS Computer Systems, Inc., Whitechapel Workstations, Olivetti, Siemens-Nixdorf, Acer, Digital Equipment Corporation, NEC, and DeskStation. Operating systems ported to the architecture include SGI's IRIX, Microsoft's Windows NT(until v4.0), Windows CE, Linux, BSD, UNIX System V, SINIXand MIPS Computer Systems' own RISC/os.
There was speculation in the early 1990s that MIPS, and other powerful
RISCprocessors would overtake the Intel IA32architecture. This was encouraged by the support of the first two versions of Microsoft's Windows NTfor DEC Alpha, MIPS and PowerPC- and to a lesser extent the Clipper architectureand SPARC. However, as Intel quickly released faster versions of their Pentiumclass CPUs, Microsoft Windows NTv4.0 dropped support for anything but Intel and Alpha. With SGI's decision to transition to the Itaniumand IA32architectures, use of MIPS processors on the desktop has now disappeared almost completely [ [http://www.sgi.com/support/mips_irix.html SGI announcing the end of MIPS] ] . "See main article Advanced Computing Environment".
Through the 1990s, the MIPS architecture was widely adopted by the embedded market, including for use in
computer networking/ telecommunications, video arcade games, home video game consoles, computer printers, digital set-top boxes, digital televisions, DSL and cable modems, and personal digital assistants.
The low power-consumption and heat characteristics of embedded MIPS implementations, the wide availability of embedded development tools, and knowledge about the architecture means use of MIPS microprocessors in embedded roles is likely to remain common.
Synthesizeable Cores for Embedded Markets
In recent years most of the technology used in the various MIPS generations has been offered as IP-cores (building-blocks) for
embedded processordesigns. Both 32-bitand 64-bitbasic cores are offered, known as the 4K and 5K respectively, and the design itself can be licensed as MIPS32 and MIPS64. These cores can be mixed with add-in units such as FPUs, SIMDsystems, various input/output devices, etc.
MIPS cores have been commercially successful, now being used in many consumer and industrial applications. MIPS cores can be found in newer
Cisco, Linksysand Mikrotik's routerboard routers, cable modems and ADSL modems, smartcards, laser printerengines, set-top boxes, robots, handheld computers, Sony PlayStation 2and Sony PlayStation Portable. In cellphone/PDA applications, the MIPS core has been unable to displace the incumbent, competing ARM core.
Examples of MIPS-powered devices:
BroadcomBCM5352E - WiFirouter processor with 54g WLAN, fast Ethernet, 200 MHz, 16KB ins. 8KB data cache, 256B prefetch cache, MMU, 16-bit 100 MHz SDRAM controller, serial/parallel flash, 5-port 100 Mbit/s Ethernet (switch), 16 GPIO, JTAG, 2xUART, 336-ball BGA. BCM 11xx, 12xx, 14xx - 64bit "SiByte" MIPS line.
MIPS architecture processors include: IDT RC32438;
ATIXilleon; Alchemy Au1000, 1100, 1200; Broadcom Sentry5; RMI XLR7xx, Cavium Octeon CN30xx, CN31xx, CN36xx, CN38xx and CN5xxx; Infineon TechnologiesEasyPort, Amazon, Danube, ADM5120, WildPass, INCA-IP, INCA-IP2; NECEMMA and EMMA2, NEC VR4181A, VR4121, VR4122, VR4181A, VR5432, VR5500; Oak TechnologiesGeneration; PMC-SierraRM11200; QuickLogicQuickMIPS ESP; Toshiba "Donau", ToshibaTMPR492x, TX4925, TX9956, TX7901.
MIPS based Supercomputers
One of the more interesting applications of the MIPS architecture is its use in massive processor count supercomputers.
Silicon Graphics(SGI) refocused its business from desktop graphics workstations to the high performance computing ( HPC) market in the early 1990s. The success of the company's first foray into server systems, the Challenge series based on the R4400 and R8000, and later R10000, motivated SGI to create a vastly more powerful system. The introduction of the integrated R10000 allowed SGI to produce a system, the Origin 2000, eventually scalable to 1024 CPUs using its NUMAlinkcc-NUMA interconnect. The Origin 2000 begat the Origin 3000series which topped out with the same 1024 maximum CPU count but using the R14000 and R16000 chips up to 700 MHz. Its MIPS based supercomputers were withdrawn in 2005 when SGI made the strategic decision to move to Intel's IA-64 architecture.
An HPC startup introduced a radical MIPS based supercomputer in 2007. SiCortex, Inc. has created a tightly integrated
Linuxcluster supercomputer based on the MIPS64 architecture and a high performance interconnect based on the Kautz digraph topology. The system is very power efficient and computationally powerful. The most unique aspect of the system is its multicore processing node which integrates six MIPS64 cores, a crossbar memory controller, interconnect DMA engine, Gigabit Ethernet and PCI Express controllers all on a single chip which consumes only 10 watts of power, yet has a peak floating point performance of 6 GFLOPs. The most powerful configuration, the SC5832, is a single cabinet supercomputer consisting of 972 such node chips for a total of 5832 MIPS64 processor cores and 5.8 teraFLOPS of peak performance.
The first commercial MIPS CPU model, the R2000, was announced in
1985. It added multiple-cycle multiply and divide instructions in a somewhat independent on-chip unit. New instructions were added to retrieve the results from this unit back to the execution core; these result-retrieving instructions were interlocked.
The R2000 could be booted either
big-endianor little-endian. It had thirty-two 32-bit general purpose registers, but no condition code register (the designers considered it a potential bottleneck), a feature it shares with the AMD 29000and the DEC Alpha. Unlike other registers, the program counteris not directly accessible.
The R2000 also had support for up to four co-processors, one of which was built into the main CPU and handled exceptions, traps and memory management, while the other three were left for other uses. One of these could be filled by the optional R2010 FPU, which had thirty-two 32-bit registers that could be used as sixteen 64-bit registers for double-precision.
The R3000 succeeded the R2000 in
1988, adding 32 KB (soon increased to 64 KB) caches for instructions and data, along with cache coherencysupport for multiprocessor use. While there were flaws in the R3000's multiprocessor support, it still managed to be a part of several successful multiprocessor designs. The R3000 also included a built-in MMU, a common feature on CPUs of the era. The R3000, like the R2000, could be paired with a R3010 FPU. The R3000 was the first successful MIPS design in the marketplace, and eventually over one million were made. A speed-bumped version of the R3000 running up to 40 MHz, the R3000A delivered a performance of 32 VUPs (VAX Unit of Performance). The R3000A was the processor used in the extremely successful Sony PlayStation. Third-party designs include Performance Semiconductor's R3400 and IDT's R3500, both of them were R3000As with an integrated R3010 FPU. Toshiba's R3900 was a virtually first SoC for the early handheld PCs based on the Windows CE. A radiation-hardenedvariant for space applications, the Mongoose-V, is a R3000 with an integrated R3010 FPU.
The R4000 series, released in 1991, extended the MIPS instruction set to a full 64-bit architecture, moved the FPU onto the main die to create a single-chip microprocessor, and operated at a radically high internal clock speed (it was introduced at 100 MHz). However, in order to achieve the clock speed the caches were reduced to 8 KB each and they took three cycles to access. The high operating frequencies were achieved through the technique of
deep pipelining(called super-pipelining at the time). With the introduction of the R4000 a number of improved versions soon followed, including the R4400 (1993) which included 16 KB caches, largely bug-free 64-bit operation, and support for a larger external level 2 cache.
MIPS, now a division of SGI called MTI, designed the lower-cost R4200, and later the even lower cost R4300, which was the R4200 with a 32-bit external bus. The
Nintendo 64used a NEC VR4300 CPU that was based upon the low-cost MIPS R4300i. [ [http://www.nec.co.jp/press/en/9801/2002.html NEC Offers Two High Cost Performance 64-bit RISC Microprocessors] ] Quantum Effect Devices(QED), a separate company started by former MIPS employees, designed the R4600 "Orion", the R4700 "Orion", the R4650 and the R5000. Where the R4000 had pushed clock frequency and sacrificed cache capacity, the QED designs emphasized large caches which could be accessed in just two cycles and efficient use of silicon area. The R4600 and R4700 were used in low-cost versions of the SGI Indyworkstation as well as the first MIPS based Cisco routers, such as the 36x0 and 7x00-series routers. The R4650 was used in the original WebTVset-top boxes (now Microsoft TV). The R5000 FPU had more flexible single precision floating-point scheduling than the R4000, and as a result, R5000-based SGI Indys had much better graphics performance than similarly clocked R4400 Indys with the same graphics hardware. SGI gave the old graphics board a new name when it was combined with R5000 in order to emphasize the improvement. QED later designed the RM7000 and RM9000 family of devices for embedded markets like networking and laser printers. QED was acquired by the semiconductor manufacturer PMC-Sierrain August 2000, the latter company continuing to invest in the MIPS architecture. The RM7000 included an on-board 256 kB level 2 cache and a controller for optional level three cache. The RM9xx0 were a family of SOC devices which included northbridge peripherals such as memory controller, PCI controller, gigabit ethernetcontroller and fast IO such as a hypertransportport.
The R8000 (
1994) was the first superscalarMIPS design, able to execute two integer or floating point and two memory instructions per cycle. The design was spread over six chips: an integer unit (with 16 KB instruction and 16 KB data caches), a floating-point unit, three full-custom secondary cache tag RAMs (two for secondary cache accesses, one for bus snooping), and a cache controller ASIC. The design had two fully pipelined double precision multiply-add units, which could stream data from the 4 MB off-chip secondary cache. The R8000 powered SGI's POWER Challenge servers in the mid 1990s and later became available in the POWER Indigo2 workstation. Although its FPU performance fit scientific users quite well, its limited integer performance and high cost dampened appeal for most users, and the R8000 was in the marketplace for only a year and remains fairly rare.
1995, the R10000 was released. This processor was a single-chip design, ran at a faster clock speed than the R8000, and had larger 32 KB primary instruction and data caches. It was also superscalar, but its major innovation was out-of-order execution. Even with a single memory pipeline and simpler FPU, the vastly improved integer performance, lower price, and higher density made the R10000 preferable for most customers.
Recent designs have all been based upon R10000 core. The R12000 used improved manufacturing to shrink the chip and operate at higher clock rates. The revised R14000 allowed higher clock rates with additional support for DDR SRAM in the off-chip cache, and a faster
front side busclocked to 200 MHz for better throughput. Later iterations are named the R16000 and the R16000A and feature increased clock speed, additional L1 cache, and smaller die manufacturing compared with before.
Other members of the MIPS family include the R6000, an ECL implementation of the MIPS architecture which was produced by
Bipolar Integrated Technology. The R6000 microprocessor introduced the MIPS II instruction set. Its TLB and cache architecture are different from all other members of the MIPS family. The R6000 did not deliver the promised performance benefits, and although it saw some use in Control Datamachines, it quickly disappeared from the mainstream market.
NOTE: in the branching and jump instructions, the offset can be replaced by a label present somewhere in the code.
NOTE: that there is no corresponding "load lower immediate" instruction; this can be done by using addi (add immediate, see below) or ori (or immediate) with the register $0 (whose value is always zero). For example, both
addi $1, $0, 100and
ori $1, $0, 100load the decimal value 100 into register $1.
NOTE: An arithmetic operation with signed immediates differs from one with unsigned ones in that it does not throw an exception. Subtracting an immediate can be done with adding the negation of that value as the immediate.
These instructions are accepted by the MIPS assembler, however they are not real instructions within the MIPS instruction set. Instead, the assembler translates them into sequences of real instructions.
Read = 0x0, Write = 0x1, Read/Write = 0x2
OR Create = 0x100, Truncate = 0x200, Append = 0x8
OR Text = 0x4000, Binary = 0x8000
* "Mips" the rabbit in
Super Mario 64is named after the technology, which was used by the Nintendo 64.
*cite book|author=David A. Patterson|first=David A|last=Patterson|authorlink=David A. Patterson (scientist)|coauthors=
John L. Hennessy|title=Computer Organization and Design: The Hardware/Software Interface|publisher= Morgan Kaufmann Publishers|id=ISBN 1-55860-604-1
*cite book|author=Dominic Sweetman|first=Dominic|last=Sweetman|title=See MIPS Run|publisher=Morgan Kaufmann Publishers|id=ISBN 1-55860-410-3
*cite book|author=Erin Farquhar|first=Erin|last=Farquhar|coauthors=Philip Bunce|title=MIPS Programmer's Handbook|publisher=Morgan Kaufmann Publishers|id=ISBN 1-55860-297-6
DLX, a very similar architecture designed by John L. Hennessy(creator of MIPS) for teaching purposes
Loongson, a MIPS-like processor architecture developed at Chinese Academy of Sciences
MIPS-X, developed as a follow-on project to the MIPS architecture
Mongoose-V, a radiation hardened version of the MIPS R3000 used in spacecrafts
* [http://www.langens.eu/tim/ea/mips_en.php Full overview of MIPS architecture.]
* [http://www.cs.wisc.edu/~larus/HP_AppA.pdf Patterson & Hennessy - Appendix A (PDF)]
* [http://logos.cs.uic.edu/366/notes/MIPS%20Quick%20Tutorial.htm summary of MIPS assembly language]
* [http://www.mrc.uidaho.edu/mrc/people/jff/digital/MIPSir.html MIPS Instruction reference]
* [http://www.cpu-collection.de/?tn=1&l0=cl&l1=MIPS%20Rx000 MIPS processor images and descriptions at cpu-collection.de]
* [http://chortle.ccsu.edu/AssemblyTutorial/TutorialContents.html A programmed introduction to MIPS assembly]
* [http://www.cs.umd.edu/class/spring2003/cmsc311/Notes/Mips/bitshift.html mips bitshift operators]
* [http://www.it.uu.se/edu/course/homepage/datsystDV/ht04/Project/tools/machinedata/4KcProgMan.pdf MIPS software user's manual]
Wikimedia Foundation. 2010.