- IBM POWER
POWER is a
RISC instruction set architecture designed by IBM. The name is abackronym for "Performance Optimization With Enhanced RISC".POWER is also the name of a series of microprocessors that implement the instruction set architecture (ISA). The POWER series microprocessors are used as the main CPU in many of IBM's servers, minicomputers, workstations, and supercomputers. The
POWER3 and subsequent microprocessors in the POWER series all implement the full64-bit PowerPC architecture. The POWER3 and above don't implement any of the old POWER instructions that were removed from the ISA when the PowerPC ISA came out or any of the POWER2 extensions such as lfq or stfq.IBM also is encouraging other developers and manufacturers to use the POWER architecture or any other derivative of it through the
Power.org community; this includes all of PowerPC and Cell.Appendix E of [ftp://www6.software.ibm.com/software/developer/library/es-ppcbook1.zip Book I: PowerPC User Instruction Set Architecture] of [http://www-128.ibm.com/developerworks/eserver/library/es-archguide-v2.html PowerPC Architecture Book, Version 2.02] describes the differences between the POWER and POWER2 instruction set architectures and the version of the PowerPC instruction set architecture implemented by the POWER5.
History
=The 801 project=In 1974,
IBM started a project with a design objective of creating a large telephone-switching network with a potential capacity to deal with at least 300 calls per second. It was projected that 20,000 machine instructions would be required to handle each call while maintaining a real-time response, so a processor speed 12 MIPS was deemed necessary. This requirement was extremely ambitious for the time, but it was realised that much of the complexity of contemporary CPUs could be dispensed with, since this machine would need only to perform I/O, branches, add register-register, move data between registers and memory, and would have no need for special instructions to perform heavy arithmetic.This simple design philosophy, whereby each step of a complex operation is specified explicitly by a single machine instruction, and all instructions are required to complete in the same constant time, would later come to be known as
RISC .By 1975 the telephone switch project was canceled without a prototype. From the estimates from simulations produced in the project's first year, however, it looked as if the processor being designed for this project could be a very promising general-purpose processor, so work continued at
Thomas J. Watson Research Center building #801, on the "801" project.1982 Research Project “Cheetah”
For 2 years at the Watson Research Center the superscalar limits of the “801” design were explored, such as the feasibility of implementing the “801” design using multiple functional units to improve performance, similar to what had been done in the IBM
System/360 Model 91 and theCDC 6600 (although the Model 91 had been based on a CISC design). To determine if a RISC machine could maintain multiple instructions per cycle, or what design changes need to be made to the “801” design to allow for a multiple-execution-unit “801” design.To increase performance “Cheetah” had separate branch, fixed-point, and floating-point execution units. Many changes were made to the “801” design to allow for a multiple-execution-unit design. "Cheetah" was originally planned to be manufactured using bipolar ECL technology, but by 1984
CMOS afforded an increase in the level of circuit integration while improving transistor-logic performance.The America Project
In 1985, research on a second-generation RISC architecture started at the IBM Thomas J. Watson Research Center, producing the "AMERICA architecture"; in 1986, IBM Austin started developing the
RS/6000 series, based on that architecture.POWER and RS/6000
In February 1990, the first computers from IBM to incorporate the POWER Architecture ("Performance Optimized With Enhanced RISC") were called the "RISC System/6000" or
RS/6000 . These RS/6000 computers were divided into two classes,workstation s and servers, and hence introduced as the POWERstation and POWERserver. The RS/6000 CPU had 2 configurations, called the "RIOS-1" and "RIOS.9" (or more commonly the "POWER1 " CPU). A RIOS-1 configuration had a total of 10 discrete chips - an instruction cache chip, fixed-point chip, floating-point chip, 4 data cache chips, storage control chip, input/output chips, and a clock chip. The lower cost RIOS.9 configuration had 8 discrete chips - an instruction cache chip, fixed-point chip, floating-point chip, 2 data cache chips, storage control chip, input/output chip, and a clock chip.A single-chip implementation of RIOS, RSC (for "
RISC Single Chip "), was developed for lower-end RS/6000's; the first machines using RSC were released in 1992.Amazon
In 1990 the Amazon project was started to create a common architecture that would host both AIX and OS/400. The
AS/400 engineering team at IBM were designing a RISC instruction set to replace the CISC instruction set of the existing AS/400 computers. Their original design was a variant of the existing "IMPI" instruction set, extended to 64-bits and given some RISC instructions to speed up the more computationally intensive commercial applications that were being put on AS/400s. IBM management wanted them to use PowerPC, but they resisted, arguing that the existing 32/64-bit PowerPC instruction set would not enable a viable transition for OS/400 software and that the existing instruction set required extensions for the commercial applications on the AS/400. Eventually, an extension to the PowerPC instruction set, called "Amazon", was developed.At the same time, the
RS/6000 developers were broadly expanding their product line to include systems which spanned from low-end workstations, to mainframe competitor-large enterprise SMP systems, to clustered RS/6000-SP2 supercomputing systems. PowerPC processors developed in the AIM alliance suited the low-end RISC workstation and small server space well. But, mainframe and large clustered supercomputing systems required more performance and RAS features than processors designed for Apple PowerMacs. Multiple processor designs were required to simultaneously meet the requirements of the cost focused Apple PowerMac, high-performance and RAS RS/6000 systems, and the AS/400 transition to PowerPC.Amazon was extended to support those features as well, so that processors could be designed for use in both high-end RS/6000 and AS/400 machines.
The project to develop the first such processor was "Bellatrix" (the name of a star in the Orion constellation, also called the "Amazon Star"). The Bellatrix project was extremely ambitious in its pervasive use of self-timed & pulse based circuits and the EDA tools required to support this design strategy, and was eventually terminated. To address technical workstation, supercomputer, and engineering/scientific markets, IBM Austin (the home of the RS/6000s) then started developing a time-to-market single chip version of the Power2 (P2SC) in parallel with the development of a sophisticated 64-bit PowerPC processor with the POWER2 extensions and twin sophisticated MAF floating point units (the POWER3/630). To address RS/6000 commercial applications and AS/400 systems IBM Rochester (the home of the AS/400s) started developing the first of the high-end 64-bit PowerPC processors with AS/400 extensions, and IBM Endicott started developing a low-end single-chip PowerPC processor with AS/400 extensions.
The A25/30 "Muskie" high-end multi-chip AS/400 processor and A10 "Cobra" single-chip AS/400 processor came out in 1995.
In 1997, the "Apache" processor, developed at IBM Endicott, was released. It was used in RS/6000s under the name
RS64 , and in AS/400s as well, as were its RS64 successors.POWER2
IBM started the
POWER2 processor effort as a successor to the Power1 two years before the creation of the 1991 Apple/IBM/Motorola alliance in Austin, Texas. Despite being impacted by diversion of resources to jumpstart the Apple/IBM/Motorola effort, the Power2 took 5 years from start to system shipment. By adding a second fixed-point unit, a secondfloating point unit and other performance enhancements to the design the Power2 had leadership performance when it was announced in November 1993.New instructions were also added to the instruction set:
*Quad-word storage instructions. The quad-word load instruction moves two adjacent double-precision values into two adjacent floating-point registers.
*Hardware square root instruction.
*Floating-point to integer conversion instructions.To support the RS/6000 and RS/6000 SP2 product lines in 1996, IBM had its own design team implement a single-chip implementation of POWER2, P2SC ("POWER2 Super Chip") outside the Apple/IBM/Motorola alliance in IBM's most advanced and dense CMOS-6s process. P2SC combined all of the separate Power2 Icache, fixed point, floating point, storage control, and data cache CPU chips onto a single huge die. At the time of its introduction, P2SC was the largest and highest transistor count processor in the industry. It was one of the first, if not the first processor to have an integrated memory controller on the CPU. Despite the challenge of its size, complexity, and advanced CMOS process the first tape-out version of the processor was able to be shipped and it had leadership floating point performance at the time it was announced. P2SC was the processor used in the 1997 IBM Deep Blue Chess playing supercomputer which beat chess grandmaster Gary Kasparov. With its twin sophisticated MAF floating point units and huge wide and low latency memory interfaces, P2SC was primarily targeted at engineering and scientific applications. P2SC was eventually succeeded by Power3/630 which included 64bit, SMP capability, L2 cache support, and a full transition to PowerPC in addition to P2SC's sophisticated twin MAF floating point units.
PowerPC
In 1991 IBM realized that they might be able to make POWER a high-volume architecture by making and selling chips to other system manufacturers. They approached Apple with the goal of collaborating on the development of a family of single-chip microprocessors based on the POWER architecture. Soon after, Apple, as one of
Motorola 's largest customers of desktop-class microprocessors, asked Motorola to join the discussions because of their long relationship, their more extensive experience with manufacturing high-volume microprocessors than IBM and to serve as a second source for themicroprocessor s. This three-way collaboration based in Austin, Texas became known as theAIM alliance , for Apple, IBM, Motorola.The result after 2 years of development in 1993 was the
PowerPC architecture, a modified version of the POWER architecture. The PowerPC architecture added single-precision floating point instructions and general register-to-register multiply and divide instructions, and removed some POWER features such as the specialized multiply and divide instructions using the MQ register. It also added a 64-bit version of the architecture and support for SMP.The first PowerPC chip was the PowerPC 601. See the
PowerPC page for more information on PowerPC.POWER3
IBM introduced the
POWER3 processor in 1998. It implemented the 64-bit POWER instruction set, including all of the optional instructions of the ISA (at the time), and had two floating-point units, three fixed-point units, and two load-store units. All subsequent POWER processors implemented the full 64-bit PowerPC and POWER instruction sets, so that there were no longer any IBM processors that implemented only POWER or only POWER2.POWER4
IBM introduced the
POWER4 processor, the first in the GIGA-Series, in 2001. It was, again, a full 64-bit processor, implementing the full 64-bit PowerPC instruction set; it also had the AS/400 extensions, and was used in both RS/6000 and AS/400 systems, replacing both POWER3 and the RS64 processors. There was a new ISA release at this point called the PowerPC 2.00 ISA which added a couple of extensions to the ISA, like a mfcr that also took a field argument.POWER5
IBM introduced the
POWER5 processor in 2004. It is a dual-core processor with support forsimultaneous multithreading with two threads, so it implements 4 logical processors. Using the ViVA "Virtual Vector Architecture" several POWER5 processors can act together as a singlevector processor . The POWER5 added more instructions to the ISA.The POWER5+ added even more instructions and there was a new release of the ISA 2.02.
POWER6
POWER6 was announced on May 21, 2007. It adds VMX to the POWER series. It also introduces the second generation ofViVA ,ViVA-2 , which is the biggest change to the POWER series of processor since the transition from POWER3 to POWER4. It is a dual-core design, reaching 4.7 GHz at 65 nm. It has very advanced interchip communication technology. Its power consumption is nearly the same as the preceding POWER5, whilst offering doubled performance.POWER7
Currently in development at IBM,
POWER7 will be the first of the Peta-Series. It's projected for release around 2010 and has been selected byDARPA as a potential processor to be used in their Peta-FLOPS SuperComputer. In the early 2000s IBM submitted their proposal and received $53 million from DARPA to continue to participate in the challenge; in 2006 IBM received $244 million to build a petaFLOPS computer for DARPA.The architecture
The POWER design is descended directly from the earlier 801 CPU, widely considered to be the first true RISC processor design. The 801 was used in a number of applications inside IBM hardware, but did not become publicly known until they released the poorly-performing
IBM PC/RT in the mid-1980s.At about the same time the PC/RT was being released, IBM started the "America Project", to design the most powerful CPU on the market. They were interested primarily in fixing two problems in the 801 design:
*the 801 required all instructions to complete in one
clock cycle , which eliminatedfloating point instructions
*although the decoder was pipelined as a side effect of these single-cycle operations, they didn't usesuperscalar effectsFloating point became a focus for the America Project, and IBM was able to use new algorithms developed in the early 1980s that could support 64-bit double-precision multiplies and divides in a single cycle. The FPU portion of the design was separate from the instruction decoder and integer parts, allowing the decoder to send instructions to both the FPU and ALU (integer)execution unit s at the same time. IBM complemented this with a complex instruction decoder which could be fetching one instruction, decoding another, and sending one to the ALU and FPU at the same time, resulting in one of the firstsuperscalar CPU designs in use.The system used thirty-two 32-bit
integer registers and another thirty-two 64-bit floating point registers, each in their own unit. The branch unit also included a number of "private" registers for its own use, including theprogram counter .The 801 was a simple design, and an overcorrection to its simplicity resulted in the POWER design being more complex than most RISC CPUs. For instance, the POWER (and PowerPC)
instruction set includes over 100 op-codes of variable length, many of which are variations on others. This compares (for instance) with the ARM which has only 34 instructions.Another interesting feature of the architecture is a "virtual address" system which maps all addresses into a 52-bit space. In this way applications can share memory in a "flat" 32-bit space, and all of the programs can have different blocks of 32-bits each.
Implementations
The first POWER1 CPUs consisted of three units: branch, integer and floating point. These were wired together on a largish motherboard to produce a single system. POWER1 was used primarily in the
RS/6000 series of workstations. The RSC was a single-chip version of POWER1 (the "SC" stands for "Single Chip"), also used in RS/6000s.POWER2 was a product-improved POWER1 and was the longest-lived of the POWER series, released in 1993 and still used five years later. It added a second floating-point unit, 256 KiB of cache and 128-bit floating-point math.POWER3 followed in 1998, moving to a full64-bit implementation, while remaining completely compatible with the POWER instruction set. This had been one of the goals of the POWER project and the POWER3 was the first of the IBM processors to take advantage of it. It also added a third ALU and a second instruction decoder, for a total of eight functional units.The
POWER4 series places two complete CPU cores (otherwise similar to the POWER3) on a single chip, speeds it up, and adds high-speed connections to up to three additional pairs of POWER4 CPUs. They can be placed together on amotherboard to produce an 8-CPU SMP building block. When processing requires high throughput instead of high code complexity, one of a pair of cores can be turned off so that the remaining cores have the entire bus andL3 cache to themselves. The POWER4, even in single core form, was considered by many to be the most powerful CPU available at the time. [http://ask.slashdot.org/article.pl?sid=01/12/16/221237&mode=thread&tid=137] [http://www.mdronline.com/publications/mpw/issues/mpw091.html] [http://www.theinquirer.net/en/inquirer/news/2002/07/04/the-64-bit-saga-power4-vs-itanium2]IBM rolled out the
POWER5 processor in 2004. The 1.9 GHz version posted the highestuniprocessor SPECfp score of any shipping chip. The POWER5 powers the i5 and p5 eServers. Improvements in the POWER5 over the POWER4 include: a larger L2 cache, a memory controller on the chip,simultaneous multithreading which appears to the operating system as multiple CPUs, advanced power management, dedicated single-tasking mode,Hypervisor (virtualization technology), and eFuse (hardware re-routing around faults).Ravi Arimilli , IBM's chief microprocessor designer has said: "The POWER5 chip is more of a midrange design that can drive up to the high end and then down to things like blades."IBM servers built with the POWER5 processor offer hardware virtualization in the form oflogical partitioning (LPAR). Withmicro-partitioning feature, up to ten logical partitions (LPARs) can be created for each CPU, the biggest 64-way system can run 256 independent operating systems. Dynamic LPAR capability allows a memory, CPU power and I/O devices can be dynamically moved between partitions. "See also"Linux on Power .In 2007,
POWER6 was formally announced.Development of
POWER7 is underway.Derivative CPUs
The first
PowerPC processor, the PowerPC 601, was essentially an RSC CPU with some of the more basic instructions emulated inmicrocode , using a bus interface based on theMotorola 88000 design. This allowed IBM to use the CPU in a number of workstation machines, changing only the motherboard. Since then the PowerPC and POWER architectures have diverged somewhat, but remain mostly compatible at the instruction level.The radiation-hardened
RAD6000 processor used in space-based applications is a derivative of the POWER / RSC CPU architecture.The IBM
RS64 family of processors is based on PowerPC (and thus POWER) and has been used in theRS/6000 andAS/400 product lines. It is optimized for commercial workloads, and does not have the floating point power expected in the POWER line. It was replaced by the POWER4.The IBM "Gekko" processor is a modified PowerPC 750CXe, used in the
Nintendo GameCube . Broadway is an updated Gekko, made forNintendo 'sWii .The Cell processor is also derived from the POWER architecture. It features a single in-order, multithreaded
superscalar core, coupled to eight independentvector processor s or "Synergistic Processing Elements". The processor powers the SonyPlaystation 3 as well asdigital TV systems from Toshiba andhigh-performance computer s from IBM.The
Xbox 360 , the latest generation ofMicrosoft 's gaming console, uses an in-order triple-core PowerPC "Xenon" processor with modified vector units clocked at 3.2 GHz [ [http://www-128.ibm.com/developerworks/power/library/pa-fpfxbox/ IBM Developerworks - Xenon processor reference] ] .References
* [http://domino.research.ibm.com/tchjr/journalindex.nsf/ResVolumes?OpenView&Start=1&Count=1000&Expand=16.1#16.1 IBM Journal of R&D, Volume 34, Issue 1 (1990)] - IBM Journal of Research and Development issue on the original RS/6000
*|url=http://www.research.ibm.com/journal/rd/341/ibmrd3401C.pdf|accessdate=2006-07-21
* - gives more information about POWER1, POWER2, and POWER3
*External links
* [http://www.ibm.com/chips/power/ IBM Power Architecture] - Official IBM website
* [http://www.ibm.com/systems/linux/power/ Linux on Power]
* [http://www.pseriestech.org/forum/linux-for-power-systems/ Linux on Power Support]
* [http://www.ibm.com/collaboration/wiki/display/LinuxP/ Linux on Power WIKI]
* [http://www.ibm.com/developerworks/power IBM Power Architecture weekly magazine]
* [http://www.power.org/ Power.org]
* [http://www.pseriestech.org/forum/ Power-Admin.org]
* [http://www.ibm.com/developerworks/power/library/pa-powerppl/ POWER to the people] - an IBM history of POWER and PowerPC
* [http://www.the400squadron.com/amug/200406/NotPowerPC.htm When Is PowerPC Not PowerPC?] - History of the POWER Architecture byFrank Soltis
* [http://www.nersc.gov/vendor_docs/ibm/asm/migrating_source.htm#be6c5d1351jeff Migrating Source Programs]
* [http://www-128.ibm.com/developerworks/library/pa-expert1.html Meet the experts: John McCalpin] - interesting discussion on power5 and beyond
* [http://www.rootvg.net/column_risc_.htm 27 years of IBM RISC]
Wikimedia Foundation. 2010.