POWER is a RISC instruction set architecture designed by IBM. The name is a backronym for "Performance Optimization With Enhanced RISC".

POWER is also the name of a series of microprocessors that implement the instruction set architecture (ISA). The POWER series microprocessors are used as the main CPU in many of IBM's servers, minicomputers, workstations, and supercomputers. The POWER3 and subsequent microprocessors in the POWER series all implement the full 64-bit PowerPC architecture. The POWER3 and above don't implement any of the old POWER instructions that were removed from the ISA when the PowerPC ISA came out or any of the POWER2 extensions such as lfq or stfq.

IBM also is encouraging other developers and manufacturers to use the POWER architecture or any other derivative of it through the community; this includes all of PowerPC and Cell.

Appendix E of [ Book I: PowerPC User Instruction Set Architecture] of [ PowerPC Architecture Book, Version 2.02] describes the differences between the POWER and POWER2 instruction set architectures and the version of the PowerPC instruction set architecture implemented by the POWER5.


=The 801 project=

In 1974, IBM started a project with a design objective of creating a large telephone-switching network with a potential capacity to deal with at least 300 calls per second. It was projected that 20,000 machine instructions would be required to handle each call while maintaining a real-time response, so a processor speed 12 MIPS was deemed necessary. This requirement was extremely ambitious for the time, but it was realised that much of the complexity of contemporary CPUs could be dispensed with, since this machine would need only to perform I/O, branches, add register-register, move data between registers and memory, and would have no need for special instructions to perform heavy arithmetic.

This simple design philosophy, whereby each step of a complex operation is specified explicitly by a single machine instruction, and all instructions are required to complete in the same constant time, would later come to be known as RISC.

By 1975 the telephone switch project was canceled without a prototype. From the estimates from simulations produced in the project's first year, however, it looked as if the processor being designed for this project could be a very promising general-purpose processor, so work continued at Thomas J. Watson Research Center building #801, on the "801" project.

1982 Research Project “Cheetah”

For 2 years at the Watson Research Center the superscalar limits of the “801” design were explored, such as the feasibility of implementing the “801” design using multiple functional units to improve performance, similar to what had been done in the IBM System/360 Model 91 and the CDC 6600 (although the Model 91 had been based on a CISC design). To determine if a RISC machine could maintain multiple instructions per cycle, or what design changes need to be made to the “801” design to allow for a multiple-execution-unit “801” design.

To increase performance “Cheetah” had separate branch, fixed-point, and floating-point execution units. Many changes were made to the “801” design to allow for a multiple-execution-unit design. "Cheetah" was originally planned to be manufactured using bipolar ECL technology, but by 1984 CMOS afforded an increase in the level of circuit integration while improving transistor-logic performance.

The America Project

In 1985, research on a second-generation RISC architecture started at the IBM Thomas J. Watson Research Center, producing the "AMERICA architecture"; in 1986, IBM Austin started developing the RS/6000 series, based on that architecture.

POWER and RS/6000

In February 1990, the first computers from IBM to incorporate the POWER Architecture ("Performance Optimized With Enhanced RISC") were called the "RISC System/6000" or RS/6000. These RS/6000 computers were divided into two classes, workstations and servers, and hence introduced as the POWERstation and POWERserver. The RS/6000 CPU had 2 configurations, called the "RIOS-1" and "RIOS.9" (or more commonly the "POWER1" CPU). A RIOS-1 configuration had a total of 10 discrete chips - an instruction cache chip, fixed-point chip, floating-point chip, 4 data cache chips, storage control chip, input/output chips, and a clock chip. The lower cost RIOS.9 configuration had 8 discrete chips - an instruction cache chip, fixed-point chip, floating-point chip, 2 data cache chips, storage control chip, input/output chip, and a clock chip.

A single-chip implementation of RIOS, RSC (for "RISC Single Chip"), was developed for lower-end RS/6000's; the first machines using RSC were released in 1992.


In 1990 the Amazon project was started to create a common architecture that would host both AIX and OS/400. The AS/400 engineering team at IBM were designing a RISC instruction set to replace the CISC instruction set of the existing AS/400 computers. Their original design was a variant of the existing "IMPI" instruction set, extended to 64-bits and given some RISC instructions to speed up the more computationally intensive commercial applications that were being put on AS/400s. IBM management wanted them to use PowerPC, but they resisted, arguing that the existing 32/64-bit PowerPC instruction set would not enable a viable transition for OS/400 software and that the existing instruction set required extensions for the commercial applications on the AS/400. Eventually, an extension to the PowerPC instruction set, called "Amazon", was developed.

At the same time, the RS/6000 developers were broadly expanding their product line to include systems which spanned from low-end workstations, to mainframe competitor-large enterprise SMP systems, to clustered RS/6000-SP2 supercomputing systems. PowerPC processors developed in the AIM alliance suited the low-end RISC workstation and small server space well. But, mainframe and large clustered supercomputing systems required more performance and RAS features than processors designed for Apple PowerMacs. Multiple processor designs were required to simultaneously meet the requirements of the cost focused Apple PowerMac, high-performance and RAS RS/6000 systems, and the AS/400 transition to PowerPC.

Amazon was extended to support those features as well, so that processors could be designed for use in both high-end RS/6000 and AS/400 machines.

The project to develop the first such processor was "Bellatrix" (the name of a star in the Orion constellation, also called the "Amazon Star"). The Bellatrix project was extremely ambitious in its pervasive use of self-timed & pulse based circuits and the EDA tools required to support this design strategy, and was eventually terminated. To address technical workstation, supercomputer, and engineering/scientific markets, IBM Austin (the home of the RS/6000s) then started developing a time-to-market single chip version of the Power2 (P2SC) in parallel with the development of a sophisticated 64-bit PowerPC processor with the POWER2 extensions and twin sophisticated MAF floating point units (the POWER3/630). To address RS/6000 commercial applications and AS/400 systems IBM Rochester (the home of the AS/400s) started developing the first of the high-end 64-bit PowerPC processors with AS/400 extensions, and IBM Endicott started developing a low-end single-chip PowerPC processor with AS/400 extensions.

The A25/30 "Muskie" high-end multi-chip AS/400 processor and A10 "Cobra" single-chip AS/400 processor came out in 1995.

In 1997, the "Apache" processor, developed at IBM Endicott, was released. It was used in RS/6000s under the name RS64, and in AS/400s as well, as were its RS64 successors.


IBM started the POWER2 processor effort as a successor to the Power1 two years before the creation of the 1991 Apple/IBM/Motorola alliance in Austin, Texas. Despite being impacted by diversion of resources to jumpstart the Apple/IBM/Motorola effort, the Power2 took 5 years from start to system shipment. By adding a second fixed-point unit, a second floating point unit and other performance enhancements to the design the Power2 had leadership performance when it was announced in November 1993.

New instructions were also added to the instruction set:
*Quad-word storage instructions. The quad-word load instruction moves two adjacent double-precision values into two adjacent floating-point registers.
*Hardware square root instruction.
*Floating-point to integer conversion instructions.

To support the RS/6000 and RS/6000 SP2 product lines in 1996, IBM had its own design team implement a single-chip implementation of POWER2, P2SC ("POWER2 Super Chip") outside the Apple/IBM/Motorola alliance in IBM's most advanced and dense CMOS-6s process. P2SC combined all of the separate Power2 Icache, fixed point, floating point, storage control, and data cache CPU chips onto a single huge die. At the time of its introduction, P2SC was the largest and highest transistor count processor in the industry. It was one of the first, if not the first processor to have an integrated memory controller on the CPU. Despite the challenge of its size, complexity, and advanced CMOS process the first tape-out version of the processor was able to be shipped and it had leadership floating point performance at the time it was announced. P2SC was the processor used in the 1997 IBM Deep Blue Chess playing supercomputer which beat chess grandmaster Gary Kasparov. With its twin sophisticated MAF floating point units and huge wide and low latency memory interfaces, P2SC was primarily targeted at engineering and scientific applications. P2SC was eventually succeeded by Power3/630 which included 64bit, SMP capability, L2 cache support, and a full transition to PowerPC in addition to P2SC's sophisticated twin MAF floating point units.


In 1991 IBM realized that they might be able to make POWER a high-volume architecture by making and selling chips to other system manufacturers. They approached Apple with the goal of collaborating on the development of a family of single-chip microprocessors based on the POWER architecture. Soon after, Apple, as one of Motorola's largest customers of desktop-class microprocessors, asked Motorola to join the discussions because of their long relationship, their more extensive experience with manufacturing high-volume microprocessors than IBM and to serve as a second source for the microprocessors. This three-way collaboration based in Austin, Texas became known as the AIM alliance, for Apple, IBM, Motorola.

The result after 2 years of development in 1993 was the PowerPC architecture, a modified version of the POWER architecture. The PowerPC architecture added single-precision floating point instructions and general register-to-register multiply and divide instructions, and removed some POWER features such as the specialized multiply and divide instructions using the MQ register. It also added a 64-bit version of the architecture and support for SMP.

The first PowerPC chip was the PowerPC 601. See the PowerPC page for more information on PowerPC.


IBM introduced the POWER3 processor in 1998. It implemented the 64-bit POWER instruction set, including all of the optional instructions of the ISA (at the time), and had two floating-point units, three fixed-point units, and two load-store units. All subsequent POWER processors implemented the full 64-bit PowerPC and POWER instruction sets, so that there were no longer any IBM processors that implemented only POWER or only POWER2.


IBM introduced the POWER4 processor, the first in the GIGA-Series, in 2001. It was, again, a full 64-bit processor, implementing the full 64-bit PowerPC instruction set; it also had the AS/400 extensions, and was used in both RS/6000 and AS/400 systems, replacing both POWER3 and the RS64 processors. There was a new ISA release at this point called the PowerPC 2.00 ISA which added a couple of extensions to the ISA, like a mfcr that also took a field argument.


IBM introduced the POWER5 processor in 2004. It is a dual-core processor with support for simultaneous multithreading with two threads, so it implements 4 logical processors. Using the ViVA "Virtual Vector Architecture" several POWER5 processors can act together as a single vector processor. The POWER5 added more instructions to the ISA.

The POWER5+ added even more instructions and there was a new release of the ISA 2.02.


POWER6 was announced on May 21, 2007. It adds VMX to the POWER series. It also introduces the second generation of ViVA, ViVA-2, which is the biggest change to the POWER series of processor since the transition from POWER3 to POWER4. It is a dual-core design, reaching 4.7 GHz at 65 nm. It has very advanced interchip communication technology. Its power consumption is nearly the same as the preceding POWER5, whilst offering doubled performance.


Currently in development at IBM, POWER7 will be the first of the Peta-Series. It's projected for release around 2010 and has been selected by DARPA as a potential processor to be used in their Peta-FLOPS SuperComputer. In the early 2000s IBM submitted their proposal and received $53 million from DARPA to continue to participate in the challenge; in 2006 IBM received $244 million to build a petaFLOPS computer for DARPA.

The architecture

The POWER design is descended directly from the earlier 801 CPU, widely considered to be the first true RISC processor design. The 801 was used in a number of applications inside IBM hardware, but did not become publicly known until they released the poorly-performing IBM PC/RT in the mid-1980s.

At about the same time the PC/RT was being released, IBM started the "America Project", to design the most powerful CPU on the market. They were interested primarily in fixing two problems in the 801 design:

*the 801 required all instructions to complete in one clock cycle, which eliminated floating point instructions
*although the decoder was pipelined as a side effect of these single-cycle operations, they didn't use superscalar effects

Floating point became a focus for the America Project, and IBM was able to use new algorithms developed in the early 1980s that could support 64-bit double-precision multiplies and divides in a single cycle. The FPU portion of the design was separate from the instruction decoder and integer parts, allowing the decoder to send instructions to both the FPU and ALU (integer) execution units at the same time. IBM complemented this with a complex instruction decoder which could be fetching one instruction, decoding another, and sending one to the ALU and FPU at the same time, resulting in one of the first superscalar CPU designs in use.

The system used thirty-two 32-bit integer registers and another thirty-two 64-bit floating point registers, each in their own unit. The branch unit also included a number of "private" registers for its own use, including the program counter.

The 801 was a simple design, and an overcorrection to its simplicity resulted in the POWER design being more complex than most RISC CPUs. For instance, the POWER (and PowerPC) instruction set includes over 100 op-codes of variable length, many of which are variations on others. This compares (for instance) with the ARM which has only 34 instructions.

Another interesting feature of the architecture is a "virtual address" system which maps all addresses into a 52-bit space. In this way applications can share memory in a "flat" 32-bit space, and all of the programs can have different blocks of 32-bits each.


The first POWER1 CPUs consisted of three units: branch, integer and floating point. These were wired together on a largish motherboard to produce a single system. POWER1 was used primarily in the RS/6000 series of workstations. The RSC was a single-chip version of POWER1 (the "SC" stands for "Single Chip"), also used in RS/6000s.

POWER2 was a product-improved POWER1 and was the longest-lived of the POWER series, released in 1993 and still used five years later. It added a second floating-point unit, 256 KiB of cache and 128-bit floating-point math.

POWER3 followed in 1998, moving to a full 64-bit implementation, while remaining completely compatible with the POWER instruction set. This had been one of the goals of the POWER project and the POWER3 was the first of the IBM processors to take advantage of it. It also added a third ALU and a second instruction decoder, for a total of eight functional units.

The POWER4 series places two complete CPU cores (otherwise similar to the POWER3) on a single chip, speeds it up, and adds high-speed connections to up to three additional pairs of POWER4 CPUs. They can be placed together on a motherboard to produce an 8-CPU SMP building block. When processing requires high throughput instead of high code complexity, one of a pair of cores can be turned off so that the remaining cores have the entire bus and L3 cache to themselves. The POWER4, even in single core form, was considered by many to be the most powerful CPU available at the time. [] [] []

IBM rolled out the POWER5 processor in 2004. The 1.9 GHz version posted the highest uniprocessor SPECfp score of any shipping chip. The POWER5 powers the i5 and p5 eServers. Improvements in the POWER5 over the POWER4 include: a larger L2 cache, a memory controller on the chip, simultaneous multithreading which appears to the operating system as multiple CPUs, advanced power management, dedicated single-tasking mode, Hypervisor (virtualization technology), and eFuse (hardware re-routing around faults). Ravi Arimilli, IBM's chief microprocessor designer has said: "The POWER5 chip is more of a midrange design that can drive up to the high end and then down to things like blades."IBM servers built with the POWER5 processor offer hardware virtualization in the form of logical partitioning (LPAR). With micro-partitioning feature, up to ten logical partitions (LPARs) can be created for each CPU, the biggest 64-way system can run 256 independent operating systems. Dynamic LPAR capability allows a memory, CPU power and I/O devices can be dynamically moved between partitions. "See also" Linux on Power.

In 2007, POWER6 was formally announced.

Development of POWER7 is underway.

Derivative CPUs

The first PowerPC processor, the PowerPC 601, was essentially an RSC CPU with some of the more basic instructions emulated in microcode, using a bus interface based on the Motorola 88000 design. This allowed IBM to use the CPU in a number of workstation machines, changing only the motherboard. Since then the PowerPC and POWER architectures have diverged somewhat, but remain mostly compatible at the instruction level.

The radiation-hardened RAD6000 processor used in space-based applications is a derivative of the POWER / RSC CPU architecture.

The IBM RS64 family of processors is based on PowerPC (and thus POWER) and has been used in the RS/6000 and AS/400 product lines. It is optimized for commercial workloads, and does not have the floating point power expected in the POWER line. It was replaced by the POWER4.

The IBM "Gekko" processor is a modified PowerPC 750CXe, used in the Nintendo GameCube. Broadway is an updated Gekko, made for Nintendo's Wii.

The Cell processor is also derived from the POWER architecture. It features a single in-order, multithreaded superscalar core, coupled to eight independent vector processors or "Synergistic Processing Elements". The processor powers the Sony Playstation 3 as well as digital TV systems from Toshiba and high-performance computers from IBM.

The Xbox 360, the latest generation of Microsoft's gaming console, uses an in-order triple-core PowerPC "Xenon" processor with modified vector units clocked at 3.2 GHz [ [ IBM Developerworks - Xenon processor reference] ] .


