- AltiVec
AltiVec is a
floating point and integerSIMD instruction set designed and owned by Apple, IBM andFreescale Semiconductor , formerly the Semiconductor Products Sector ofMotorola , (theAIM alliance ), and implemented on versions of thePowerPC including Motorola's G4,IBM 's G5 andPOWER6 processors, andP.A. Semi 'sPWRficient PA6T. AltiVec is atradename owned solely by Freescale, so the system is also referred to as Velocity Engine by Apple and VMX by IBM and P.A. Semi, although IBM has recently begun using AltiVec as well.It should be noted that while AltiVec refers to an instruction set, the implementations in CPUs produced by IBM and Motorola are separate in terms of logic design. To date, no IBM core has included an AltiVec logic design licensed from Motorola or vice-versa.
AltiVec is a standard part of the new Power ISA v.2.03cite web |title=Power ISA v.2.03 |publisher= [http://power.org Power.org] |url=http://www.power.org/resources/downloads/PowerISA_203.Public.pdf ] specification. It was never formally a part of the PowerPC architecture until this specification although it used PowerPC instruction formats and syntax and occupied the opcode space expressly allocated for such purposes.
Features and similarities
Both AltiVec and SSE feature 128-bit vector registers that can represent sixteen 8-bit signed or unsigned chars, eight 16-bit signed or unsigned shorts, four 32-bit ints or four 32-bit floating point variables. Both provide cache-control instructions intended to minimize
cache pollution when working on streams of data.They also exhibit important differences. Unlike SSE2, AltiVec supports a special RGB "
pixel " data type, but it does not operate on 64-bit double precision floats, and there is no way to move data directly between scalar and vector registers. In keeping with the "load/store" model of the PowerPC'sRISC design, the vector registers, like the scalar registers, can only be loaded from and stored to memory. However, AltiVec provides a much more complete set of "horizontal" operations that work across all the elements of a vector; the allowable combinations of data type and operations are much more complete. 32 128-bit vector registers are provided, compared to 8 for SSE and SSE2 (extended to 16 inx86-64 ), and most AltiVec instructions take three register operands compared to only two register/register or register/memory operands onIA-32 .AltiVec is also unique in its support for a flexible vector permute instruction, in which each byte of a resulting vector value can be taken from any byte of either of two other vectors, parametrized by yet another vector. This allows for sophisticated manipulations in a single instruction.
Recent versions of the
GNU Compiler Collection , IBM Visual Age Compiler and other compilers provide intrinsics to access AltiVec instructions directly from C andC++ programs. As of version 4, the GCC also includes auto-vectorisation capabilities that attempt to intelligently create Altivec accelerated binaries without the need for the programmer to use intrinsics directly. The "vector" type keyword is introduced to permit the declaration of native vector types, e.g., "vector unsigned char foo;
" declares a 128-bit vector variable named "foo" containing sixteen 8-bit unsigned chars. The full compliment of arithmetic and binary operators are defined on vector types so that the normal C expression language can be used to manipulate vector variables. There are also overloaded intrinsic functions such as "vec_add
" that emit the appropriate op code based on the type of the elements within the vector, and very strong type checking is enforced. In contrast, the Intel-defined data types for IA-32 SIMD registers declare only the size of the vector register (128 or 64 bits) and in the case of a 128-bit register, whether it contains integers or floating point values. The programmer must select the appropriate intrinsic for the data types in use, e.g., "_mm_add_epi16(x,y)
" for adding two vectors containing eight 16-bit integers.Development history
AltiVec was developed between 1996 and 1998 by a collaborative project between Apple, IBM, and Motorola. Apple was the primary customer for AltiVec although Apple switched to Intel-made, x86-based CPUs in 2006. They used it to accelerate
multimedia applications such asQuickTime ,iTunes and key parts of Apple'sMac OS X including in the Quartz graphics compositor. Other companies such as Adobe use it for optimization of their image-processing programs such asAdobe Photoshop . Motorola was the first to supply AltiVec enabled processors starting with their G4 line. AltiVec was also used in some embedded systems that are used for high-performance digital signal processing.IBM consistently left VMX out of their POWER systems, which were intended for mainframe and server applications where it was not very useful. However, the last desktop CPU from IBM,
PowerPC 970 (dubbed the G5 by Apple) did include the AltiVec unit similar to the original PowerPC 7400. The core included a multiplier/adder unit and a full VMX unit.AltiVec is the standard "Category.VEC" part of the Power ISA v.2.03 specification.
The Cell Broadband Engine, used in (amongst other things) the
Playstation 3 , is also AltiVec enabled.The
POWER6 , introduced in 2007 also includes AltiVec, the implementation is similar to the one in 970 and Cell.VMX128
IBM enhanced VMX for use in Xenon (Xbox 360) and called this enhancement VMX128. The enhancements comprise new routines targeted at gaming (accelerating 3D graphics and game physics) [cite web
title=The Microsoft Xbox 360 CPU story
publisher= [http://ibm.com IBM]
url=http://www-128.ibm.com/developerworks/power/library/pa-fpfxbox/?ca=dgr-lnxw09XBoxDesign] and a total of 128 registers. VMX128 is not entirely compatible with VMX/Altivec, as a number of integer operations were removed to make space for the larger register file and additional application-specific operations.Issues
In C++, AltiVec support is mutually exclusive with use of the Standard Template Library "
vector<>
" class template due to the treatment of "vector" as a reserved word.References
External links
* [http://www-128.ibm.com/developerworks/library/pa-unrollav1/ Introducing the PowerPC SIMD unit]
* [http://www.freescale.com/webapp/sps/site/overview.jsp?nodeId=0162468rH3bTdGmKqW5Nf2 Freescale's AltiVec page]
* [http://domino.research.ibm.com/comm/research.nsf/pages/r.arch.simd.html Using data-parallel SIMD architecture in video games and supercomputers]
* [http://developer.apple.com/hardware/ve/ Apple's Velocity Engine page]
* [http://www.simdtech.org/altivec Simdtech.org mailing list]
* [http://noisymime.org/blogimages/SIMD.pdf SIMD history and performance comparison]
Wikimedia Foundation. 2010.