OpenCL logo
Original author(s) Apple Inc.
Developer(s) Khronos Group
Stable release 1.2 / November 15, 2011; 6 days ago (2011-11-15)
Operating system Cross-platform
Type API
License Royalty Free

OpenCL (Open Computing Language) is a framework for writing programs that execute across heterogeneous platforms consisting of CPUs, GPUs, and other processors. OpenCL includes a language (based on C99) for writing kernels (functions that execute on OpenCL devices), plus APIs that are used to define and then control the platforms. OpenCL provides parallel computing using task-based and data-based parallelism. It has been adopted by Intel, AMD, Nvidia, and ARM.

OpenCL gives any application access to the graphics processing unit for non-graphical computing. Thus, OpenCL extends the power of the Graphics Processing Unit beyond graphics (general-purpose computing on graphics processing units). Academic researchers have investigated automatically compiling OpenCL programs into application-specific processors running on FPGAs,[1] and commercial FPGA vendors are developing tools to translate OpenCL to run on their FPGA devices.[2]

OpenCL is analogous to the open industry standards OpenGL and OpenAL, for 3D graphics and computer audio, respectively. OpenCL is managed by the non-profit technology consortium Khronos Group.



OpenCL was initially developed by Apple Inc., which holds trademark rights, and refined into an initial proposal in collaboration with technical teams at AMD, IBM, Intel, and Nvidia. Apple submitted this initial proposal to the Khronos Group. On June 16, 2008 the Khronos Compute Working Group was formed[3] with representatives from CPU, GPU, embedded-processor, and software companies. This group worked for five months to finish the technical details of the specification for OpenCL 1.0 by November 18, 2008.[4] This technical specification was reviewed by the Khronos members and approved for public release on December 8, 2008.[5]

OpenCL 1.0 has been released with Mac OS X Snow Leopard. According to an Apple press release:[6]

Snow Leopard further extends support for modern hardware with Open Computing Language (OpenCL), which lets any application tap into the vast gigaflops of GPU computing power previously available only to graphics applications. OpenCL is based on the C programming language and has been proposed as an open standard.

AMD has decided to support OpenCL (and DirectX 11) instead of the now deprecated Close to Metal in its Stream framework.[7][8] RapidMind announced their adoption of OpenCL underneath their development platform, in order to support GPUs from multiple vendors with one interface.[9] On December 9, 2008, Nvidia announced its intention to add full support for the OpenCL 1.0 specification to its GPU Computing Toolkit.[10] On October 30, 2009, IBM released its first OpenCL implementation as a part of the XL compilers.[11]

OpenCL 1.1 was ratified by the Khronos Group June 14, 2010[12] and adds significant functionality for enhanced parallel programming flexibility, functionality and performance including:

  • New data types including 3-component vectors and additional image formats;
  • Handling commands from multiple host threads and processing buffers across multiple devices;
  • Operations on regions of a buffer including read, write and copy of 1D, 2D or 3D rectangular regions;
  • Enhanced use of events to drive and control command execution;
  • Additional OpenCL built-in C functions such as integer clamp, shuffle and asynchronous strided copies;
  • Improved OpenGL interoperability through efficient sharing of images and buffers by linking OpenCL and OpenGL events.

On Nov 15, 2011 the OpenCL 1.2 specification was announced by the Khronos Group[13] which added significant functionality over the previous versions in terms of performance and features for parallel programming. Most notable features include:

  • Device partitioning: the ability to partition a device into sub-devices so that work assignments can be allocated to individual compute units. This is useful for reserving areas of the device in order to reduce latency for time-critical tasks.
  • Separate compilation and linking of objects: the functionality to compile OpenCL into external libraries for inclusion into other programs.
  • Enhanced image support: 1.2 adds support for 1D images and 1D/2D image arrays. Furthermore, the OpenGL sharing extensions now allow for OpenGL 1D textures and 1D/2D texture arrays to be used to create OpenCL images.
  • Built-in kernels: custom devices that contain specific unique functionality are now integrated more closley into the OpenCL framework. Kernels can be called to use specialised or non-programmable aspects of underlying hardware. Examples include, video encoding/decoding and digital signal processors.
  • DirectX functionality: DX9 media surface sharing allows for efficient sharing between OpenCL and DX9 or DXVA media surfaces. Equally, for DX11 seamless sharing between OpenCL and DX11 surfaces is enabled.

The OpenCL specification is under development at Khronos, which is open to any interested company to join.


  • On December 10, 2008, AMD and Nvidia held the first public OpenCL demonstration, a 75-minute presentation at Siggraph Asia 2008. AMD showed a CPU-accelerated OpenCL demo explaining the scalability of OpenCL on one or more cores while Nvidia showed a GPU-accelerated demo.[14][15]
  • On March 16, 2009, at the 4th Multicore Expo, Imagination Technologies announced the PowerVR SGX543MP, the first GPU of this company to feature OpenCL support.[16]
  • On March 26, 2009, at GDC 2009, AMD and Havok demonstrated the first working implementation for OpenCL accelerating Havok Cloth on AMD Radeon HD 4000 series GPU.[17]
  • On April 20, 2009, Nvidia announced the release of its OpenCL driver and SDK to developers participating in its OpenCL Early Access Program.[18]
  • On August 5, 2009, AMD unveiled the first development tools for its OpenCL platform as part of its ATI Stream SDK v2.0 Beta Program.[19]
  • On August 28, 2009, Apple released Mac OS X Snow Leopard, which contains a full implementation of OpenCL.[20]
OpenCL in Snow Leopard is supported on the NVIDIA GeForce 320M, GeForce GT 330M, GeForce 9400M, GeForce 9600M GT, GeForce 8600M GT, GeForce GT 120, GeForce GT 130, GeForce GTX 285, GeForce 8800 GT, GeForce 8800 GS, Quadro FX 4800, Quadro FX5600, ATI Radeon HD 4670, ATI Radeon HD 4850, Radeon HD 4870, ATI Radeon HD 5670, ATI Radeon HD 5750, ATI Radeon HD 5770 and ATI Radeon HD 5870.[21]
  • On September 28, 2009, NVIDIA released its own OpenCL drivers and SDK implementation.
  • On October 13, 2009, AMD released the fourth beta of the ATI Stream SDK 2.0, which provides a complete OpenCL implementation on both R700/R800 GPUs and SSE3 capable CPUs. The SDK is available for both Linux and Windows.[22]
  • On November 26, 2009, NVIDIA released drivers for OpenCL 1.0 (rev 48).
The Apple,[23] Nvidia,[24] RapidMind[25] and Gallium3D[26] implementations of OpenCL are all based on the LLVM Compiler technology and use the Clang Compiler as its frontend.
  • On October 27, 2009, S3 released their first product supporting native OpenCL 1.0 - the Chrome 5400E embedded graphics processor.[27]
  • On December 10, 2009, VIA released their first product supporting OpenCL 1.0 - ChromotionHD 2.0 video processor included in VN1000 chipset.[28]
  • On December 21, 2009, AMD released the production version of the ATI Stream SDK 2.0,[29] which provides OpenCL 1.0 support for R800 GPUs and beta support for R700 GPUs.
  • On June 1, 2010, ZiiLABS released details of their first OpenCL implementation for the ZMS processor for handheld, embedded and digital home products.[30]
  • On June 30, 2010, IBM released a fully conformant version of OpenCL 1.0.[31]
  • On September 13, 2010, Intel released details of their first OpenCL implementation for the Sandy Bridge chip architecture. Sandy Bridge will integrate Intel's newest graphics chip technology directly onto the central processing unit.[32]
  • On November 15, 2010, Wolfram Research released Mathematica 8 with OpenCLLink package.
  • On March 3, 2011, Khronos Group announces the formation of the WebCL working group to explore defining a JavaScript binding to OpenCL. This creates the potential to harness GPU and multi-core CPU parallel processing from a Web browser.[33][34]
  • On March 31, 2011, IBM released a fully conformant version of OpenCL 1.1.[31][35]
  • On April 25, 2011, IBM released OpenCL Common Runtime v0.1 for Linux on x86 Architecture.[36]
  • On May 4, 2011, Nokia Research releases an open source WebCL extension for the Firefox web browser, providing a JavaScript binding to OpenCL.[37]
  • On July 1, 2011, Samsung Electronics releases an open source prototype implementation of WebCL for WebKit, providing a JavaScript binding to OpenCL.[38]
  • On August 8, 2011, AMD released the OpenCL-driven AMD Accelerated Parallel Processing (APP) Software Development Kit (SDK) v2.5, replacing the ATI Stream SDK as technology and concept. [39]

OpenCL language

The programming language used to write computation kernels is based on C99 with some limitations and additions. It omits the use of function pointers, recursion, bit fields, variable-length arrays, and standard C99 header files.[40] The language is extended to easily use parallelism with vector types and operations, synchronization, functions to work with work-items/groups.[41] It has memory region qualifiers: __global, __local, __constant, and __private. Also, a lot of built-in functions are added.


This example will load a Fast Fourier Transformation (FFT) and execute it: [42]

  // create a compute context with GPU device
  context = clCreateContextFromType(NULL, CL_DEVICE_TYPE_GPU, NULL, NULL, NULL);
  // create a command queue
  queue = clCreateCommandQueue(context, NULL, 0, NULL);
  // allocate the buffer memory objects
  memobjs[0] = clCreateBuffer(context, CL_MEM_READ_ONLY | CL_MEM_COPY_HOST_PTR, sizeof(float)*2*num_entries, srcA, NULL);
  memobjs[1] = clCreateBuffer(context, CL_MEM_READ_WRITE, sizeof(float)*2*num_entries, NULL, NULL);
  // create the compute program
  program = clCreateProgramWithSource(context, 1, &fft1D_1024_kernel_src, NULL, NULL);
  // build the compute program executable
  clBuildProgram(program, 0, NULL, NULL, NULL, NULL);
  // create the compute kernel
  kernel = clCreateKernel(program, "fft1D_1024", NULL);
  // set the args values
  clSetKernelArg(kernel, 0, sizeof(cl_mem), (void *)&memobjs[0]);
  clSetKernelArg(kernel, 1, sizeof(cl_mem), (void *)&memobjs[1]);
  clSetKernelArg(kernel, 2, sizeof(float)*(local_work_size[0]+1)*16, NULL);
  clSetKernelArg(kernel, 3, sizeof(float)*(local_work_size[0]+1)*16, NULL);
  // create N-D range object with work-item dimensions and execute kernel
  global_work_size[0] = num_entries;
  local_work_size[0] = 64;
  clEnqueueNDRangeKernel(queue, kernel, 1, NULL, global_work_size, local_work_size, 0, NULL, NULL);

The actual calculation: (Based on Fitting FFT onto the G80 Architecture)[43]

  // This kernel computes FFT of length 1024. The 1024 length FFT is decomposed into
  // calls to a radix 16 function, another radix 16 function and then a radix 4 function
  __kernel void fft1D_1024 (__global float2 *in, __global float2 *out,
                          __local float *sMemx, __local float *sMemy) {
    int tid = get_local_id(0);
    int blockIdx = get_group_id(0) * 1024 + tid;
    float2 data[16];
    // starting index of data to/from global memory
    in = in + blockIdx;  out = out + blockIdx;
    globalLoads(data, in, 64); // coalesced global reads
    fftRadix16Pass(data);      // in-place radix-16 pass
    twiddleFactorMul(data, tid, 1024, 0);
    // local shuffle using local memory
    localShuffle(data, sMemx, sMemy, tid, (((tid & 15) * 65) + (tid >> 4)));
    fftRadix16Pass(data);               // in-place radix-16 pass
    twiddleFactorMul(data, tid, 64, 4); // twiddle factor multiplication
    localShuffle(data, sMemx, sMemy, tid, (((tid >> 4) * 64) + (tid & 15)));
    // four radix-4 function calls
    fftRadix4Pass(data);      // radix-4 function number 1
    fftRadix4Pass(data + 4);  // radix-4 function number 2
    fftRadix4Pass(data + 8);  // radix-4 function number 3
    fftRadix4Pass(data + 12); // radix-4 function number 4
    // coalesced global writes
    globalStores(data, out, 64);

A full, open source implementation of an OpenCL FFT can be found on Apple's website[44]

OpenCL conformant products

The Khronos Group announces an extended list of OpenCL conformant products, see OpenCL Conformant Products.

Synopsis of OpenCL conformant products[45]
AMD APP SDK (supports OpenCL CPU and Accelerated processing unit Devices) X86 + SSE2 (or higher) compatible CPUs 64bit & 32bit[46]; Linux 2.6 PC, Windows Vista/7 PC AMD Fusion E-350, E-240, C-50, C-30 with HD 6310/HD 6250 AMD Radeon/Mobility HD 6800, HD 5x00 series GPU, iGPU HD 6310/HD 6250 ATI FirePro Vx800 series GPU
Intel OpenCL SDK 1.1[47] (supports only OpenCL Intel Core based CPU Device) Intel CPUs with SSE 4.1, SSE 4.2 or AVX support.[48] [49] Microsoft Windows, Linux Intel Core i7, i5, i3; 2nd Generation Intel Core i7/5/3 Intel Core 2 Solo, Duo Quad, Extreme Intel Xeon 7x00,5x00,3x00 (Core based)
IBM Servers with OpenCL Development Kit for Linux on Power running on Power VSX[50][51] IBM Power 755 (PERCS), 750 IBM BladeCenter PS70x Express IBM BladeCenter JS2x, JS43 IBM BladeCenter QS22
IBM OpenCL Common Runtime (OCR)


X86 + SSE2 (or higher) compatible CPUs 64bit & 32bit[53]; Linux 2.6 PC AMD Fusion, NVIDIA ION and Intel Core i7, i5, i3; 2nd Generation Intel Core i7/5/3 AMD Radeon, NVIDIA GeForce and Intel Core 2 Solo, Duo Quad, Extreme ATI FirePro, NVIDIA Quadro and Intel Xeon 7x00,5x00,3x00 (Core based)

See also


  1. ^ Jääskeläinen, Pekka O.; de La Lama, Carlos S.; Huerta, Pablo; Takala, Jarmo H. (July 2010). "OpenCL-based design methodology for application-specific rocessors". 2010 International Conference on Embedded Computer Systems (SAMOS) (IEEE): 223–230. doi:10.1109/ICSAMOS.2010.5642061. Retrieved February 17, 2011. 
  2. ^ Jobs at Altera: Senior Design Engineer
  3. ^ "Khronos Launches Heterogeneous Computing Initiative" (Press release). Khronos Group. 2008-06-16. Retrieved 2008-06-18. 
  4. ^ "OpenCL gets touted in Texas". MacWorld. 2008-11-20. Retrieved 2009-06-12. 
  5. ^ "The Khronos Group Releases OpenCL 1.0 Specification" (Press release). Khronos Group. 2008-12-08. Retrieved 2009-06-12. 
  6. ^ "Apple Previews Mac OS X Snow Leopard to Developers" (Press release). Apple Inc.. 2008-06-09. Retrieved 2008-06-09. 
  7. ^ "AMD Drives Adoption of Industry Standards in GPGPU Software Development" (Press release). AMD. 2008-08-06.,,51_104_543~127451,00.html. Retrieved 2008-08-14. 
  8. ^ "AMD Backs OpenCL, Microsoft DirectX 11". eWeek. 2008-08-06. Retrieved 2008-08-14. 
  9. ^ "HPCWire: RapidMind Embraces Open Source and Standards Projects". HPCWire. 2008-11-10. Retrieved 2008-11-11. 
  10. ^ "NVIDIA Adds OpenCL To Its Industry Leading GPU Computing Toolkit" (Press release). Nvidia. 2008-12-09. Retrieved 2008-12-10. 
  11. ^ "OpenCL Development Kit for Linux on Power". alphaWorks. 2009-10-30. Retrieved 2009-10-30. 
  12. ^ Khronos Drives Momentum of Parallel Computing Standard with Release of OpenCL 1.1 Specification
  13. ^ Khronos Releases OpenCL 1.2 Specification
  14. ^ "OpenCL Demo, AMD CPU". 2008-12-10. Retrieved 2009-03-28. 
  15. ^ "OpenCL Demo, NVIDIA GPU". 2008-12-10. Retrieved 2009-03-28. 
  16. ^ "Imagination Technologies launches advanced, highly-efficient POWERVR™ SGX543MP multi-processor graphics IP family". Imagination Technologies. 2009-03-19. Retrieved 2011-01-30. 
  17. ^ "AMD and Havok demo OpenCL accelerated physics". PC Perspective. 2009-03-26. Retrieved 2009-03-28. 
  18. ^ "NVIDIA Releases OpenCL Driver To Developers". NVIDIA. 2009-04-20. Retrieved 2009-04-27. 
  19. ^ "AMD does reverse GPGPU, announces OpenCL SDK for x86". Ars Technica. 2009-08-05. Retrieved 2009-08-06. 
  20. ^ Dan Moren; Jason Snell (2009-06-08). "Live Update: WWDC 2009 Keynote". MacWorld. Retrieved 2009-06-12. 
  21. ^ "Mac OS X Snow Leopard – Technical specifications and system requirements". Apple Inc. 2011-03-23. Retrieved 2011-03-23. 
  22. ^ "ATI Stream Software Development Kit (SDK) v2.0 Beta Program". Retrieved 2009-10-14. [dead link]
  23. ^ "Apple entry on LLVM Users page". Retrieved 2009-08-29. 
  24. ^ "Nvidia entry on LLVM Users page". Retrieved 2009-08-06. 
  25. ^ "Rapidmind entry on LLVM Users page". Retrieved 2009-10-01. 
  26. ^ "Zack Rusin's blog post about the Gallium3D OpenCL implementation". Retrieved 2009-10-01. 
  27. ^ "S3 Graphics launched the Chrome 5400E embedded graphics processor". Retrieved 2009-10-27. 
  28. ^ "VIA Brings Enhanced VN1000 Graphics Processor"]. Retrieved 2009-12-10. 
  29. ^ "ATI Stream SDK v2.0 with OpenCL™ 1.0 Support". Retrieved 2009-10-23. 
  30. ^
  31. ^ a b "Khronos Group Conformant Products". 
  32. ^ "Intel discloses new Sandy Bridge technical details". Retrieved 2010-09-13. 
  33. ^ WebCL related stories
  34. ^ Khronos Releases Final WebGL 1.0 Specification
  35. ^ "OpenCL Development Kit for Linux on Power". 
  36. ^ "About the OpenCL Common Runtime for Linux on x86 Architecture". 
  37. ^ Nokia Research releases WebCL prototype
  38. ^ Samsung's WebCL Prototype for WebKit
  39. ^ [1]
  40. ^ AMD. Introduction to OpenCL Programming 201005, page 89-90
  41. ^ AMD. Introduction to OpenCL Programming 201005, page 89-90
  42. ^ "OpenCL". SIGGRAPH2008. 2008-08-14. Retrieved 2008-08-14. 
  43. ^ "Fitting FFT onto G80 Architecture" (PDF). Vasily Volkov and Brian Kazian, UC Berkeley CS258 project report. May 2008. Retrieved 2008-11-14. 
  44. ^ . "OpenCL on FFT". Apple. 16 Nov 2009. Retrieved 2009-12-07. 
  45. ^ "Conformant Products". Retrieved 11 August 2011. 
  46. ^ "OpenCL™ and the AMD APP SDK". AMD Developer Central. Retrieved 11 August 2011. 
  47. ^ "About Intel® OpenCL SDK 1.1". Retrieved 11 August 2011. 
  48. ^ "Product Support". Retrieved 11 August 2011. 
  49. ^ "Intel® OpenCL SDK - Release Notes". Retrieved 11 August 2011. 
  50. ^ "Announcing OpenCL Development Kit for Linux on Power v0.3". Retrieved 11 August 2011. 
  51. ^ "IBM releases OpenCL Development Kit for Linux on Power v0.3 - OpenCL 1.1 conformant release available". OpenCL Lounge. Retrieved 11 August 2011. 
  52. ^ "IBM releases OpenCL Common Runtime for Linux on x86 Architecture". Retrieved 10 September 2011. 
  53. ^ "OpenCL™ and the AMD APP SDK". AMD Developer Central. Retrieved 10 September 2011. 
  54. ^ "Nvidia Releases OpenCL Driver".,7596.html. Retrieved 11 August 2011. 

External links




Language bindings and wrappers


Wikimedia Foundation. 2010.

Игры ⚽ Поможем сделать НИР

Look at other dictionaries:

  • OpenCL — Entwickler Khronos Group Aktuelle Version 1.2 (16. November 2011) Betriebssystem plattformunabhängig Kategorie Programmierschnittstelle …   Deutsch Wikipedia

  • OpenCL — Тип API Разработчик Apple Inc., Khronos Group Операционная система Кроссплатформенное программное обеспечение Первый выпуск 9 декабря 2008 Последняя версия 1.2 (15 ноября 2011) Лицензия …   Википедия

  • OpenCL C — OpenCL Entwickler: Khronos Group Aktuelle Version: 1.0 (8. Dezember 2008) Betriebssystem: plattformunabhängig Kateg …   Deutsch Wikipedia

  • OpenCL — Desarrollador Grupo Khronos Información general Diseñador Apple …   Wikipedia Español

  • OpenCL —  Ne doit pas être confondu avec OpenGL. OpenCL (Open Computing Language) est la combinaison d une API et d un langage de programmation dérivé du C, proposé comme un standard ouvert par le Khronos Group. OpenCL est conçu pour programmer des… …   Wikipédia en Français

  • Open CL — OpenCL  Ne doit pas être confondu avec OpenGL. OpenCL (Open Computing Language) est la combinaison d une API et d un langage de programmation dérivé du C, proposé comme un standard ouvert par le Khronos Group. L objectif d OpenCL est de… …   Wikipédia en Français

  • Bullet Physics Library — Физический движок …   Википедия

  • Mac OS X Snow Leopard — Mac OS X v10.6 Snow Leopard Part of the Mac OS X family …   Wikipedia

  • Compute Unified Device Architecture — Entwickler Nvidia Aktuelle Version 4.0 (Mai 2011) Betriebssystem Windows, Linux, MacOS X Kategorie …   Deutsch Wikipedia

  • AMD Fusion — Codename(s) Fusion Desna Ontario Zacate Llano Hondo (cancelled) Wichita (cancelled) Krishna (cancelled) Trinity Weatherford Richland IGP Wrestler WinterPark BeaverCreek ATI/Radeon Driver related BTC[1] [2] …   Wikipedia

Share the article and excerpts

Direct link
Do a right-click on the link above
and select “Copy Link”