Automatic parallelization

Automatic parallelization

Automatic parallelization, also auto parallelization, autoparallelization, parallelization, or //ization (shorthand), the last two of which imply automation when used in context, refers to converting sequential code into multi-threaded or vectorized (or even both) code in order to utilize multiple processors simultaneously in a shared-memory multiprocessor (SMP) machine. The goal of automatic parallelization is to relieve programmers from the tedious and error-prone manual parallelization process. Though the quality of automatic parallelization has improved in the past several decades, fully automatic parallelization of sequential programs by compilers remains a grand challenge due to its need for complex program analysis and the unknown factors (such as input data range) during compilation.

The programming control structures on which autoparallelization places the most focus are loops, because, in general, most of the execution time of a program takes place inside some form of loop. A parallelizing compiler tries to split up a loop so that its iterations can be executed on separate processors concurrently.

Compiler parallelization analysis

The compiler usually conducts two passes of analysis before actual parallelization in order to determine the following:
*Is it safe to parallelize the loop? Answering this question needs accurate dependence analysis and alias analysis
*Is it worthwhile to parallelize it? This answer requires a reliable estimation (modeling) of the program workload and the capacity of the parallel system.

The first pass of the compiler performs a data dependence analysis of the loop to determine whether each iteration of the loop can be executed independently of the others. Data dependence can sometimes be dealt with, but it may incur additional overhead in the form of message passing, synchronization of shared memory, or some other method of processor communication.

The second pass attempts to justify the parallelization effort by comparing the theoretical execution time of the code after parallelization to the code's sequential execution time. Somewhat counterintuitively, code does not always benefit from parallel execution. The extra overhead that can be associated with using multiple processors can eat into the potential speedup of parallelized code.

Example

The Fortran code below can be auto-parallelized by a compiler because each iteration is independent of the others, and the final result of array z will be correct regardless of the execution order of the other iterations.

do i=1, n z(i) = x(i) + y(i) enddo

On the other hand, the following code cannot be auto-parallelized, because the value of z(i) depends on the result of the previous iteration, z(i-1).

do i=2, n z(i) = z(i-1)*2 enddo

This does not mean that the code cannot be parallelized. Indeed, it is equivalent to

do i=2, n z(i) = z(1)*2**(i-1) enddo

However, current parallelizing compilers are not usually capable of bringing out these parallelisms automatically, and it is questionable whether this code would benefit from parallelization in the first place.

Difficulties

Automatic parallelization by compilers or tools is very difficult due to the following reasons:
* dependence analysis is hard for code using indirect addressing, pointers, recursion, and indirect function calls;
* loops have an unknown number of iterations;
* and accesses to global resources are difficult to coordinate in terms of memory allocation, I/O, and shared variables.

Workaround

Due to the inherent difficulties in full automatic parallelization, several easier approaches exist to get a parallel program in higher quality. They are:
* Allow programmers to add "hints" to their programs to guide compiler parallelization, such as HPF for distributed memory systems and OpenMP for shared memory systems.
* Build an interactive system between programmers and parallelizing tools/compilers. Notable examples are SUIF Explorer (The Stanford University Intermediate Format compiler, http://suif.stanford.edu/), the Polaris compiler, and ParaWise (formally CAPTools).
* Hardware-supported speculative multithreading.

Historical parallelizing compilers

Most research compilers for automatic parallelization consider Fortran programs,Fact|date=July 2007 because Fortran makes stronger guarantees about aliasing than languages such as C. Typical examples are:
* Vienna Fortran compiler
* Paradigm compiler
* Polaris compiler
* SUIF compiler

ee also

*Loop nest optimization
*Polytope model


Wikimedia Foundation. 2010.

Игры ⚽ Нужна курсовая?

Look at other dictionaries:

  • Parallel computing — Programming paradigms Agent oriented Automata based Component based Flow based Pipelined Concatenative Concurrent computing …   Wikipedia

  • Sieve C++ Parallel Programming System — The Sieve C++ Parallel Programming System is a C++ compiler and parallel runtime designed and released by Codeplay that aims to simplify the parallelization of code so that it may run efficiently on multi processor or multi core systems. It is an …   Wikipedia

  • Compiler optimization — is the process of tuning the output of a compiler to minimize or maximize some attributes of an executable computer program. The most common requirement is to minimize the time taken to execute a program; a less common one is to minimize the… …   Wikipedia

  • Loop optimization — In compiler theory, loop optimization plays an important role in improving cache performance, making effective use of parallel processing capabilities, and reducing overheads associated with executing loops. Most execution time of a scientific… …   Wikipedia

  • Parallel programming model — A parallel programming model is a set of software technologies to express parallel algorithms and match applications with the underlying parallel systems. It encloses the areas of applications, programming languages, compilers, libraries,… …   Wikipedia

  • OpenFOAM — Developer(s) OpenCFD Ltd. Initial release 10 December 2004 Stable release 2.0.1 / 4 August 2011 Operating system Unix/Linux …   Wikipedia

  • Normalized loop — In computer science, a normalized loop (sometimes called well behaved loop), is a loop which the loop variable starts at 0 (or any constant) and get incremented by one at every iteration until the exit condition is met. Normalized loops are very… …   Wikipedia

  • Compiler — This article is about the computing term. For the anime, see Compiler (anime). A diagram of the operation of a typical multi language, multi target compiler A compiler is a computer program (or set of programs) that transforms source code written …   Wikipedia

  • GNU Compiler Collection — Cc1 redirects here. For other uses of CC1 or CC 1, see CC1 (disambiguation). GNU Compiler Collection Developer(s) GNU Project Initial release May 23, 1987 ( …   Wikipedia

  • Dominator (graph theory) — For Dominating set problem, see Dominating set. In computer science, in control flow graphs, a node d dominates a node n if every path from the start node to n must go through d. Notationally, this is written as d dom n (or sometimes d n). By… …   Wikipedia

Share the article and excerpts

Direct link
Do a right-click on the link above
and select “Copy Link”