- Systolic array
In

computer architecture , a**systolic array**is a pipe network arrangement of processing units called cells. It is a specialized form ofparallel computing , where cells ( i.e. processors), compute data and store it independently of each other.**Description**A systolic array is composed of matrix-like rows of data processing units called cells. Data processing units

DPU s are similar tocentral processing unit s (CPU )s,( except for aprogram counter , since operation is transport-triggered, i.e., by the arrival of a data object "(seetransport triggered architecture s)"). Each cell share the information with its neighbours immediately after processing.The systolic array is often rectangular where data flows across the array between neighbour DPUs, often with different data flowing in different directions. The data streams entering and leaving the ports of the array are generated byauto-sequencing memory units, ASMs. Each ASM includes adata counter . InEmbedded System s a data stream may also be input from and/or output to an external source.An example of a systolic

algorithm might be designed formatrix multiplication . One matrix is fed in a row at a time from the top of the array and is passed down the array, the other matrix is fed in a column at a time from the left hand side of the array and passes from left to right. Dummy values are then passed in until each processor has seen one whole row and one whole column. At this point, the result of the multiplication is stored in the array and can now be output a row or a column at a time, flowing down or across the array.Systolic arrays are arrays of DPUs which are connected to a small number of nearest neighbour DPUs in a mesh-like topology. DPUs perform a sequence of operations on data that flows between them. Because the traditional systolic array synthesis methods have been practiced by algebraic algorithms, only uniform arrays with only linear pipes can be obtained, so that the architectures are the same in all DPUs. The consequence is, that only applications with regular data dependencies can be implemented on classical systolic arrays. Like

SIMD machines, clocked systolic arrays compute in "lock-step" with each processor undertaking alternate compute | communicate phases. But systolic arrays with asynchronous handshake between DPUs are called "wavefront arrays". One well-known systolic array is CMU's iWarp processor, which has been manufactured by Intel. An iWarp system has a linear array processors connected by data buses going in both directions.**History**The systolic array paradigm, data-stream-driven by data counters, is the counterpart of the von Neumann paradigm, instruction-stream-driven by a program counter (see

von Neumann orvon Neumann architecture ). Because a systolic array usually sends and receives multiple data streams, and multiple data counters are needed to generate these data streams, it supportsdata parallelism . The name derives from analogy with the regular pumping of blood by the heart.H. T. Kung andCharles E. Leiserson published the first paper describing systolic arrays in 1978; however, the first machine known to have used a similar technique was the Colossus Mark II in 1944.**Applications**An application Example - Polynomial Evaluation Horner's rule for evaluating a polynomial is:

y = ((((an*x + an-1)*x + an-2)*x + an-3)*x .... a1)*x + a0

A linear systolic array in which the processors are arranged in pairs: one multiplies its input by x and passes the result to the right, the next adds aj and passes the result to the right:

**Advantages and Disadvantages**Pros

*Faster

*ScalableCons

*Expensive

*They are a highly specialized for particular applications.

*Difficult to build**uper Systolic Array**The

**super systolic array**is a generalization of thesystolic array . Because the classical synthesis methods (algebraic, i. e. projection-based synthesis), yielding only uniform DPU arrays permitting only linear pipes, systolic arrays could be used only to implement applications with regular data dependencies. By usingsimulated annealing instead,Rainer Kress has introduced the generalizedsystolic array : the**super systolic array**. Its application is not restricted to applications with regular data dependencies.**KressArray**The

**KressArray**is the reconfigurable version of thesuper systolic array . More information about the background may be obtained from the articles aboutSystolic array ,Reconfigurable Computing ,Configware Compiler ,super systolic array and Configware/Software Co-Compiler.The

super systolic array is a generalization of thesystolic array . Because of the classical synthesis methods (algebraic, i. e. projection-based synthesis), yielding only uniform DPU arrays permitting only linear pipes, systolic arrays could be used only to implement applications with regular data dependencies. By using simulated annealing instead,Rainer Kress came up with the super systolic array, a generalization of the systolic array not being restricted to regular data dependencies.Because of the wide applicability of the super systolic array its reconfigurability makes sense: the Kress Array, having been pioneered by

Rainer Kress forreconfigurable computing .**ee also***

SISAL

*KressArray - Reconfigurable version of Super systolic array**Literature***H. T. Kung, C. E. Leiserson: Algorithms for VLSI processor arrays; in: C. Mead, L. Conway (eds.): Introduction to VLSI Systems; Addison-Wesley, 1979

*S. Y. Kung: VLSI Array Processors; Prentice-Hall, Inc., 1988

*N. Petkov: Systolic Parallel Processing; North Holland Publishing Co, 1992**External links*** [

*http://www.iti.fh-flensburg.de/lang/papers/isa/index.htm "Instruction Systolic Array (ISA)"*]

* [*http://kressarray.de/ KressArray*]

* [*http://xputers.informatik.uni-kl.de/staff/hartenstein/lot/GeneralizationOfTheSystolicArray.pdf Generalization of the systolic array (Super systolic array)*]

* [*http://www.cotsjournalonline.com/home/article.php?id=100249 A Systolic Array Implementation Using FPGAs, "COTS Journal, January 2005"*]

*Wikimedia Foundation.
2010.*