TransAgg

TransAgg

TransAgg or "Transform and Aggregate" is a model of distributed computation for "Internet-scale" computations. It is especially relevant for Cloud computing. Origins for TransAgg model could be found in Bulk Synchronous Parallel computations, which divide stages of parallel computations into a transformation of input data, followed by shuffling of the results of these computations, and the second stage of computation base don the aggregated results at each compute node. TransAgg can be thought of as modularizing Bulk Synchronous Parallel, because BSP is an iterative scheme, while TransAgg splits each of the iterations of BSP into different parallel jobs.

TransAgg computations divide parallel computations into two stages : Transformation, and Aggregation. Many common parallel algorithms can easily be transformed into chains of these stages.

MapReduce computations, originally conceived in the functional programming paradigm as implemented in Lisp, is a special subset of TransAgg computations, since MapReduce implies transformations in the map stage that do not have any persistent state. TransAgg computations' transform stages are free from these limitations.

Hadoop, an Apache distributed computing framework, implements TransAgg computation model.


Wikimedia Foundation. 2010.

Игры ⚽ Нужна курсовая?

Share the article and excerpts

Direct link
Do a right-click on the link above
and select “Copy Link”