Fold (higher-order function)

Fold (higher-order function)

In functional programming, fold, also known variously as reduce, accumulate, compress or inject, is a family of higher-order functions that process a data structure in some order and build up a return value. Typically, a fold deals with two things: a combining function, and a data structure, typically a list of elements. The fold then proceeds to combine elements of the data structure using the function in some systematic way.

Folds are in a sense dual to unfolds, which take a starting value and apply a function to it repeatedly to generate a data structure, whereas a fold applies a function repeatedly to a data structure and generates a starting value (anamorphism as opposed to catamorphism).

Folds on lists

The folding of the list [1,2,3,4,5] with the addition operator would result in 15, the sum of the elements of the list [1,2,3,4,5] . To a rough approximation, one can think of the fold as replacing the commas in the list with the + operation, giving 1+2+3+4+5.

In the example above, + is an associative operation, so it is irrelevant how the addition is parenthesized. In the general case of non-associative binary functions the order in which the elements are combined matters. On lists, there are two obvious ways to carry this out: either by recursively combining the first element with the results of combining the rest (called a right fold) or by recursively combining the results of combining all but the last element with the last one (called a left fold). With a right fold, the sum would be parenthesized as 1 + (2 + (3 + (4 + 5))), whereas with a left fold it would be parenthesized as (((1 + 2) + 3) + 4) + 5.

In practice, it is convenient and natural to have an initial value which in the case of a right fold is used when one reaches the end of the list, and in the case of a left fold is what is initially combined with the first element of the list. In the example above, the value 0 (the additive identity) would be chosen as an initial value, giving 1 + (2 + (3 + (4 + (5 + 0)))) for the right fold, and ((((0 + 1) + 2) + 3) + 4) + 5 for the left fold.

List folds as structural transformations

Folds can be viewed as a mechanism for replacing the structural components of a data structure with functions and values in a regular way. In many languages, lists are built up from two primitives: any list is either the empty list, commonly called "nil", or it is a list "cons"tructed by appending an element to the start of some other list, which we call a "cons". The empty list and the cons operation are written as [] and (:) (colon) in Haskell. One can view a right fold as "substituting" the nil at the end of the list with a specific value, and each cons with a specific other function. Hence, one gets a diagram which looks something like this:

In the case of a left fold, the structural transformation being performed is somewhat less natural, but is still quite regular:

These pictures illustrate the names "left" and "right" fold visually. They also highlight the fact that foldr (:) [] is the identity function on lists, as replacing cons with cons and nil with nil will not change the result. The left fold diagram suggests an easy way to reverse a list, foldl (flip (:)) [] . Note that the parameters to cons must be flipped, because the element to add is now the right hand parameter of the combining function. Another easy result to see from this vantage-point is to write the higher-order map function in terms of foldr, by composing the function to act on the elements with cons, as:

map f = foldr ((:) . f) []

where the period (.) is an operator denoting function composition.

This way of looking at things provides a simple route to designing fold-like functions on other algebraic data structures, like various sorts of trees. One writes a function which recursively replaces the constructors of the datatype with provided functions, and any constant values of the type with provided values. Such a function is generally referred to as a catamorphism.

Implementation

Using Haskell as an example, foldr and foldl can be formulated in a few equations.

foldr f z [] = z -- if the list is empty, the result is the initial value z foldr f z (x:xs) = f x (foldr f z xs) -- if not, apply f to the first element and the result of folding the rest

foldl f z [] = z -- if the list is empty, the result is the initial value foldl f z (x:xs) = foldl f (f z x) xs -- if not, we recurse immediately, making the new initial value the result -- of combining the old initial value with the first element.

In Scheme and other Lisps, the empty list and the list construction operator are written as () and cons. Using Scheme, right and left fold can be written as:

(define (foldr f z xs) (if (null? xs) z (f (car xs) (foldr f z (cdr xs)))))(define (foldl f z xs) (if (null? xs) z (foldl f (f z (car xs)) (cdr xs))))

Here null? denotes a predicate function that returns a true value if given the empty list as argument.

Evaluation order considerations

In the presence of lazy, or normal-order evaluation, foldr will immediately return the application of "f" to the recursive case of folding over the rest of the list. Thus, if "f" is able to produce some part of its result without reference to the recursive case, and the rest of the result is never demanded, then the recursion will stop. This allows right folds to operate on infinite lists. By contrast, foldl will immediately call itself with new parameters until it reaches the end of the list. This tail recursion can be efficiently compiled as a loop, but can't deal with infinite lists at all — it will recurse forever in an infinite loop.

Reversing a list is also tail-recursive. (It can be implemented using rev = foldl (ys x -> x : ys) [] .) On finite lists, that means that left-fold and reverse can be composed to perform a right fold in a tail-recursive way (with a modification to the function f so it reverses the order of its arguments).

Another technical point to be aware of in the case of left folds using lazy evaluation is that the new initial parameter is not being evaluated before the recursive call is made. This can lead to stack overflows when one reaches the end of the list and tries to evaluate the resulting potentially gigantic expression. For this reason, such languages often provide a stricter variant of left folding which forces the evaluation of the initial parameter before making the recursive call, in Haskell, this is the foldl' (note the apostrophe, pronounced 'prime') function in the Data.List library. Combined with the speed of tail recursion, such folds are very efficient when lazy evaluation of the final result is impossible or undesirable.

One often wants to choose the identity element of the operation "f" as the initial value "z". When no initial value seems appropriate, for example, when one wants to fold the function which computes the maximum of its two parameters over a list in order to get the maximum element of the list, there are variants of foldr and foldl which use the last and first element of the list respectively as the initial value. In Haskell and several other languages, these are called foldr1 and foldl1, the 1 making reference to the automatic provision of an initial element, and the fact that the lists they are applied to must have at least one element.

Examples

Using a Haskell interpreter, we can show the structural transformation which foldr and foldl perform by constructing a string as follows:

Prelude> foldr (x y -> concat ["(f ",x," ",y,")"] ) "z" (map show [1..5] ) "(f 1 (f 2 (f 3 (f 4 (f 5 z)))))" Prelude> foldl (x y -> concat ["(f ",x," ",y,")"] ) "z" (map show [1..5] ) "(f (f (f (f (f z 1) 2) 3) 4) 5)"

ee also

*Homomorphism
*Map (higher-order function)

Notes

External links

* [http://www.cse.unsw.edu.au/~en1000/haskell/hof.html "Higher order functions — map, fold and filter"]
* [http://www.cs.bham.ac.uk/~ard/modules/swh06.html "Unit 6: The Higher-order fold Functions"]
* [http://wiki.tcl.tk/14726 "Fold"]
* A pair of cute websites for the difference between [http://foldr.com/ foldr] and [http://foldl.com/ foldl] .
* [http://www.iis.sinica.edu.tw/~scm/2008/constructing-list-homomorphism/ "Constructing List Homomorphism from Left and Right Folds"]
* [http://caos.di.uminho.pt/~ulisses/blog/2007/11/20/foldr-the-magic-function/ "The magic foldr"]


Wikimedia Foundation. 2010.

Игры ⚽ Поможем решить контрольную работу

Look at other dictionaries:

  • Map (higher-order function) — In many programming languages, map is the name of a higher order function that applies a given function to each element of a list, returning a list of results. They are examples of both catamorphisms and anamorphisms. This is often called apply… …   Wikipedia

  • Higher-order function — In mathematics and computer science, higher order functions or functionals are functions which do at least one of the following: *take one or more functions as an input *output a function.In mathematics these are also known as operators or… …   Wikipedia

  • Function composition — For function composition in computer science, see function composition (computer science). g ∘ f, the composition of f and g. For example, (g ∘ f)(c) = #. In mathematics, function composition is the application of one function to the resul …   Wikipedia

  • Ackermann function — In recursion theory, the Ackermann function or Ackermann Péter function is a simple example of a general recursive function that is not primitive recursive. General recursive functions are also known as computable functions. The set of primitive… …   Wikipedia

  • Folding — Fold may refer to:*Above the fold, a graphic design concept originating in newspapers *Fold (from the Saxon falod ) meaning a staked off pasture area (e.g. Sheepfold, an enclosure for sheep) and used in place names such as the Fold Villages.… …   Wikipedia

  • Language Integrated Query — LINQ redirects here. For the card game, see Linq (card game). Language Integrated Query Influenced by SQL, Haskell Language Integrated Query (LINQ, pronounced link ) is a Microsoft .NET Framework component that adds native data querying… …   Wikipedia

  • Contact order — The contact order of a protein is a measure of the locality of the inter amino acid contacts in the protein s native state tertiary structure. It is calculated as the average sequence distance between residues that form native contacts in the… …   Wikipedia

  • Iterated binary operation — In mathematics, an iterated binary operation is an extension of a binary operation on a set S to a function on finite sequences of elements of S through repeated application. Common examples include the extension of the addition operation to the… …   Wikipedia

  • Prefix sum — The prefix sum (also known as the scan, prefix reduction, or partial sum) is an operation on lists in which each element in the result list is obtained from the sum of the elements in the operand list up to its index. There are two types of… …   Wikipedia

  • Church encoding — In mathematics, Church encoding is a means of embedding data and operators into the lambda calculus, the most familiar form being the Church numerals, a representation of the natural numbers using lambda notation. The method is named for Alonzo… …   Wikipedia

Share the article and excerpts

Direct link
Do a right-click on the link above
and select “Copy Link”