- Fold (higher-order function)
functional programming, fold, also known variously as reduce, accumulate, compress or inject, is a family of higher-order functions that process a data structurein some order and build up a return value. Typically, a fold deals with two things: a combining function, and a data structure, typically a list of elements. The fold then proceeds to combine elements of the data structure using the function in some systematic way.
Folds are in a sense
dualto unfolds, which take a starting value and apply a function to it repeatedly to generate a data structure, whereas a fold applies a function repeatedly to a data structure and generates a starting value ( anamorphismas opposed to catamorphism).
Folds on lists
The folding of the list
[1,2,3,4,5]with the addition operator would result in 15, the sum of the elements of the list
[1,2,3,4,5]. To a rough approximation, one can think of the fold as replacing the commas in the list with the + operation, giving 1+2+3+4+5.
In the example above, + is an
associative operation, so it is irrelevant how the addition is parenthesized. In the general case of non-associative binary functions the order in which the elements are combined matters. On lists, there are two obvious ways to carry this out: either by recursively combining the first element with the results of combining the rest (called a right fold) or by recursively combining the results of combining all but the last element with the last one (called a left fold). With a right fold, the sum would be parenthesized as 1 + (2 + (3 + (4 + 5))), whereas with a left fold it would be parenthesized as (((1 + 2) + 3) + 4) + 5.
In practice, it is convenient and natural to have an initial value which in the case of a right fold is used when one reaches the end of the list, and in the case of a left fold is what is initially combined with the first element of the list. In the example above, the value 0 (the
additive identity) would be chosen as an initial value, giving 1 + (2 + (3 + (4 + (5 + 0)))) for the right fold, and ((((0 + 1) + 2) + 3) + 4) + 5 for the left fold.
List folds as structural transformations
Folds can be viewed as a mechanism for replacing the structural components of a data structure with functions and values in a regular way. In many languages, lists are built up from two primitives: any list is either the empty list, commonly called "nil", or it is a list "cons"tructed by appending an element to the start of some other list, which we call a "cons". The empty list and the cons operation are written as
(:)(colon) in Haskell. One can view a right fold as "substituting" the nil at the end of the list with a specific value, and each cons with a specific other function. Hence, one gets a diagram which looks something like this:
In the case of a left fold, the structural transformation being performed is somewhat less natural, but is still quite regular:
These pictures illustrate the names "left" and "right" fold visually. They also highlight the fact that
foldr (:) is the identity function on lists, as replacing cons with cons and nil with nil will not change the result. The left fold diagram suggests an easy way to reverse a list,
foldl (flip (:)) . Note that the parameters to cons must be flipped, because the element to add is now the right hand parameter of the combining function. Another easy result to see from this vantage-point is to write the higher-order map function in terms of foldr, by composing the function to act on the elements with cons, as:
map f = foldr ((:) . f) 
where the period (.) is an operator denoting function composition.
This way of looking at things provides a simple route to designing fold-like functions on other algebraic data structures, like various sorts of trees. One writes a function which recursively replaces the constructors of the datatype with provided functions, and any constant values of the type with provided values. Such a function is generally referred to as a
Using Haskell as an example,
foldlcan be formulated in a few equations.
foldr f z  = z -- if the list is empty, the result is the initial value z foldr f z (x:xs) = f x (foldr f z xs) -- if not, apply f to the first element and the result of folding the rest
foldl f z  = z -- if the list is empty, the result is the initial value foldl f z (x:xs) = foldl f (f z x) xs -- if not, we recurse immediately, making the new initial value the result -- of combining the old initial value with the first element.
null?denotes a predicate function that returns a true value if given the empty list as argument.
Evaluation order considerations
In the presence of lazy, or
foldrwill immediately return the application of "f" to the recursive case of folding over the rest of the list. Thus, if "f" is able to produce some part of its result without reference to the recursive case, and the rest of the result is never demanded, then the recursion will stop. This allows right folds to operate on infinite lists. By contrast,
foldlwill immediately call itself with new parameters until it reaches the end of the list. This
tail recursioncan be efficiently compiled as a loop, but can't deal with infinite lists at all — it will recurse forever in an infinite loop.
Reversing a list is also tail-recursive. (It can be implemented using
rev = foldl (ys x -> x : ys) .) On finite lists, that means that left-fold and reverse can be composed to perform a right fold in a tail-recursive way (with a modification to the function f so it reverses the order of its arguments).
Another technical point to be aware of in the case of left folds using lazy evaluation is that the new initial parameter is not being evaluated before the recursive call is made. This can lead to stack overflows when one reaches the end of the list and tries to evaluate the resulting potentially gigantic expression. For this reason, such languages often provide a stricter variant of left folding which forces the evaluation of the initial parameter before making the recursive call, in Haskell, this is the
foldl'(note the apostrophe, pronounced 'prime') function in the
Data.Listlibrary. Combined with the speed of tail recursion, such folds are very efficient when lazy evaluation of the final result is impossible or undesirable.
One often wants to choose the
identity elementof the operation "f" as the initial value "z". When no initial value seems appropriate, for example, when one wants to fold the function which computes the maximum of its two parameters over a list in order to get the maximum element of the list, there are variants of foldr and foldl which use the last and first element of the list respectively as the initial value. In Haskell and several other languages, these are called foldr1 and foldl1, the 1 making reference to the automatic provision of an initial element, and the fact that the lists they are applied to must have at least one element.
Using a Haskell interpreter, we can show the structural transformation which foldr and foldl perform by constructing a string as follows:
Prelude> foldr (x y -> concat ["(f ",x," ",y,")"] ) "z" (map show [1..5] ) "(f 1 (f 2 (f 3 (f 4 (f 5 z)))))" Prelude> foldl (x y -> concat ["(f ",x," ",y,")"] ) "z" (map show [1..5] ) "(f (f (f (f (f z 1) 2) 3) 4) 5)"
Map (higher-order function)
* [http://www.cse.unsw.edu.au/~en1000/haskell/hof.html "Higher order functions — map, fold and filter"]
* [http://www.cs.bham.ac.uk/~ard/modules/swh06.html "Unit 6: The Higher-order fold Functions"]
* [http://wiki.tcl.tk/14726 "Fold"]
* A pair of cute websites for the difference between [http://foldr.com/ foldr] and [http://foldl.com/ foldl] .
* [http://www.iis.sinica.edu.tw/~scm/2008/constructing-list-homomorphism/ "Constructing List Homomorphism from Left and Right Folds"]
* [http://caos.di.uminho.pt/~ulisses/blog/2007/11/20/foldr-the-magic-function/ "The magic foldr"]
Wikimedia Foundation. 2010.