Hindley–Milner

In type theory, Hindley–Milner (HM) (also known as Damas–Milner or Damas–Hindley–Milner) is a classical type inference method with parametric polymorphism for the lambda calculus, first described by J. Roger Hindley^[1] and later rediscovered by Robin Milner.^[2] Luis Damas contributed a close formal analysis and proof of the method in his PhD thesis.^[3]^[4] Among the properties making HM so outstanding is completeness and its ability to deduce the most general type of a given source without the need of any type annotations or other hints. HM is a fast algorithm, computing a type almost in linear time w.r.t. the size of the source, making it practically usable to type large programs. HM is preferably used for functional languages. It was first implemented as part of the type system of the programming language ML. Since then, HM has been extended in various ways, most notably by constrained types as used in Haskell.

Introduction

Organizing their original paper, Damas and Milner^[4] clearly separated two very different tasks. One is to describe what types an expression can have and another to present an algorithm actually computing a type. Keeping both aspects apart from each other allows to focus separately on the logic (i.e. meaning) behind the algorithm as well as to established a benchmark for the algorithm's properties.

How expressions and types fit to each other is described by means of a deductive system. Like any proof system, it allows different ways to come to a conclusion and since one and the same expression arguably might have different types, dissimilar conclusions about an expressions are possible. Contrary to this, the type inference method itself (Algorithm W) is defined as a step-by-step procedure, leaving no choice what to do next. Thus clearly, decisions not present in the logic might have been made constructing the algorithm, which demand a closer look and justifications but would perhaps remain non-obvious without the above differentiation.

Syntax

Expressions
$\begin{array}{lrll} e & = & x & \textrm{variable}\\ & \vert & e\ e & \textrm{application}\\ & \vert & \lambda\ x\ .\ e & \textrm{abstraction} \\ & \vert & \texttt{let}\ x = e\ \texttt{in}\ e \\ \end{array}$
Types
$\begin{array}{llrll} \textrm{mono} & \tau &= & \alpha & \ \textrm{variable} \\ & &\vert & D\ \tau\dots\tau & \ \textrm{application} \\ \textrm{poly} & \sigma &= & \tau \\ & &\vert& \forall\ \alpha\ .\ \sigma & \ \textrm{qualifier}\\ \\ \end{array}$

Logic and algorithm share the notions of "expression" and "type", whose form is made precise by the syntax.

The expressions to be typed are exactly those of the lambda calculus, enhanced by a let-expression.

Readers unfamiliar with the lambda calculus might not only be puzzled by the syntax, which can quickly be straightened out translating, that the application $e 1 e 2$ represents the function application, often written $e 1 (e 2)$ and that the abstraction means anonymous function or function literal, common in most contemporary programming languages, there perhaps spelled only more verbosely $\texttt{function}\,(x)\ \texttt{return}\ e\ \texttt{end}$ .

Types as a whole are split into two groups, called mono- and polytypes.^{[note 1]}

Monotypes $τ$ , syntactically terms, always designate a particular type in the sense, that it is equal only to itself and different from all others. The most typical representatives of monotypes are type constants like $i n t$ or $s t r i n g$ . Types can be parametric like $Map\ (Set\ string)\ int$ . All these types are examples of applications of type functions $D$ , i.e. $\left\{int^0, string^0, Map^2, Set^1\right\} \subset D$ in the before mentioned examples, where the superscript indicates the number of type parameters. While the choice of $D$ is completely arbitrary, in context of HM it must contain at least $\rightarrow^2$ , the type of functions, which is written infix for convenience, e.g. a function mapping integers to strings has type $int\rightarrow string$ . ^{[note 2]}

Perhaps a bit irritating, type variables are monotypes, either. Standing alone, a type variable $α$ is meant to be as concrete like $i n t$ or $β$ and clearly different from both. Type variables occurring as monotypes behave as if they were type constants, of which one only does not have any further information. Correspondingly, a function typed $\alpha\rightarrow\alpha$ only maps values of the particular type $α$ on itself. Such a function can only be applied to values having type $α$ and to no others.

A function with polytype $\forall\alpha.\alpha\rightarrow\alpha$ by contrast can map any value of the same type to itself, and the identity function is a value for this type. As another example $\forall\alpha.(Set\ \alpha)\rightarrow int$ is the type of a function mapping all finite sets to integers. The count of members is a value for this type. Note that qualifiers can only appear top level, i.e. a type $\forall\alpha.\alpha\rightarrow\forall\alpha.\alpha$ for instance, is excluded by syntax of types and that monotypes are included in the polytypes, thus a type has the general form $\forall\alpha_1\dots\forall\alpha_n.\tau$ .

Free type variables

Free Type Variables
$\begin{array}{ll} \text{free}(\ \alpha\ ) &=\ \left\{\alpha\right\}\\ \text{free}(\ D\ \tau_1\dots\tau_n\ ) &=\ \bigcup\limits_{i=1}^n{\text{free}(\ \tau_i\ )} \\ \text{free}(\ \forall\ \alpha\ .\ \sigma\ ) &=\ \text{free}(\ \sigma\ )\ -\ \left\{\alpha\right\}\\ \end{array}$

In a type $\forall\alpha_1\dots\forall\alpha_n.\tau$ , the symbol $\forall$ is the qualifier binding the type variables $α i$ in the monotype $τ$ . The variables $α i$ are called qualified and any occurrence of a qualified type variable in $τ$ is called bound and all unbound type variables in $τ$ are call free. Like in the lambda calculus, the notion of free and bound variables are essential for the understanding of the meaning of types.

This is certainly the hardest part of HM, perhaps because polytypes containing free variables are not represented in programming languages like Haskell. Likely, one does not have clauses with free variables in Prolog clauses. In particular developers experienced with both languages and actually knowing all the prerequisites of HM, are likely to slip this point. In Haskell for example, all type variables implicitly occur qualified, i.e. a Haskell type $\texttt{a -> a}$ means $\forall\alpha.\alpha\rightarrow\alpha$ here. Because a type like $\alpha\rightarrow\alpha$ , though it may practically occur in a Haskell program, cannot be expressed there, it is easily be confused with its qualified version.

So what function can have a type like e.g. $\forall\beta.\beta\rightarrow\alpha$ , i.e. a mixture of both bound and unbound type variables and what could the free type variable $α$ therein mean?

Example 1
$\begin{array}{l} \textbf{let}\ bar\ [\forall\alpha.\forall\beta.\alpha\rightarrow(\beta\rightarrow\alpha)] = \lambda\ x.\\ \quad\textbf{let}\ foo\ [\forall\beta.\beta\rightarrow\alpha] = \lambda\ y.x\\ \quad\textbf{in}\ foo\\ \textbf{in}\ bar \end{array}$

Consider $f o o$ in Example 1, with type annotations in brackets. Its parameter $y$ is not used in the body, but the variable $x$ bound in the outer context of $f o o$ sure is. As a consequence, $f o o$ accepts every value as argument, while returning a value bound outside and with it its type. $b a r$ to the contrary has type $\forall\alpha.\forall\beta.\alpha\rightarrow(\beta\rightarrow\alpha)$ , in which all occurring type variables are bound Evaluating, for instance $bar\ 1$ , results in a function of type $\forall\beta.\beta\rightarrow\ int$ , perfectly reflecting that foo's monotype $α$ in $\forall\beta.\beta\rightarrow\alpha$ has been refined by this call.

In this example, the free monotype variable $α$ in foo's type becomes meaningful by being qualified in the outer scope, namely in bar's type. I.e. in context of the example, the same type variable $α$ appears both bound and free in different types. As a consequence, a free type variable cannot be interpreted better than stating it is a monotype without knowing the context. Turning the statement around, in general, a typing is not meaningful without a context.

Context and typing

Syntax
$\begin{array}{llrl} \text{Context} & \Gamma & = & \epsilon\ \texttt{(empty)}\\ & & \vert& \Gamma,\ x : \sigma\\ \text{Typing} & & = & \Gamma \vdash e : \sigma\\ \\ \end{array}$
Free Type Variables
$\begin{array}{ll} \text{free}(\ \Gamma\ ) &=\ \bigcup\limits_{x:\sigma \in \Gamma}\text{free}(\ \sigma\ ) \end{array}$

Consequently, to get the yet disjoint the parts of the syntax, expressions and types together meaningfully, a third part, the context is needed. Syntactically, it is a list of pairs $x :σ$ , called assignments or assumptions, stating for each value variable $x i$ therein a type $σ i$ . All three parts combined gives a typing of the form $\Gamma\ \vdash\ e:\sigma$ , stating, that under assumptions $Γ$ , the expression $e$ has type $σ$ .

Now having the complete syntax at hand, one can finally make a senseful statement about the type of $f o o$ in example 1, above, namely $x:\alpha \vdash \lambda\ y.x : \forall\beta.\beta\rightarrow\alpha$ . Contrary to the above formulations, the monotype variable $α$ does not longer appear unbound, i.e. meaningless, but bound in the context as the type of the value variable $x$ . The circumstance whether a type variable is bound or free in the context apparently plays a significant role for a type as part of a typing, so $free(\ \Gamma\ )$ it is made precise in the side box.

Note on expressivness

Since the expression syntax might appear far too inexpressive to readers unfamiliar with the lambda calculus, and because the examples given below will likely support this misconception, some notes that the HM is not dealing with toy languages might be helpful. As a central result in research on computability, the expression syntax defined above (without the let-variant) is able to express any computable function. Moreover all other programming language constructions can be relatively directly transformed syntactically into expressions of the lambda calculus. Therefore, this simple expression is used as a model for programming languages in research. A method known to work well for the lambda calculus can easily be extended to all or at least many other syntactical construction of a particular programming language using the before mentioned syntactical transformations.

As an example, the additional expression variant $\textbf{let}\ x = e_1\ \textbf{in}\ e_2$ can be transformed to $(\lambda x.e_2)\ e_1$ . It is added to expression syntax in HM only to support generalization during the type inference and not because syntax lacks computational strength. Thus HM deals with inference of types in programs in general and the various functional languages using this method demonstrate, how well a result formulated only for the syntax of the lambda calculus can be extend to syntactically complex languages.

Contrary to the impression, that the expressions might be too inexpressive for practical application, they are actually far too expressive to be meaningfully typed at all. This is a consequence of the decision problem being undecidable for anything as expressive as the expression of the lambda calculus. Consequently, computing typings is a hopeless venture in general. Depending on the nature of the type system, it will either never terminate or otherwise refuse to work.

HM belongs to the later group of type systems. A collapse of the type system presents itself then as more subtle situation in that suddenly only one and the same type is yielded for the expressions of interest. This is not a fault in HM, but inherent in the problem of typing itself and can easily be created within any strongly typed programming language e.g. by coding an evaluator (the universal function) for the "too simple" expression. One then has a single concrete type that represents the universal data type as usual in untyped languages. The type system of the host programming language is then collapsed and cannot longer differentiate between the various types of values handed to or produced by the evaluator. In this context, it still delivers or checks types, but always the same, just as if the type system were not longer present at all.

Polymorphic type order

While the equality of monotypes is purely syntactical, polytypes offer a richer structure by being related to other types through a specialization relation $\sigma \sqsubseteq \sigma'$ expressing that $σ'$ is more special than $σ$ .

When being applied to a value a polymorphic functions has to change its shape specializing to deal with this particular type of values. During this process, it also changes its type to match that of the parameter. If for instance the identity function having type $\forall\alpha.\alpha\rightarrow\alpha$ is to be applied on a number having type $i n t$ , both simply cannot work together, because all the types are different and nothing fits. What is needed is a function of type $int\rightarrow int$ . Thus, during application, the polymorphic identity is specialized to a monomorphic version of itself. In terms of the specialization relation, one writes $\forall\alpha.\alpha\rightarrow\alpha \sqsubseteq\ int\rightarrow int$

Now the shape shifting of polymorphic values is not fully arbitrary but rather limited by their pristine polytype. Following what has happened in the example one could paraphrase the rule of specialization, saying, a polymorphic type $\forall\alpha.\tau$ is specialized by consistently replacing each occurrence of $α$ in $τ$ and dropping the qualifier. While this rule works well for any monotype uses as replacement, it fails when a polytype, say $\forall\beta.\beta$ is tried as a replacement, resulting in the non-syntactical type $\forall\beta.\beta\rightarrow\forall\beta.\beta$ . But not only that. Even if a type with nested qualified types would be allowed in the syntax, the result of the substitution would not longer preserve the property of the pristine type, in which both the parameter and the result of the function have the same type, which are now only seemingly equal because both subtypes became independent from each other allowing to specialize the parameter and the result with different types resulting in, e.g. $string\rightarrow Set\ int$ , hardly the right task for an identity function.

The syntactic restriction to allow qualification only top-level is imposed to prevent generalization while specializing. Instead $\forall\beta.\beta\rightarrow\forall\beta.\beta$ the more special type $\forall\beta.\beta\rightarrow\beta$ must be produces in this case.

One could undo the former specialization by specializing on some value of type $\forall\alpha.\alpha$ again. In terms of the relation one gains $\forall\alpha.\alpha\rightarrow\alpha \sqsubseteq \forall\beta.\beta\rightarrow\beta \sqsubseteq\forall\alpha.\alpha\rightarrow\alpha$ as a summary, meaning that syntactically different polytypes are equal w.r.t. to renaming their qualified variables.

Specialization Rule
$\displaystyle\frac{\tau' = \left[\alpha_i := \tau_i\right] \tau \quad \beta_i \not\in \textrm{free}(\forall \alpha_1...\forall\alpha_n . \tau)}{\forall \alpha_1...\forall\alpha_n . \tau \sqsubseteq \forall \beta_1...\forall\beta_m . \tau'}$

Now focusing only on the question whether a type is more special than another and not longer what the specialized type is used for, one could summarize the specialization as in the box above. Paraphrasing it clockwise, a type $\forall\alpha_1\dots\forall\alpha_n.\tau$ is specialized by consistently replacing any of the qualified variables $α i$ by arbitrary monotypes $τ i$ gaining a monotype $τ'$ . Finally, type variables in $τ'$ not occurring free in the pristine type can optionally be qualified.

Thus the specialization rules makes sure that no free variable, i.e. monotype in the pristine type becomes unintentionally bound by a qualifier, but originally qualified variable can be replaced with whatever, even with types introducing new qualified or unqualified type variables.

Starting with a polytype $\forall\alpha.\alpha$ , the specialization could either replace the body by another qualified variable, actually a rename or by some type constant (including the function type) which may or may not have parameters filled either with monotypes or qualified type variables. Once a qualified variable is replaced by a type application, this specialization cannot be undone through another substitution as it was possible for qualified variables. Thus the type application is there to stay. Only if it contains another qualified type variable, the specialization could continue further replacing for it.

So the specialization introduces no further equivalence on polytype beside the already known renaming. Polytypes are syntactically equal up to renaming their qualified variables. The equality of types is a reflexive, antisymmetric and transitive relation and the remaining specializations of polytypes are transitive and with this the relation $\sqsubseteq$ an order.

Deductive system

The Syntax of Rules
$\begin{array}{lrl} \text{Predicate} & = &\sigma\sqsubseteq\sigma\\ & \vert\ &\alpha\not\in free(\Gamma)\\ & \vert\ &x:\alpha\in \Gamma\\ \\ \text{Judgment} & = &\text{Typing}\\ \text{Premise} & = &\text{Judgment}\ \vert\ \text{Predicate}\\ \text{Conclusion} & = &\text{Judgment}\\ \\ \text{Rule} & = &\displaystyle\frac{\textrm{Premise}\ \dots}{\textrm{Conclusion}}\quad [\texttt{Name}] \end{array}$

The syntax of HM is carried forward to the syntax of the inference rules that form the body of the formal system, by using the typings as judgments. Each of the rules define what conclusion could be drawn from what premises. Additionally to the judgments, some extra conditions introduced above might be used as premises, too.

A proof using the rules is a sequence of judgments such that all premises are listed before a conclusion. Please see the Examples 2, 3 below for a possible format of proofs. From left to right, each line shows the conclusion, the $[\texttt{Name}]$ of the rule applied and the premises, either by referring to an earlier line (number) if the premise is a judgment or by making the predicate explicit.

Typing rules

Principal type

As mentioned in the introduction, the rules allow to deduce different types for one and the same expression. See for instance, Example 2, steps 1,2 and Example 3, steps 2,3 for three different typings of the same expression. Clearly, the different results are not fully unrelated, but connected by the type order. It is an important property of the rule system and this order that whenever more but one type can be deduced for an expression, among them is (modulo alpha-renaming of the type variables) a unique most general type in the sense, that all others are specialization of it. Though the rule system must allow to derive specialized types, a type inference algorithm should deliver this most general or principal type as its result.

Let-polymorphism

Not visible immediately, the rule set encodes a regulation under which circumstances a type might be generalized or not by a slightly varying use of mono- and polytypes in the rules $[\texttt{Abs}]$ and $[\texttt{Let}]$ .

In rule $[\texttt{Abs}]$ , the value variable of the parameter of the function $λ x . e$ is added to the context with a monomorphic type through the premise $\Gamma,\ x:\tau \vdash e:\tau'$ , while in the rule $[\texttt{Let}]$ , the variable enters the environment in polymorphic form $\Gamma,\ x:\sigma \vdash e_1:\tau'$ . Though in both cases the presence of x in the context prevents the use of the generalisation rule for any monotype variable in the assignment, this regulation forces the parameter x in a $λ$ -expression to remain monomorphic, while in a let-expression, the variable could already be introduced polymorphic, making specializations possible.

As a consequence of this regulation, no type can be inferred for $\lambda f.(f\, \textrm{true}, f\, \textrm{0})$ since the parameter $f$ is in a monomorphic position, while $\textbf{let}\ f = \lambda x . x\, \textbf{in}\, (f\, \textrm{true}, f\, \textrm{0})$ yields a type $(b o o l, i n t)$ , because $f$ has been introduced in a let-expression and is treated polymorphic therefore. Note that this behaviour is in strong contrast to the usual definition $\textbf{let}\ x = e_1\ \textbf{in}\ e_2\ ::= (\lambda\ x.e_2)\ e_1$ and the reason why the let-expression appears in the syntax at all. This distinction is called let-polymorphism or let generalization and is a conception owned to HM.

Towards an algorithm

Now that the deduction system of HM is at hand, one could present an algorithm and validate it w.r.t. the rules. Alternatively, it might be possible to derive it by taking a closer look on how the rules interact and proof are formed. This is done in the remainder of this article focusing on the possible decisions one can make while proving a typing.

Degrees of freedom choosing the rules

Isolating the points in a proof, where no decision is possible at all, the first group of rules centered around the syntax leaves no choice since to each syntactical rule corresponds a unique typing rule, which determines a part of the proof, while between the conclusion and the premises of these fixed parts chains of $[\texttt{Inst}]$ and $[\texttt{Gen}]$ could occur. Such a chain could also exist between the conclusion of the proof and the rule for topmost expression. All proof must have the so sketched shape.

Because the only choice in a proof with respect of rule selection are the $[\texttt{Inst}]$ and $[\texttt{Gen}]$ chains, the form of the proof suggests the question whether it can be made more precise, where these chains might be needed. This is in fact possible and leads to a variant of the rules system without both rules.

Syntax-directed rule system

Syntactical Rule System
$\begin{array}{cl} \displaystyle\frac{x:\sigma \in \Gamma \quad \sigma \sqsubseteq \tau}{\Gamma \vdash x:\tau}&[\texttt{Var}]\\ \\ \displaystyle\frac{\Gamma \vdash e_0:\tau \rightarrow \tau' \quad\quad \Gamma \vdash e_1 : \tau }{\Gamma \vdash e_0\ e_1 : \tau'}&[\texttt{App}]\\ \\ \displaystyle\frac{\Gamma,\;x:\tau\vdash e:\tau'}{\Gamma \vdash \lambda\ x\ .\ e : \tau \rightarrow \tau'}&[\texttt{Abs}]\\ \\ \displaystyle\frac{\Gamma \vdash e_0:\tau \quad\quad \Gamma,\,x:\bar{\Gamma}(\tau) \vdash e_1:\tau'}{\Gamma \vdash \texttt{let}\ x = e_0\ \texttt{in}\ e_1 : \tau'}&[\texttt{Let}] \end{array}$
Generalization
$\bar{\Gamma}(\tau) = \forall\ \hat{\alpha}\ .\ \tau \quad\quad \hat{\alpha} = \textrm{free}(\tau) - \textrm{free}(\Gamma)$

A contemporary treatment of HM uses a purely syntax-directed rule system due to Clement^[5] as an intermediate step. In this system, the specialization is located directly after the original $[\texttt{Var}]$ rule and merged into it, while the generalization becomes part of the $[\texttt{Let}]$ rule. There the generalization is also determined to always produce the most general type by introducing the function $\bar{\Gamma}(\tau)$ , which qualifies all monotype variables not bound in $Γ$ .

Formally, to validate, that this new rule system $\vdash_S$ is equivalent to the original $\vdash_D$ , one has to show that $\Gamma \vdash_D\ e:\sigma \Leftrightarrow \Gamma \vdash_S\ e:\sigma$ , which falls apart into two sub-proofs:

$\Gamma \vdash_D\ e:\sigma \Leftarrow \Gamma \vdash_S\ e:\sigma$ (Consistence)
$\Gamma \vdash_D\ e:\sigma \Rightarrow \Gamma \vdash_S\ e:\sigma$ (Completeness)

While consistency can be seen by decomposing the rules $[\texttt{Let}]$ and $[\texttt{Var}]$ of $\vdash_S$ into proofs in $\vdash_D$ , it is likely visible that $\vdash_S$ is incomplete, as one cannot show $\lambda\ x.x:\forall\alpha.\alpha\rightarrow\alpha$ in $\vdash_S$ , for instance, but only $\lambda\ x.x:\alpha\rightarrow\alpha$ . An only slightly weaker version of completeness is provable ^[6] though, namely

$\Gamma \vdash_D\ e:\sigma \Rightarrow \Gamma \vdash_S\ e:\tau \wedge \bar{\Gamma}(\tau)\sqsubseteq\sigma$

implying, one can derive the principal type for an expression in $\vdash_S$ allowing to generalize the proof in the end.

Comparing $\vdash_D$ and $\vdash_S$ note that only monotypes appear in the judgments of all rules, now.

Degrees of freedom instanciating the rules

Within the rules themselves, assuming a given expression, one is free to pick the instances for (rule) variables not occurring in this expression. These are the instances for the type variable in the rules. Working towards finding the most general type, this choice can be limited to picking suitable types for $τ$ in $[\texttt{Var}]$ and $[\texttt{Abs}]$ . The decision of a suitable choice cannot be made locally, but its quality becomes apparent in the premises of $[\texttt{App}]$ , the only rule, in which two different types, namely the function's formal and actual parameter type have to come together as one.

Therefor, the general strategy for finding a proof would be to make the most general assumption ( $\alpha \not\in free(\Gamma)$ ) for $τ$ in $[\texttt{Abs}]$ and to refine this and the choice to be made in $[\texttt{Var}]$ until all side conditions imposed by the $[\texttt{App}]$ rules are finally met. Fortunately, no trial and error is needed, since an effective method is known to compute all the choices, Robinson's Unification in combination with the so-called Union-Find algorithm.

To briefly summarize the union-find algorithm, given the set of all types in a proof, it allows to to group them together into equivalence classes by means of a $\texttt{union}$ procedure and to pick a representative for each such class using a $\texttt{find}$ procedure. Emphasizing on the word procedure in the sense of side effect, we're clearly leaving the realm of logic to prepare an effective algorithm. The representative of a $\texttt{union}(a,b)$ is determined such, that if both $a$ and $b$ are type variables the representative is arbitrarily one of them, while uniting a variable and a term, the term becomes the representative. Assuming an implementation of union-find at hand, one can formulate the unification of two monotypes as follows:

unify(ta,tb):
  ta = find(ta)
  tb = find(tb)
  if both ta,tb are terms of the form D p1..pn with identical D,n then
    unify(ta[i],tb[i]) for each corresponding ith parameter
  else
  if at least one of ta,tb is a type variable then
    union(ta,tb)
  else
    error 'types do not match'

Algorithm W

Algorithm W
$\begin{array}{cl} \displaystyle\frac{x:\sigma \in \Gamma \quad \tau = inst(\sigma)}{\Gamma \vdash x:\tau}&[\texttt{Var}]\\ \\ \displaystyle\frac{\Gamma \vdash e_0:\tau_0 \quad \Gamma \vdash e_1 : \tau_1 \quad \tau'=newvar \quad unify(\tau_0,\ \tau_1 \rightarrow \tau') }{\Gamma \vdash e_0\ e_1 : \tau'}&[\texttt{App}]\\ \\ \displaystyle\frac{\tau = newvar \quad \Gamma,\;x:\tau\vdash e:\tau'}{\Gamma \vdash \lambda\ x\ .\ e : \tau \rightarrow \tau'}&[\texttt{Abs}]\\ \\ \displaystyle\frac{\Gamma \vdash e_0:\tau \quad\quad \Gamma,\,x:\bar{\Gamma}(\tau) \vdash e_1:\tau'}{\Gamma \vdash \texttt{let}\ x = e_0\ \texttt{in}\ e_1 : \tau'}&[\texttt{Let}] \end{array}$

The presentation of Algorithm W as shown in the side box does not only deviate significantly from the original^[4] but is also a gross abuse of the notation of logical rules, since it includes side effects. It is legitimized here, for allowing a direct comparison with $\vdash_S$ while expressing an efficient implementation at the same time. The rules now specify a procedure with parameters $Γ, e$ yielding $τ$ in the conclusion where the execution of the premises proceeds from left to right. Alternatively to a procedure, it could be viewed as an attributation of the expression.

The procedure ' $i n s t (σ)$ ' specializes the polytype $σ$ by copying the term and replacing the bound type variables consistently by new monotype variables. ' $n e w v a r$ ' produces a new monotype variable. Likely, $\bar{\Gamma}(\tau)$ has to copy the type introducing new variables for the qualification to avoid unwanted captures. Overall, the algorithm now proceeds by always making the most general choice leaving the specialization to the unification, which by itself produces the most general result. As noted above, the final result $τ$ has to be generalized to $\bar{\Gamma}(\tau)$ in the end, to gain the most general type for a given expression.

Because the procedures used in the algorithm have near O(1) cost, the overall cost of the algorithm is close linear to the size of the expression for which a type is to be inferred. This is in strong contrast to many other attempts to derive type inference algorithms, which often came out to be NP-hard, if not undecidable w.r.t. termination. Thus the HM performs as good as the best fully informed type-checking algorithms can. Type-checking here means, that an algorithm does not have to find a proof, but only to validate a given one.

The efficiency is slightly lowered for two reasons. First, the binding of type variables in the context has to be maintained to allow computation of $\bar{\Gamma}(\tau)$ and an occurs check has to made to prevent recursive types to be build during $u n i o n (α,τ)$ . An example of such a case is $\lambda\ x.(x\ x)$ , for which no type can be derived using HM. Because practically types are only small terms and do not build up expanding structures, one can treat them in complexity analysis as being smaller as some constant, retaining O(1) costs.

Original presentation of Algorithm W

In the original paper,^[4] the algorithm is presented more formally using a substitution style instead of side effects in the method above. In the later form, the side effect invisibly takes care of all places where a type variable is used. Explicitly using substitutions does not only makes the algorithm hard to read, because the side effect occurs virtually everywhere, but also gives the false impression that the method might be costly. When implemented using purely functional means or for the purpose to prove the algorithm to be basically equivalent to the deduction system, full explicitness is of cause needed and the original formulation a necessary refinement.

Further topics

Recursive definitions

A central property of the lambda calculus is, that recursive definitions are non-elemental, but can instead be expressed by a fixed point combinator. The original paper^[4] notes that recursion can realized by this combinator's type $fix:\forall\alpha.(\alpha\rightarrow\alpha)\rightarrow\alpha$ . A possible recursive definitions could thus be formulated as $\texttt{rec}\ v = e_1\ \texttt{in}\ e_2\ ::=\texttt{let}\ v = fix(\lambda v.e_1)\ \texttt{in}\ e_2$ .

Alternatively an extension of the expression syntax and an extra typing rule is possible as:

$\displaystyle\frac{ \Gamma, \Gamma' \vdash e_1:\tau_1\quad\dots\quad\Gamma, \Gamma' \vdash e_n:\tau_n\quad\Gamma, \Gamma'' \vdash e:\tau }{ \Gamma\ \vdash\ \texttt{rec}\ v_1 = e_1\ \texttt{and}\ \dots\ \texttt{and}\ v_n = e_n\ \texttt{in}\ e:\tau }\quad[\texttt{Rec}]$

where

$\Gamma' = v_1:\tau_1,\ \dots,\ v_n:\tau_n$
$\Gamma'' = v_1:\bar\Gamma(\ \tau_1\ ),\ \dots,\ v_n:\bar\Gamma(\ \tau_n\ )$

basically merging $[\texttt{Abs}]$ and $[\texttt{Let}]$ while including the recursively defined variables in monotype positions where they occur left to the $\texttt{in}$ but as polytypes right to it. This formulation perhaps best summarizes the essence of let-polymorphism.

Notes

^ Polytypes are called "type schemes" in the original article.
^ The parametric types $D\ \tau\dots\tau$ were not present in the original paper on HM and are not needed to present the method. None of the inference rules below will take care or even note them. The same hold for the non-parametric "primitive types" in said paper. All the machinery for polymorphic type inference can be defined without them. They have been included here for sake of the examples but also because the nature of HM is all about parametric types. This comes from the function type $\tau\rightarrow\tau$ , hard-wired in the inference rules, below, which already has two parameters and have been presented here as only a special case.

References

^ R. Hindley, (1969) The Principal Type-Scheme of an Object in Combinatory Logic, Transactions of the American Mathematical Society, Vol. 146, pp. 29–60 [1]
^ Milner, (1978) A Theory of Type Polymorphism in Programming. Journal of Computer and System Science (JCSS) 17, pp. 348–374[2]
^ Luis Damas (1985): Type Assignment in Programming Languages. PhD thesis, University of Edinburg (CST-33-85)
^ ^a ^b ^c ^d ^e Damas,Milner (1982), Principal type-schemes for functional programs. 9th Symposium on Principles of programming languages (POPL'82) pp. 207–212, ACM: [3]
^ Clement, (1987). The Natural Dynamic Semantics of Mini-Standard ML. TAPSOFT'87, Vol 2. LNCS, Vol. 250, pp 67–81
^ Jeff Vaughan. A proof of correctness for the Hindley–Milner type inference algorithm.[4]

Categories:

Type systems
Type theory
Inference
Lambda calculus
Theoretical computer science
Formal methods
1969 in computer science
1978 in computer science
1985 in computer science

Wikimedia Foundation. 2010.

Игры ⚽ Поможем написать курсовую

Look at other dictionaries:

Typinferenz nach Hindley-Milner — Hindley Milner (HM) ist ein klassisches Verfahren der Typinferenz mit parametrischem Polymorphismus für den Lambda Kalkül. Es wurde erstmals von J. Roger Hindley[1] beschrieben und später von Robin Milner[2] wiederentdeckt. Luis Damas trug eine… … Deutsch Wikipedia
Charles Hindley — (25 June 1796–1 December 1857) was a Member of Parliament for Ashton under Lyne, Lancashire from 1835 until his death in 1857. He was the son of Ignatius and Mary Hindley, a Moravian family who owned a cotton mill. He was active in… … Wikipedia
Type inference — Type inference, or implicit typing, refers to the ability to deduce automatically the type of a value in a programming language. It is a feature present in some strongly statically typed languages. It is often characteristic of but not limited to … Wikipedia
Typinferenz — Durch Typinferenz (auch Typableitung) kann in manchen (stark typisierten) Programmiersprachen viel Schreibarbeit eingespart werden, indem auf die Niederschrift von Typangaben verzichtet wird, die aus den restlichen Angaben und den… … Deutsch Wikipedia
Inferencia de tipos — Los tipos de inferencia es una característica predominante de los lenguajes de programación funcionales tales como ML, Haskell, C#, Vala y Ocaml. La inferencia de tipos asigna automáticamente un tipo de datos a una función sin necesidad de que el … Wikipedia Español
Вывод типов — Типизация данных Типобезопасность Вывод типов Динамическая типизация Статическая типизация Строгая типизация Мягкая типизация Зависимые типы Утиная типизация Вывод типа (англ. Type inference) в программировании возможность компилятора… … Википедия
Funktionale Programmiersprache — Dieser Artikel oder Abschnitt bedarf einer Überarbeitung. Näheres ist auf der Diskussionsseite angegeben. Hilf mit, ihn zu verbessern, und entferne anschließend diese Markierung. Funktionale Programmierung ist ein Programmierparadigma. Programme… … Deutsch Wikipedia
Funktionale Programmierung — ist ein Programmierstil, bei dem Programme ausschließlich aus Funktionen bestehen. Dadurch werden die aus der imperativen Programmierung bekannten Nebenwirkungen vermieden. Die funktionale Programmierung entspringt der akademischen Forschung. In… … Deutsch Wikipedia
Funktionionale Programmierung — Dieser Artikel oder Abschnitt bedarf einer Überarbeitung. Näheres ist auf der Diskussionsseite angegeben. Hilf mit, ihn zu verbessern, und entferne anschließend diese Markierung. Funktionale Programmierung ist ein Programmierparadigma. Programme… … Deutsch Wikipedia
ML (programming language) — ML Paradigm(s) multi paradigm: imperative, functional Appeared in 1973 Designed by Robin Milner others at the University of Edinburgh Typing discipline static, strong, inferred … Wikipedia

Academic Dictionaries and Encyclopedias

Hindley–Milner

Contents

Introduction