Abstract interpretation

Abstract interpretation

In computer science, abstract interpretation is a theory of sound approximation of the semantics of computer programs, based on monotonic functions over ordered sets, especially lattices. It can be viewed as a partial execution of a computer program which gains information about its semantics (e.g. control structure, flow of information) without performing all the calculations.

Its main concrete application is formal static analysis, the automatic extraction of information about the possible executions of computer programs; such analyses have two main usages:
* inside compilers, to analyse programs in order to decide whether certain optimisations or transformations are applicable;
* for debugging or even the certification of programs against classes of bugs.

Abstract interpretation was formalized by Patrick Cousot and Radhia Cousot.

Intuition

This article now illustrates what abstract interpretation means, by using real-world, non-computing, examples.

Consider the people in a conference room. To prove that some persons were not present, one concrete method is to look up a list of the names and some identifiers unique to them, like a social security number in the United States, of all participants. Since no two persons have the same number, it is possible to prove or disprove the presence of a participant simply by looking up his or her number in the list.

However it is possible that only the names of attendees were registered. If the name of a person is not found in the list, we may safely conclude that that person was not present; but if it is, we cannot conclude definitely without further enquiries, due to the possibility of homonyms. Note that this imprecise information will still be adequate for most purposes, because homonyms are rare in practice. However, in all rigor, we cannot say for sure that somebody was present in the room; all we can say is that he or she was "possibly" here. If the person we are looking up is a criminal, we will issue an "alarm"; but there is of course the possibility of issuing a "false alarm". Similar phenomena will occur in the analysis of programs.

If we are only interested in some specific information, say, "was there a person of age "n" in the room", keeping a list of all names and dates of births is unnecessary. We may safely and without loss of precision restrict ourselves to keeping a list of the participants' ages. If this is already too much to handle, we might keep only the minimal "m" and maximal "M" ages. If the question is about an age strictly lower than "m" or strictly higher than "M", then we may safely respond that no such participant was present. Otherwise, we may only be able to say that we do not know.

In the case of computing, concrete, precise information is in general not computable within finite time and memory (see Rice's theorem and the halting problem). Abstraction is used to simplify problems up to problems amenable to automatic solutions. One crucial requirement is to diminish precision so as to make problems manageable while still retaining enough precision for answering the important questions (such as "may the program crash?").

Abstract interpretation of computer programs

Given a programming or specification language, abstract interpretation consists in giving several semantics linked by relations of abstraction. A semantics is a mathematical characterization of a possible behavior of the program. The most precise semantics, describing very closely the actual execution of the program, are called the concrete semantics. For instance, the concrete semantics of an imperative programming language may associate to each program the set of execution traces it may produce – an execution trace being a sequence of possible consecutive states of the execution of the program; a state typically consists of the value of the program counter and the memory locations (globals, stack and heap). More abstract semantics are then derived; for instance, one may consider only the set of reachable states in the executions (which amounts to considering the last states in finite traces).

The goal of static analysis is to derive a computable semantic interpretation at some point. For instance, one may choose to represent the state of a program manipulating integer variables by forgetting the actual values of the variables and only keeping their signs (+, - or 0). For some elementary operations, such as multiplication, such an abstraction does not lose any precision: to get the sign of a product, it is sufficient to know the sign of the operands. For some other operations, the abstraction may lose precision: for instance, it is impossible to know the sign of a sum whose operands are respectively positive and negative.

Sometimes a loss of precision is necessary to make the semantics decidable (see Rice's theorem, halting problem). In general, there is a compromise to be made between the precision of the analysis and its decidability (computability), or tractability (complexity).

In practice the abstractions that are defined are tailored to both the program properties one desires to analyse, and to the set of target programs.

Formalization

Let "L" be an ordered set, called a concrete set and let "L"′ be another ordered set, called an abstract set. These two sets are related to each other by defining total functions that map elements from one to the other.

A function α is called an abstraction function if it maps an element "x" in the concrete set "L" to an element α("x") in the abstract set "L"′. That is, element α("x") in "L"′ is the abstraction of "x" in "L".

A function γ is called a concretization function if it maps an element "x"′ in the abstract set "L"′ to an element γ("x"′) in the concrete set "L". That is, element γ("x"′) in "L" is a concretization of "x"′ in "L"′.

Let "L"1, "L"2, "L"′1 and "L"′2 be ordered sets. The concrete semantics "f" is a monotonic function from "L"1 to "L"2. A function "f"′ from "L"′1 to "L"′2 is said to be a valid abstraction of "f" if for all "x"′ in "L"′1, ("f" ∘ γ)("x"′) ≤ (γ ∘ "f"′)("x"′).

Program semantics are generally described using fixed points in the presence of loops or recursive procedures. Let us suppose that "L" is a complete lattice and let "f" be a monotonic function from "L" into "L". Then, any "x"′ such that "f"′("x"′) ≤ "x"′ is an abstraction of the least fixed-point of "f", which exists, according to the Knaster-Tarski theorem.

The difficulty is now to obtain such an "x"′. If L' is of finite height, or at least verifies the "ascending chain condition" (all ascending sequences are ultimately stationary), then such an "x"′ may be obtained as the stationary limit of the ascending sequence "x"′"n" defined by induction as follows: "x"′0=⊥ (the least element of "L"′) and "x"′"n"+1="f"′("x"′"n").

In other cases, it is still possible to obtain such an "x"′ through a widening operator ∇: for all "x" and "y", "x" ∇ "y" should be greater or equal than both "x" and "y", and for any sequence "y"′"n", the sequence defined by "x"′0=⊥ and "x"′"n"+1="x"′"n" ∇ "y"′"n" is ultimately stationary. We can then take "y"′"n"="f"′("x"′"n").

In some cases, it is possible to define abstractions using Galois connections (α, γ) where α is from "L" to "L"′ and γ is from "L"′ to "L". This supposes the existence of best abstractions, which is not necessarily the case. For instance, if we abstract sets of couples ("x","y") of real numbers by enclosing convex polyhedra, there is no optimal abstraction to the disc defined by "x"2+"y"2 ≤ 1.

Examples of abstract domains

One can assign to each variable "x" available at a given program point an interval [lx,hx] . A state assigning the value "v"("x") to variable "x" will be a concretization of these intervals if for all "x", then "v"("x") is in [lx,hx] . From the intervals [lx,hx] and [ly,hy] for variables "x" and "y", one can easily obtain intervals for "x"+"y" ( [lx+ly,hx+hy] ) and for "x"-"y" ( [lx-hy,hx-ly] ); note that these are "exact" abstractions, since the set of possible outcomes for, say, "x"+"y", is precisely the interval ( [lx+ly,hx+hy] ). More complex formulas can be derived for multiplication, division, etc., yielding so-called interval arithmetics.

Let us now consider the following very simple program:

y = x;z = x - y;
With reasonable arithmetic types, the result for z should be zero. But if we do interval arithmetics starting from x in [0,1] , one gets z in [-1,1] . While each of the operations taken individually was exactly abstracted, their composition isn't.

The problem is evident: we did not keep track of the equality relationship between x and y; actually, this domain of intervals does not take into account any relationships between variables, and is thus a non-relational domain. Non-relational domains tend to be fast and simple to implement, but imprecise.

Some examples of relational numerical abstract domains are:
* congruence relations on integers
* convex polyhedra – with some high computational costs
* "octagons"
* difference-bound matrices
* linear equalitiesand combinations thereof.

When one chooses an abstract domain, one typically has to strike a balance between keeping fine-grained relationships, and high computational costs.

Tools

* [http://www.astree.ens.fr ASTRÉE]
* [http://www.polyspace.com PolySpace Technologies]
* [http://www.absint.com/pag PAG] and [http://www.program-analysis.com PAG/WWW]
* [http://spa-arrow.com Sparrow]
* [http://www.grammatech.com/products/codesonar/ CodeSonar]
* [http://www.coverity.com/html/prod_prevent.html Coverity Prevent]
* [http://www.klocwork.com/products/insight.asp Klocwork Insight]
* [http://www.automationsquare.com/plc-checker.html PLC Checker]

ee also

*DAEDALUS
*Standard interpretation
*Model checking
*Symbolic simulation

External links

* [http://santos.cis.ksu.edu/schmidt/Escuela03/home.html David Schmidt's lecture notes on abstract interpretation]
* [http://www.brics.dk/~mis/static.pdf Michael Schwarzbach's lecture notes on static analysis]


Wikimedia Foundation. 2010.

Игры ⚽ Поможем решить контрольную работу

Look at other dictionaries:

  • Interpretation — Distinguish from interpenetration. Interpreter, interpreting, interpretation can mean:The term interpretation may refer to: *Formal interpretation an abstract model. * Logical interpretation the assignment of meaning to the symbols used in formal …   Wikipedia

  • Abstract management — is the process of accepting and preparing abstracts for presentation at an academic conference. The process consists of either invited or proffered submissions of the abstract or summary of work. The abstract typically states the hypothesis,… …   Wikipedia

  • Interpretation of quantum mechanics — An interpretation of quantum mechanics is a statement which attempts to explain how quantum mechanics informs our understanding of nature. Although quantum mechanics has received thorough experimental testing, many of these experiments are open… …   Wikipedia

  • Abstract expressionism — Although the term abstract expressionism was first applied to American art in 1946 by the art critic Robert Coates, it had been first used in Germany in 1919 in the magazine Der Sturm , regarding German Expressionism. In the USA, Alfred Barr was… …   Wikipedia

  • Abstract labour and concrete labour — Part of a series on Marxism …   Wikipedia

  • INTERPRETATION — This article is arranged according to the following outline: definition of terms bible exegesis substance of bible exegesis in jewish creative interpretation and integrative interpretation …   Encyclopedia of Judaism

  • Interpretation (logic) — An interpretation is an assignment of meaning to the symbols of a formal language. Many formal languages used in mathematics, logic, and theoretical computer science are defined in solely syntactic terms, and as such do not have any meaning until …   Wikipedia

  • Abstract structure — An abstract structure in mathematics is a formal object that is defined by a set of laws, properties, and relationships in a way that is logically if not always historically independent of the structure of contingent experiences, for example,… …   Wikipedia

  • Abstract — Ein Abstract ist eine prägnante Inhaltsangabe, ein Abriss ohne Interpretation und Wertung einer wissenschaftlichen Arbeit. In DIN 1426 wird das (oder auch der) Abstract als Kurzreferat zur Inhaltsangabe beschrieben. Die Definition des American… …   Deutsch Wikipedia

  • Abstract State Machine — Eine abstrakte Zustandsmaschine (englisch Abstract State Machine (ASM), ehemals auch Evolving Algebra (EVA) genannt), ist in der Informatik ein Modell zur formalen, operationellen Beschreibung von Algorithmen. Anders als bei endlichen Automaten,… …   Deutsch Wikipedia

Share the article and excerpts

Direct link
Do a right-click on the link above
and select “Copy Link”