Shape analysis (software)

Shape analysis (software)

Shape analysis is a static code analysis technique that discovers and verifies properties of linked, dynamically allocated data structures in (usually imperative) computer programs. It is typically used at compile time to find software bugs or to verify high-level correctness properties of programs. In Java programs, it can be used to ensure that a sort method correctly sorts a list. For C programs, it might look for places where a block of memory is not properly freed. Although shape analyses are very powerful, they usually take a long time to run. For this reason, they have not seen widespread acceptance outside of universities and research labs (where they are only used experimentally).


Shape analysis has been applied to a variety of problems:
* Finding memory leaks, including Java-style leaks where a pointer to an unused object is not nulled out
* Discovering cases where a block of memory is freed more than once (in C)
* Finding dereferences of "dangling pointers" (pointers to freed memory in C)
* Finding array out-of-bounds errors
* Checking type-state properties (for example, ensuring that a file is open() before it is read())
* Ensuring that a method to reverse a linked list does not introduce cycles into the list
* Verifying that a sort method returns a result that is in sorted order


Shape analysis is a form of pointer analysis, although it is more precise than typical pointer analyses. Pointer analyses attempt to determine the set of objects to which a pointer can point (called the points-to set of the pointer). Unfortunately, these analyses are necessarily approximate (since a perfectly precise static analysis could solve the halting problem). Shape analyses can determine smaller (more precise) points-to sets.

Consider the following simple C++ program.

Item *items [10] ;for (int i=0; i<10; i++) { items [i] = new Item(...); // line [1] }process_items(items); // line [2] for (int i=0; i<10; i++) { delete items [i] ; // line [3] }

This program builds an array of objects, processes them in some arbitrary way, and then deletes them. Assuming that the process_items function is free of errors, it is clear that the program is safe: it never references freed memory, and it deletes all the objects that it has constructed.

Unfortunately, most pointer analyses have difficulty analyzing this program precisely. In order to determine points-to sets, a pointer analysis must be able to "name" a program's objects. In general, programs can allocate an unbounded number of objects; but in order to terminate, a pointer analysis can only use a finite set of names. A typical approximation is to give all the objects allocated on a given line of the program the same name. In the example above, all the objects constructed at line [1] would have the same name. Therefore, when the delete statement is analyzed for the first time, the analysis determines that one of the objects named [1] is being deleted. The second time the statement is analyzed (since it is in a loop) the analysis warns of a possible error: since it is unable to distinguish the objects in the array, it may be that the second delete is deleting the same object as the first delete. This warning is spurious, and the goal of shape analysis is to avoid such warnings.

ummarization and materialization

Shape analysis overcomes the problems of pointer analysis by using a more flexible naming system for objects. Rather than giving an object the same name throughout a program, objects can change names depending on the program's actions. Sometimes, several distinct objects with different names may be "summarized," or merged, so that they have the same name. Then, when a summarized object is about to be used by the program, it can be "materialized"--that is, the summarized object is split into two objects with distinct names, one representing a single object and the other representing the remaining summarized objects. The basic heuristic of shape analysis is that objects that are being used by the program are represented using unique materialized objects, while objects not in use are summarized.

The array of objects in the example above is summarized in separate ways at lines [1] , [2] , and [3] . At line [1] , the array has been only partly constructed. The array elements 0..i-1 contain constructed objects. The array element i is about to be constructed, and the following elements are uninitialized. A shape analysis can approximate this situation using a summary for the first set of elements, a materialized memory location for element i, and a summary for the remaining uninitialized locations, as follows:

Notice that in this case, the analysis recognizes that the pointer at index i has not been deleted yet. Therefore, it doesn't warn of a double deletion.

ee also

* Alias analysis
* Escape analysis
* Pointer analysis


*cite journal
author = Mooly Sagiv
authorlink =
coauthors = Thomas Reps, Reinhard Wilhelm
title = Parametric shape analysis via 3-valued logic
journal = ACM Transactions on Programming Languages and Systems (TOPLAS)
volume = 24
issue = 3
pages = 217&ndash;298
publisher = ACM
date = May 2002
url =
doi = 10.1145/292540.292552
id =
accessdate =

Wikimedia Foundation. 2010.

Игры ⚽ Поможем написать реферат

Look at other dictionaries:

  • Software tools for molecular microscopy — There are a large number of software tools or software applications that have been specifically developed for the field sometimes referred to as molecular microscopy or cryo electron microscopy or cryoEM. Several special issues of the Journal of… …   Wikipedia

  • Static program analysis — This article is about certain software quality assessment methods. For the statistical method, see Static analysis. Static program analysis (also Static code analysis or SCA) is the analysis of computer software that is performed without actually …   Wikipedia

  • Static code analysis — is the analysis of computer software that is performed without actually executing programs built from that software (analysis performed on executing programs is known as dynamic analysis). In most cases the analysis is performed on some version… …   Wikipedia

  • Pointer analysis — In computer science pointer analysis, or points to analysis, is a static code analysis technique that establishes which pointers, or heap references, can point to which variables or storage locations. It is often a component of more complex… …   Wikipedia

  • Shape context — is the term given by Serge Belongie and Jitendra Malik to the feature descriptor they first proposed in their paper Matching with Shape Contexts in 2000cite conference author = S. Belongie and J. Malik title = Matching with Shape Contexts url =… …   Wikipedia

  • Escape analysis — In programming language compiler optimization theory, escape analysis is a method for determining the dynamic scope of pointers. It is related to pointer analysis and shape analysis.When a variable (or an object) is allocated in a subroutine, a… …   Wikipedia

  • Alias analysis — is a technique in compiler theory, used to determine if a storage location may be accessed in more than one way. Two pointers are said to be aliased if they point to the same location. Alias analysis techniques are usually classified by flow… …   Wikipedia

  • analysis — /euh nal euh sis/, n., pl. analyses / seez /. 1. the separating of any material or abstract entity into its constituent elements (opposed to synthesis). 2. this process as a method of studying the nature of something or of determining its… …   Universalium

  • Mass spectrometry software — is software used for data acquisition, analysis, or representation in mass spectrometry. Contents 1 MS/MS peptide identification 1.1 Database search algorithms 1.1.1 SEQUEST 1.1.2 …   Wikipedia

  • List of numerical analysis topics — This is a list of numerical analysis topics, by Wikipedia page. Contents 1 General 2 Error 3 Elementary and special functions 4 Numerical linear algebra …   Wikipedia

Share the article and excerpts

Direct link
Do a right-click on the link above
and select “Copy Link”