# Topological data analysis

Topological data analysis

Topological data analysis is a new area of study aimed at having applications in areas such as data mining and computer vision. The main problems are (1) how one infers high-dimensionalstructure from low-dimensional representations; and (2) how one assembles discrete points into global structure.

The human brain can easily extract global structure from representations in a strictly lower dimension, i.e. we infer a 3D environment from a 2D image from each eye. The inference of global structure also occurs when converting discrete data into continuous images. E.g. dot-matrix printers and televisions communicate images via arrays of discrete points.

The main method used by topological data analysis is:

(1) replace a set of data points with a family of simplicial complexes, indexed by a proximity parameter.

(2) Analyse these topological complexes via algebraic topology — specifically, via the new theory of persistent homology.

(3) Encode the persistent homology of a data set in the form of a parameterized version of a Betti number which will be called a barcode.

Point cloud data

Data is often represented as points in a Euclidean "n"-dimensional space E"n". The global "shape" of the data may provide information about the phenomena that the data represent.

One type of data set for which global features are certainly present is the so-called point cloud data coming from physical objects in 3D. E.g. a laser can scan an object at a set of discrete points and the cloud of such points can be used in a computer representation of the object. Point cloud data refers to any collection of points in E"n" or a (perhaps noisy) sample of points on a lower-dimensional subset.

For point clouds in low-dimensional spaces there are numerous approaches for inferring features based on planar projections in the fields of computer graphics and statistics. Topological data analysis is needed when the spaces are high-dimensional or too twisted to allow planar projections.

To convert a point cloud in a metric space into a global object use the point cloud as thevertices of a graph whose edges are determined by proximity, then turn the graph into a simplicial complex and use algebraic topology to study it.

Persistent homology

ee also

*Dimensionality reduction
*Data mining
*Computer vision
*Computational topology
*Digital topology
*Digital Morse theory
*Shape analysis
*Structured data analysis (statistics)

References

* [http://www.ams.org/bull/2008-45-01/S0273-0979-07-01191-3/S0273-0979-07-01191-3.pdf BARCODES: THE PERSISTENT TOPOLOGY OF DATA]
* [http://www.informatics.bangor.ac.uk/%7Etporter/TDA/TDA.html Topological Data Analysis: the algebraic topology of point data clouds?]
*

Wikimedia Foundation. 2010.

### Look at other dictionaries:

• Geometric data analysis — can refer to geometric aspects of image analysis, pattern analysis and shape analysis or the approach of multivariate statistics that treats arbitrary data sets as clouds of points in n dimensional space. This includes topological data analysis,… …   Wikipedia

• Structured data analysis (statistics) — Structured data analysis is the statistical data analysis of structured data. Either in the form of a priori structure such as multiple choice questionnaires or in situations with the need to search for structure that fits the given data, either… …   Wikipedia

• Data mining — Not to be confused with analytics, information extraction, or data analysis. Data mining (the analysis step of the knowledge discovery in databases process,[1] or KDD), a relatively young and interdisciplinary field of computer science[2][3] is… …   Wikipedia

• Shape analysis — This article describes shape analysis to analyze and process geometric shapes. The shape analysis described here is related to the statistical analysis of geometric shapes, to shape matching and shape recognition. It applies purely to the… …   Wikipedia

• Data Validation and Reconciliation — Industrial process data validation and reconciliation or short data validation and reconciliation (DVR) is a technology which is using process information and mathematical methods in order to automatically correct measurements in industrial… …   Wikipedia

• analysis — /euh nal euh sis/, n., pl. analyses / seez /. 1. the separating of any material or abstract entity into its constituent elements (opposed to synthesis). 2. this process as a method of studying the nature of something or of determining its… …   Universalium

• Spatial analysis — In statistics, spatial analysis or spatial statistics includes any of the formal techniques which study entities using their topological, geometric, or geographic properties. The phrase properly refers to a variety of techniques, many still in… …   Wikipedia

• Cluster analysis — The result of a cluster analysis shown as the coloring of the squares into three clusters. Cluster analysis or clustering is the task of assigning a set of objects into groups (called clusters) so that the objects in the same cluster are more… …   Wikipedia

• Mathematical analysis — Mathematical analysis, which mathematicians refer to simply as analysis, has its beginnings in the rigorous formulation of infinitesimal calculus. It is a branch of pure mathematics that includes the theories of differentiation, integration and… …   Wikipedia

• Fourier analysis — In mathematics, Fourier analysis is a subject area which grew out of the study of Fourier series. The subject began with trying to understand when it was possible to represent general functions by sums of simpler trigonometric functions. The… …   Wikipedia