Formal concept analysis

Formal concept analysis

Formal concept analysis is a principled way of automatically deriving an ontology from a collection of objects and their properties. The term was introduced by Rudolf Wille in 1984, and builds on applied lattice and order theory that was developed by Birkhoff and others in the 1930s.

Contents

Intuitive description

Formal concept analysis refers to both an unsupervised machine learning technique and, more broadly, a method of data analysis. The approach takes as input a matrix specifying a set of objects and the properties thereof, called attributes, and finds both all the "natural" clusters of attributes and all the "natural" clusters of objects in the input data, where

  • a "natural" object cluster is the set of all objects that share a common subset of attributes, and
  • a "natural" property cluster is the set of all attributes shared by one of the natural object clusters.

Natural property clusters correspond one-for-one with natural object clusters, and a concept is a pair containing both a natural property cluster and its corresponding natural object cluster. The family of these concepts obeys the mathematical axioms defining a lattice, and is called a concept lattice (in French this is called a Treillis de Galois because the relation between the sets of concepts and attributes is a Galois connection).

Note the strong parallel between "natural" property clusters and definitions in terms of individually necessary and jointly sufficient conditions, on one hand, and between "natural" object clusters and the extensions of such definitions, on the other. Provided that the input objects and input concepts provide a complete description of the world (never true in practice, but perhaps a reasonable approximation), then the set of attributes in each concept can be interpreted as a set of singly necessary and jointly sufficient conditions for defining the set of objects in the concept. Conversely, if a set of attributes is not identified as a concept in this framework, then those attributes are not singly necessary and jointly sufficient for defining any non-empty subset of objects in the world.

Example

A concept lattice for objects consisting of the integers from 1 to 10, and attributes composite (c), square (s), even (e), odd (o) and prime (p). The lattice is drawn as a Hasse diagram.

Consider O = {1,2,3,4,5,6,7,8,9,10}, and A = {composite, even, odd, prime, square}. The smallest concept including the number 3 is the one with objects {3,5,7}, and attributes {odd, prime}, for 3 has both of those attributes and {3,5,7} is the set of objects having that set of attributes. The largest concept involving the attribute of being square is the one with objects {1,4,9} and attributes {square}, for 1, 4 and 9 are all the square numbers and all three of them have that set of attributes. It can readily be seen that both of these example concepts satisfy the formal definitions below

The full set of concepts for these objects and attributes is shown in the illustration. It includes a concept for each of the original attributes: the composite numbers, square numbers, even numbers, odd numbers, and prime numbers. Additionally it includes concepts for the even composite numbers, composite square numbers (that is, all square numbers except 1), even composite squares, odd squares, odd composite squares, even primes, and odd primes.

Contexts and concepts

A (formal) context consists of a set of objects O, a set of attributes A, and an indication of which objects have which attributes. Formally it can be regarded as a bipartite graph I ⊆ O × A.

composite even odd prime square
1
2
3
4
5
6
7
8
9
10

A (formal) concept for a context is defined to be a pair (Oi, Ai) such that

  1. OiO
  2. AiA
  3. every object in Oi has every attribute in Ai
  4. for every object in O that is not in Oi, there is an attribute in Ai that the object does not have
  5. for every attribute in A that is not in Ai, there is an object in Oi that does not have that attribute

Oi is called the extent of the concept, Ai the intent.

A context may be described as a table, with the objects corresponding to the rows of the table, the attributes corresponding to the columns of the table, and a Boolean value (in the example represented graphically as a checkmark) in cell (x, y) whenever object x has value y:

A concept, in this representation, forms a maximal subarray (not necessarily contiguous) such that all cells within the subarray are checked. For instance, the concept highlighted with a different background color in the example table is the one describing the odd prime numbers, and forms a 3 × 2 subarray in which all cells are checked.[1]

Concept lattice of a context

The concepts (Oi, Ai) defined above can be partially ordered by inclusion: if (Oi, Ai) and (Oj, Aj) are concepts, we define a partial order ≤ by saying that (Oi, Ai) ≤ (Oj, Aj) whenever OiOj. Equivalently, (Oi, Ai) ≤ (Oj, Aj) whenever AjAi.

Every pair of concepts in this partial order has a unique greatest lower bound (meet). The greatest lower bound of (Oi, Ai) and (Oj, Aj) is the concept with objects OiOj; it has as its attributes the union of Ai, Aj, and any additional attributes held by all objects in OiOj. Symmetrically, every pair of concepts in this partial order has a unique least upper bound (join). The least upper bound of (Oi, Ai) and (Oj, Aj) is the concept with attributes AiAj; it has as its objects the union of Oi, Oj, and any additional objects that have all attributes in AiAj.

These meet and join operations satisfy the axioms defining a lattice. In fact, by considering infinite meets and joins, analogously to the binary meets and joins defined above, one sees that this is a complete lattice. It may be viewed as the Dedekind–MacNeille completion of a partially ordered set of height two in which the elements of the partial order are the objects and attributes of A and in which two elements x and y satisfy x ≤ y exactly when x is an object that has attribute y.

Any finite lattice may be generated as the concept lattice for some context. For, let L be a finite lattice, and form a context in which the objects and the attributes both correspond to elements of L. In this context, let object x have attribute y exactly when x and y are ordered as xy in the lattice. Then, the concept lattice of this context is isomorphic to L itself.[2] This construction may be interpreted as forming the Dedekind–MacNeille completion of L, which is known to produce an isomorphic lattice from any finite lattice.

Concept algebra of a context

Modelling negation in a formal context is somewhat problematic because the complement (O\Oi, A\Ai) of a concept (Oi, Ai) is in general not a concept. However, since the concept lattice is complete one can consider the join (Oi, Ai)Δ of all concepts (Oj, Aj) that satisfy Oj ⊆ G\Oi; or dually the meet (Oi, Ai)


Wikimedia Foundation. 2010.

Игры ⚽ Нужен реферат?

Look at other dictionaries:

  • Concept mining — is an activity that results in the extraction of concepts from artifacts. Solutions to the task typically involve aspects of artificial intelligence and statistics, such as data mining and text mining. Because artifacts are typically a loosely… …   Wikipedia

  • Concept — For other uses, see Concept (disambiguation). A concept (substantive term: conception) is a cognitive unit of meaning an abstract idea or a mental symbol sometimes defined as a unit of knowledge, built from other units which act as a concept s… …   Wikipedia

  • Concept-oriented model — Example of a concept oriented model. The concept oriented model (COM) is a data model based on the following three principles: Duality principle postulates that any element consists of two parts, called identity and entity. Accordingly, data… …   Wikipedia

  • Analysis — (from Greek ἀνάλυσις , a breaking up ) is the process of breaking a complex topic or substance into smaller parts to gain a better understanding of it. The technique has been applied in the study of mathematics and logic since before Aristotle,… …   Wikipedia

  • analysis — /euh nal euh sis/, n., pl. analyses / seez /. 1. the separating of any material or abstract entity into its constituent elements (opposed to synthesis). 2. this process as a method of studying the nature of something or of determining its… …   Universalium

  • Concept learning — Concept learning, also known as category learning, concept attainment, and concept formation, is largely based on the works of the cognitive psychologist Jerome Bruner. Bruner, Goodnow, Austin (1967) defined concept attainment (or concept… …   Wikipedia

  • Boolean analysis — was introduced by Flament (1976). The goal of a Boolean analysis is to detect deterministic dependencies between the items of a questionnaire in observed response patterns. These deterministic dependencies have the form of logical formulas… …   Wikipedia

  • Structured data analysis (statistics) — Structured data analysis is the statistical data analysis of structured data. Either in the form of a priori structure such as multiple choice questionnaires or in situations with the need to search for structure that fits the given data, either… …   Wikipedia

  • Formal proof — See also: mathematical proof, proof theory, and axiomatic system A formal proof or derivation is a finite sequence of sentences (called well formed formulas in the case of a formal language) each of which is an axiom or follows from the… …   Wikipedia

  • Concept map — For concept maps in generic programming, see Concept (generic programming). Example concept map, created using the IHMC CmapTools computer program. A concept map is a diagram showing the relationships among concepts. It is a graphical tool for… …   Wikipedia

Share the article and excerpts

Direct link
Do a right-click on the link above
and select “Copy Link”