# Dempster-Shafer theory

Dempster-Shafer theory

The Dempster-Shafer theory is a mathematical theory of evidenceShafer, Glenn; "A Mathematical Theory of Evidence", Princeton University Press, 1976, ISBN 0-608-02508-9] based on "belief functions" and "plausible reasoning", which is used to combine separate pieces of information (evidence) to calculate the probability of an event. The theory was developed by Arthur P. Dempster and Glenn Shafer.

Consider two possible gambles

The first gamble is that we bet on a head turning up when we toss a coin that is known to be fair. Now consider the second gamble, in which we bet on the outcome of a fight between the world's greatest boxer and the world's greatest wrestler. Assume we are fairly ignorant about martial arts and would have great difficulty making a choice of who to bet on.

Many people would feel more unsure about taking the second gamble, in which the probabilities are unknown, rather than the first gamble, in which the probabilities are easily seen to be one half for each outcome. Dempster-Shafer theory allows one to consider the confidence one has in the probabilities assigned to the various outcomes.

Formalism

Let "X" be the "universal set": the set of all states under consideration. The power set, $mathbb P\left(X\right),!$, is the set of all possible sub-sets of "X", including the empty set. For example, if:

:$X = left \left\{ a, b ight \right\} ,!$

then

:$mathbb P\left(X\right) = left \left\{ varnothing, left \left\{ a ight \right\}, left \left\{ b ight \right\}, X ight \right\}. ,!$

The elements of the power set can be taken to represent propositions that one might be interested in, by containing all and only the states in which this proposition is true.

The theory of evidence assigns a belief mass to each subset of the power set. Formally, a function $m: mathbb P\left(X\right) ightarrow \left[0,1\right]$, is called a "basic belief assignment" (BBA), when it verifies two axioms. First, the mass of the empty set is zero:

:$m\left(varnothing\right) = 0. ,!$

Second, the masses of the remaining members of the power set add up to a total of 1:

:$1 = sum_\left\{A in mathbb P\left(X\right)\right\} m\left(A\right). ,!$

The mass "m"("A") of a given member of the power set, "A", expresses the proportion of all relevant and available evidence that supports the claim that the actual state belongs to "A" but to no particular subset of "A". The value of "m"("A") pertains "only" to the set "A" and makes no additional claims about any subsets of "A", each of which has, by definition, its own mass.

From the mass assignments, the upper and lower bounds of a probability interval can be defined. This interval contains the precise probability of a set of interest (in the classical sense), and is bounded by two non-additive continuous measures called belief (or support) and plausibility:

:$operatorname\left\{bel\right\}\left(A\right) le P\left(A\right) le operatorname\left\{pl\right\}\left(A\right).,!$

The belief bel("A") for a set "A" is defined as the sum of all the masses of (not necessarily proper) subsets of the set of interest:

:$operatorname\left\{bel\right\}\left(A\right) = sum_\left\{B mid B subseteq A\right\} m\left(B\right).$

The plausibility pl("A") is the sum of all the masses of the sets "B" that intersect the set of interest "A":

:$operatorname\left\{pl\right\}\left(A\right) = sum_\left\{B mid B cap A e varnothing\right\} m\left(B\right).$

The two measures are related to each other as follows:

:$operatorname\left\{pl\right\}\left(A\right) = 1 - operatorname\left\{bel\right\}\left(overline\left\{A\right\}\right).,!$

It follows from the above that you need know but one of the three (mass, belief, or plausibility) to deduce the other two, though you may need to know the values for many sets in order to calculate one of the other values for a particular set.

Dempster's rule of combination

The problem we now face is how to combine two independent sets of mass assignments. The original combination rule, known as Dempster's rule of combination, is a generalization of Bayes' rule. This rule strongly emphasises the agreement between multiple sources and ignores "all" the conflicting evidence through a normalization factor. Use of that rule has come under serious criticism when significant conflict in the information is encountered.

Specifically, the combination (called the joint mass) is calculated from the two sets of masses $m_1,!$ and $m_2,!$ in the following manner:

:$m_\left\{1,2\right\}\left(varnothing\right) = 0 ,!$

:$m_\left\{1,2\right\}\left(A\right) = \left(m_1 oplus m_2\right) \left(A\right) = frac \left\{1\right\}\left\{1 - K\right\} sum_\left\{B cap C = A e varnothing\right\} m_1\left(B\right) m_2\left(C\right) ,!$

where

:$K = sum_\left\{B cap C = varnothing\right\} m_1\left(B\right) m_2\left(C\right). ,!$

$K,!$ is a measure of the amount of conflict between the two mass sets. The normalization factor, $1-K,!$, has the effect of completely ignoring conflict and attributing "any" mass associated with conflict to the null set. Consequently, this operation yields counterintuitive results in the face of significant conflict in certain contexts.

Discussion

Dempster-Shafer theory is a generalization of the Bayesian theory of subjective probability; whereas the latter requires probabilities for each question of interest, belief functions base degrees of belief (or confidence, or trust) for one question on the probabilities for a related question. These degrees of belief may or may not have the mathematical properties of probabilities; how much they differ depends on how closely the two questions are relatedShafer, Glenn; [http://www.glennshafer.com/assets/downloads/articles/article48.pdf "Dempster-Shafer theory"] , 2002] . Put another way, it is a way of representing epistemic plausibilities but it can yield answers which contradict those arrived at using probability theory.

Often used as a method of sensor fusion, Dempster-Shafer theory is based on two ideas: obtaining degrees of belief for one question from subjective probabilities for a related question, and Dempster's ruleDempster, Arthur P.; "A generalization of Bayesian inference", Journal of the Royal Statistical Society, Series B, Vol. 30, pp. 205-247, 1968] for combining such degrees of belief when they are based on independent items of evidence. In essence, the degree of belief in a proposition depends primarily upon the number of answers (to the related questions) containing the proposition, and the subjective probability of each answer. Also contributing are the rules of combination that reflect general assumptions about the data.

In this formalism a degree of belief (also referred to as a mass) is represented as a belief function rather than a Bayesian probability distribution. Probability values are assigned to "sets" of possibilities rather than single events: their appeal rests on the fact they naturally encode evidence in favor of propositions.

Dempster-Shafer theory assigns its masses to all of the subsets of the entities that comprise a system. Suppose for example that a system has five members, that is to say five independent states, exactly one of which is actual. If the original set is called S, $|S|=5$, then the set of all subsets &mdash;the "power set"&mdash; is called 2S. Since you can express each possible subset as a binary vector (describing whether any particular member is present or not by writing a “1” or a “0” for that member's slot), it can be seen that there are 25 subsets possible ($2^$

Although these are rather bad examples, as events of that kind would not be modeled as disjoint sets in the probability space, rather would the event "red or blue" be considered as the union of the events "red" and "blue", thereby (see the axioms of probability theory) p(red or white)>= p(white) = 0.25 and p(any)=1.Only the three disjoint events "Blue" "Red" and "White" would need to add up to 1. In fact one could model a probability measure on the space linear proportional to "plausibility" (normalized so that p(red)+p(white)+p(blue) = 1, and with the exception that still all probabilities are <=1)

Combining beliefs

Beliefs corresponding to independent pieces of information are combined using "Dempster's rule of combination" which is a generalisation of the special case of Bayes' theorem where events are independent. Note that the probability masses from propositions that contradict each other can also be used to obtain a measure of how much conflict there is in a system. This measure has been used as a criterion for clustering multiple pieces of seemingly conflicting evidence around competing hypotheses.

In addition, one of the computational advantages of the Dempster-Shafer framework is that priors and conditionals need not be specified, unlike Bayesian methods which often use a symmetry (minimax error) argument to assign prior probabilities to random variables (e.g. assigning 0.5 to binary values for which no information is available about which is more likely). However, any information contained in the missing priors and conditionals is not used in the Dempster-Shafer framework unless it can be obtained indirectly - and arguably is then available for calculation using Bayes equations.

Dempster-Shafer theory allows one to specify a degree of ignorance in this situation instead of being forced to supply prior probabilities which add to unity. This sort of situation, and whether there is a real distinction between "risk" and "ignorance", has been extensively discussed by statisticians and economists. See, for example, the contrasting views of Daniel Ellsberg, Howard Raiffa, Kenneth Arrow and Frank Knight.

Critics

Judea Pearl (1988a, chapter 9Pearl, J. (1988a), "Probabilistic Reasoning in Intelligent Systems," (Revised Second Printing) San Mateo, CA: Morgan Kaufmann.] ; 1988bPearl, J. (1988b) "On Probability Intervals," "International Journal of Approximate Reasoning," 2(3):211-216.] and 1990)Pearl, J. (1990) Reasoning with Belief Functions: An Analysis of Compatibility. "The International Journal of Approximate Reasoning," 4(5/6):363-389.] ; has argued that it is misleading to interpret belief functions as representing either "probabilities of an event," or "the confidence one has in the probabilities assigned to various outcomes," or "degrees of belief (orconfidence, or trust) in a proposition," or "degree ofignorance in a situation." Instead, belieffunctions represent the probability that a given propositionis "provable" from a set of other propositions, to whichprobabilities are assigned. Confusing probabilitiesof "truth" with probabilities of "provability" may lead tocounterintuitive results in reasoning tasks such as(1) representing incomplete knowledge, (2) belief-updatingand (3) evidence pooling. He further demonstrated that, ifpartial knowledge is encoded and updated by belieffunction methods, the resulting beliefscannot serve as a basis for rational decisions.

Kłopotek and Wierzchoń: M.A. Kłopotek, S.T. Wierzchoń: A New Qualitative Rough-Set Approach to Modeling Belief Functions. [in:] L. Polkowski, A, Skowron eds: Rough Sets And Current Trends In Computing. Proc. 1st International Conference RSCTC'98, Warsaw, June 22 - 26 1998, Lecture Notes in Artificial Intelligence 1424, Springer-Verlag, pp. 346-353. ] proposed to interpret the Dempster-Shafer theory in terms of statistics of decision tables (of the rough set theory), whereby the operator of combining evidence should be seen as relational join of decision tables.In another interpretation M.A.Kłopotek, S.T.Wierzchoń: Empirical Models for the Dempster-Shafer Theory. in: Srivastava, R.P., Mock, T.J., (Eds.). Belief Functions in Business Decisions. Series: Studies in Fuzziness and Soft Computing. VOL. 88 Springer-Verlag. March 2002. ISBN 3-7908-1451-2, pp. 62-112 ] they propose to view this theory as describing destructive material processing (under loss of properties), e.g. like in some semiconductor production processes. Under both interpretations reasoning in DST gives correct results, contrary to the earlier probabilistic interpretations, criticized by Pearl in the cited papers and by other researches.

ee also

*Possibility theory
*Probability theory
*Bayes' theorem
*Bayesian network
*G.L.S. Shackle
*Transferable belief model
*Info-gap decision theory
*Subjective logic

References

* Joseph C. Giarratano and Gary D. Riley (2005); "Expert Systems: principles and programming", ed. Thomson Course Tech., ISBN 0-534-38447-1
* Kari Sentz and Scott Ferson (2002); [http://www.sandia.gov/epistemic/Reports/SAND2002-0835.pdf "Combination of Evidence in Dempster-Shafer Theory"] , Sandia National Laboratories SAND 2002-0835

* Yager, R. R., & Liu, L. (2008). "Classic works of the Dempster-Shafer theory of belief functions." Studies in fuzziness and soft computing, v. 219. Berlin: Springer. ISBN 9783540253815.

Wikimedia Foundation. 2010.

### Look at other dictionaries:

• Dempster–Shafer theory — Prof Arthur P. Dempster at the workshop on Belief Function Theory (Brest 1 april 2010). The Dempster–Shafer theory (DST) is a mathematical theory of evidence.[1] It allows …   Wikipedia

• Dempster-Shafer-Theorie — Die Evidenztheorie von Dempster und Shafer ist eine mathematische Theorie aus dem Gebiet der Wahrscheinlichkeitstheorie. Sie wird benutzt, um Informationen unterschiedlicher Quellen zu einer Gesamtaussage zusammenzusetzen, wobei die… …   Deutsch Wikipedia

• Theorie Dempster-Shafer — Théorie Dempster Shafer La Théorie Dempster Shafer est une théorie mathématique basée sur la notion de preuves[1] utilisant les fonctions de croyance et le raisonnement plausible. Le but de cette théorie est de permettre de combiner des preuves… …   Wikipédia en Français

• Théorie dempster-shafer — La Théorie Dempster Shafer est une théorie mathématique basée sur la notion de preuves[1] utilisant les fonctions de croyance et le raisonnement plausible. Le but de cette théorie est de permettre de combiner des preuves distinctes pour calculer… …   Wikipédia en Français

• Théorie Dempster-Shafer — La Théorie Dempster Shafer est une théorie mathématique basée sur la notion de preuves[1] utilisant les fonctions de croyance et le raisonnement plausible. Le but de cette théorie est de permettre de combiner des preuves distinctes pour calculer… …   Wikipédia en Français

• Dempster — can refer to: Maple Leaf Foods Dempster Brand Bread Dempster Street, a major east west artery north of Chicago, Illinois, part of which is part of U.S. Route 14 in Illinois Dempster (CTA), stations on the Chicago Transit Authority s L system, all …   Wikipedia

• Decision theory — in economics, psychology, philosophy, mathematics, and statistics is concerned with identifying the values, uncertainties and other issues relevant in a given decision, its rationality, and the resulting optimal decision. It is closely related to …   Wikipedia

• Arthur P. Dempster — Arthur Pentland Dempster is a Professor Emeritus in the Harvard University Department of Statistics. He was one of four faculty when the department was founded in 1957.He was a Putnam Fellow in 1951. He obtained his Ph.D. from Princeton… …   Wikipedia

• Possibility theory — is a mathematical theory for dealing with certain types of uncertainty and is an alternative to probability theory. Professor Lotfi Zadeh first introduced possibility theory in 1978 as an extension of his theory of fuzzy sets and fuzzy logic. D.… …   Wikipedia

• Evidenztheorie — Die Evidenztheorie von Dempster und Shafer ist eine mathematische Theorie aus dem Gebiet der Wahrscheinlichkeitstheorie. Sie wird benutzt, um Informationen unterschiedlicher Quellen zu einer Gesamtaussage zusammenzusetzen, wobei die… …   Deutsch Wikipedia