- Probability metric
A

**probability metric**is a function defining a distance betweenrandom variable s or vectors. In particular the probability metric does not satisfy theidentity of indiscernibles condition required to be satisfied by the metric of themetric space .**Probability metric of random variables**A probability metric "D" between two

random variable s "X" and "Y" may be defined as::$D(X,\; Y)\; =\; E(|X\; -\; Y|).,$

If the joint probability distribution is absolutely continuous, this is the same as

:$int\_\{-infty\}^infty\; int\_\{-infty\}^infty\; |x-y|F(x,\; y)\; ,\; dx,\; dy,$

where "F"("x", "y") denotes the

joint probability density function of random variables "X" and "Y". Obviously, if "X" and "Y" are independent from each other, the equation above transforms into::$D(X,\; Y)\; =\; int\_\{-infty\}^infty\; int\_\{-infty\}^infty\; |x-y|f(x)g(y)\; ,\; dx,\; dy$

where "f"("x") and "g"("y") are the probability density functions of "X" and "Y" respectively.

One may easily show that such probability metrics do not satisfy the

identity of indiscernibles condition of the metric or satisfies itif and only if both of its arguments "X", "Y" are certain events described byDirac delta densityprobability distribution function s. In this case::$D\_\{deltadelta\}(X,\; Y)\; =\; int\_\{-infty\}^infty\; int\_\{-infty\}^infty\; |x-y|delta(x-mu\_x)delta(y-mu\_y)\; ,\; dx,\; dy\; =\; |mu\_x-mu\_y|$

the probability metric simply transforms into the metric between

expected value s $mu\_x$, $mu\_y$ of the variables "X" and "Y" and obviously::$D\_\{deltadelta\}(X,\; X)\; =\; int\_\{-infty\}^infty\; int\_\{-infty\}^infty\; |x-x\text{'}|delta(x-mu\_x)delta(x\text{'}-mu\_x)\; ,\; dx,\; dx\text{'}\; =\; |mu\_x-mu\_x|\; =\; 0.$

For all other cases:

:$Dleft(X,\; X\; ight)\; >\; 0.$

[

Probability metric between two random variables "X" and "Y", both havingnormal distribution s and the samestandard deviation $sigma\; =\; 0,\; sigma\; =\; 0.2,\; sigma\; =\; 0.4,\; sigma\; =\; 0.6,\; sigma\; =\; 0.8,\; sigma\; =\; 1$ (beginning with the bottom curve).$m\_\{xy\}\; =\; |mu\_x-mu\_y|$ denotes a distance betweenmean s of "X" and "Y".]**Example: two continuous random variables with normal distributions (NN)**If both

probability distribution function s of random variables "X" and "Y" arenormal distribution s (N) having the samestandard deviation σ, and moreover "X" and "Y" are independent, then evaluating "D"("X", "Y") yields:$D\_\{NN\}(X,\; Y)\; =\; mu\_\{xy\}\; +\; frac\{2sigma\}\{sqrtpi\}operatorname\{exp\}left(-frac\{mu\_\{xy\}^2\}\{4sigma^2\}\; ight)-mu\_\{xy\}\; operatorname\{erfc\}\; left(frac\{mu\_\{xy\{2sigma\}\; ight)$

where:$mu\_\{xy\}\; =\; left|mu\_x-mu\_y\; ight|$,

erfc("x") is the complementary

error function and subscripts NN indicate the type of the metric.In this case "zero value" of the probability metric $D\_\{NN\}(X,\; Y)$ amounts:

:$lim\_\{mu\_\{xy\}\; o\; 0\}\; D\_\{NN\}(X,\; Y)\; =\; D\_\{NN\}(X,\; X)\; =\; frac\{2sigma\}\{sqrtpi\}.$

**Example: two continuous random variables with uniform distributions (RR)**In case both random variables "X" and "Y" are characterized by uniform distributions ("R") of the same

standard deviation σ, integrating "D"("X", "Y") yields::$D\_\{RR\}(X,\; Y)\; =\; egin\{cases\}\; frac\{24sqrt\{3\}sigma^3-mu\_\{xy\}^3+6sqrt\{3\}sigmamu\_\{xy\}^2\}\{36sigma^2\},\; mu\_\{xy\}2sqrt\{3\}sigma,\; \backslash \backslash \; mu\_\{xy\},\; mu\_\{xy\}\; ge\; 2sqrt\{3\}sigma.\; end\{cases\}$

The minimal value of this kind of probability metric amounts:

:$D\_\{RR\}(X,\; X)\; =\; frac\{2sigma\}\{sqrt\{3.$

**Probability metric of discrete random variables**In case random variables "X" and "Y" are characterized by

discrete probability distribution the probability metric "D" may be defined as: :$D(X,\; Y)\; =\; sum\_\{i\}\; sum\_\{j\}\; |x\_i-y\_j|P(X=x\_i)P(Y=y\_j),$.For example for two discrete Poisson-distributed random variables "X" and "Y" the equation above transforms into:

:$D\_\{PP\}(X,\; Y)\; =\; sum\_\{x=0\}^nsum\_\{y=0\}^n\; |x-y|frac\; sin\{left(frac\{m\; pi\; x\}\{L\}\; ight)\},\; ,$

:$psi\_n(y)\; =\; sqrt\{frac\{2\}\{L\; sin\{left(frac\{n\; pi\; y\}\{L\}\; ight)\},\; ,$

may be defined in terms of probability metric of independent random variables as:

:$egin\{align\}\{\}\; D(X,\; Y)\; =\; intlimits\_\{0\}^L\; intlimits\_\{0\}^L\; |x-y||psi\_m(x)|^2|psi\_n(y)|^2\; ,\; dx,\; dy\; \backslash \backslash \{\}\; =\; Lleft(frac\{1\}\{3\}(m+n)^2\; -\; frac\{m^4\; +\; n^4\; +\; 2m^3n\; +\; 2mn^3\; +\; 2m^2n^2\}\{2m^2n^2pi^2\}\; ight).end\{align\}$

The distance between particles "X" and "Y" is obviously minimum for "m" = 1 i "n" = 1, that is for the minimum energy levels of these particles and amounts:

:$min(D(X,\; Y))\; =\; Lleft(frac\{4\}\{3\}-frac\{4\}\{pi^2\}\; ight)\; approx\; 0.93L\; ,.$

According to the probability metric properties the minimum distance is nonzero. In fact it is close to the length "L" of the potential well. For other energy levels it is even greater than the length of the well.

**External references*** [

*http://www.springerlink.com/content/y4fbdb0m0r12701p/ A new concept of probability metric and its applications in approximation of scattered data sets*]

*Wikimedia Foundation.
2010.*