- Probability metric
A probability metric is a function defining a distance between
random variable s or vectors. In particular the probability metric does not satisfy theidentity of indiscernibles condition required to be satisfied by the metric of themetric space .Probability metric of random variables
A probability metric "D" between two
random variable s "X" and "Y" may be defined as::D(X, Y) = E(|X - Y|).,
If the joint probability distribution is absolutely continuous, this is the same as
:int_{-infty}^infty int_{-infty}^infty |x-y|F(x, y) , dx, dy,
where "F"("x", "y") denotes the
joint probability density function of random variables "X" and "Y". Obviously, if "X" and "Y" are independent from each other, the equation above transforms into::D(X, Y) = int_{-infty}^infty int_{-infty}^infty |x-y|f(x)g(y) , dx, dy
where "f"("x") and "g"("y") are the probability density functions of "X" and "Y" respectively.
One may easily show that such probability metrics do not satisfy the
identity of indiscernibles condition of the metric or satisfies itif and only if both of its arguments "X", "Y" are certain events described byDirac delta densityprobability distribution function s. In this case::D_{deltadelta}(X, Y) = int_{-infty}^infty int_{-infty}^infty |x-y|delta(x-mu_x)delta(y-mu_y) , dx, dy = |mu_x-mu_y|
the probability metric simply transforms into the metric between
expected value s mu_x, mu_y of the variables "X" and "Y" and obviously::D_{deltadelta}(X, X) = int_{-infty}^infty int_{-infty}^infty |x-x'|delta(x-mu_x)delta(x'-mu_x) , dx, dx' = |mu_x-mu_x| = 0.
For all other cases:
:Dleft(X, X ight) > 0.
[
Probability metric between two random variables "X" and "Y", both havingnormal distribution s and the samestandard deviation sigma = 0, sigma = 0.2, sigma = 0.4, sigma = 0.6, sigma = 0.8, sigma = 1 (beginning with the bottom curve).m_{xy} = |mu_x-mu_y| denotes a distance betweenmean s of "X" and "Y".]Example: two continuous random variables with normal distributions (NN)
If both
probability distribution function s of random variables "X" and "Y" arenormal distribution s (N) having the samestandard deviation σ, and moreover "X" and "Y" are independent, then evaluating "D"("X", "Y") yields:D_{NN}(X, Y) = mu_{xy} + frac{2sigma}{sqrtpi}operatorname{exp}left(-frac{mu_{xy}^2}{4sigma^2} ight)-mu_{xy} operatorname{erfc} left(frac{mu_{xy{2sigma} ight)
where:mu_{xy} = left|mu_x-mu_y ight|,
erfc("x") is the complementary
error function and subscripts NN indicate the type of the metric.In this case "zero value" of the probability metric D_{NN}(X, Y) amounts:
:lim_{mu_{xy} o 0} D_{NN}(X, Y) = D_{NN}(X, X) = frac{2sigma}{sqrtpi}.
Example: two continuous random variables with uniform distributions (RR)
In case both random variables "X" and "Y" are characterized by uniform distributions ("R") of the same
standard deviation σ, integrating "D"("X", "Y") yields::D_{RR}(X, Y) = egin{cases} frac{24sqrt{3}sigma^3-mu_{xy}^3+6sqrt{3}sigmamu_{xy}^2}{36sigma^2}, & mu_{xy}<2sqrt{3}sigma, \\ mu_{xy}, & mu_{xy} ge 2sqrt{3}sigma. end{cases}
The minimal value of this kind of probability metric amounts:
:D_{RR}(X, X) = frac{2sigma}{sqrt{3.
Probability metric of discrete random variables
In case random variables "X" and "Y" are characterized by
discrete probability distribution the probability metric "D" may be defined as: :D(X, Y) = sum_{i} sum_{j} |x_i-y_j|P(X=x_i)P(Y=y_j),.For example for two discrete Poisson-distributed random variables "X" and "Y" the equation above transforms into:
:D_{PP}(X, Y) = sum_{x=0}^nsum_{y=0}^n |x-y|frac sin{left(frac{m pi x}{L} ight)}, ,
:psi_n(y) = sqrt{frac{2}{L sin{left(frac{n pi y}{L} ight)}, ,
may be defined in terms of probability metric of independent random variables as:
:egin{align}&{} D(X, Y) = intlimits_{0}^L intlimits_{0}^L |x-y||psi_m(x)|^2|psi_n(y)|^2 , dx, dy \\&{} = Lleft(frac{1}{3}(m+n)^2 - frac{m^4 + n^4 + 2m^3n + 2mn^3 + 2m^2n^2}{2m^2n^2pi^2} ight).end{align}
The distance between particles "X" and "Y" is obviously minimum for "m" = 1 i "n" = 1, that is for the minimum energy levels of these particles and amounts:
:min(D(X, Y)) = Lleft(frac{4}{3}-frac{4}{pi^2} ight) approx 0.93L ,.
According to the probability metric properties the minimum distance is nonzero. In fact it is close to the length "L" of the potential well. For other energy levels it is even greater than the length of the well.
External references
* [http://www.springerlink.com/content/y4fbdb0m0r12701p/ A new concept of probability metric and its applications in approximation of scattered data sets]
Wikimedia Foundation. 2010.