Empirical distribution function
- Empirical distribution function
In statistics, an empirical distribution function is a cumulative probability distribution function that concentrates probability 1/"n" at each of the "n" numbers in a sample.
Let X_1,ldots,X_n be iid random variables in mathbb{R} with the cdf "F"("x").
The empirical distribution function F_n(x) based on sample X_1,ldots,X_n is a step function defined by
:F_n(x) = frac{ mbox{number of elements in the sample} leq x}n = frac{1}{n} sum_{i=1}^n I(X_i le x),
where "I"("A") is the indicator of event "A".
For fixed "x", I(X_ileq x) is a Bernoulli random variable with parameter "p" = "F"("x"), hence nF_n(x) is a binomial random variable with mean "nF"("x") and variance "nF"("x")(1 − "F"("x")).
Asymptotical properties
* By the strong law of large numbers,
:: F_n(x) o F(x) almost surely for fixed "x".
:In other words, F_n(x) is a consistent unbiased estimator of the cumulative distribution function "F(x)".
* By the central limit theorem,
:: sqrt{n}(F_n(x)-F(x))
converges in distribution to a normal distribution "N"(0, "F"("x")(1 − "F"("x"))) for fixed "x".:The Berry–Esséen theorem provides the rate of this convergence.
* By the Glivenko-Cantelli theorem F_n(x) o F(x) uniformly over "x", that is :: F_n(x)-F(x)|_infty o 0 with probability 1. :The Dvoretzky-Kiefer-Wolfowitz inequality provides the rate of this convergence.
* Kolmogorov showed that :: sqrt{n}|F_n(x)-F(x)|_infty converges in distribution to the Kolmogorov distribution, provided that "F"("x") is continuous.:The Kolmogorov-Smirnov test for "goodness-of-fit" is based on this fact.
* By Donsker's theorem,:: sqrt{n}(F_n-F), as a process indexed by "x", converges weakly in ell^infty(mathbb{R}) to a Brownian bridge "B"("F"("x")).
See also
* Càdlàg functions
* Empirical probability
* Empirical process
Wikimedia Foundation.
2010.
Look at other dictionaries:
Cumulative distribution function — for the normal distributions in the image below … Wikipedia
Empirical probability — Empirical probability, also known as relative frequency, or experimental probability, is the ratio of the number favourable outcomes to the total number of trials [ [http://www.answers.com/topic/empirical probability statistics Empirical… … Wikipedia
Empirical measure — In probability theory, an empirical measure is a random measure arising from a particular realization of a (usually finite) sequence of random variables. The precise definition is found below. Empirical measures are relevant to mathematical… … Wikipedia
Empirical process — The study of empirical processes is a branch of mathematical statistics and a sub area of probability theory. It is a generalization of the central limit theorem for empirical measures. DefinitionIt is known that under certain conditions… … Wikipedia
Normal distribution — This article is about the univariate normal distribution. For normally distributed vectors, see Multivariate normal distribution. Probability density function The red line is the standard normal distribution Cumulative distribution function … Wikipedia
Distribution (economics) — Distribution in economics refers to the way total output, income, or wealth is distributed among individuals or among the factors of production (such as labour, land, and capital).[1]. In general theory and the national income and product… … Wikipedia
Empirical Bayes method — In statistics, empirical Bayes methods are a class of methods which use empirical data to evaluate / approximate the conditional probability distributions that arise from Bayes theorem. These methods allow one to estimate quantities… … Wikipedia
Multivariate normal distribution — MVN redirects here. For the airport with that IATA code, see Mount Vernon Airport. Probability density function Many samples from a multivariate (bivariate) Gaussian distribution centered at (1,3) with a standard deviation of 3 in roughly the… … Wikipedia
Characteristic function (probability theory) — The characteristic function of a uniform U(–1,1) random variable. This function is real valued because it corresponds to a random variable that is symmetric around the origin; however in general case characteristic functions may be complex valued … Wikipedia
Stretched exponential function — Figure 1. Illustration of a stretched exponential fit (with β=0.52) to an empirical master curve. For comparison, a least squares single and a double exponential fit are also shown. The data are rotational anisotropy of anthracene in… … Wikipedia