Probability plot

Probability plot

The probability plot is a graphical technique for assessing whether or not a data set follows a given distribution such as the normal or Weibull, and for visually estimating the location and scale parameters of the chosen distribution. The data are plotted against a theoretical distribution in such a way that the points should form approximately a straight line. Departures from this straight line indicate departures from the specified distribution.

The probability plot correlation coefficient is the correlation coefficient associated with the linear fit to the data in the probability plot; it is a measure of the goodness of the fit. Estimates of the location and scale parameters of the distribution are given by the intercept and slope. Probability plots can be generated for several competing distributions to see which provides the best fit, and the probability plot generating the highest correlation coefficient is the best choice since it generates the straightest probability plot.

For distributions with shape parameters (not counting location and scale parameters), the shape parameters must be known in order to generate the probability plot. For distributions with a single shape parameter, the probability plot correlation coefficient plot (PPCC plot) provides an excellent method for estimating the shape parameter.

The special case of the normal probability plot is covered separately due to its importance in many statistical applications.

Definition

The probability plot is formed by:
* Vertical axis: Ordered response values
* Horizontal axis: Order statistic medians for the given distribution The order statistic medians are defined as::N(i) = G(U(i)) where "U(i)" are the uniform order statistic medians (defined below) and "G" is the quantile function for the desired distribution. The quantile function is the inverse of the cumulative distribution function (probability that "X" is less than or equal to some value). That is, given a probability, we want the corresponding quantile of the cumulative distribution function.

The uniform order statistic medians are defined as::m(i) = egin{cases} 1 - m(n) & i = 1\ \ dfrac{i - 0.3175}{n + 0.365} & i = 2, 3, ldots, n-1\ \ 0.5^{1/n} & i = nend{cases}

In addition, a straight line can be fit to the points and added as a reference line. The further the points vary from this line, the greater the indication of a departure from the specified distribution.

This definition implies that a probability plot can be easily generated for any distribution for which the quantile function can be computed. One advantage of this method of computing probability plots is that the intercept and slope estimates of the fitted line are in fact estimates for the location and scale parameters of the distribution. Although this is not too important for the normal distribution (the location and scale are estimated by the mean and standard deviation, respectively), it can be useful for many other distributions.

Relation with Q-Q plots

Q-Q plots are similar to probability plots; the difference is that in a Q-Q plot, one uses the quantile of the distribution as the x-axis, whereas in a probability plot, one uses the expected value of the "k"th order statistic. Only when "n" is small is there a substantial difference between a Q-Q plot and a probability plot.

External links

* [http://www.itl.nist.gov/div898/handbook/eda/section3/probplot.htm Probability plot]

References

*cite book
last = Chambers
first = John
coauthors = William Cleveland, Beat Kleiner, and Paul Tukey
year = 1983
title = Graphical Methods for Data Analysis
publisher = Wadsworth


Wikimedia Foundation. 2010.

Игры ⚽ Нужно сделать НИР?

Look at other dictionaries:

  • Probability plot correlation coefficient plot — Many statistical analyses are based on distributional assumptions about the population from which the data have been obtained. However, distributional families can have radically different shapes depending on the value of the shape parameter.… …   Wikipedia

  • Normal probability plot — The normal probability plot is a graphical technique for normality testing: assessing whether or not a data set is approximately normally distributed. Example of a normal probability plot. The data are plotted against a theoretical normal… …   Wikipedia

  • Plot (graphics) — Scatterplot of the eruption interval for Old Faithful (a geyser). A plot is a graphical technique for representing a data set, usually as a graph showing the relationship between two or more variables. The plot can be drawn by hand or by a… …   Wikipedia

  • Probability Sun — infobox Book | name = Probability Sun title orig = translator = image caption = (no image) author = Nancy Kress illustrator = cover artist = country = United States language = English series = Probability trilogy genre = Science fiction novel… …   Wikipedia

  • Probability Moon — infobox Book | name = Probability Moon title orig = translator = image caption = (no image) author = Nancy Kress illustrator = cover artist = country = United States language = English series = Probability trilogy genre = Science fiction novel… …   Wikipedia

  • Q-Q plot — Not to be confused with P P plot. A normal Q Q plot of randomly generated, independent standard exponential data, (X   Exp(1)). This Q Q plot compares a sample of data on the vertical axis to a statistical population on the horizontal… …   Wikipedia

  • P-P plot — Not to be confused with Q Q plot. In statistics, a P P plot (probability probability plot or percent percent plot) is a probability plot for assessing how closely two data sets agree, which plots the two cumulative distribution functions against… …   Wikipedia

  • Quantile-Quantile-Plot — Ein Quantile Quantile Plot (Q Q Plot, Quantil Quantil Diagramm) ist ein exploratives, grafisches Werkzeug, in dem die Quantile zweier statistischer Variablen gegeneinander abgetragen werden, um ihre Verteilungen zu vergleichen. Ein Probability… …   Deutsch Wikipedia

  • Frequency probability — Statistical probability redirects here. For the episode of Star Trek: Deep Space Nine, see Statistical Probabilities. John Venn Frequency probability is the interpretation of probability that defines an event s probability as the limit of its… …   Wikipedia

  • Copula (probability theory) — In probability theory and statistics, a copula can be used to describe the dependence between random variables. Copulas derive their name from linguistics. The cumulative distribution function of a random vector can be written in terms of… …   Wikipedia

Share the article and excerpts

Direct link
Do a right-click on the link above
and select “Copy Link”