- Probability plot
The probability plot is a
graphical technique for assessing whether or not adata set follows a given distribution such as the normal or Weibull, and for visually estimating the location andscale parameter s of the chosen distribution. The data are plotted against a theoretical distribution in such a way that the points should form approximately a straight line. Departures from this straight line indicate departures from the specified distribution.The probability plot correlation coefficient is the
correlation coefficient associated with the linear fit to the data in the probability plot; it is a measure of the goodness of the fit. Estimates of the location andscale parameter s of the distribution are given by theintercept andslope . Probability plots can be generated for several competing distributions to see which provides the best fit, and the probability plot generating the highest correlation coefficient is the best choice since it generates the straightest probability plot.For distributions with
shape parameter s (not counting location and scale parameters), the shape parameters must be known in order to generate the probability plot. For distributions with a single shape parameter, theprobability plot correlation coefficient plot (PPCC plot) provides an excellent method for estimating the shape parameter.The special case of the
normal probability plot is covered separately due to its importance in many statistical applications.Definition
The probability plot is formed by:
* Vertical axis: Ordered response values
* Horizontal axis:Order statistic median s for the given distribution The order statistic medians are defined as::where "U(i)" are the uniform order statistic medians (defined below) and "G" is thequantile function for the desired distribution. The quantile function is the inverse of thecumulative distribution function (probability that "X" is less than or equal to some value). That is, given a probability, we want the correspondingquantile of the cumulative distribution function.The uniform order statistic medians are defined as::
In addition, a straight line can be fit to the points and added as a reference line. The further the points vary from this line, the greater the indication of a departure from the specified distribution.
This definition implies that a probability plot can be easily generated for any distribution for which the
quantile function can be computed. One advantage of this method of computing probability plots is that theintercept andslope estimates of the fitted line are in fact estimates for the location and scale parameters of the distribution. Although this is not too important for the normal distribution (the location and scale are estimated by the mean and standard deviation, respectively), it can be useful for many other distributions.Relation with Q-Q plots
Q-Q plot s are similar to probability plots; the difference is that in a Q-Q plot, one uses the quantile of the distribution as the x-axis, whereas in a probability plot, one uses theexpected value of the "k"th order statistic. Only when "n" is small is there a substantial difference between a Q-Q plot and a probability plot.External links
* [http://www.itl.nist.gov/div898/handbook/eda/section3/probplot.htm Probability plot]
References
*cite book
last = Chambers
first = John
coauthors = William Cleveland, Beat Kleiner, and Paul Tukey
year = 1983
title = Graphical Methods for Data Analysis
publisher = Wadsworth
Wikimedia Foundation. 2010.