- Anscombe's quartet
Anscombe's quartet comprises four
dataset s which have identical simple statistical properties, yet which are revealed to be very different when inspected graphically. Each dataset consists of eleven (x,y) points. They were constructed in 1973 by thestatistician F.J. Anscombe to demonstrate the importance of graphing data before analyzing it, and of the effect ofoutlier s on the statistical properties of a dataset.For all four datasets:
The first one (top left) seems to be distributed normally, and corresponds to what one would expect when considering two variables correlated and following the assumption of normality. The second one (top right) is not distributed normally; while an obvious relationship between the two variables can be observed, it is not linear, and the Pearson correlation coefficient is not relevant. In the third case (bottom left), the linear relationship is perfect, except for one
outlier which exerts enough influence to lower the correlation coefficient from 1 to 0.81. Finally, the fourth example (bottom right) shows another example when one outlier is enough to produce a high correlation coefficient, even though the relationship between the two variables is not linear.Edward Tufte uses the quartet to emphasize the importance of "looking" at one's data before analyzing it in the first page of the first chapter of his book, "The Visual Display of Quantitative Information".The datasets are as follows. The "x" values are the same for the first three datasets.
References
* F.J. Anscombe, [http://links.jstor.org/sici?sici=0003-1305%28197302%2927%3A1%3C17%3AGISA%3E2.0.CO%3B2-J "Graphs in Statistical Analysis,"] American Statistician, 27 (February 1973), 17-21.
* Tufte, Edward R. (2001). "The Visual Display of Quantitative Information," 2nd Edition, Cheshire, CT: Graphics Press. ISBN 0961392142See also
*
Exploratory data analysis External links
* [http://www.upscale.utoronto.ca/GeneralInterest/Harrison/Visualisation/Visualisation.html Department of Physics, University of Toronto]
* [http://www.soi.city.ac.uk/~dcd/ig/s2viscom/lb_datg/l03.htm Department of Computing, City University, London]
* [http://exploringdata.cqu.edu.au/curv_fit.htm Curve fitting, Central Queensland University, Australia]
Wikimedia Foundation. 2010.