# Pythagorean expectation

Pythagorean expectation

Pythagorean expectation is a formula invented by Bill James to estimate how many games a baseball team "should" have won based on the number of runs they scored and allowed. Comparing a team's actual and Pythagorean winning percentage can be used to evaluate how lucky that team was (by examining the variation between the two winning percentages). The term is derived from the formula's resemblance to the Pythagorean theorem.

The basic formula is:

:$mathrm\left\{Win%\right\} = frac\left\{mathrm\left\{Runs Scored\right\}^2\right\}\left\{mathrm\left\{Runs Scored\right\}^2 + mathrm\left\{Runs Allowed\right\}^2\right\} = frac\left\{1\right\}\left\{1+\left(mathrm\left\{Runs Allowed\right\}/mathrm\left\{Runs Scored\right\}\right)^2\right\}$

where Win% is the winning percentage generated by the formula. The expected number of wins would be the expected winning percentage multiplied by the number of games played.

Empirical origin

Empirically, this formula correlates fairly well with how baseball teams actually perform, although an exponent of 1.81 is slightly more accurate. This correlation is one justification for using runs as a unit of measurement for player performance. Efforts have been made to find the ideal exponent for the formula, the most widely known being the Pythagenport formula [ [http://www.baseballprospectus.com/article.php?articleid=342 Baseball Prospectus | Articles | Revisiting the Pythagorean Theorem ] ] developed by Clay Davenport of Baseball Prospectus (1.5 log(("r" + "ra")/"g") + 0.45) and the less well known but equally (if not more) effective Pythagenpat formula (("r" + "ra")/"g")0.287), developed by David Smyth. [ [http://gosu02.tripod.com/id69.html W% Estimators ] ] Davenport expressed his support for the latter of the two, saying:

After further review, I (Clay) have come to the conclusion that the so-called Smyth/Patriot method, aka Pythagenpat, is a better fit. In that, X=((rs+ra)/g)^.285, although there is some wiggle room for disagreement in the exponent. Anyway, that equation is simpler, more elegant, and gets the better answer over a wider range of runs scored than Pythagenport, including the mandatory value of 1 at 1 rpg. [ [http://baseballprospectus.com/glossary/index.php?mode=viewstat&stat=136 Baseball Prospectus | Glossary ] ]

These formulas are only necessary when dealing with extreme situations in which the average amount of runs scored per game is either very high or very low. For most situations, simply squaring each variable yields accurate results.

There are some systematic statistical deviations between actual winning percentage and expected winning percentage, which include bullpen quality and luck. In addition, the formula tends to regress toward the mean, as teams that win a lot of games tend to be underrepresented by the formula (meaning they "should" have won fewer games), and teams that lose a lot of games tend to be overrepresented (they "should" have won more).

"Second-order" and "third-order" wins

In their [http://www.baseballprospectus.com/statistics/standings.php Adjusted Standings Report] , Baseball Prospectus refers to different "orders" of wins for a team. The basic order of wins is simply the number of games they have won. However, because a team's record may not reflect its true talent due to luck, different measures of a team's talent were developed.

First-order wins, based on pure run differential, are the number of expected wins generated by the "pythagenport" formula (see above). In addition, to further filter out the distortions of luck, sabermetricians can also calculate a team's "expected" runs scored and allowed via a runs created-type equation (the most accurate at the team level being Base Runs). These formulas result in the team's expected number of runs given their total singles, doubles, walks, etc., which helps to eliminate the luck factor of the order in which the team's hits and walks came within an inning.

By plugging these expected runs scored and allowed into the pythagorean formula, one can generate second-order wins, the number of wins a team deserves based on the number of runs they should have scored and allowed given their component offensive and defensive statistics. Third-order wins are second-order wins that have been adjusted for strength of schedule (the quality of the opponent's pitching and hitting). Second- and third-order winning percentage has been shown to predict future actual team winning percentage better than both actual winning percentage and first-order winning percentage.

Theoretical explanation

Initially the correlation between the formula and actual winning percentage was simply an experimental observation; however, Professor Steven J. Miller provided a [http://arxiv.org/abs/math/0509698v4 statistical derivation of the formula] under some assumptions about baseball games: if runs for each team follow a Weibull distribution and the runs scored and allowed per game are statistically independent, then the formula gives the probability of winning. [cite news|first=Steven J|last=Miller|title=A Derivation of the Pythagorean Won-Loss Formula in Baseball|url=http://www.math.brown.edu/~sjmiller/math/papers/PythagWonLoss_Paper.pdf|format=pdf]

When noted basketball analyst Dean Oliver applied James' pythagorean theory to his own sport, the result was similar, except for the exponents:

:$mathrm\left\{Win%\right\} = frac\left\{mathrm\left\{Points For\right\}^\left\{14\left\{mathrm\left\{Points For\right\}^\left\{14\right\} + mathrm\left\{Points Against\right\}^\left\{14.$

Another noted basketball statistician, John Hollinger, uses a similar pythagorean formula except with 16.5 as the exponent.

ee also

* Baseball statistics
* Sabermetrics

Notes

* [http://arxiv.org/abs/math.ST/0509698 A Derivation of the Pythagorean Won-Loss Formula in Baseball, by Steven Miller] [http://arxiv.org/pdf/math.ST/0509698 PDF]

Wikimedia Foundation. 2010.

### Look at other dictionaries:

• Pythagorean — means of or pertaining to the ancient Ionian mathematician, philosopher, and music theorist Pythagoras. See:Philosophy* Pythagoreanism an Egyptian influenced Neo Platonic philosophy named after the Greek philosopher and mathematician,… …   Wikipedia

• Pythagorean theorem — See also: Pythagorean trigonometric identity The Pythagorean theorem: The sum of the areas of the two squares on the legs (a and b) equals the area of the square on the hypotenuse (c) …   Wikipedia

• List of mathematics articles (P) — NOTOC P P = NP problem P adic analysis P adic number P adic order P compact group P group P² irreducible P Laplacian P matrix P rep P value P vector P y method Pacific Journal of Mathematics Package merge algorithm Packed storage matrix Packing… …   Wikipedia

• Win shares — can refer to a book by Bill James or the statistic explained in the book. Win Shares (book) Win Shares is a book (ISBN 1 931584 03 6) about baseball written by Bill James, published by STATS, Inc. in 2002. It takes a sabermetric approach to… …   Wikipedia

• Pomeroy College Basketball Ratings — The Pomeroy College Basketball Ratings are a series of predictive ratings of men s college basketball teams published free of charge online by Ken Pomeroy. They were first published in 2003.[1] The system is based around the Pythagorean… …   Wikipedia

• List of statistics topics — Please add any Wikipedia articles related to statistics that are not already on this list.The Related changes link in the margin of this page (below search) leads to a list of the most recent changes to the articles listed below. To see the most… …   Wikipedia

• Baseball Prospectus — Baseball Prospectus, sometimes abbreviated as BP, is a think tank focusing on sabermetrics, the statistical analysis of the sport of baseball. Baseball Prospectus has fathered several popular new statistical tools which have become hallmarks of… …   Wikipedia

• Baseball statistics — Statistics play an important role in summarizing baseball performance and evaluating players in the sport. Since the flow of baseball has natural breaks to it, the game lends itself to easy record keeping and statistics. This makes comparisons… …   Wikipedia

• Sabermetrics — is the analysis of baseball through objective evidence, especially baseball statistics. The term is derived from the acronym SABR, which stands for the Society for American Baseball Research. It was coined by Bill James, who was among its first… …   Wikipedia

• Dean Oliver (statistician) — For the English footballer see Dean Oliver (footballer). Dr. Dean Oliver Ph. D (born 1969) is a prominent contributor to the statistical evaluation of basketball, or sometimes called APBRmetrics after the forum of a growing community of… …   Wikipedia