- Score (statistics)
In
statistics , the score or score function is thepartial derivative , with respect to some parameter heta, of thelogarithm (commonly thenatural logarithm ) of thelikelihood function .If the observation is X and its likelihood is L( heta;X), then the score V can be found through thechain rule ::V=frac{partial}{partial heta} log L( heta;X)=frac{1}{L( heta;X)} frac{partial L( heta;X)}{partial heta}.
Note that V is a function of heta and the observation X, so that, in general, it is not a
statistic . Note also that V indicates the "sensitivity" of L( heta;X) (its variation normalized by its value).Mean
The
expected value of V with respect to the observation x, given heta, written mathbb{E}(V| heta), is zero.To see this, rewrite the definition of expectation, using the fact that the probability mass function is just L( heta; x), which is conventionally denoted by f(x; heta) (in which the dependence on x is more explicit). The correspondingcumulative distribution function is denoted as F(x; heta). With this change of notation and writing f'_{ heta}(x; heta) for the partial derivative with respect to heta,:mathbb{E}(V| heta)=int_{ [0,1] }frac{f'_{ heta}(x; heta)}{f(x; heta)}dF(x; heta)=int_X frac{f'_{ heta}(x; heta)}{f(x; heta)} f(x; heta)dx=int_X frac{partial f(x; heta)}{partial heta} , dx
where the integral runs over the whole of the probability space of "X" and a prime denotes partial differentiation with respect to heta. If certain differentiability conditions are met, the integral may be rewritten as
:frac{partial}{partial heta} int_X f(x; heta) , dx=frac{partial}{partial heta}1 = 0.
It is worth restating the above result in words: the expected value of the score is zero.Thus, if one were to repeatedly sample from some distribution, and repeatedly calculate the score with the true heta, then the mean value of the scores would tend to zero as the number of repeat samples approached infinity.
Variance
The variance of the score is known as the
Fisher information and is written mathcal{I}( heta). Because the expectation of the score is zero, this may be written as:mathcal{I}( heta)=mathbb{E}left{left. left [ frac{partial}{partial heta} log L( heta;X) ight] ^2 ight| heta ight}.
Note that the Fisher information, as defined above, is not a function of any particular observation, as the random variable X has been averaged out.This concept of information is useful when comparing two methods of observation of some
random process .Example
Consider a
Bernoulli process , with "A" successes and "B" failures; the probability of success is θ.Then the likelihood "L" is
:L( heta;A,B)=frac{(A+B)!}{A!B!} heta^A(1- heta)^B,
so the score "V" is given by taking the partial derivative of the log likelihood function as follows:
:V=frac{partial}{partial heta}logleft [L( heta;A,B) ight] =frac{1}{L}frac{partial L}{partial heta}.
This is a standard calculus problem: "A" and "B" are treated as constants. Then
:V=frac{A}{ heta}-frac{B}{1- heta}.
So if the score is zero, θ = "A"/("A" + "B"). We can now verify that the expectation of the score is zero. Noting that the expectation of "A" is "n"θ and the expectation of "B" is "n"(1 − θ), we can see that the expectation of "V" is
:E(V)= frac{n heta}{ heta} - frac{n(1- heta)}{1- heta}= n - n = 0.
We can also check the variance of V. We know that "A" + "B" = "n" (so "B" = "n" - "A") and the variance of "A" is "n"θ(1 − θ) so the variance of "V" is
:operatorname{var}(V)=operatorname{var}left(frac{A}{ heta}-frac{n-A}{1- heta} ight) =operatorname{var}left(Aleft(frac{1}{ heta}+frac{1}{1- heta} ight) ight)=left(frac{1}{ heta}+frac{1}{1- heta} ight)^2operatorname{var}(A)=frac{n}{ heta(1- heta)}.
ee also
*
Fisher information
*Information theory
*Score test
*Support curve References
*cite book
last = Schervish
first = Mark J.
title = Theory of Statistics
publisher =Springer
date =1995
location =New York
pages = Section 2.3.1
isbn = 0387945466
Wikimedia Foundation. 2010.