Minimum mean square error

Minimum mean square error

In statistics and signal processing, a minimum mean square error (MMSE) estimator describes the approach which minimizes the mean square error (MSE), which is a common measure of estimator quality.

The term MMSE specifically refers to estimation in a Bayesian setting, since in the alternative frequentist setting there does not exist a single estimator having minimal MSE. A somewhat similar concept can be obtained within the frequentist point of view if one requires unbiasedness, since an estimator may exist that minimizes the variance (and hence the MSE) among unbiased estimators. Such an estimator is then called the minimum-variance unbiased estimator (MVUE).

Contents

Definition

Let X be an unknown random variable, and let Y be a known random variable (the measurement). An estimator \hat{X}(y) is any function of the measurement Y, and its MSE is given by

\mathrm{MSE} = E \left\{ (\hat{X} - X)^2 \right\}

where the expectation is taken over both X and Y.

The MMSE estimator is then defined as the estimator achieving minimal MSE.

In many cases, it is not possible to determine a closed form for the MMSE estimator. In these cases, one possibility is to seek the technique minimizing the MSE within a particular class, such as the class of linear estimators. The linear MMSE estimator is the estimator achieving minimum MSE among all estimators of the form AY + b. If the measurement Y is a random vector, A is a matrix and b is a vector. (Such an estimator would more correctly be termed an affine MMSE estimator, but the term linear estimator is widely used.)

Properties

  • Under some weak regularity assumptions,[1] the MMSE estimator is uniquely defined, and is given by
\hat{X}_{\mathrm{MMSE}}(y) = E \left\{ X | Y=y \right\}.
In other words, the MMSE estimator is the conditional expectation of X given the observed value of the measurements.
  • If X and Y are jointly Gaussian, then the MMSE estimator is linear, i.e., it has the form aX + b for constants a and b. As a consequence, to find the MMSE estimator, it is sufficient to find the linear MMSE estimator. Such a situation occurs in the example presented in the next section.
E \{ (\hat{X}-X) f(Y) \} = 0
for all functions f(Y) of the measurements. A different version of the orthogonality principle exists for linear MMSE estimators.

Example

An example can be shown by using a linear combination of random variable estimates X1,X2 and X3 to estimate another random variable X4 using \hat X_{4}. If the random variables X = [X1,X2,X3,X4]T are real Gaussian random variables with zero mean and covariance matrix given by


\operatorname{cov}(X)=E[XX^{T}]=\left[\begin{array}{cccc}
1 & 2 & 3 & 4\\
2 & 5 & 8 & 9\\
3 & 8 & 6 & 10\\
4 & 9 & 10 & 15\end{array}\right],

we will estimate the vector X4 and find coefficients ai such that the estimate \hat X_{4}=\sum_{i=1}^{3}a_{i}X_{i} is an optimal estimate of X4. We will use the autocorrelation matrix, R, and the cross correlation matrix, C, to find vector A, which consists of the coefficient values that will minimize the estimate. The autocorrelation matrix R is defined as

R=\left[\begin{array}{ccc}
E[X_{1},X_{1}] & E[X_{2},X_{1}] & E[X_{3},X_{1}]\\
E[X_{1},X_{2}] & E[X_{2},X_{2}] & E[X_{3},X_{2}]\\
E[X_{1},X_{3}] & E[X_{2},X_{3}] & E[X_{3},X_{3}]\end{array}\right]=\left[\begin{array}{ccc}
1 & 2 & 3\\
2 & 5 & 8\\
3 & 8 & 6\end{array}\right].

The cross correlation matrix C is defined as

C=\left[\begin{array}{c}
E[X_{4},X_{1}]\\
E[X_{4},X_{2}]\\
E[X_{4},X_{3}]\end{array}\right]=\left[\begin{array}{c}
4\\
9\\
10\end{array}\right].

In order to find the optimal coefficients by the orthogonality principle we solve the equation RA = C by inverting R and multiplying to get

R^{-1}C=\left[\begin{array}{ccc}
4.85 & -1.71 & -.142\\
-1.71 & .428 & .2857\\
-.142 & .2857 & -.1429\end{array}\right]\left[\begin{array}{c}
4\\
9\\
10\end{array}\right]=\left[\begin{array}{c}
2.57\\
-.142\\
.5714\end{array}\right]=A.

So we have a1 = 2.57, a2 = − .142, and a3 = .5714 as the optimal coefficients for \hat X_{4}. Computing the minimum mean square error then gives \left\Vert e\right\Vert _{\min}^{2}=E[X_{4}X_{4}]-C^{T}A=15-C^{T}A=.2857.[2]

A shorter, non-numerical example can be found in orthogonality principle.

See also

Notes

  1. ^ Lehmann and Casella, Corollary 4.1.2.
  2. ^ Moon and Stirling.

Further reading

  • Johnson, D. (22 November 2004). Minimum Mean Squared Error Estimators. Connexions
  • Prediction and Improved Estimation in Linear Models, by J. Bibby, H. Toutenburg (Wiley, 1977). This book looks almost exclusively at minimum mean-square error estimation and inference.
  • Jaynes, E. T. Probability Theory: The Logic of Science. Cambridge University Press, 2003.
  • Lehmann, E. L.; Casella, G. (1998). Theory of Point Estimation. Springer. pp. 2nd ed, ch. 4. ISBN 0-387-98502-6. 
  • Kay, S. M. (1993). Fundamentals of Statistical Signal Processing: Estimation Theory. Prentice Hall. pp. 344–350. ISBN 0-13-042268-1. 
  • Moon, T.K. and W.C. Stirling. Mathematical Methods and Algorithms for Signal Processing. Prentice Hall. 2000.

Wikimedia Foundation. 2010.

Игры ⚽ Поможем написать реферат

Look at other dictionaries:

  • Mean squared error — In statistics, the mean squared error (MSE) of an estimator is one of many ways to quantify the difference between values implied by a kernel density estimator and the true values of the quantity being estimated. MSE is a risk function,… …   Wikipedia

  • Mean percentage error — In statistics, the mean percentage error (MPE) is the computed average of percentage errors by which estimated forecasts differ from actual values of the quantity being forecast. Formula for mean percentage error calculation is: where at is the… …   Wikipedia

  • Minimum message length — (MML) is a formal information theory restatement of Occam s Razor: even when models are not equal in goodness of fit accuracy to the observed data, the one generating the shortest overall message is more likely to be correct (where the message… …   Wikipedia

  • Minimum-variance unbiased estimator — In statistics a uniformly minimum variance unbiased estimator or minimum variance unbiased estimator (UMVUE or MVUE) is an unbiased estimator that has lower variance than any other unbiased estimator for all possible values of the parameter. The… …   Wikipedia

  • Minimum description length — The minimum description length (MDL) principle is a formalization of Occam s Razor in which the best hypothesis for a given set of data is the one that leads to the best compression of the data. MDL was introduced by Jorma Rissanen in 1978. It is …   Wikipedia

  • Mean and predicted response — In linear regression mean response and predicted response are values of the dependent variable calculated from the regression parameters and a given value of the independent variable. The values of these two responses are the same, but their… …   Wikipedia

  • Mean — This article is about the statistical concept. For other uses, see Mean (disambiguation). In statistics, mean has two related meanings: the arithmetic mean (and is distinguished from the geometric mean or harmonic mean). the expected value of a… …   Wikipedia

  • Least mean squares filter — Least mean squares (LMS) algorithms are a class of adaptive filter used to mimic a desired filter by finding the filter coefficients that relate to producing the least mean squares of the error signal (difference between the desired and the… …   Wikipedia

  • Minimum distance estimation — (MDE) is a statistical method for fitting a mathematical model to data, usually the empirical distribution. Contents 1 Definition 2 Statistics used in estimation 2.1 Chi square criterion …   Wikipedia

  • List of statistics topics — Please add any Wikipedia articles related to statistics that are not already on this list.The Related changes link in the margin of this page (below search) leads to a list of the most recent changes to the articles listed below. To see the most… …   Wikipedia

Share the article and excerpts

Direct link
Do a right-click on the link above
and select “Copy Link”