Invariant estimator

Invariant estimator

In statistics, the concept of being an invariant estimator is a criterion that can be used to compare the properties of different estimators for the same quantity. It is a way of formalising the idea that an estimator should have certain intuitively appealing qualities. Strictly speaking, "invariant" would mean that the estimates themselves are unchanged when both the measurements and the parameters are transformed in a compatible way, but the meaning has been extended to allow the estimates to change in appropriate ways with such transformations. The term equivariant estimator is used in formal mathematical contexts that include a precise description of the relation of the way the estimator changes in response to changes to the dataset and parameterisation.

General Setting

Background

In statistical inference, there are several approaches to estimation theory that can be used to decide immediately what estimators should be used according to those approaches. For example, ideas from Bayesian inference would lead directly to Bayesian estimators. Similarly, the theory of classical statistical inference can sometimes lead to strong conclusions about what estimator should be used. However, the usefulness of these theories depends on having a fully prescribed statistical model and may also depend on having a relevant loss function to determine the estimator. Thus a Bayesian analysis might be undertaken leading to a posterior distribution for relevant parameters but the use of a specific utility or loss function may be unclear: ideas of invariance can then be applied to the task of summarising the posterior distribution. In other cases, statistical analyses are undertaken without a fully defined statistical model or the classical theory of statistical inference cannot be readily applied because the family of models being considered are not amenable to such treatment. In addition to these cases where general theory does not prescribe an estimator, the concept of invariance of an estimator can be applied when seeking estimators of alternative forms, either for the sake of simplicity of application of the estimator or so that the estimator is robust.

The concept of invariance is sometimes used on its own as a way of choosing between estimators, but this is not necessarily definitive. For example, the requirement of unbiasedness is often imposed on a estimator but this is counter-indicated by the possible requirement that the estimator should be invariant to transformations of the parameter being estimated.

One use of the concept of invariance is where a class or family of estimators is proposed and a particular formulation must be selected amongst these. One procedure is to impose relevant invariance properties and then to find the formulation within this class that has the best properties, leading to what is called the optimal invariant estimator.

Invariant estimators

There are several types of transformations that are usefully considered when dealing with invariant estimators:
*Shift invariance. Notionally, estimates of a location parameter should be invariant to simple shifts of the data values: if all data values are increased by a given amount, the estimate should change by the same amount. When considering esimation using a weighted average, this invariance requirement immediately implies that the weights should sum to one. While the same result is often derived from a requirement for unbiasedness, the use of "invariance" does not require that a mean value exists and makes no use of any probability distribution at all.
*Scale invariance. Note that this is a topic not directly covered in scale invariance.
*Parameter-transformation invariance. Here the transformation applies to the parameters alone. The concept here is that essentially the same inference should be made from data and a model involving a parameter θ as would be made from the same data if the model used a parameter φ, where φ is a one-to-one transformation of θ, φ="h"(θ). According to this type of invariance, results from transformation-invariant estimators should also be related by φ="h"(θ). Maximum likelihood estimators have this property.
*Permutation invariance. Where a set of data values can be represented by a statistical model that they are outcomes from independent and identically distributed random variables, it is reasonable to impose the requirement that any estimator of any property of the common distribtion should be permutation-invariant: specifically that the esimator, considered as a function the set of data-values, should not change if items of data are swapped within the dataset.

The combination of permutation invariance and location invariance for estimating a location parameter from an independent and identically distributed datset using a weighted average implies that the weights should be identical and sum to one. Of course, estimators other than a weighted average may be preferable.

Optimal invariant estimators

Under this setting, we are given a measurement x which contains information about an unknown parameter heta. The measurement x is modeled as a random variable having a probability density function f(x| heta) which depends on heta.

We would like to estimate heta given x. The estimate, denoted by a, is a function of the measurements and belongs to a set A. The quality of the result is defined by a loss function L=L(a, heta) which determines a risk function R=R(a, heta)=E [L(a, heta)| heta] .

We denote the sets of possible values of x, heta, and a by X, Theta, and A, respectively.

Mathematical setting

Definition

An invariant estimator is an estimator which obeys the following two rules:
# Principle of Rational Invariance: The action taken in a decision problem should not depend on transformation on the measurement used
# Invariance Principle: If two decision problems have the same formal structure (in terms of X, Theta, f(x| heta) and L) then the same decision rule should be used in each problem

To define an invariant estimator formally we will first set some definitions about groups of transformations:

A group of transformations of X, to be denoted by G is a set of (measurable) 1:1 and onto transformations of X into itself, which satisfies the following conditions:

# If g_1in G and g_2in G then g_1,g_2in G ,
# If gin G then g^{-1}in G, where g^{-1}(g(x))=x , . (That is, each transformation has an inverse within the group.)
# ein G (i.e there is an identity transformation e(x)=x , )

Datasets x_1 and x_2 in X are equivalent if x_1=g(x_2) for some gin G. All the equivalent points form an equivalence class.Such an equivalence class is called an orbit (in X). The x_0 orbit, X(x_0), is the set X(x_0)={g(x_0):gin G}.If X consists of a single orbit then g is said to be transitive.

A family of densities F is said to be invariant under the group G if, for every gin G and hetain Theta there exists a unique heta^*in Theta such that Y=g(x) has density f(y| heta^*). heta^* will be denoted ar{g}( heta).

If F is invariant under the group G then the loss function L( heta,a) is said to be invariant under G if for every gin G and ain A there exists an a^*in A such that L( heta,a)=L(ar{g}( heta),a^*) for all heta in Theta. a^* will be denoted ilde{g}(a).

ar{G}={ar{g}:gin G} is a group of transformations from Theta to itself and ilde{G}={ ilde{g}: g in G} is a group of transformations from A to itself.

An estimation problem is invariant under G if there exists three such groups G, ar{G}, ilde{G} as defined above.

For an estimation problem that is invariant under G, estimator delta(x) is invariant estimator under G if for all xin X and gin G delta(g(x)) = ilde{g}(delta(x)).

Properties

# The risk function of an invariant estimator delta is constant on orbits of Theta. Equivalently R( heta,delta)=R(ar{g}( heta),delta) for all heta in Theta and ar{g}in ar{G}.
# The risk function of an invariant estimator with transitive ar{g} is constant.

For a given problem the invariant estimator with the lowest risk is termed the "best invariant estimator". Best invariant estimator cannot be achieved always. A special case for which it can be achieved is the case when ar{g} is transitive.

Example: Location parameter

heta is a location parameter if the density of X is f(x- heta). For Theta=A=Bbb{R}^1 and L=L(a- heta) the problem is invariant under g=ar{g}= ilde{g}={g_c:g_c(x)=x+c, cin Bbb{R}}. The invariant estimator in this case must satisfy delta(x+c)=delta(x)+c, forall cin Bbb{R} thus it is of the form delta(x)=x+K (Kin Bbb{R}). ar{g} is transitive on Theta so we have here constant risk: R( heta,delta)=R(0,delta)=E [L(X+K)| heta=0] . The best invariant estimator is the one that brings the risk R( heta,delta) to minimum.

In the case that L is squared error delta(x)=x-E [X| heta=0]

Pitman estimator

Given the estimation problem: X=(X_1,dots,X_n) that has density f(x_1- heta,dots,x_n- heta) and loss L(|a- heta|). This problem is invariant under G={g_c:g_c(x)=(x_1+c, dots, x_n+c),cin Bbb{R}^1}, ar{G}={g_c:g_c( heta)= heta + c,cin Bbb{R}^1} and ilde{G}={g_c:g_c(a)=a + c,cin Bbb{R}^1} (additive groups).

The best invariant estimator delta(x) is the one that minimize frac{int_{-infty}^{infty}{L(delta(x)- heta)f(x_1- heta,dots,x_n- heta)d heta{int_{-infty}^{infty}{f(x_1- heta,dots,x_n- heta)d heta (Pitman's estimator, 1939).

For the square error loss case we get that delta(x)=frac{int_{-infty}^{infty}{ heta f(x_1- heta,dots,x_n- heta)d heta{int_{-infty}^{infty}{f(x_1- heta,dots,x_n- heta)d heta

If x sim N( heta 1_n,I),! (normal distribution) than delta_{pitman} = delta_{ML}=frac{sum{x_i{n}

If x sim C( heta 1_n,I),! (Cauchy distribution) than delta_{pitman} e delta_{ML} and delta_{pitman}=sum_{k=1}^n{x_kleft [frac{Re{w_k{sum_{m=1}^{n}{Re{w_k} ight] },n>1when w_k = prod_{j e k}left [frac{1}{(x_k-x_j)^2+4sigma^2} ight] left [1-frac{2sigma}{(x_k-x_j)}i ight]

References

* James O. Berger "Statistical Decision Theory and Bayesian Analysis". 1980. Springer Series in Statistics. ISBN 0-387-90471-9.
* The Pitman estimator of the Cauchy location parameter, Gabriela V. Cohen Freue, Journal of Statistical Planning and Inference 137 (2007) 1900 – 1913


Wikimedia Foundation. 2010.

Игры ⚽ Нужно решить контрольную?

Look at other dictionaries:

  • Estimator — In statistics, an estimator is a function of the observable sample data that is used to estimate an unknown population parameter (which is called the estimand ); an estimate is the result from the actual application of the function to a… …   Wikipedia

  • Invariant (mathematics) — In mathematics, an invariant is a property of a class of mathematical objects that remains unchanged when transformations of a certain type are applied to the objects. The particular class of objects and type of transformations are usually… …   Wikipedia

  • James-Stein estimator — The James Stein estimator is a nonlinear estimator which can be shown to dominate, or outperform, the ordinary (least squares) technique. As such, it is the best known example of Stein s phenomenon.An earlier version of the estimator was… …   Wikipedia

  • List of statistics topics — Please add any Wikipedia articles related to statistics that are not already on this list.The Related changes link in the margin of this page (below search) leads to a list of the most recent changes to the articles listed below. To see the most… …   Wikipedia

  • Scale parameter — In probability theory and statistics, a scale parameter is a special kind of numerical parameter of a parametric family of probability distributions. The larger the scale parameter, the more spread out the distribution.DefinitionIf a family of… …   Wikipedia

  • List of mathematics articles (I) — NOTOC Ia IA automorphism ICER Icosagon Icosahedral 120 cell Icosahedral prism Icosahedral symmetry Icosahedron Icosian Calculus Icosian game Icosidodecadodecahedron Icosidodecahedron Icositetrachoric honeycomb Icositruncated dodecadodecahedron… …   Wikipedia

  • Location parameter — In statistics, a location family is a class of probability distributions parametrized by a scalar or vector valued parameter μ , which determines the location or shift of the distribution. Formally, this means that the probability density… …   Wikipedia

  • Method of conditional probabilities — In mathematics and computer science, the probabilistic method is used to prove the existence of mathematical objects with desired combinatorial properties. The proofs are probabilistic they work by showing that a random object, chosen from some… …   Wikipedia

  • Kalman filter — Roles of the variables in the Kalman filter. (Larger image here) In statistics, the Kalman filter is a mathematical method named after Rudolf E. Kálmán. Its purpose is to use measurements observed over time, containing noise (random variations)… …   Wikipedia

  • Median — This article is about the statistical concept. For other uses, see Median (disambiguation). In probability theory and statistics, a median is described as the numerical value separating the higher half of a sample, a population, or a probability… …   Wikipedia

Share the article and excerpts

Direct link
Do a right-click on the link above
and select “Copy Link”