Confidence distribution

This article is about the confidence distribution. For Confidence interval, see Confidence interval.

In statistics, the concept of a confidence distribution (CD) has often been loosely referred to as a distribution function on the parameter space that can represent confidence intervals of all levels for a parameter of interest. Historically, it has typically been constructed by inverting the upper limits of lower sided confidence intervals of all levels, and it was commonly associated with a fiducial^[1] interpretation (fiducial distribution). A confidence distribution is not a valid probability distribution.^[2]

In recent years, there has been a surge of renewed interest in confidence distributions.^{[citation needed]} In the more recent developments, the concept of confidence distribution has emerged as a purely frequentist concept, without any fiducial interpretation or reasoning. Conceptually, a confidence distribution is no different from a point estimator or an interval estimator (confidence interval), but it uses a sample-dependent distribution function on the parameter space (instead of a point or an interval) to estimate the parameter of interest.

A simple example of a confidence distribution, that has been broadly used in statistical practice, is a bootstrap distribution.^[3] The development and interpretation of a bootstrap distribution does not involve any fiducial reasoning; the same is true for the concept of a confidence distribution. But the notion of confidence distribution is much broader than that of a bootstrap distribution. In particular, recent research suggests that it encompasses and unifies a wide range of examples, from regular parametric cases (including most examples of the classical development of Fisher's fiducial distribution) to bootstrap distributions, p-value functions,^[4] normalized likelihood functions and, in some cases, Bayesian priors and Bayesian posteriors.^[5]

Just as a Bayesian posterior distribution contains a wealth of information for any type of Bayesian inference, a confidence distribution contains a wealth of information for constructing almost all types of frequentist inferences, including point estimates, confidence intervals and p-values, among others. Some recent developments have highlighted the promising potentials of the CD concept, as an effective inferential tool.^{[citation needed]}

1 The history of CD concept
2 Definition
3 Using CD to make inference
4 References
5 Bibliography

The history of CD concept

Neyman (1937)^[6] introduced the idea of "confidence" in his seminal paper on confidence intervals which clarified the frequentist repetition property. According to Fraser,^[7] the seed (idea) of confidence distribution can even be traced back to Bayes (1763)^[8] and Fisher (1930).^[1] Some researchers view the confidence distribution as "the Neymanian interpretation of Fishers fiducial distribution",^[9] which was "furiously disputed by Fisher".^[10] It is also believed that these "unproductive disputes" and Fisher's "stubborn insistence"^[10] might be the reason that the concept of confidence distribution has been long misconstrued as a fiducial concept and not been fully developed under the frequentist framework.^[5]^[11] Indeed, the confidence distribution is a purely frequentist concept with a purely frequentist interpretation, and it also has ties to Bayesian inference concepts and the fiducial arguments.

Definition

Classical definition

Classically, a confidence distribution is defined by inverting the upper limits of a series of lower sided confidence intervals.^[12]^{[page needed]}^[13]^{[page needed]}. In particular,

Definition (classical definition): For every α in (0, 1), let (−∞, ξ_n(α)] be a 100α% lower-side confidence side interval for θ, where ξ_n(α) = ξ_n(X_n,α) is continuous and increasing in α for each sample X_n. Then, H_n(•) = ξ_n⁻¹(•) is a confidence distribution for θ.

Efron stated that this distribution "assigns probability 0.05 to θ lying between the upper endpoints of the 0.90 and 0.95 confidence interval, etc." and "it has powerful intuitive appeal".^[12] In the classical literature,^{[citation needed]} the confidence distribution function is interpreted as a distribution function of the parameter θ, which is impossible unless fiducial reasoning is involved since, in a frequentist setting, the parameters are fixed and nonrandom.

To interpret the CD function entirely from a frequentist viewpoint and not interpret it as a distribution function of a (fixed/nonrandom) parameter is one of the major departures of recent development relative to the classical approach. The nice thing about treating confidence distribution as a purely frequentist concept (similar to a point estimator) is that it is now free from those restrictive, if not controversial, constraints set forth by Fisher on fiducial distributions.^[5]^[11]

The modern definition

The following definition applies.^[9]^[14]^[15] In the definition, Θ is the parameter space of the unknown parameter of interest θ, and χ is the sample space corresponding to data X_n={X₁,...,X_n}.

Definition

A function H_n(•) = H_n(X_n, •) on χ × Θ → [0, 1] is called a confidence distribution (CD) for a parameter θ, if it follows two requirements:

(R1) For each given X_n ∈ χ is a continuous cumulative distribution function on Θ;
(R2) At the true parameter value θ = θ₀, H_n(θ₀) ≡ H_n(X_n, θ₀), as a function of the sample X_n, follows the uniform distribution U[0, 1].

Also, the function H is an asymptotic CD (aCD), if the U[0, 1] requirement is true only asymptotically and the continuity requirement on H_n(•) is dropped.

In nontechnical terms, a confidence distribution is a function of both the parameter and the random sample, with two requirements. The first requirement (R1) simply requires that a CD should be a distribution on the parameter space. The second requirement (R2) sets a restriction on the function so that inferences (point estimators, confidence intervals and hypothesis testing, etc.) based on the confidence distribution have desired frequentist properties. This is similar to the restrictions in point estimation to ensure certain desired properties, such as unbiasedness, consistency, efficiency, etc.^[5]^[16]

A confidence distribution derived by inverting the upper limits of confidence intervals (classical definition) also satisfies the requirements in the above definition and this version of the definition is consistent with the classical definition.^[15]

Examples

Example 1: Normal Mean and Variance

Suppose a normal sample X_i ~ N(μ, σ²), i = 1, 2, ..., n is given.

(1) Variance σ² is known

Both the functions $H Φ (μ)$ and $H t (μ)$ given by

$H_{\Phi}(\mu) = \Phi\left(\frac{\sqrt{n}(\mu-\bar{X})}{\sigma}\right) , \quad\text{and}\quad H_{t}(\mu) = F_{t_{n-1}}\left(\frac{\sqrt{n}(\mu-\bar{X})}{s}\right) ,$

satisfy the two requirements in the CD definition, and they are confidence distribution functions for μ.^{[citation needed]} Here, Φ is the cumulative distribution function of the standard normal distribution, and $F_{t_{n-1}}$ is the cumulative distribution function of the student $t n - 1$ distribution. Furthermore,

$H_A(\mu) = \Phi\left(\frac{\sqrt{n}(\mu-\bar{X})}{s}\right)$

satisfies the definition of an asymptotic confidence distribution when n→∞, and it is an asymptotic confidence distribution for μ. The uses of $H t (μ)$ and $H A (μ)$ are equivalent to state that we use $N(\bar{X},\sigma^2)$ and $N(\bar{X},s^2)$ to estimate $μ$ , respectively.

(2) Variance σ² is unknown

For the parameter μ, since $H Φ (μ)$ involves the unknown parameter σ and it violates the two requirements in the CD definition, it is no longer a "distribution estimator" or a confidence distribution for μ. However, $H t (μ)$ is still a CD for μ and $H A (μ)$ is an aCD for μ.

For the parameter σ², the sample-dependent cumulative distribution function

$H_{\chi^2}(\theta)=1-F_{\chi^2_{n-1}}(s^2/\theta)$

is a confidence distribution function for σ². Here, $F_{\chi^2_{n-1}}$ is the cumulative distribution function of the student $\chi^2_{n-1}$ distribution.

Example 2: Bivariate normal correlation

Let ρ denotes the correlation coefficient of a bivariate normal population. It is well known that Fisher's z defined by the Fisher transformation:

$z = {1 \over 2}\ln{1+r \over 1-r}$

has the limiting distribution $N({1 \over 2}\ln{{1+\rho}\over{1-\rho}}, {1 \over n-3})$ with a fast rate of convergence, where r is the sample correlation and n is the sample size.

The function

$H_n(\rho) = 1 - \Phi\left(\sqrt{n-3} \left({1 \over 2}\ln{1+r \over 1-r} -{1 \over 2}\ln{{1+\rho}\over{1-\rho}} \right)\right)$

is an asymptotic confidence distribution for ρ.^{[citation needed]}

Using CD to make inference

Confidence interval

From the CD definition, it is evident that the interval $(-\infty, H_n^{-1}(1-\alpha)], [H_n^{-1}(\alpha), \infty)$ and $[H_n^{-1}(\alpha/2), H_n^{-1}(1-\alpha/2)]$ provide 100(1 − α)%-level confidence intervals of different kinds, for θ, for any α ∈ (0, 1). Also $[H_n^{-1}(\alpha_1), H_n^{-1}(1-\alpha_2)]$ is a level 100(1 − α₁ − α₂)% confidence interval for the parameter θ for any α₁ > 0, α₂ > 0 and α₁ + α₂ < 1. Here, $H_n^{-1}(\beta)$ is the 100β% quantile of $H n (θ)$ or it solves for θ in equation $H n (θ) = β$ . The same holds for an aCD, where the confidence level is achieved in limit.

Point estimation

Point estimators can also be constructed given a confidence distribution estimator for the parameter of interest. For example, given H_n(θ) the CD for a parameter θ, natural choices of point estimators include the median M_n = H_n⁻¹(1/2), the mean $\bar{\theta}_n=\int_{-\infty}^\infty t \, dH_n(t)$ , and the maximum point of the CD density

$\widehat{\theta}_n=\arg\max_\theta h_n(\theta), h_n(\theta)=H'_n(\theta).$

Under some modest conditions, among other properties, one can prove that these point estimators are all consistent.^[5]^[17]

Hypothesis testing

One can derive a p-value for a test, either one-sided or two-sided, concerning the parameter θ, from its confidence distribution H_n(θ).^[5]^[17] Denote by the probability mass of a set C under the confidence distribution function

p_s(C) = H_n(C) =	∫	dH(θ).
	C

This p_s(C) is called "support" in the CD inference and also known as "belief" in the fiducial literature.^[18] We have

(1) For the one-sided test K₀: θ ∈ C vs. K₁: θ ∈ C^c, where C is of the type of (−∞, b] or [b, ∞), one can show from the CD definition that sup_θ ∈ CP_θ(p_s(C) ≤ α) = α. Thus, p_s(C) = H_n(C) is the corresponding p-value of the test.

(2) For the singleton test K₀: θ = b vs. K₁: θ ≠ b, P_{{K₀: θ = b}}(2 min{p_s(C_lo), one can show from the CD definition that p_s(C_up)} ≤ α) = α. Thus, 2 min{p_s(C_lo), p_s(C_up)} = 2 min{H_n(b), 1 − H_n(b)} is the corresponding p-value of the test. Here, C_lo = (−∞, b] and C_up = [b, ∞).

See Figure 1 from Xie and Singh (2011)^[5] for a graphical illustration of the CD inference.

References

^ ^a ^b ^c Fisher, R.A. (1930). "Inverse probability." Proc. cambridge Pilos. Soc. 26, 528–535.
^ Cox, D. R. (2006). Principles of Statistical Inference, CUP. ISBN 0-521-68567-2. (page 66)
^ ^a ^b Efron, B. (1998). "R.A.Fisher in the 21st Century" Statistical Science. 13 95–122.
^ ^a ^b Fraser, D.A.S. (1991). "Statistical inference: Likelihood to significance." J. Amer. Statist. Assoc., 86, 258–265.
^ ^a ^b ^c ^d ^e ^f ^g ^h Xie, M. and Singh, K. (2011). "On Confidence Distribution, the Frequentist Distribution Estimator of a Parameter." Draft review article. (invited)
^ ^a ^b Neyman, J. (1937). "Outline of a theory of statistical estimation based on the classical theory of probability." Phil. Trans. Roy. Soc A237 333–380
^ ^a ^b Fraser, D.A.S. (2011). "Is Bayes posterior just quick and dirty confidence?" Statistical Science In press.
^ ^a ^b Bayes, T. (1973). "An Essay towards solving a Problem in the Doctrine of Chances." Phil. Trans. Roy. Soc, London 53 370–418 54 296–325. Reprinted in Biometrika 45 (1958) 293–315.
^ ^a ^b ^c Schweder, T. and Hjort, N.L. (2002). "Confidence and likelihood", Scandinavian Journal of Statistics. 29 309–332. doi:10.1111/1467-9469.00285
^ ^a ^b ^c Zabell, S.L. (1992). "R.A.Fisher and fiducial argument", Stat. Sci., 7, 369–387
^ ^a ^b ^c Singh, K. and Xie, M. (2011). "Discussion on Professor Fraser's article on "Is Bayes posterior just quick and dirty confidence?"" Statistical Science In press.
^ ^a ^b ^c Efron, B. (1993). "Bayes and likelihood calculations from confidence intervals. Biometrika, 80 3–26.
^ ^a ^b Cox, D.R. (2006). Principle of statistical inference, Cambridge University Press.
^ ^a ^b Singh, K. Xie, M. and Strawderman, W.E. (2001). "Confidence distributions—concept, theory and applications". Technical report, Dept. Statistics, Rutgers Univ. Revised 2004.
^ ^a ^b ^c Singh, K. Xie, M. and Strawderman, W.E. (2005). "Combining Information from Independent Sources Through Confidence Distribution" Ann. Statist., 33, 159–183.
^ ^a ^b Xie, M., Liu, R.Y., Damaraju, C.V., and Olson, W.H. (2009). "Incorporating expert opinions with information from binomial clinical trials", Technical report, Dept. Statistics, Rutgers Univ. Submitted for publication.
^ ^a ^b ^c Singh, K. Xie, M. and Strawderman, W.E. (2007). "Confidence Distribution (CD)-Distribution Estimator of a Parameter", in Complex Datasets and Inverse Problems IMS Lecture Notes—Monograph Series, 54,(R. Liu, et al. Eds) 132–150.
^ ^a ^b Kendall, M., & Stuart, A. (1974). The Advanced Theory of Statistics, Volume ?. (Chapter 21). Wiley.
^ Fisher, R.A. (1973), Statistical Methods and Scientific Inference, 3rd edition. Hafner Press, New York.
^ Neyman, J. (1941). "Fiducial argument and the theory of confidence intervals." Biometrika. 32 128–150.
^ Parzen, E. (2005). "All Statistical Methods, Parameter Confidence Quantiles". Noether Award Lecture at the Joint Statistical Meeting.
^ Schweder, T. and Hjort, N.L. (2003). "Frequentist analogues of priors and posteriors." In B.P. Stigum, Econometrics and the philosophy of economics. Princeton University Press 285–317.
^ Schweder, T. and Hjort, N.L. (2009). Confidence, Likelihood and Probability. Cambridge University Press. (forthcoming)
^ Xie, M., Singh, K. and Strawderman, W.E. (2011). "Confidence distributions and a unified framework for meta-analysis". J. Amer. Statist. Assoc., 106, 320–333.

Bibliography

Fisher, R A (1956). Statistical Methods and Scientific Inference. New York: Hafner. ISBN 0028447409.
Fisher, R. A. (1955). "Statistical methods and scientific induction" J. Roy. Statist. Soc. Ser. B. 17, 69—78. (criticism of statistical theories of Jerzy Neyman and Abraham Wald from a fiducial perspective)
Hannig, J. (2009). "On generalized fiducial inference". Statistica Sinica, 19, 491–544.
Lawless, F. and Fredette, M. (2005). "Frequentist prediction intervals and predictive distributions." Biometrika. 92(3) 529–542.
Lehmann, E.L. (1993). "The Fisher, Neyman–Pearson theories of testing hypotheses: one theory or two?" J. Amer. Statist. Assoc. 88 1242–1249.
Neyman, Jerzy (1956). "Note on an Article by Sir Ronald Fisher". Journal of the Royal Statistical Society. Series B (Methodological) 18 (2): 288–294. JSTOR 2983716. (reply to Fisher 1955, which diagnoses a fallacy of "fiducial inference")
Schweder T., Sadykova D., Rugh D. and Koski W. (2010) "Population Estimates From Aerial Photographic Surveys of Naturally and Variably Marked Bowhead Whales" Journal of Agricultural Biological and Environmental Statistics 2010 15: 1–19
Bityukov S., Krasnikov N., Nadarajah S. and Smirnova V. (2010) "Confidence distributions in statistical inference". AIP Conference Proceedings, 1305, 446-456.
Singh, K. and Xie, M. (2011). "CD-posterior – combining prior and data through confidence distributions". A Festchrift in Honor of William E Strawderman. IMS-LNS Monogrph Series. (D. Fourdrinier, et al. Eds.). In press.

Categories:

Wikimedia Foundation. 2010.

Игры ⚽ Нужен реферат?

Look at other dictionaries:

Confidence interval — This article is about the confidence interval. For Confidence distribution, see Confidence Distribution. In statistics, a confidence interval (CI) is a particular kind of interval estimate of a population parameter and is used to indicate the… … Wikipedia
Confidence Bay — was the first mobile high definition edit suite in the world. The facility is a 35 recreational vehicle with two full editing bays inside. It first debuted in March 2006 at South by Southwest (SXSW) in Austin, Texas. Contents 1 Design philosophy… … Wikipedia
Confidence (film, 2003) — Confidence est un film américain réalisé par James Foley, sorti en 2003. Fiche technique Cette section est vide, insuffisamment détaillée ou incomplète. Votre aide est la bienvenue ! Distribution Edward Burns (Jake Vig) Rachel Weisz (Lily)… … Wikipédia en Français
Confidence band — A confidence band is used in statistical analysis to represent the uncertainty in an estimate of a curve or function based on limited or noisy data. Confidence bands are often used as part of the graphical presentation of results in a statistical … Wikipedia
Confidence region — In statistics, a confidence region is a multi dimensional generalization of a confidence interval. It is a set of points in an n dimensional space, often represented as an ellipsoid around a point which is an estimated solution to a problem,… … Wikipedia
Distribution mangagement system — SCADA systems have been a part of utility automation for at least 15 years and contributing to the decision making process of the control rooms. However, majority of the existing solutions are closely related to distribution network data… … Wikipedia
Confidence Interval — A term used in inferential statistics that measures the probability that a population parameter will fall between two set values. The confidence interval can take any number of probabilities, with the most common being 95% or 99%. In other words … Investment dictionary
confidence interval — The degree of certainty that an event will fall outside of boundaries on a distribution. For a normal distribution, boundaries set at two standard deviations from the mean create approximately 95 percent confidence intervals. In other words, only … Financial and business terms
Distribution Waterfall — The order in which a private equity fund makes distributions. A distribution waterfall is a hierarchy delineating the order in which funds will be distributed, and may ensure that different types of investors have priority of payment compared to… … Investment dictionary
Exponential distribution — Not to be confused with the exponential families of probability distributions. Exponential Probability density function Cumulative distribution function para … Wikipedia

Academic Dictionaries and Encyclopedias

Confidence distribution

Contents

The history of CD concept

Definition

Classical definition

The modern definition

Examples

Using CD to make inference

Confidence interval

Point estimation

Hypothesis testing

References

Bibliography

Look at other dictionaries:

Share the article and excerpts

Academic Dictionaries and Encyclopedias

Wikipedia

Confidence distribution

Contents

The history of CD concept

Definition

Classical definition

The modern definition

Examples

Using CD to make inference

Confidence interval

Point estimation

Hypothesis testing

References

Bibliography

Look at other dictionaries:

Share the article and excerpts

Direct link