Multivariate statistics

Multivariate statistics is a form of statistics encompassing the simultaneous observation and analysis of more than one statistical variable. The application of multivariate statistics is multivariate analysis. Methods of bivariate statistics, for example simple linear regression and correlation, are special cases of multivariate statistics in which two variables are involved.

Multivariate statistics concerns understanding the different aims and background of each of the different forms of multivariate analysis, and how they relate to each other. The practical implementation of multivariate statistics to a particular problem may involve several types of univariate and multivariate analysis in order to understand the relationships between variables and their relevance to the actual problem being studied.

In addition, multivariate statistics is concerned with multivariate probability distributions, in terms of both:

how these can be used to represent the distributions of observed data;
how they can be used as part of statistical inference, particularly where several different quantities are of interest to the same analysis.

1 Types of analysis
2 Important probability distributions
3 Software & Tools
4 See also
5 References
6 External links

Types of analysis

There are many different models, each with its own type of analysis:

Multivariate analysis of variance (MANOVA) extends the analysis of variance to cover cases where there is more than one dependent variable to be analyzed simultaneously: see also MANCOVA.
Multivariate regression analysis attempts to determine a formula that can describe how elements in a vector of variables respond simultaneously to changes in others. For linear relations, regression analyses here are based on forms of the general linear model.
Principal components analysis (PCA) creates a new set of orthogonal variables that contain the same information as the original set. It rotates the axes of variation to give a new set of orthogonal axes, ordered so that they summarize decreasing proportions of the variation.
Factor analysis is similar to PCA but allows the user to extract a specified number of synthetic variables, fewer than the original set, leaving the remaining unexplained variation as error. The extracted variables are known as latent variables or factors; each one may be supposed to account for covariation in a group of observed variables.
Canonical correlation analysis finds linear relationships among two sets of variables; it is the generalised (i.e. canonical) version of bivariate correlation.
Redundancy analysis is similar to canonical correlation analysis but allows the user to derive a specified number of synthetic variables from one set of (independent) variables that explain as much variance as possible in another (independent) set. It is a multivariate analogue of regression.
Correspondence analysis (CA), or reciprocal averaging, finds (like PCA) a set of synthetic variables that summarise the original set. The underlying model assumes chi-squared dissimilarities among records (cases). There is also canonical (or "constrained") correspondence analysis (CCA) for summarising the joint variation in two sets of variables (like canonical correlation analysis).
Multidimensional scaling comprises various algorithms to determine a set of synthetic variables that best represent the pairwise distances between records. The original method is principal coordinates analysis (based on PCA).
Discriminant analysis, or canonical variate analysis, attempts to establish whether a set of variables can be used to distinguish between two or more groups of cases.
Linear discriminant analysis (LDA) computes a linear predictor from two sets of normally distributed data to allow for classification of new observations.
Clustering systems assign objects into groups (called clusters) so that objects (cases) from the same cluster are more similar to each other than objects from different clusters.
Recursive partitioning creates a decision tree that attempts to correctly classify members of the population based on a dichotomous dependent variable.
Artificial neural networks extend regression and clustering methods to non-linear multivariate models.

Important probability distributions

There is a set of probability distributions used in multivariate analyses that play a similar role to the corresponding set of distributions that are used in univariate analysis when the normal distribution is appropriate to a dataset. These multivariate distributions are:

The Inverse-Wishart distribution is important in Bayesian inference, for example in Bayesian multivariate linear regression. Additionally, Hotelling's T-squared distribution is a univariate distribution, generalising Student's t-distribution, that is used in multivariate hypothesis testing.

Software & Tools

There are an enormous number of software packages and other tools for multivariate analysis, including:

Calc
MiniTab
R
SAS (software)
sciPy for Python
SPSS
Stata
STATISTICA
TMVA - Toolkit for Multivariate Data Analysis in ROOT
The Unscrambler

References

Professor Kim H. Esbensen. Multivariate Data Analysis: in Practice : An Introduction to Multivariate Data Analysis and Experimental Design (5th Edition). Camo Process AS. ISBN 8299333032.

KV Mardia, JT Kent, and JM Bibby (1979). Multivariate Analysis. Academic Press. ISBN 0-124-712525.

Gerry Quinn and Michael Keough (2002). Experimental Design and Data Analysis for Biologists. Cambridge University Press. ISBN 978-0521009768.

External links

Statistics

Descriptive statistics

Continuous data

Location	Mean (Arithmetic, Geometric, Harmonic) · Median · Mode

Dispersion	Range · Standard deviation · Coefficient of variation · Percentile · Interquartile range

Shape	Variance · Skewness · Kurtosis · Moments · L-moments

Count data

Index of dispersion

Summary tables

Grouped data · Frequency distribution · Contingency table

Dependence

Pearson product-moment correlation · Rank correlation (Spearman's rho, Kendall's tau) · Partial correlation · Scatter plot

Statistical graphics

Bar chart · Biplot · Box plot · Control chart · Correlogram · Forest plot · Histogram · Q-Q plot · Run chart · Scatter plot · Stemplot · Radar chart

Data collection

Designing studies	Effect size · Standard error · Statistical power · Sample size determination

Survey methodology	Sampling · Stratified sampling · Opinion poll · Questionnaire

Controlled experiment	Design of experiments · Factorial experiment · Randomized experiment · Random assignment · Replication · Blocking · Optimal design

Uncontrolled studies	Natural experiment · Quasi-experiment · Observational study

Statistical inference

Statistical theory	Sampling distribution · Sufficient statistic · Meta-analysis

Bayesian inference	Bayesian probability · Prior · Posterior · Credible interval · Bayes factor · Bayesian estimator · Maximum posterior estimator

Frequentist inference	Confidence interval · Hypothesis testing · Likelihood-ratio

Specific tests	Z-test (normal) · Student's t-test · F-test · Pearson's chi-squared test · Wald test · Mann–Whitney U · Shapiro–Wilk · Signed-rank · Kolmogorov–Smirnov test

General estimation	Mean-unbiased · Median-unbiased · Maximum likelihood · Method of moments · Minimum distance · Density estimation

Correlation and regression analysis

Correlation	Pearson product-moment correlation · Partial correlation · Confounding variable · Coefficient of determination

Regression analysis	Errors and residuals · Regression model validation · Mixed effects models · Simultaneous equations models

Linear regression	Simple linear regression · Ordinary least squares · General linear model · Bayesian regression

Non-standard predictors	Nonlinear regression · Nonparametric · Semiparametric · Isotonic · Robust

Generalized linear model	Exponential families · Logistic (Bernoulli) · Binomial · Poisson

Partition of variance	Analysis of variance (ANOVA) · Analysis of covariance · Multivariate ANOVA · Degrees of freedom

Categorical, multivariate, time-series, or survival analysis

Categorical data	Cohen's kappa · Contingency table · Graphical model · Log-linear model · McNemar's test

Multivariate statistics	Multivariate regression · Principal components · Factor analysis · Cluster analysis · Copulas

Time series analysis	Decomposition (Trend · Stationary process) · ARMA model · ARIMA model · Vector autoregression · Spectral density estimation

Survival analysis	Survival function · Kaplan–Meier · Logrank test · Failure rate · Proportional hazards models · Accelerated failure time model

Applications

Biostatistics	Bioinformatics · Biometrics · Clinical trials & studies · Epidemiology · Medical statistics · Pharmaceutical statistics

Engineering statistics	Methods engineering · Probabilistic design · Process & Quality control · Reliability · System identification

Social statistics	Actuarial science · Census · Crime statistics · Demography · Econometrics · National accounts · Official statistics · Population · Psychometrics

Spatial statistics	Cartography · Environmental statistics · Geographic information system · Geostatistics · Kriging

Category · Portal · Outline · Index

Categories:

Multivariate statistics

Wikimedia Foundation. 2010.

Игры ⚽ Нужно сделать НИР?

Look at other dictionaries:

Data matrix (multivariate statistics) — In multivariate statistics, a data matrix is a mathematical matrix of data whose rows represent different repetition of an experiment, and whose columns represent different types of datum (say, the results from a particular probe). For example,… … Wikipedia
Multivariate analysis of variance — (MANOVA) is a generalized form of univariate analysis of variance (ANOVA). It is used when there are two or more dependent variables. It helps to answer : 1. do changes in the independent variable(s) have significant effects on the dependent … Wikipedia
Multivariate Student distribution — Multivariate Student parameters: location (real vector) Σ scale matrix (positive definite real matrix) n is the degree of freedom support … Wikipedia
Multivariate analysis — (MVA) is based on the statistical principle of multivariate statistics, which involves observation and analysis of more than one statistical variable at a time. In design and analysis, the technique is used to perform trade studies across… … Wikipedia
Multivariate — may refer to: in Mathematics Multivariable calculus Multivariate division algorithm Multivariate interpolation Multivariate polynomial in Statistics Multivariate analysis Multivariate random variable Multivariate statistics in other areas… … Wikipedia
Statistics — is a mathematical science pertaining to the collection, analysis, interpretation or explanation, and presentation of data. Also with prediction and forecasting based on data. It is applicable to a wide variety of academic disciplines, from the… … Wikipedia
Multivariate kernel density estimation — Kernel density estimation is a nonparametric technique for density estimation i.e., estimation of probability density functions, which is one of the fundamental questions in statistics. It can be viewed as a generalisation of histogram density… … Wikipedia
Multivariate normal distribution — MVN redirects here. For the airport with that IATA code, see Mount Vernon Airport. Probability density function Many samples from a multivariate (bivariate) Gaussian distribution centered at (1,3) with a standard deviation of 3 in roughly the… … Wikipedia
Multivariate testing — Software Testing portal In statistics, multivariate testing or multi variable testing is a technique for testing hypotheses on complex multi variable systems, especially used in testing market perceptions.[1] … Wikipedia
Multivariate — Mit Multivariaten Verfahren (Multivariate Analyse(methoden), Abk.: MVA) werden multivariat verteilte statistische Variablen untersucht. Man betrachtet hier nicht eine Variable isoliert (univariat verteilt), sondern das Zusammenwirken mehrerer… … Deutsch Wikipedia

Academic Dictionaries and Encyclopedias

Multivariate statistics

Contents

Types of analysis

Important probability distributions

Software & Tools

See also

References

External links

Look at other dictionaries:

Share the article and excerpts

Academic Dictionaries and Encyclopedias

Wikipedia

Multivariate statistics

Contents

Types of analysis

Important probability distributions

Software & Tools

See also

References

External links

Look at other dictionaries:

Share the article and excerpts

Direct link