In probability theory and statistics, the variance is a way to measure how far a set of numbers is spread out.
Variance describes how much a random variable differs from its expected value. The variance is defined as the average of the squares of the differences between the individual (observed) and the expected value. This means that it is always positive. A variance is often represented by the symbol  , if the data is the entire population, and
, if the data is the entire population, and  , if the data is from a sample.[1][2][3]
, if the data is from a sample.[1][2][3]
In practice, variance is a measure of how much something changes. For example, temperature has more variance in Moscow than in Hawaii.
The variance is not simply the average difference from the expected value. The standard deviation, which is the square root of the variance and comes closer to the average difference, is also not simply the average difference. Variance and standard deviation are used because it makes the mathematics easier—when adding two random variables together.
In accountancy, a variance refers to the difference between the budget for a cost, and the actual cost.
History
Karl Pearson, the father of biometry, first used the term variance as follows:
"It is here attempted to (show) the biometrical properties of a population of a more general type that has (..) been examined, inheritance in which follows this scheme. It is hoped that in this way it will be possible to make a more exact analysis of the causes of human variability. The great body of available statistics shows us that the deviations of a human measurement from its mean follow very closely the Normal Law of Errors, and that  therefore, the variablility may be uniformly measured by the standard deviation, corresponding to the square root of the mean square error."[4]
Related pages
References
|  | 
|---|
|  | 
| |  | 
|---|
 | | Continuous data |  | 
|---|
 | Count data |  | 
|---|
 | Summary tables |  | 
|---|
 | Dependence |  | 
|---|
 | Graphics | 
Bar chartBiplotBox plotControl chartCorrelogramFan chartForest plotHistogramPie chartQ–Q plotRun chartScatter plotStem-and-leaf displayRadar chartViolin plot
 | 
|---|
 | 
 | 
| |  | 
|---|
 | | Study design | 
PopulationStatisticEffect sizeStatistical powerOptimal designSample size determinationReplicationMissing data
 | 
|---|
 | Survey methodology |  | 
|---|
 | Controlled experiments |  | 
|---|
 | Adaptive Designs | 
Adaptive clinical trialUp-and-Down DesignsStochastic approximation
 | 
|---|
 | Observational Studies | 
Cross-sectional studyCohort studyNatural experimentQuasi-experiment
 | 
|---|
 | 
 | 
| |  | 
|---|
 | | Statistical theory |  | 
|---|
 | Frequentist inference | | Point estimation | 
Estimating equations
Unbiased estimators
Mean-unbiased minimum-variance
Rao–BlackwellizationLehmann–Scheffé theorem
Median unbiased
Plug-in
 | 
|---|
 | Interval estimation |  | 
|---|
 | Testing hypotheses | 
1- & 2-tailsPower
Uniformly most powerful test
Permutation test
Multiple comparisons
 | 
|---|
 | Parametric tests | 
Likelihood-ratioScore/Lagrange multiplierWald
 | 
|---|
 | 
|---|
 | Specific tests | |  |  | Goodness of fit |  | 
|---|
 | Rank statistics | 
Sign
Signed rank (Wilcoxon)
Rank sum (Mann–Whitney)Nonparametric anova
1-way (Kruskal–Wallis)2-way (Friedman)Ordered alternative (Jonckheere–Terpstra)
 | 
|---|
 | 
|---|
 | Bayesian inference |  | 
|---|
 | 
 | 
| |  | 
|---|
 | | Correlation |  | 
|---|
 | Regression analysis | 
Errors and residualsRegression validationMixed effects modelsSimultaneous equations modelsMultivariate adaptive regression splines (MARS)
 | 
|---|
 | Linear regression |  | 
|---|
 | Non-standard predictors | 
Nonlinear regressionNonparametricSemiparametricIsotonicRobustHeteroscedasticityHomoscedasticity
 | 
|---|
 | Generalized linear model |  | 
|---|
 | Partition of variance | 
Analysis of variance (ANOVA, anova)Analysis of covarianceMultivariate ANOVADegrees of freedom
 | 
|---|
 | 
 | 
| | Categorical / Multivariate / Time-series / Survival analysis | 
|---|
 | | Categorical | 
Cohen's kappaContingency tableGraphical modelLog-linear modelMcNemar's testCochran-Mantel-Haenszel statistics
 | 
|---|
 | Multivariate | 
RegressionManovaPrincipal componentsCanonical correlationDiscriminant analysisCluster analysisClassificationStructural equation model
Multivariate distributions
 | 
|---|
 | Time-series | | General | 
DecompositionTrendStationaritySeasonal adjustmentExponential smoothingCointegrationStructural breakGranger causality
 | 
|---|
 | Specific tests | 
Dickey–FullerJohansenQ-statistic (Ljung–Box)Durbin–WatsonBreusch–Godfrey
 | 
|---|
 | Time domain | 
Autocorrelation (ACF)
Cross-correlation (XCF)ARMA modelARIMA model (Box–Jenkins)Autoregressive conditional heteroskedasticity (ARCH)Vector autoregression (VAR)
 | 
|---|
 | Frequency domain |  | 
|---|
 | 
|---|
 | Survival | | Survival function | 
Kaplan–Meier estimator (product limit)Proportional hazards modelsAccelerated failure time (AFT) modelFirst hitting time
 | 
|---|
 | Hazard function |  | 
|---|
 | Test |  | 
|---|
 | 
|---|
 | 
 | 
| | Applications | 
|---|
 | | Biostatistics |  | 
|---|
 | Engineering statistics | 
ChemometricsMethods engineeringProbabilistic designProcess / quality controlReliabilitySystem identification
 | 
|---|
 | Social statistics |  | 
|---|
 | Spatial statistics | 
CartographyEnvironmental statisticsGeographic information systemGeostatisticsKriging
 | 
|---|
 | 
 |