variance

Variance in machine learning is a measurement of the spread between numbers in a dataset (x independent variables) or a measurement of the variation of an ML model's estimations across different datasets. ML models with high bias and low variance lead to underfitting. ML models with low bias and high variance lead to overfitting. ML models with relatively low/average bias and variance are closer to the best fit (aka good fit or sweet spot). The balance therefore between bias and variance is reflected on the balance between underfitting and overfitting.

There are various ML methods which aim at reducing overfitting by simplifying an ML model, while at the same time keeping most of the "good" variance inside the model's features. One such method is PCA, an unsupervised dimensionality reduction algorithm.

Related Cloud terms