standardization

In machine learning, standardization is a feature engineering technique by which the dataset features are re-scaled to achieve zero-mean value equal to zero (μ=0) and unit standard deviation value equal to one (σ=1). Each x value in the dataset gets a corresponding x' standardized value, which is calculated as follows.

standardization, where μ is the x variable mean and σ is the standard deviation.

The standardized value is also known as Z-Score.

It is important to differentiate between standardization, normalization and regularization. Standardization and normalization are data preparation (feature engineering) methods, while regularization is used to improve the performance of ML models, by adjusting the cost function to eliminate the ML model error by using the regularization hyperparameter. Standardization and normalization are very similar techniques, in that they both change the scale of data to better accommodate for an ML algorithm operations.

It must be noted that standardization is mostly efficient and must be used, when the x variable data in your dataset follow a normal (Gaussian) distribution.

Related Cloud terms