NumPy

NumPy is a Python package for scientific computing operations. The NumPy Python library includes a multidimensional array object, a series of derived objects as well as functions which apply processing actions on these objects. More details about NumPy features and documentation can be found at: https://numpy.org/learn/.

Pandas

Pandas is an open source library in Python. Pandas provides functions and tools to allow management of data structures and data analysis. Therefore Pandas library is particularly useful in data science, data engineering and machine learning. The official documentation of the pandas functions is available at: https://pandas.pydata.org/pandas-docs/stable/index.html. Pydata community is a member of the NumFocus ... Read more

PCA

In machine learning, PCA stands for Principal Component Analysis. PCA is used to tackle a known ML problem when a dataset has a large number of features, i.e. high dimensions, also known as the curse of dimensionality. The ML feature engineering engineering techniques available are classified into feature selection and feature extraction techniques. Dimensionality reduction ... Read more

Python

Python is an imperative computer programming language. Python is very versatile and flexible and can accommodate a series of problem solving algorithms in various  knowledge domains. One scientific area in which Python excels is Machine Learning and, consequently, Artificial Intelligence. Python code, including libraries of classes and functions, are organized into packages and modules. A ... Read more

regularization

In machine learning, regularization is a method by which the ML model cost/error function is changed, to include an extra variable called the regularization hyperparameter. There are two basic types of regularization: L1-norm (lasso regression) and L2-norm (ridge regression). The lasso regularization uses the L1 norm parameter. The lasso regularized cost function is calculated as ... Read more

RMSE

RMSE is simply the root of the MSE statistical metric. RMSE stands for Root Mean Squared Error. The RMSE is in the same metric scale as the observed parameters, same as the MAE metric. RMSE is a calculation for the standard deviation of residuals. Compared to RMSE, MSE is a calculation of the variance of ... Read more

Standard deviation

A standard deviation (or σ) is a statistical metric (measure) of how dispersed the data is in relation to the mean. It if frequently used in machine learning data preparation and in machine learning model training.

standardization

In machine learning, standardization is a feature engineering technique by which the dataset features are re-scaled to achieve zero-mean value (μ=0) and unit standard deviation value (σ=1). Each x value in the dataset gets a corresponding x' standardized value, which is calculated as follows. , where μ is the x variable mean and σ is ... Read more

stratified cross validation

Stratified cross validation is a data validation technique used when splitting the ML dataset into k subsets, of which k-1 subsets are used as training subsets (folds) and one (1) is used as the test subset (fold). This process is repeated k times. Stratified cross validation uses stratified sampling in the dataset, in order to ... Read more

Transformer machine learning model

Transformer machine learning model A transformer is a deep learning model. Transformer models are mainly used for natural language processing (NLP) and computer vision (CV). Transformers are the evolution of RNN models. A recent example of tranformer-type models in artificial intelligence (AI) are the dp-tranformer models developed by Microsoft research. More details can be found ... Read more

variance

In machine learning (ML), variance is a concept which is related to errors in the model's predictions, as a results of over-sensitivity and high correlation of the machine learning algorithm to the training data. Due to this over-sensitivity, the ML model becomes complex to explain (explainability) and it captures the complexity inside the training data ... Read more