linear regression

Linear regression is a statistical modeling technique. It utilizes a dependent variable, which can be discrete or continuous number and one independent variable. Linear regression models are simple to train by using machine learning. In many cases are the first type of machine learning models to try to train when using a new data set. ... Read more

logit-transformation

Logit-transformation (aka as the odds ratio) is a logarithmic transformation which is used to transform proportional values into continues values in ML regression problems. The logit-transformation is defined by the following mathematical formula: logit(p)=log(p/(1-p)).

loocv

In machine learning, LOOCV stands for leave-one-out cross validation. Assuming that the dataset of an ML project comprises n examples (rows), we split the dataset into n subsets and perform n iterations of training/testing. In each iteration, n-1 examples are used as training data and only 1 example is used the testing subset. This process ... Read more

LSTM

LSTM stands for Long Short-Term Memory. LSTM is a type of RNN artificial neural network, which uses a complex formula to implement an architecture of memory cells, which makes decisions about what pieces of historical information (states) to preserve and which to drop.

Machine learning

Machine learning Machine learning (ML) is a field of study that involves the development of algorithms and statistical models which enable computing systems to develop skills in specific areas, by using data-based training and evaluation. Commonly this leads to Artificial Intelligence (AI) cognitive services, such as computer vision and image recognition.

Machine learning model

A model, under a more general perspective, is the mathematical representation of a real-life system, e.g. a simulator or emulator of an airplane uses airplane flight navigation models and turbo jet engine models. A machine learning (ML) model is a mathematical representation of data applied to an algorithm.

MAE

MAE in statistics and Machine Learning (ML) stands for the Mean Absolute Error. MAE is the average of the sum of the differences between the actual and predicted values in a dataset. In other words the MAE is the calculation of the the average of the residuals. MAE is expressed by the following mathematical formula. ... Read more

Matplotlib

Matplotlib is a Python programming language library which is used in applications which need to create and process visualizations. Visualizations can be static, animated or interactive. The major visualization types which can be managed by Matplotlib are: Pairwise data Statistical distributions Gridded data 3D and volumetric data The official Matplotlib documentation can be found at: ... Read more

MSE

MSE in Machine Learning (ML) stands for Mean Squared Error and is an error calculation formula. MSE calculates the average value of the square power of the sum of differences between the original and predicted values in a dataset. It is similar to MAE, in that MSE is a calculation for the variance of residuals, ... Read more

multivariate

The term multivariate generally refers to a property of a dataset having multiple variables being studied. In time series forecasting problems, multivariate refers to multiple variables being measured in given time series. The most common multivariate ML algorithm is VAR.

Neural network

Neural network An artificial neural network (aka neural network) is a computational model which is employed by machine learning algorithms and is based on connected hierarchical functions. It simulates the neurons of the human brain.

NLP

NLP stands for natural language processing. A similar term is natural language understanding (NLU). Various artificial neural networks (ANN) are used to process natural language, including RNN and LLM neural networks.

NLTK

NLTK is a natural language processing platform which can be utilized in Python applications to allow them to process human language data. Details about the NLTK platform can be found in its official website at: https://www.nltk.org/. An alternative toolkit for text tokenization (into sentences and words), for identifying parts of speech and stop words and ... Read more

NLU

NLU stands for natural language understanding. See NLP for details.

normalization

In machine learning, normalization is a statistical technique by which the data in a dataset are transformed to have values in a normal (Gaussian) distribution, in the value range of [0,1] or in the value range of [-1. 1]. For each value x in the dataset, its corresponding normalized value x' is calculated in the ... Read more

NPU

NPU stands for Neural Processing Unit. It is another name for an AI accelerator.

NumPy

NumPy is a Python package for scientific computing operations. The NumPy Python library includes a multidimensional array object, a series of derived objects as well as functions which apply processing actions on these objects. More details about NumPy features and documentation can be found at: https://numpy.org/learn/.

overfitting

Overfitting in machine learning is a problem in which the ML model fails to generalize well to unseen data because its predictions very tightly much the training data. An overfitting model has  high variance and low bias. Overfitting is the other pole to underfitting. The desired situation between overfitting and underfitting is a good fit ... Read more

Pandas

Pandas is an open source library in Python. Pandas provides functions and tools to allow management of data structures and data analysis. Therefore Pandas library is particularly useful in data science, data engineering and machine learning. The official documentation of the pandas functions is available at: https://pandas.pydata.org/pandas-docs/stable/index.html. Pydata community is a member of the NumFocus ... Read more

PCA

In machine learning, PCA stands for Principal Component Analysis. PCA is an unsupervised learning type of algorithm and is used to tackle a known ML problem when a dataset has a large number of features, i.e. high dimensions, also known as the curse of dimensionality. The ML feature engineering techniques available are classified into feature ... Read more