dataset - Stefanos Cloud

DR

1) In machine learning, DR stands for dimensionality reduction. In machine learning, dimensionality reduction is a feature engineering technique, in which a large number of features in a dataset is reduced to a smaller number of features. It is important to ensure that the remaining features are meaningful and representative for the dataset and that ... Read more

overfitting

Overfitting in machine learning is a problem in which the ML model fails to generalize well to unseen data because its predictions very tightly much the training data. An overfitting model has high variance and low bias. Overfitting is the other pole to underfitting. The desired situation between overfitting and underfitting is a good fit ... Read more

percentile

In statistics, a percentile is a term that describes how a score compares to other scores from the same set. While there is no universal definition of percentile, it is commonly expressed as the percentage of values in a set of data scores that fall below a given value, for example the 25th percentile, the ... Read more

stratified cross validation

Stratified cross validation is a data validation technique which is used when splitting the ML dataset into k subsets, of which k-1 subsets are used as training subsets (folds) and one (1) is used as the test subset (fold). This process is repeated k times. Stratified cross validation uses stratified sampling in the dataset, in ... Read more

SVM

SVM stands for Support Vector Machine. SVM is a well-known family of supervised learning non-parametric algorithms which are used in regression and classification machine learning problems, by separating data values using a hyperplane. SVM algorithms are ideal when there is presence of outliers in the model training data.