Instead of one technique, you use several, or absolutely different models.

Basing your answer on the majority vote, or in case of a regression, you take the avg. output.

Models that have ensemble learning embedded in them, like random forest (with multiple decision trees)

Advantages

  • Increased accuracy
  • Reduces overfitting

Flavors

About the data and not the model.

Bagging (Bootstrap Aggregating)

Generating for each model a subset from our original dataset via random sampling with replacement.

Each model trains on different data samples. Final prediction via majority vote (classification) or averaging (regression).

Besides already yielding different results, we can validate the quality of the training by using the data points that were not in the dataset (out-of-bag samples). Similar to cross-validation.

Example: Random Forest uses bagging with decision trees.

Boosting

Sequential training where each new model focuses on correcting errors of previous models.

Misclassified instances get higher weights, forcing subsequent models to pay more attention to difficult cases.

Popular algorithms:

  • AdaBoost (Adaptive Boosting)
  • Gradient Boosting
  • XGBoost

Key difference from bagging: Models trained sequentially vs. parallel; focuses on errors vs. random subsets.

Cross-validation

Splitting data into training, validation, test subsets to evaluate model performance.

K-fold CV: Data divided into K subsets. Model trained K times, each time using K-1 folds for training and 1 for validation. Average performance across folds gives final metric.

Purpose:

  • Prevents overfitting to training data
  • Better estimate of model generalization
  • Helps tune hyperparameters

Common split: 60% train, 20% validation, 20% test (or 70/15/15)