Bias Variance Trade Off

The interesting property of a machine learning model is its capacity to predict or categorize new unseen data (data that was not used in training the model). For this reason the important measure is the MSE error with test data, which is denominated as test MSE. The goal is to choose a model where the test MSE is the lowest across other models. 

The bias variance tradeoff  is generated when at some point if we increase the bias of the model by creating additional features, the variance of the model increases too (overfitting), and on the other hand if the model is too simple (has very few parameters),  it will have high bias and low variance (under fitting). 

It is necessary to find the right balance between bias and variance without overfitting and under fitting the data. The prediction error in a Supervised machine learning algorithm can be divided into three different parts:

  • Bias Error
  • Variance Error
  • Irreducible Error

First we will write the equation which breaks these three factors:

Variance Error

The first term on the right hand side is the variance of the estimation across many testing sets. This measures the average model deviation among different testing data. In particular, a model with high variance is suggestive that it is overfit to the training data. In this scenario, the model is capturing the noise of the training dataset but it is poor for new data. 

This content is for paid members only.

Join our membership for lifelong unlimited access to all our data science learning content and resources.