• Skip to primary navigation
  • Skip to main content
  • Skip to primary sidebar
  • Skip to footer
Finance Train

Finance Train

High Quality tutorials for finance, risk, data science

  • Home
  • Data Science
  • CFA® Exam
  • PRM Exam
  • Tutorials
  • Careers
  • Products
  • Login

Bias Variance Trade Off

Data Science

This lesson is part 8 of 22 in the course Machine Learning in Finance Using Python

The interesting property of a machine learning model is its capacity to predict or categorize new unseen data (data that was not used in training the model). For this reason the important measure is the MSE error with test data, which is denominated as test MSE. The goal is to choose a model where the test MSE is the lowest across other models. 

The bias variance tradeoff  is generated when at some point if we increase the bias of the model by creating additional features, the variance of the model increases too (overfitting), and on the other hand if the model is too simple (has very few parameters),  it will have high bias and low variance (under fitting). 

It is necessary to find the right balance between bias and variance without overfitting and under fitting the data. The prediction error in a Supervised machine learning algorithm can be divided into three different parts:

  • Bias Error
  • Variance Error
  • Irreducible Error

First we will write the equation which breaks these three factors:

Variance Error

The first term on the right hand side is the variance of the estimation across many testing sets. This measures the average model deviation among different testing data. In particular, a model with high variance is suggestive that it is overfit to the training data. In this scenario, the model is capturing the noise of the training dataset but it is poor for new data. 

Bias Error

The squared bias characterizes the difference between the averages of the estimate and the true values. A model that has high bias, would not capture the underlying behavior of the true functional form well. One example of this case is when a linear regression is used to fit a model with a non-linearity relationship between their features.

Irreducible error

Var(∈)

This term represents the minimum lower bound for the test MSE. This error cannot be reduced regardless what algorithm is used. This error reflects the influence that unknown variables have between the interaction of the features and the target variable.

Bias and variance moves in different direction and the relative rate of change between these two factors determines whether the expected test MSE increases or decreases.

In the plot below we show the trade-off between bias and variance. At first time, as flexibility increase the bias tend to drop quickly, faster than the increase in variance generating a reduction on the test MSE of the model (Total Error). However, as flexibility increases further, there is less reduction in bias and the variance rapidly increase, causing model overfitting.

The goal of any supervised machine learning algorithm is to achieve low bias and low variance, and all models need a well balance between these factors. For this reason is necessary to understand both factors to reach the optimal point that consist in the lower bias-variance.

Figure: Bias-Variance Trade Off

Previous Lesson

‹ Model Selection in Machine Learning

Next Lesson

Supervised Learning Models ›

Join Our Facebook Group - Finance, Risk and Data Science

Posts You May Like

How to Improve your Financial Health

CFA® Exam Overview and Guidelines (Updated for 2021)

Changing Themes (Look and Feel) in ggplot2 in R

Coordinates in ggplot2 in R

Facets for ggplot2 Charts in R (Faceting Layer)

Reader Interactions

Leave a Reply Cancel reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.

Primary Sidebar

In this Course

  • Machine Learning with Python
  • What is Machine Learning?
  • Data Preprocessing in Data Science and Machine Learning
  • Feature Selection in Machine Learning
  • Train-Test Datasets in Machine Learning
  • Evaluate Model Performance – Loss Function
  • Model Selection in Machine Learning
  • Bias Variance Trade Off
  • Supervised Learning Models
  • Multiple Linear Regression
  • Logistic Regression
  • Logistic Regression in Python using scikit-learn Package
  • Decision Trees in Machine Learning
  • Random Forest Algorithm in Python
  • Support Vector Machine Algorithm Explained
  • Multivariate Linear Regression in Python with scikit-learn Library
  • Classifier Model in Machine Learning Using Python
  • Cross Validation to Avoid Overfitting in Machine Learning
  • K-Fold Cross Validation Example Using Python scikit-learn
  • Unsupervised Learning Models
  • K-Means Algorithm Python Example
  • Neural Networks Overview

Latest Tutorials

    • Data Visualization with R
    • Derivatives with R
    • Machine Learning in Finance Using Python
    • Credit Risk Modelling in R
    • Quantitative Trading Strategies in R
    • Financial Time Series Analysis in R
    • VaR Mapping
    • Option Valuation
    • Financial Reporting Standards
    • Fraud
Facebook Group

Membership

Unlock full access to Finance Train and see the entire library of member-only content and resources.

Subscribe

Footer

Recent Posts

  • How to Improve your Financial Health
  • CFA® Exam Overview and Guidelines (Updated for 2021)
  • Changing Themes (Look and Feel) in ggplot2 in R
  • Coordinates in ggplot2 in R
  • Facets for ggplot2 Charts in R (Faceting Layer)

Products

  • Level I Authority for CFA® Exam
  • CFA Level I Practice Questions
  • CFA Level I Mock Exam
  • Level II Question Bank for CFA® Exam
  • PRM Exam 1 Practice Question Bank
  • All Products

Quick Links

  • Privacy Policy
  • Contact Us

CFA Institute does not endorse, promote or warrant the accuracy or quality of Finance Train. CFA® and Chartered Financial Analyst® are registered trademarks owned by CFA Institute.

Copyright © 2021 Finance Train. All rights reserved.

  • About Us
  • Privacy Policy
  • Contact Us