• Skip to primary navigation
  • Skip to main content
  • Skip to primary sidebar
  • Skip to footer
Finance Train

Finance Train

High Quality tutorials for finance, risk, data science

  • Home
  • Data Science
  • CFA® Exam
  • PRM Exam
  • Tutorials
  • Careers
  • Products
  • Login

Build the Predictive Model

Data Science, Risk Management

This lesson is part 7 of 28 in the course Credit Risk Modelling in R

We have now gathered our data and cleansed/transformed it to suit our modeling needs. The next step is to actually build the model. The goal of predictive modeling is to build a model to predict the future outcomes using statistical techniques.

We use well-known statistical methods (algorithms) to find the function (model) that best describes a dependency between different variables (a.k.a features). The crux of this is to fit a model to the data such that the function we get is able to predict the outcome based on the given features. In our example, Account Balance, Loan Purpose, Telephone, etc are all predictors/features. The creditability is the outcome/response (the value that we are trying to predict). This is also called the target class, response variable or dependent variable.

We create the model using one of the many algorithms that best describes the relationship between the predictors and the response variable. This is also called training the model. Once the model is ready, it can be used to make the prediction for creditability given all the other features of the loan applicant/borrower.

As we have established earlier, the problem we are looking at is a binary classification problem – Creditability as Bad Credit (0) or Good Credit (1).

Below is a list of the popular algorithms used for classification problems.

  1. Linear Classifiers: Logistic Regression, Naive Bayes Classifier
  2. Support Vector Machines
  3. Decision Trees
  4. Boosted Trees
  5. Random Forest
  6. Neural Networks
  7. Nearest Neighbor

Most often a data scientist will create many models using different algorithms and then use the best or average of all the models. In this case study, we will build the model using just one algorithm, i.e., Logistic Regression.

Previous Lesson

‹ Credit Modelling: Training and Test Data Sets

Next Lesson

Logistic Regression Model in R ›

Join Our Facebook Group - Finance, Risk and Data Science

Posts You May Like

How to Improve your Financial Health

CFA® Exam Overview and Guidelines (Updated for 2021)

Changing Themes (Look and Feel) in ggplot2 in R

Coordinates in ggplot2 in R

Facets for ggplot2 Charts in R (Faceting Layer)

Reader Interactions

Leave a Reply Cancel reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.

Primary Sidebar

In this Course

  • Credit Risk Modelling – Case Studies
  • Classification vs. Regression Models
  • Case Study – German Credit – Steps to Build a Predictive Model
  • Import Credit Data Set in R
  • German Credit Data : Data Preprocessing and Feature Selection in R
  • Credit Modelling: Training and Test Data Sets
  • Build the Predictive Model
  • Logistic Regression Model in R
  • Measure Model Performance in R Using ROCR Package
  • Create a Confusion Matrix in R
  • Credit Risk Modelling – Case Study- Lending Club Data
  • Explore Loan Data in R – Loan Grade and Interest Rate
  • Credit Risk Modelling – Required R Packages
  • Loan Data – Training and Test Data Sets
  • Data Cleaning in R – Part 1
  • Data Cleaning in R – Part 2
  • Data Cleaning in R – Part 3
  • Data Cleaning in R – Part 5
  • Remove Dimensions By Fitting Logistic Regression
  • Create a Function and Prepare Test Data in R
  • Building Credit Risk Model
  • Credit Risk – Logistic Regression Model in R
  • Support Vector Machine (SVM) Model in R
  • Random Forest Model in R
  • Extreme Gradient Boosting in R
  • Predictive Modelling: Averaging Results from Multiple Models
  • Predictive Modelling: Comparing Model Results
  • How Insurance Companies Calculate Risk

Latest Tutorials

    • Data Visualization with R
    • Derivatives with R
    • Machine Learning in Finance Using Python
    • Credit Risk Modelling in R
    • Quantitative Trading Strategies in R
    • Financial Time Series Analysis in R
    • VaR Mapping
    • Option Valuation
    • Financial Reporting Standards
    • Fraud
Facebook Group

Membership

Unlock full access to Finance Train and see the entire library of member-only content and resources.

Subscribe

Footer

Recent Posts

  • How to Improve your Financial Health
  • CFA® Exam Overview and Guidelines (Updated for 2021)
  • Changing Themes (Look and Feel) in ggplot2 in R
  • Coordinates in ggplot2 in R
  • Facets for ggplot2 Charts in R (Faceting Layer)

Products

  • Level I Authority for CFA® Exam
  • CFA Level I Practice Questions
  • CFA Level I Mock Exam
  • Level II Question Bank for CFA® Exam
  • PRM Exam 1 Practice Question Bank
  • All Products

Quick Links

  • Privacy Policy
  • Contact Us

CFA Institute does not endorse, promote or warrant the accuracy or quality of Finance Train. CFA® and Chartered Financial Analyst® are registered trademarks owned by CFA Institute.

Copyright © 2021 Finance Train. All rights reserved.

  • About Us
  • Privacy Policy
  • Contact Us