Case Study - German Credit - Steps to Build a Predictive Model

We will preform various steps in building our predictive model. These steps are explained below:

Step 1 – Data Selection

The first step is to get the dataset that we will use for building the model. For this case study, we are using the German Credit Scoring Data Set in the numeric format which contains information about 21 attributes of 1000 loans.

Step 2 – Data Pre-Processing

The purpose of preprocessing is to make your raw data suitable for the data science algorithms. For example, we may want to remove the outliers, remove or change imputations (missing values, and so on).

Step 3 – Features Selection

The raw data we have may contain many features/independent variables, and there will be many features which will be quite useless from the viewpoint of predicting the response variable. Such features should be removed from the dataset. We also need to check if there are any redundant information represented using two attributes. We can then safely remove one of the two attributes. This can be done by finding the correlation between various attributes. The resultant dataset with the reduced number of features is ready for use by the classification algorithms.

Step 4 – Building Classification Model

In this step, we build our classification model. We split the data into training and test set. Then we train our model on the training dataset. Once we have the fitted model, we can apply the model to the test dataset to predict the values of our response variable.

Step 5 – Evaluating Predictions

The resultant prediction is then evaluated against the original class labels of the test dataset to find the accuracy of the model.

Related Downloads

Data Science in Finance: 9-Book Bundle

Data Science in Finance Book Bundle

Master R and Python for financial data science with our comprehensive bundle of 9 ebooks.

What's Included:

  • Getting Started with R
  • R Programming for Data Science
  • Data Visualization with R
  • Financial Time Series Analysis with R
  • Quantitative Trading Strategies with R
  • Derivatives with R
  • Credit Risk Modelling With R
  • Python for Data Science
  • Machine Learning in Finance using Python

Each book includes PDFs, explanations, instructions, data files, and R code for all examples.

Get the Bundle for $39 (Regular $57)
JOIN 30,000 DATA PROFESSIONALS

Free Guides - Getting Started with R and Python

Enter your name and email address below and we will email you the guides for R programming and Python.

Data Science in Finance: 9-Book Bundle

Data Science in Finance Book Bundle

Master R and Python for financial data science with our comprehensive bundle of 9 ebooks.

What's Included:

  • Getting Started with R
  • R Programming for Data Science
  • Data Visualization with R
  • Financial Time Series Analysis with R
  • Quantitative Trading Strategies with R
  • Derivatives with R
  • Credit Risk Modelling With R
  • Python for Data Science
  • Machine Learning in Finance using Python

Each book comes with PDFs, detailed explanations, step-by-step instructions, data files, and complete downloadable R code for all examples.