Predictive Modelling: Averaging Results from Multiple Models

Our final model is to combine the result of previous machine learning models and provide a single prediction by averaging probabilities from all previous models.

predict_loan_status_ensemble = predict_loan_status_logit +
                               predict_loan_status_svm +
                               predict_loan_status_rf +
                               predict_loan_status_xgb
predict_loan_status_ensemble = predict_loan_status_ensemble / 4
rocCurve_ensemble = roc(response = data_test$loan_status,
               predictor = predict_loan_status_ensemble)
auc_curve = auc(rocCurve_ensemble)
plot(rocCurve_ensemble,legacy.axes = TRUE,print.auc = TRUE,col="red",main="ROC(Ensemble Avg.)")

> rocCurve_ensemble

Call:
roc.default(response = data_test$loan_status, predictor = predict_loan_status_ensemble)

Data: predict_loan_status_ensemble in 5358 controls (data_test$loan_status Default) < 12602 cases (data_test$loan_status Fully.Paid).
Area under the curve: 0.7147
> 
predict_loan_status_label = ifelse(predict_loan_status_ensemble<0.5,"Default","Fully.Paid")
c = confusionMatrix(predict_loan_status_label,data_test$loan_status,positive="Fully.Paid")

table_perf[5,] = c("Ensemble",
  round(auc_curve,3),
  as.numeric(round(c$overall["Accuracy"],3)),
  as.numeric(round(c$byClass["Sensitivity"],3)),
  as.numeric(round(c$byClass["Specificity"],3)),
  as.numeric(round(c$overall["Kappa"],3))
  )

We get the following performance:

> tail(table_perf,1)
     model   auc accuracy sensitivity specificity kappa
5 Ensemble 0.715     0.65       0.637        0.68 0.275
> 

Related Downloads

Data Science in Finance: 9-Book Bundle

Data Science in Finance Book Bundle

Master R and Python for financial data science with our comprehensive bundle of 9 ebooks.

What's Included:

  • Getting Started with R
  • R Programming for Data Science
  • Data Visualization with R
  • Financial Time Series Analysis with R
  • Quantitative Trading Strategies with R
  • Derivatives with R
  • Credit Risk Modelling With R
  • Python for Data Science
  • Machine Learning in Finance using Python

Each book includes PDFs, explanations, instructions, data files, and R code for all examples.

Get the Bundle for $39 (Regular $57)
JOIN 30,000 DATA PROFESSIONALS

Free Guides - Getting Started with R and Python

Enter your name and email address below and we will email you the guides for R programming and Python.

Data Science in Finance: 9-Book Bundle

Data Science in Finance Book Bundle

Master R and Python for financial data science with our comprehensive bundle of 9 ebooks.

What's Included:

  • Getting Started with R
  • R Programming for Data Science
  • Data Visualization with R
  • Financial Time Series Analysis with R
  • Quantitative Trading Strategies with R
  • Derivatives with R
  • Credit Risk Modelling With R
  • Python for Data Science
  • Machine Learning in Finance using Python

Each book comes with PDFs, detailed explanations, step-by-step instructions, data files, and complete downloadable R code for all examples.