One common problem while working with beginners in data science is the confusion about what is a model and what is an algorithm. In this article, I will try to explain the difference between a model and algorithm in simple words.
In simple words, an algorithm is a set of rules to follow to solve a problem. It will have a set of rules that need to be followed in the right order in order to solve the problem. A model is what you build by using the algorithm.
For example, let’s say you have loan data for over 5,000 loans issued by a bank. This data also contains the loan default status. You have been given the task of analyzing this data and try to come up with a way of predicting whether a new loan applicant will default on its loan obligation or not. So, what you will build here is the model. Once the model is ready, you will pass the data for the new loan applicant into the model, and the model will tell you the probability of default by this new applicant. Based on the probability, the bank can decide whether to extend the loan to the applicant or not or if issuing a loan, what interest rate to charge. So, this job is done by the model you have built. The accuracy of the model depends on how well it has understood the data.
So, what’s an algorithm then? The algorithm is what you use to train the model on the data on and build the model. So, you have the loan data. In most cases, you will divide the dataset into training and test set. Then you will choose a statistical algorithm that you want to use as a base to build a model. Regression is a popular family of algorithms used in predictive modeling. In our case, since we are trying to predict the default (default or no default), a suitable algorithm is the logistic regression. So, we will use the logistic regression algorithm. We will run the algorithm over our training data and as a result of it, we will get a logistic regression equation which will have its variables and coefficients fitting our data in the best possible way. This is our model.
Once a model is ready, a data scientist will work with the model to test its accuracy and fine-tuning it to improve the results. Once satisfied, the model will be deployed across the bank so that the credit officers can start using it for assessing the new loan applications.
I hope this article gives you some clarity on the difference. To summarize, an algorithm is a method or a procedure we follow to get something done or solve a problem. A model is a computation or a formula formed as a result of an algorithm that takes some values as input and produces some value as output. So, next time while discussing a project, always say that you are building a model using a given algorithm.