The predictions that the model returns will be compared with the real observation of the variable to obtain a measure of model performance. In order to evaluate the model performance, both classification and regression problems need to define a function called ** loss function** that will compute the error of the model.

This ** loss function** is a measure of the accuracy of the model as it calculates the differences between the true value of the response

**and the estimation from the model y.**

*y*The ** loss function** is different depending on whether we are working on a

**problem or a**

*classification***problem. In the**

*regression***setting the common**

*classification***are the**

*loss functions***and the**

*0-1 loss***.**

*cross entropy***0-1 loss**Is a very basic loss function that assigns 1 to correct predictions and 0 to incorrect predictions. Measure the performance of a classification model whose output is a probability between 0 and 1. It does not care about how the errors are made. The Cross Entropy increases as the predicted value diverge from the actual label. On the other hand, in the

**problem, a common loss function is the**

*regression***(MSE). We will explain these metrics in the Model Selection section.**

*Mean Squared Error***Loss Function Interpretation**

The accuracy of the model is higher when the ** loss function** is at a minimum, that is, when the difference between the true values and the estimated values is small. There are many factors that account for the minimization of the

**such as the quality of the data, the amount of features used to train the model as well as the size of the data used.**

*loss function*Researchers and machine learning engineers will work with the model in order to minimize the ** loss function**. However, if the

**is minimized too severely, the model can get good results with the training data, but could fail in their performance to predict new data.**

*loss function*The above issue is generated when the model is “overfitted” to the data used in the training phase but has not learned how to generalize to new, unseen data. In the machine learning field, this situation is called the ** overfitting**.

** Overfitting** happens when a model captures the noise of the underlying pattern in data. These models have low

**and high**

*bias***.**

*variance***is the difference between the average prediction of the model and the correct value which we are trying to predict. A model with high bias pays very little attention to the training data and oversimplifies the model.**

*Bias*** Variance** is the variability of model prediction for a data point. A model with high variance pays a lot of attention to training data and is not good to generalize on data which it hasn’t seen before. The results is that the model performs very well on training data but has high error rates for new data.Both bias and variance lead to a common situation in machine learning that is called

**.**

*the bias variance trade off*
## Leave a Reply