Multiple Linear Regression

The multiple linear regression algorithm states that a response y can be estimated with a set of input features x and an error term ɛ. The model can be expressed with the following mathematical equation:

βT_X_ is the matrix notation of the equation, where βT, X ϵ ʀp+1 and ɛ ~ N(μ,σ2)

βT(transpose of β) and X are both real-valued vectors with dimension p+1 and ɛ is the residual term which represents the difference between the predictions of the model and the true observation of the variable y.

The vector βT = (β0,β1,…βP) stores all the beta coefficients of the model. These coefficients measure how a change on some of the independent variable impact on the dependent or target variable.

The vector X = (1,x1,x2, …xp) hold all the values of the independent variables. Both vectors (T and X) are p+1 dimensional because of the need to include an intercept term.

The goal of the linear regression model is to minimize the difference between the predictions and the real observations of the target variable. For this purpose, a method called Ordinal Least Squares (OLS) is used which will derive the optimal set of coefficients for fitting the model.

Ordinal Least Squares

Formally the OLS model will minimize the Residual Sum of Squares (RSS) between the observations of the target variable and the predictions of the model. The RSS is the loss function metric to assess model performance in the linear regression model and has the following formulation:

This content is for paid members only.

Join our membership for lifelong unlimited access to all our data science learning content and resources.