Analysis of Variance or ANOVA

This is a tool to review a regression analysis and decompose the contribution of the variation in the independent X variable and the variation in the error (or residual) term in predicting the variation in the dependent Y variable.

Regression Sum of Squares (RSS) measures the amount of variation in Y that is explained by the variation in X.

Mean Sum of Squares of the Regression (MSSR) = RSS/degrees of freedom

Sum of the Squares of the Errors (SSE) measures the amount of variation in Y that is explained by the variation in the error term.

  • Mean Sum of Squares of the Errors (MSSE) = SSE/(n - x variables - 1)

    √MSSE = SEE

  • Be careful not to confuse SSE with the Standard Error of the Estimate (SEE) – SEE is one standard deviation above and below the model line, while SSE is measure of the error term’s contribution to the explanation of the dependent variable.

Total Sum of Squares (TSS) indicates the total amount in variation in Y and equals RSS + SSE.

Note that mathematical relationships exist among R2yx, RSS, SSE, and TSS, enabling the analyst to derive one value, if given values for the others.  Knowing these relationships can save you time on the exam and greatly simplify a problem that appears complex.

Learn the skills required to excel in data science and data analytics covering R, Python, machine learning, and AI.

Free Guides - Getting Started with R and Python

Enter your name and email address below and we will email you the guides for R programming and Python.

Saylient AI Logo

Take the Next Step in Your Data Career

Join our membership for lifetime unlimited access to all our data analytics and data science learning content and resources.