- CFA L2: Quantitative Methods - Introduction
- Quants: Correlation Analysis
- Quants: Single Variable Linear Regression Analysis
- Standard Error of the Estimate or SEE
- Confidence Intervals (CI) for Dependent Variable Prediction
- Coefficient of Determination (R-Squared)
- Analysis of Variance or ANOVA
- Multiple Regression Analysis
- Multiple Regression and Coefficient of Determination (R-Squared)
- Fcalc – the Global Test for Regression Significance
- Regression Analysis and Assumption Violations
- Qualitative and Dummy Variables in Regression Modeling
- Time Series Analysis: Simple and Log-linear Trend Models
- Auto-Regressive (AR) Time Series Models
- Auto-Regressive Models - Random Walks and Unit Roots
- ARMA Models and ARCH Testing
- How to Select the Most Appropriate Time Series Model?

# Regression Analysis and Assumption Violations

**Heteroskedasticity**

There are two types, Conditional and Unconditional. The type focused on in evaluating model validity is Conditional Heteroskedasticity.

Conditional = the error terms change in a systematic manner that is correlated with the values of the independent variables.

Look up a graph depicting this problem.

The Breusch-Pagan test will test for Conditional Heteroskedasticity.

When this problem is present, the model’s t-scores will be artificially high, indicating a false significance of relationships.

**Serial Correlation**

This is interaction of your model’s error terms.

- When serial (or auto) correlation is present your SEE may be incorrect.
- The Durbin-Watson test statistic can be used to determine the presence of Serial Correlation in multiple regression models, as well as simple and log linear time series models, but not on auto-regressive time series models.

**Multi-collinearity**

Two or more of your independent variables are highly correlated.

- A tiny bit of multi-collinearity is tolerable and can be common in regression models involving several independent variables.
- A common symptom of this problem is the presence of a high coefficient of determination (R2), despite having low t-scores for your independent variables (i.e. they are insignificant).

# This content is for paid members only.

Join our membership for lifelong unlimited access to all our data science learning content and resources.