7 No Multicollinearity

What this assumption means: Each predictor makes some unique contribution in explaining the outcome. A significant amount of the information contained in one predictor is not contained in the other predictors (i.e., non-redundancy).

Why it matters: Multicollinearity results in increased standard errors.

How to diagnose violations: A predictor’s variance inflation factor (VIF) should be below a cutoff, such as 5 or 10.

How to address it: Combine problematic variables into composite or factor scores, drop variables, or use structural equation modeling to account for shared variance.

7.1 Example Model

If you have not already done so, download the example dataset, read about its variables, and import the dataset into R.

Then, use the code below to fit this page’s example model.

acs <- readRDS("acs2019sample.rds")
mod <- lm(income ~ hours_worked + weeks_worked + age, acs, na.action = na.exclude)

7.2 Statistical Tests

Use the variance inflation factor (VIF) to detect multicollinearity. It tells us how much of one predictor’s variance is explained by the other predictors, and how much a coefficient’s standard error is increased due to multicollinearity. (Standard errors are increased by a factor of \(\sqrt{VIF}\).)

The formula for the VIF is \(\frac{1}{1-R^2}\), where \(R^2\) is obtained from a model where one predictor is regressed on all of the other predictors. Perfectly uncorrelated predictors have VIFs of 1, and perfectly correlated predictors have VIFs of infinity.

Different cutoffs are used for determining whether a VIF indicates multicollinearity, such as 5 (corresponding to \(R^2=0.8\)) or 10 (\(R^2=0.9\)).

7.2.1 Understanding the VIF

Load the car package and use its vif() function with the name of a fitted model.

library(car)
vif(mod)

## hours_worked weeks_worked          age 
##     1.202691     1.215237     1.011891

This returns three VIFs, one for each predictor.

We will manually calculate one of the VIFs to enhance our understanding of them. We will use the VIF for hours_worked, 1.203, as an example.

Fit another model with just the predictors from our original model, where the predictor of interest (hours_worked) is used as the outcome. Our original formula was income ~ hours_worked + weeks_worked + age, so our new formula to get the VIF of hours_worked will be hours_worked ~ weeks_worked + age.

Then, find the unadjusted \(R^2\) in the model summary output.

mod_hours <- lm(hours_worked ~ weeks_worked + age, acs, na.action = na.exclude)
summary(mod_hours)

## 
## Call:
## lm(formula = hours_worked ~ weeks_worked + age, data = acs, na.action = na.exclude)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -39.803  -4.026  -0.669   6.025  58.376 
## 
## Coefficients:
##               Estimate Std. Error t value Pr(>|t|)    
## (Intercept)  18.981070   1.031837   18.39   <2e-16 ***
## weeks_worked  0.423573   0.017989   23.55   <2e-16 ***
## age          -0.006375   0.015167   -0.42    0.674    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 12.69 on 2758 degrees of freedom
##   (2239 observations deleted due to missingness)
## Multiple R-squared:  0.1685, Adjusted R-squared:  0.1679 
## F-statistic: 279.5 on 2 and 2758 DF,  p-value: < 2.2e-16

The multiple \(R^2\) is 0.1685. The VIF for hours_worked is \(\frac{1}{1-R^2} = \frac{1}{1-0.1685}=1.203\), and this matches what we saw earlier.

7.2.2 Polynomials and Interactions

We should not be immediately concerned when we find high VIFs in a model with polynomials or interaction terms. For example, see this model, which has a polynomial term (hours_worked squared) and an interaction term (weeks_worked and age). These terms are highly correlated with the simple effects, so their \(R^2\) values are necessarily high.

In this case, we should fit another model without our polynomial and interaction terms, and check the VIFs again. This model is actually our original model where the highest VIF was about 1.2, so there is no evidence for multicollearity here. Note this is not always the case, however.

mod_poly_int <- lm(income ~ hours_worked + I(hours_worked^2) + weeks_worked * age, acs, na.action = na.exclude)
vif(mod_poly_int)

##      hours_worked I(hours_worked^2)      weeks_worked               age  weeks_worked:age 
##         10.899757         10.118456          5.335164          6.859445         12.104029

7.3 Corrective Actions

If we find evidence of multicollinearity, we have two basic approaches. We also need to check the other regression assumptions, since a violation of one can lead to a violation of another.

Combine predictors
- Create simple composite scores, such as sums or (un)weighted means.
- Fit a measurement model (where correlated predictors load on latent variables), extract latent variable estimates (factor scores), and use these in a regular regression model. Note this approach assumes zero measurement error.
- Fit a structural equation model that includes measurement models.
Drop predictors
- Simply remove predictors not essential to the research question until the VIFs of the focal predictors are decreased. Be aware that this may bias the other coefficient estimates. Note how may affect our conclusions for the other assumptions, especially linearity which assumes our model is complete.

After you have applied any corrections or changed your model in any way, you must re-check this assumption and all of the other assumptions.

Centering variables is often proposed as a remedy for multicollinearity, but it only helps in limited circumstances with polynomial or interaction terms.⁴ ⁵

Iacobucci, D., Schneider, M. J., Popovich, D. L., & Bakamitsos, G. A. (2016). Mean centering helps alleviate “micro” but not “macro” multicollinearity. Behavior Research Methods, 48, 1308–1317. https://doi.org/10.3758/s13428-015-0624-x ↩︎
Olvera Astivia, O. L., & Kroc, E. (2019). Centering in multiple regression does not always reduce multicollinearity: How to tell when your estimates will not benefit from centering. Educational and Psychological Measurement, 79(5), 813–826. https://doi.org/10.1177/0013164418817801 ↩︎