Multicollinearity

Albert J.Mills; Gabrielle Durepos; Elden Wiebe

doi:10.4135/9781412957397

Entry
Reader's guide
Entries A-Z
Subject index

Return to Entries

Multicollinearity

Edited by:
Albert J. Mills
,
Gabrielle Durepos
&
Elden Wiebe
In:Encyclopedia of Case Study Research
Chapter DOI:https://doi.org/10.4135/9781412957397.n212
Subject:Anthropology, Business and Management, Criminology and Criminal Justice, Communication and Media Studies, Economics, Education, Geography, Health, Marketing, Nursing, Political Science and International Relations, Psychology, Social Policy and Public Policy, Social Work, Sociology

Request Permissions

Show page numbers Hide page numbers

Suppose we have a single dependent variable y and k independent variables. Further let y = f(x1, x2 …, xk) denote the functional relationship between (x1, x2 …, xk) and y. Note that we are in a different situation than the one dependent variable and one independent or predictor variable case. When the predictor variables or x's, (x1, x2 …, xk) are highly correlated we say multicollinearity exists. When the correlation among the variables is very high (say .9 or more), problems may arise with the model estimates, their interpretation, and the significance level of tests. Other consequences of multicollinearity are large standard errors that give rise to wide confidence intervals and nonsignificant or incorrect t statistics.

Application

We assume hereafter that the relationship between the dependent variable y and the independent variables x1, …, xk (perhaps re-expressions of the original independent variables) is of the form, ignoring for the moment the possibility of variation,

In the statistical context, that is, when a particular value for (x1, …, xk) specifies a frequency distribution for y, we assume that the average value of y is given by

and that changes in (x1, …, xk) affect at most the means of the frequency distributions. Read E[y|x1, …, xk] as the average value of y given x1, …, xk. If we put e = y–β1x–…–βkxk then the [Page 573]frequency distribution of e is constant as (x1, …, xk) changes.

Thus we can write our model as

and e is referred to as the error term.

If f is the frequency of e, then for a particular value of (x1, …, xk) the frequency function of y is given by f(e–β1x1–…–βkxk). We will assume hereafter that f can be taken to be a density function and that the variance of the frequency distribution for e exists and is equal to σ 2.

In a psychological investigation, our primary purpose will be to make inferences about the true value of the coefficients β1, β2, …, βk.

To do this we will be required to make a number of observations at different values of (x1, …, xk).

Let yi denote the observation taken at

and let ei denote the error. Then for n observations we have in matrix notation

where X is called the design matrix.

We assume that the form of the frequency distribution for e is normal: that is:

The statistical model we have constructed here is

called the linear model with normal error.

For the normal linear model the least squares estimator of β is given by

The vector of residuals is

When the predictor variables or x's (x1, x2 …, xk) are correlated, we say multicollinearity exists. Where correlations are high among the x variables then the computer has difficulty in calculating (i.e., rounding error, etc.) the matrix (X'X)-1, which is necessary for many estimates (i.e., b's, standard errors, etc.). The psychologist might find for example the F test of H0: β2 = β3 = … = βk = 0 in the overall ANOVA table is significant; however, the t tests are not significant. The problem here is that the variables share information concerning the dependent y.

...

Sign in to access this content

Get a 30 day FREE TRIAL

Watch videos from a variety of sources bringing classroom topics to life
Read modern, diverse business cases
Explore hundreds of books and reference titles

No internet connection.

All search filters on the page have been cleared.

Your search has been saved.

Entry

Reader's guide

Entries A-Z

Subject index

Multicollinearity

Application

Sign in to access this content

Get a 30 day FREE TRIAL

Read next

More like this

Sage Recommends