Curse of Dimensionality

Neil J.Salkind

doi:10.4135/9781412952644

Entry
Reader's guide
Entries A-Z
Subject index

Return to Entries

Curse of Dimensionality

Edited by:
Neil J. Salkind
In:Encyclopedia of Measurement and Statistics
Chapter DOI:https://doi.org/10.4135/9781412952644.n119
Subject:Anthropology, Business and Management, Criminology and Criminal Justice, Communication and Media Studies, Counseling and Psychotherapy, Economics, Education, Geography, Health, History, Marketing, Nursing, Political Science and International Relations, Psychology, Social Policy and Public Policy, Social Work, Sociology, Science, Technology, Computer Science, Engineering, Mathematics, Medicine

Request Permissions

Show page numbers Hide page numbers

Curse of dimensionality refers to the rapid increase in volume associated with adding extra dimensions to a mathematical space. In the behavioral and social sciences, the mathematical space in question refers to the multidimensional space spanned by the set of V variables collected by the researcher. Simply put, the ability to simultaneously analyze large sets of variables requires large numbers of observations due to the fact that, as the number of variables increases, the multidimensional space becomes more and more sparse. This problem manifests itself in several analytical techniques (such as multiple regression and finite mixture modeling) in which difficulties arise because the variance-covariance matrix becomes singular (i.e., noninvertible) when the number of observations, N, exceeds the number of variables, V. Additionally, as N approaches V, the parameter estimates of the aforementioned models become increasingly unstable, causing statistical inference to become less precise.

For a mathematical example, consider multiple regression in which we are predicting y from a matrix of explanatory variables, X. For ease of presentation, assume that the data are mean centered; then the unbiased estimate of the covariance matrix of X is given by

Furthermore, the general equation for multiple regression is

where

y is the N × 1 vector of responses,

X is the N × V matrix of predictor variables,

β is the V × 1 vector of parameter estimates corresponding to the predictor variables, and

∊ is the N × 1 vector of residuals.

It is well known that the estimate of β is given by

It is easily seen that the (X′X)–1 is proportional to the inverse of Σ. Thus, if there are any redundancies (i.e., Σ is not of full rank, or in regression terms, multicollinearity exists) in Σ, it will not be possible to take the inverse of Σ and, consequently, it will not be possible to estimate β. One possible introduction of multicollinearity into Σ is when V exceeds N .

Related to this general problem is the fact that, as V increases, the multidimensional space becomes more and more sparse. To illustrate, consider the Euclidean distance between any two points x and y,

the square root of the sum of squared differences across all V dimensions. To begin, consider the two points x = (1, 3) and y = (4, 7), which results in the Euclidean distance of d(x,y) = [(1 – 4)2 + (3 – 7)2]1/2 = [9 + 16]1/2 = 5. Now assume that K additional, albeit meaningless, dimensions are added to each observation by sampling from a uniform distribution with lower bound of 0 and upper bound of 1 (denoted by U(0,1). The new Euclidean distance, d(x,y)∗, is given by

where the 5 represents the original Euclidean distance and the remainder of d(x,y)∗ represents the additional distance that is due to random noise alone. Clearly, as K → ∞, then d(x,y)∗ → ∞, indicating that as more dimensions are added, the two points become farther and farther apart. In the extreme, an infinite amount of random noise results in the two points being infinitely far apart.

...

Sign in to access this content

Get a 30 day FREE TRIAL

Watch videos from a variety of sources bringing classroom topics to life
Read modern, diverse business cases
Explore hundreds of books and reference titles

No internet connection.

All search filters on the page have been cleared.

Your search has been saved.

Entry

Reader's guide

Entries A-Z

Subject index

Curse of Dimensionality

Sign in to access this content

Get a 30 day FREE TRIAL

Read next

More like this

Sage Recommends