Skip to main content icon/video/no-internet

In 1984, Anderson stated, “A major reason for basing statistical analysis on the normal distribution is that this probabilistic model approximates well the distribution of continuous measurements in many sampled populations.” He goes on to mention that normality based methods have the advantage that the theory is developed in great detail. In univariate analysis, the central limit theorem steers us toward the normal distribution. Similarly, we are steered toward the multivariate normal distribution in multivariate analysis by the general, or multivariate, central limit theorem.

As its name would indicate, the multivariate normal distribution is the multidimensional extension of the familiar univariate normal distribution, or “bell curve.” The bivariate normal distribution will arise when a pair of variables X and Y has not just individual (or marginal) distributions that are normal, but also a joint distribution that is normal. The bivariate normal distribution can be visualized as a three-dimensional bell. The distribution can be extended with three or more variables sharing this relationship.

None

Figure 1 Example of a Multivariate Normal Distribution

Properties

The probability density function for a univariate normal distribution is None, where –∞< x <∞; μ represents the mean of the distribution and can take on any real number value; and σ represents the standard deviation of the distribution, which can take on any positive value. This formula is extended to the multivariate distributions as None. Many online sites, including Weisstein's MathWorld, cover many of the mathematical properties of this distribution.

For the bivariate distribution, p = 2, and in general, p is the number of variables. μ is a vector of size p × 1 that contains the means, and Σ is the variance covariance matrix of size p × p. In this matrix, the main diagonal elements are the variances of the p component variables, and the off diagonal elements are the covariances. For example, if we let σij represent the element in the ith row and jth column of this matrix, then σ11 would represent the variance of the first variable, σ12 is the covariance between the first and second variables, and the correlation between these two variables would be None or the covariance divided by the product of the two standard deviations.

Some properties of the multivariate normal distribution are very important to consider when performing multivariate analyses. When considering a random vector X that follows a multivariate normal distribution, the following properties will hold:

  • All linear combinations of the components of X are normally distributed.
  • All possible subsets of the components of X will be normally distributed.
  • Zero covariance between two components implies that those components are independent.
  • The conditional distributions of the components are normal.

It is important to realize that even though each univariate component of X follows the univariate normal distribution, the univariate normality of each component of some other random vector X does not necessarily imply that X is multivariate normal. In other words, the univariate normality of each variable is necessary but not sufficient to establish multivariate normality. Thus, a strategy of assessing multivariate normality that merely assesses each component variable separately will not be successful in detecting all deviations from multivariate normality. The relationship between the variables, numerically expressed in the variance covariance matrix, must be considered for the problem of testing multivariate normality.

...

  • Loading...
locked icon

Sign in to access this content

Get a 30 day FREE TRIAL

  • Watch videos from a variety of sources bringing classroom topics to life
  • Read modern, diverse business cases
  • Explore hundreds of books and reference titles

Sage Recommends

We found other relevant content for you on other Sage platforms.

Loading