Skip to main content icon/video/no-internet

Conditional Independence

Statistical independence and conditional independence (CI) are important concepts in statistics, artificial intelligence, and related fields. Let X, Y, and Z denote three sets of random variables, and let P denote their probability distribution or density functions. X and Y are conditionally independent given Z, denoted by XY | Z, if and only if P(X, Y | Z) = P(X | Z) P(Y | Z). It reflects the fact that given the values of Z, further knowing the values of X does not provide any additional information about Y. Generally speaking, such a CI relationship allows us to drop X when constructing a probabilistic model for Y with (X, Z), resulting in a parsimonious representation. Moreover, independence and CI play a central role in Bayesian network learning and causal discovery, which aims at recovering the underlying causal model from purely observational data.

A direct way to assess if XY | Z is to estimate the involved probability density or distribution functions and then check on whether the definition is satisfied. However, density estimation in high dimensions is known to be difficult: In nonparametric joint or conditional density estimation, due to the curse of dimensionality, to achieve the same accuracy, the number of required data points grows exponentially in the data dimension.

Testing for CI is much more difficult than that for unconditional independence. For CI tests, traditional methods either focus on the discrete case, in which the chi-square test can be used, or impose simplifying assumptions to deal with the continuous case. In particular, the variables are often assumed to have linear relations with additive Gaussian errors. In that case, XY | Z reduces to zero partial correlation or zero conditional correlation between X and Y given Z, which can be easily tested. However, nonlinearity and non-Gaussian noise are frequently encountered in practice and, accordingly, the partial correlation test may lead to incorrect conclusions.

CI is just one particular property associated with the distributions; to test for it, it is possible to avoid explicitly estimating the densities. There exist some ways to characterize the CI relation that do not explicitly involve the densities, and they inspired more efficient methods for CI testing. Note that when (X, Y, Z) is jointly Gaussian, XY | Z is equivalent to the vanishing of the partial correlation coefficient between X and Y given Z. As its generalization, J. J. Daudin showed that in the general case, XY | Z if and only if f(X,Z)−E[f | Z] is always uncorrelated with g(Y)−E[g | Z] for any square-integrable functions f and g. Here, E[f | Z] denotes the conditional mean of f(X, Z) given Z. In this way, CI is characterized by the uncorrelatedness of functions in suitable spaces. Kenji Fukumizu and others showed that one can use the reproducing kernel Hilbert spaces corresponding to the so-called characteristic kernels (e.g., the Gaussian kernel) instead of the square-integrable spaces and proposed a measure of conditional dependence. Kun Zhang and others further developed a kernel-based CI test. Such a nonparametric conditional dependence measure and CI test have received many applications in machine learning, statistics, and artificial intelligence.

...

  • Loading...
locked icon

Sign in to access this content

Get a 30 day FREE TRIAL

  • Watch videos from a variety of sources bringing classroom topics to life
  • Read modern, diverse business cases
  • Explore hundreds of books and reference titles

Sage Recommends

We found other relevant content for you on other Sage platforms.

Loading