Skip to main content icon/video/no-internet

The use of path analysis to examine causal structures among continuous variables was pioneered by Sewall Wright and popularized in the social sciences through the work of Peter M. Blau and Otis D. Duncan, among others. There are several advantages to path analysis that account for its continuing popularity: (a) It provides a graphical representation of a set of algebraic relationships among variables that concisely and visually summarizes those relationships; (b) it allows researchers to not only examine the direct impact of a predictor on a dependent variable, but also see other types of relationships, including indirect and spurious relationships; (c) it indicates, at a glance, which predictors appear to have stronger, weaker, or no relationships with the dependent variable; (d) it allows researchers to decompose or split up the variance in a dependent variable into explained and unexplained, and also allows researchers to decompose the explained variance into variance explained by different variables; and (e) it allows researchers to decompose the correlation between a predictor and a dependent variable into direct, indirect, and spurious effects. Path analysis is used to describe systems of predictive or, more often, causal relationships involving three or more interrelated variables. Because path analysis is most often used in causal rather than purely predictive analysis, the language of causality is adopted here, but it should be borne in mind that causal relationships require more than path analysis for evidence; in particular, questions not only of association and spuriousness, but also of causal (and thus implicitly temporal) order must be considered.

In path analysis, more than one variable is typically treated as a dependent variable with respect to other variables in the model. Variables that affect other variables in the model, but are not affected by other variables in the model, are called exogenous variables, implying not so much that they are outside the model but that their explanation lies outside the model. A variable that is affected or predicted by at least one of the other variables in the model is considered an endogenous variable. An endogenous variable may be the last variable in the causal chain, or it may be an intervening variable, one that occurs between an exogenous variable and another endogenous variable. In practice, in any path analytical model, there will be at least one exogenous and one endogenous variable.

The simple patterns of direct, indirect, and spurious relationships are diagrammed in Figure 1. Diagram A in Figure 1 shows a simple relationship in which X and Y are both exogenous variables, and each has a direct effect on the endogenous variable Z. Diagram B in Figure 1 shows a spurious relationship, in which X is an exogenous variable, Y and Z are endogenous variables, and X has an effect on both Y and Z. From Diagram B, we would expect that the zero-order correlation between Y and Z would be nonzero, but that the partial correlation between Y and Z, controlling for X, would be zero. Diagram C in Figure 1 shows a causal chain in which X is an exogenous variable and has a direct effect on Y but not on Z, Y and Z are endogenous variables, and Y has a direct effect on Z. In Diagram C, X has an indirect effect on Z through its effect on Y. We would therefore expect the zero-order correlation between X and Z to be nonzero, but the partial correlation between X and Z, controlling for Y, to be zero. Diagram D in Figure 1 shows a mixture of direct and indirect effects, and incorporates all of the effects in the previous three diagrams. X is an exogenous variable that has a direct effect on the endogenous variable Y, a direct effect on the endogenous variable Z, and an indirect effect (via Y) on the endogenous variable Z. Y is an endogenous variable that is related to Z in part through a direct effect, but also in part through a spurious effect because X is a cause of both Y and Z. In this diagram, we would expect the zero-order correlations among all the variables to be nonzero, the partial correlation between X and Z controlling for Y to be nonzero but smaller than the zero-order correlation between X and Z (because part of the zero-order correlation would reflect the indirect effect of X on Z via Y), and the partial correlation between Y and Z controlling for X to be nonzero but smaller than the zero-order correlation between Y and Z (because the zero-order correlation would reflect the spurious relationship between Y and Z resulting from their both being influenced by X).

...

  • Loading...
locked icon

Sign in to access this content

Get a 30 day FREE TRIAL

  • Watch videos from a variety of sources bringing classroom topics to life
  • Read modern, diverse business cases
  • Explore hundreds of books and reference titles

Sage Recommends

We found other relevant content for you on other Sage platforms.

Loading