Skip to main content icon/video/no-internet

Deviation is a very general term referring to some type of discrepancy. A simple example is the deviation or difference between an observation and the population mean, μ, the average of all individuals if they could be measured. The average (expected) squared deviation between an observation and μ is the population variance. A related measure is the squared deviation between observations made and the sample mean, which is represented by the sample variance. That is, deviations are defined in terms of the sample mean rather than population mean.

In a more general context, deviation refers to the discrepancy between a sample of observations and some model that supposedly represents the process by which all data are generated. Perhaps the simplest and best-known example is determining whether observations were sampled from some specified distribution, with a normal distribution typically being the distribution of interest. The most commonly used measure of deviation between the distribution of the observed values, versus the hypothesized distribution, is called a Kolmogorov distance (e.g., see Conover, 1980). The method has been extended to measuring the overall deviation between two distributions (Doksum & Sievers, 1976), which can result in an interesting perspective on how two independent groups compare. (For details and appropriate software, see Wilcox, 2003.)

The notion of deviation arises in several other situations. In regression, for example, a common model is Y = β0 + β1X + ε, where ε is usually taken to be a variable having a mean of zero. So, the model is that if we are told that X = 10, for example, the mean of Y is equal to β0 + β1X = β0 + β110, where, typically, β0 and β1 are unknown parameters that are estimated based on the sample of observations available to us. In essence, the model assumes a linear association between the mean of Y and X. Here, deviation might refer to any discrepancy between the assumed model and the data collected in a particular study. So, for example, if Y represents cognitive functioning of children and X represents level of aggression in the home, a common strategy is to assume that there is a linear association between these two measures. A fundamental issue is measuring the deviation between this assumed linear association and the true association in an attempt to assess the adequacy of the model used to characterize the data. That is, is it reasonable to assume that for some choice of β0 and β1, the mean of Y is equal to β0 + β1X? Measures of how much the data deviate from this model have been devised, one of which is essentially a Kolmogorov distance, and they can be used to provide empirical checks on the assumption of a linear model (e.g., Wilcox, 2003).

Yet another context where deviation arises is categorical data. As a simple example, consider a study where adult couples are asked to rate the effectiveness of some political leader on a 5-point scale. A possible issue is whether a woman's response is independent of her partner's response. Now, the strategy is to measure the deviation from the pattern of responses that is expected under independence. More generally, deviation plays a role in assessing goodness of fit when examining the plausibility of some proposed model (e.g., see Goodman, 1978).

...

locked icon

Sign in to access this content

Get a 30 day FREE TRIAL

  • Watch videos from a variety of sources bringing classroom topics to life
  • Read modern, diverse business cases
  • Explore hundreds of books and reference titles

Sage Recommends

We found other relevant content for you on other Sage platforms.

Loading