Skip to main content icon/video/no-internet

In many scientific research fields, statistical models are used to describe a system or a population, to interpret a phenomenon, or to investigate the relationship among various measurements. These statistical models often contain one or multiple components, called parameters, that are unknown and thus need to be estimated from the data (sometimes also called the sample). An estimator, which is essentially a function of the observable data, is biased if its expectation does not equal the parameter to be estimated.

To formalize this concept, suppose θ is the parameter of interest in a statistical model. Let be its estimator based on an observed sample. Then is a biased estimator if , where E denotes the expectation operator. Similarly, one may say that is an unbiased estimator if . Some examples follow.

Example 1

Suppose an investigator wants to know the average amount of credit card debt of undergraduate students from a certain university. Then the population would be all undergraduate students currently enrolled in this university, and the population mean of the amount of credit card debt of these undergraduate students, denoted by θ, is the parameter of interest. To estimate θ, a random sample is collected from the university, and the sample mean of the amount of credit card debt is calculated. Denote this sample mean by . Then that is, is an unbiased estimator. If the largest amount of credit card debt from the sample, call it , is used to estimate θ, then obviously is biased. In other words, .

Example 2

In this example a more abstract scenario is examined. Consider a statistical model in which a random variable X follows a normal distribution with mean μ and variance σ2, and suppose a random sample X1, …, Xn is observed. Let the parameter θ be μ. It is seen in Example 1 that , the sample mean of X1, …, Xn, is an unbiased estimator for θ. But is a biased estimator for μ2 (or θ2). This is because follows a normal distribution with mean μ and variance . Therefore, .

Example 2 indicates that one should be careful about determining whether an estimator is biased. Specifically, although is an unbiased estimator for may be a biased estimator for g(θ) if g is a nonlinear function. In Example 2, g(θ) = θ2 is such a function. However, when g is a linear function, that is, g(θ) = aθ + b where a and b are two constants, then is always an unbiased estimator for .

Example 3

Let X1, …, Xn be an observed sample from some distribution (not necessarily normal) with mean μ and variance σ2. The sample variance S2, which is defined as , is an unbiased estimator for σ2, while the intuitive guess would yield a biased estimator. A heuristic argument is given here. If μ were known, could be calculated, which would be an unbiased estimator for σ2. But since μ is not known, it has to be replaced by . This replacement actually makes the numerator smaller. That is, regardless of the value of μ. Therefore, the denominator has to be reduced a little bit (from n to n–1) accordingly.

...

  • Loading...
locked icon

Sign in to access this content

Get a 30 day FREE TRIAL

  • Watch videos from a variety of sources bringing classroom topics to life
  • Read modern, diverse business cases
  • Explore hundreds of books and reference titles

Sage Recommends

We found other relevant content for you on other Sage platforms.

Loading