Skip to main content icon/video/no-internet

Estimation is the process of providing a numerical value for an unknown quantity based on information collected from a sample. If a single value is calculated for the unknown quantity, the process is called point estimation. If an interval is calculated that is likely, in some sense, to contain the quantity, then the procedure is called interval estimation, and the interval is referred to as a confidence interval. Estimation is thus the statistical term for an everyday activity: making an educated guess about a quantity that is unknown based on known information. The unknown quantities, which are called parameters, may be familiar population quantities such as the population mean μ, population variance σ2, and population proportion π. For instance, a researcher may be interested in the proportion of voters favoring a political party. That proportion is the unknown parameter, and its estimation may be based on a small random sample of individuals. In other situations, the parameters are part of more elaborate statistical models, such as the regression coefficients β0, β1,…, βp in a linear regression model

which relates a response variable Y to explanatory variables x1, x2,…,xp.

Point estimation is one of the most common forms of statistical inference. One measures a physical quantity in order to estimate its value, surveys are conducted to estimate unemployment rates, and clinical trials are carried out to estimate the cure rate (risk) of a new treatment. The unknown parameter in an investigation is denoted by θ, assumed for simplicity to be a scalar, but the results below extend to the case that θ = (θ1, θ2,…,θk) with k > 1.

To estimate θ, or, more generally, a real-valued function of θ, τ(θ), one calculates a corresponding function of the observations, a statistic, δ = δ(X1, X2,…, Xn). An estimator is any statistic δ defined over the sample space. Of course, it is hoped that δ will tend to be close, in some sense, to the unknown τ(θ), but such a requirement is not part of the formal definition of an estimator. The value δ(x1, x2, …, xn) taken on by δ in a particular case is the estimate of τ(θ), which will be our educated guess for the unknown value. In practice, the compact notation is often used for both estimator and estimate.

The theory of point estimation can be divided into two parts. The first part is concerned with methods for finding estimators, and the second part is concerned with evaluating these estimators. Often, the methods of evaluating estimators will suggest new estimators. In many cases, there will be an obvious choice for an estimator of a particular parameter. For example, the sample mean is a natural candidate for estimating the population mean; the median is sometimes proposed as an alternative. In more complicated settings, however, a more systematic way of finding estimators is needed.

Methods of Finding Estimators

The formulation of the estimation problem in a concrete situation requires specification of the probability model, P, that generates the data. The model P is assumed to be known up to an unknown parameter θ, and P = Pθ is written to express this dependence. The observations x = (x1, x2, …,xn) are postulated to be the values taken on by the random observable X = (X1, X2,…, Xn) with distribution Pθ. Frequently, it will be reasonable to assume that each of the Xis has the same distribution, and that the variables X1, X2,…, Xn are independent. This situation is called the independent, identically distributed (i.i.d.) case in the literature and allows for a considerable simplification in our model.

...

  • Loading...
locked icon

Sign in to access this content

Get a 30 day FREE TRIAL

  • Watch videos from a variety of sources bringing classroom topics to life
  • Read modern, diverse business cases
  • Explore hundreds of books and reference titles

Sage Recommends

We found other relevant content for you on other Sage platforms.

Loading