Skip to main content icon/video/no-internet

Linear regression refers to a linear FUNCTION expressing the RELATIONSHIP between the conditional mean of a RANDOM VARIABLE (the DEPENDENT VARIABLE) and the corresponding values of one or more explanatory variables (INDEPENDENT VARIABLES). The dependent variable is a random variable whose realization is composed of the deterministic effects of fixed values of the explanatory variables as well as random disturbances. We could express such a relationship as

None

We can speak of either a population regression function or a sample regression function. The sample regression function is typically used to estimate an unknown population regression function. If a linear regression model is well specified, then the expected value of εi in the population is zero. Given this expectation, unbiased estimates of the population regression function can be calculated from the sample by solving the problem ^Yi = ^ β0 + ^ β1X1i + ··· + ^ βkXki. This equation can also be thought of in terms of estimating a conditional mean as in E(Yi|Xi) = ^ β0 + ^ β1X1i+···+ ^ βkXki. Both equations give the average values of the stochastic dependent variable conditional on fixed values of the independent variables. The goal of linear regression analysis is to find the values ^ β that best characterize ^ Yi or E(Yi|Xi).

The term linear regression implies no particular method of estimation. Rather, a linear regression can be estimated in any of several ways. The most popular approach is to use ORDINARY LEAST SQUARES(OLS). This approach does not explicitly incorporate assumptions about the probability distribution of the disturbances. However, when attempting to relate from the sample to the population regression functions, unbiased and EFFICIENT estimation requires compliance with the Gauss-Markov assumptions. Using OLS, the analyst minimizes the function defined by the sum of squared sample residuals with respect to the regression parameters. Another approach to estimating a linear regression is to use MAXIMUM LIKELIHOOD ESTIMATION. This approach requires the assumption of normally distributed disturbances, in addition to the assumptions of OLS. Using maximum likelihood, the analyst maximizes the function defined by the product of the joint probabilities of all disturbances with respect to the regression parameters. This is called the joint likelihood function. Yet another approach is to use the method of moments. This approach uses the analogy principle whereby moment conditions are used to derive estimates.

With regard to the meaning of linearity, it is useful to clarify that we are not concerned with linearity in the variables but with linearity in the parameters. For a linear regression, we require that predicted values of the dependent variable be a linear function of the estimated parameters. For example, suppose we estimate a sample regression such as ^ Yi =^α + ^ β√Xi + εi. In this regression, the relation between ^ Yi and √Xi is linear. If we think of √Xi instead of Xi as our explanatory variable, then the linearity assumption is met. How-ever, this cannot be applied equally for all estimators. For example, suppose we want to estimate a function such as

None

...

locked icon

Sign in to access this content

Get a 30 day FREE TRIAL

  • Watch videos from a variety of sources bringing classroom topics to life
  • Read modern, diverse business cases
  • Explore hundreds of books and reference titles

Sage Recommends

We found other relevant content for you on other Sage platforms.

Loading