Entry
Reader's guide
Entries A-Z
Linear Regression
Linear regression refers to a linear FUNCTION expressing the RELATIONSHIP between the conditional mean of a RANDOM VARIABLE (the DEPENDENT VARIABLE) and the corresponding values of one or more explanatory variables (INDEPENDENT VARIABLES). The dependent variable is a random variable whose realization is composed of the deterministic effects of fixed values of the explanatory variables as well as random disturbances. We could express such a relationship as

We can speak of either a population regression function or a sample regression function. The sample regression function is typically used to estimate an unknown population regression function. If a linear regression model is well specified, then the expected value of εi in the population is zero. Given this expectation, unbiased estimates of the population regression function can be calculated from the sample by solving the problem ^Yi = ^ β0 + ^ β1X1i + ··· + ^ βkXki. This equation can also be thought of in terms of estimating a conditional mean as in E(Yi|Xi) = ^ β0 + ^ β1X1i+···+ ^ βkXki. Both equations give the average values of the stochastic dependent variable conditional on fixed values of the independent variables. The goal of linear regression analysis is to find the values ^ β that best characterize ^ Yi or E(Yi|Xi).
The term linear regression implies no particular method of estimation. Rather, a linear regression can be estimated in any of several ways. The most popular approach is to use ORDINARY LEAST SQUARES(OLS). This approach does not explicitly incorporate assumptions about the probability distribution of the disturbances. However, when attempting to relate from the sample to the population regression functions, unbiased and EFFICIENT estimation requires compliance with the Gauss-Markov assumptions. Using OLS, the analyst minimizes the function defined by the sum of squared sample residuals with respect to the regression parameters. Another approach to estimating a linear regression is to use MAXIMUM LIKELIHOOD ESTIMATION. This approach requires the assumption of normally distributed disturbances, in addition to the assumptions of OLS. Using maximum likelihood, the analyst maximizes the function defined by the product of the joint probabilities of all disturbances with respect to the regression parameters. This is called the joint likelihood function. Yet another approach is to use the method of moments. This approach uses the analogy principle whereby moment conditions are used to derive estimates.
With regard to the meaning of linearity, it is useful to clarify that we are not concerned with linearity in the variables but with linearity in the parameters. For a linear regression, we require that predicted values of the dependent variable be a linear function of the estimated parameters. For example, suppose we estimate a sample regression such as ^ Yi =^α + ^ β√Xi + εi. In this regression, the relation between ^ Yi and √Xi is linear. If we think of √Xi instead of Xi as our explanatory variable, then the linearity assumption is met. How-ever, this cannot be applied equally for all estimators. For example, suppose we want to estimate a function such as

...
- Analysis of Variance
- Association and Correlation
- Association
- Association Model
- Asymmetric Measures
- Biserial Correlation
- Canonical Correlation Analysis
- Correlation
- Correspondence Analysis
- Intraclass Correlation
- Multiple Correlation
- Part Correlation
- Partial Correlation
- Pearson's Correlation Coefficient
- Semipartial Correlation
- Simple Correlation (Regression)
- Spearman Correlation Coefficient
- Strength of Association
- Symmetric Measures
- Basic Qualitative Research
- Basic Statistics
- F Ratio
- N(n)
- t-Test
- X¯
- Y Variable
- z-Test
- Alternative Hypothesis
- Average
- Bar Graph
- Bell-Shaped Curve
- Bimodal
- Case
- Causal Modeling
- Cell
- Covariance
- Cumulative Frequency Polygon
- Data
- Dependent Variable
- Dispersion
- Exploratory Data Analysis
- Frequency Distribution
- Histogram
- Hypothesis
- Independent Variable
- Measures of Central Tendency
- Median
- Null Hypothesis
- Pie Chart
- Regression
- Standard Deviation
- Statistic
- Causal Modeling
- DISCOURSE/CONVERSATION ANALYSIS
- Econometrics
- Epistemology
- Ethnography
- Evaluation
- Event History Analysis
- Experimental Design
- Factor Analysis and Related Techniques
- Feminist Methodology
- Generalized Linear Models
- HISTORICAL/COMPARATIVE
- Interviewing in Qualitative Research
- Latent Variable Model
- LIFE HISTORY/BIOGRAPHY
- LOG-LINEAR MODELS (CATEGORICAL DEPENDENT VARIABLES)
- Longitudinal Analysis
- Mathematics and Formal Models
- Measurement Level
- Measurement Testing and Classification
- Multilevel Analysis
- Multiple Regression
- Qualitative Data Analysis
- Sampling in Qualitative Research
- Sampling in Surveys
- Scaling
- Significance Testing
- Simple Regression
- Survey Design
- Time Series
- ARIMA
- Box-Jenkins Modeling
- Cointegration
- Detrending
- Durbin-Watson Statistic
- Error Correction Models
- Forecasting
- Granger Causality
- Interrupted Time-Series Design
- Intervention Analysis
- Lag Structure
- Moving Average
- Periodicity
- Serial Correlation
- Spectral Analysis
- Time-Series Cross-Section (TSCS) Models
- Time-Series Data (Analysis/Design)
- Trend Analysis
Get a 30 day FREE TRIAL
-
Watch videos from a variety of sources bringing classroom topics to life
-
Read modern, diverse business cases
-
Explore hundreds of books and reference titles
Sage Recommends
We found other relevant content for you on other Sage platforms.
Have you created a personal profile? Login or create a profile so that you can save clips, playlists and searches