Skip to main content icon/video/no-internet

Multiple regression is a general and flexible statistical method for analyzing associations between two or more independent variables and a single dependent variable. As a general statistical technique, multiple regression can be employed to predict values of a particular variable based on knowledge of its association with known values of other variables, and it can be used to test scientific hypotheses about whether and to what extent certain independent variables explain variation in a dependent variable of interest. As a flexible statistical method, multiple regression can be used to test associations among continuous as well as categorical variables, and it can be used to test associations between individual independent variables and a dependent variable, as well as interactions among multiple independent variables and a dependent variable. In this entry, different approaches to the use of multiple regression are presented, along with explanations of the more commonly used statistics in multiple regression, methods of conduction multiple regression analysis, and the assumptions of multiple regression.

Approaches to Using Multiple Regression

Prediction

One common application of multiple regression is for predicting values of a particular dependent variable based on knowledge of its association with certain independent variables. In this context, the independent variables are commonly referred to as predictor variables and the dependent variable is characterized as the criterion variable. In applied settings, it is often desirable for one to be able to predict a score on a criterion variable by using information that is available in certain predictor variables. For example, in the life insurance industry, actuarial scientists use complex regression models to predict, on the basis of certain predictor variables, how long a person will live. In scholastic settings, college and university admissions offices will use predictors such as high school grade point average (GPA) and ACT scores to predict an applicant's college GPA, even before he or she has entered the university.

Multiple regression is most commonly used to predict values of a criterion variable based on linear associations with predictor variables. A brief example using simple regression easily illustrates how this works. Assume that a horticulturist developed a new hybrid maple tree that grows exactly 2 feet for every year that it is alive. If the height of the tree was the criterion variable and the age of the tree was the predictor variable, one could accurately describe the relationship between the age and height of the tree with the formula for a straight line, which is also the formula for a simple regression equation:

where Y is the value of the dependent, or criterion, variable; X is the value of the independent, or predictor, variable; b is a regression coefficient that describes the slope of the line; and a is the Y intercept. The Y intercept is the value of Y when X is 0. Returning to the hybrid tree example, the exact relationship between the tree's age and height could be described as follows:

Notice that the Y intercept is 0 in this case because at 0 years of age, the tree has 0 height. At that point, it is just seed in the ground. It is clear how knowledge of the relationship between the tree's age and height could be used to easily predict the height of any given tree by just knowing what its age is. A 5-year-old tree will be 10 feet tall, an 8-year-old tree will be 16 feet tall, and so on.

...

  • Loading...
locked icon

Sign in to access this content

Get a 30 day FREE TRIAL

  • Watch videos from a variety of sources bringing classroom topics to life
  • Read modern, diverse business cases
  • Explore hundreds of books and reference titles

Sage Recommends

We found other relevant content for you on other Sage platforms.

Loading