## Entry

## Reader's guide

## Entries A-Z

## Subject index

# Regression Analysis

Regression analysis is the name for a family of techniques that attempts to predict one variable (an outcome or dependent variable) from another variable, or set of variables (the predictor or independent variables).

We will illustrate this first with an example of linear regression, also called (ordinary) least squares (OLS) regression. When people say “regression” without any further description, they are almost always talking about OLS regression. Figure 1 shows a scatterplot of data from a group of British ex-miners, who were claiming compensation for industrial injury. The x-axis shows the age of the claimant, and the y-axis shows the grip strength, as measured by a dynamometer (this measures how hard the person can squeeze two bars together).

Running through the points is the line of best fit, or regression line. This line allows us to predict the conditional mean of the grip strength—that is, the mean value that would be expected for a person of any age.

The line of best fit, or regression line, is calculated using the least squares method. To illustrate the least squares method, consider Figure 2, which is simplified, in that it has only four points on the scatter-plot. For each point, we calculate (or measure) the vertical distance between the point and the regression line—this is the residual, or error, for that point. Each of these errors is squared and these values are summed. The line of best fit is placed where it minimizes this sum of squared errors (or residuals)— hence, it is the least squares line of best fit, which is sometimes known as the ordinary least squares line of best fit (because there are other kinds of least squares lines, such as generalized least squares and weighted least squares). Thus, we can think of the regression line as minimizing the error (note that in statistics, the term error is used to mean deviation or wandering, not mistake).

### Figure 1 Scatteplot Showing Age Against Grip Strength With Line of Best Fit

The position of a line on a graph is given by two values—the height of the line and the gradient of the line. In regression analysis, the gradient may be referred to as b1 or β1 (β is the Greek letter beta). Of course, because the line slopes, the height varies along its length. The height of the line is given at the point where the value of the x-axis (that is, the predictor variable) is equal to zero. The height of the line is called the intercept, or y-intercept, the constant, b0 (or β0), or sometimes α (the Greek letter alpha).

Calculation of the regression line is straightforward, given the correlation between the measures. The slope of the line (b1) is given by

### Figure 2 Example of Calculation of Residuals

where

r is the correlation between the two measures,

sy is the standard deviation of the outcome variable, and

sx is the standard deviation of the predictor variable.

The intercept is given by

where

i is the intercept,

Y¯ is the mean of the outcome variable, and

X¯ is the mean of the predictor variable.

In the case of the data shown in Figure 1, the intercept is equal to 50.9, and the slope is −0.41. We can calculate the predicted (conditional mean) grip strength of a person at any age, using the equation

where ŝ is the predicted strength and a is the age of the individual. Notice the hat on top of the s, which means that it is predicted, not actual. A very similar way to write the equation would be

In this equation, we are saying that s is the person's actual strength, which is equal to the expected value plus a deviation for that individual. Now we no longer have the hat on the s, because the equation is stating that the person's actual score is equal to that calculated, plus e, that is, error.

Each of the parameters in the regression analysis can have a standard error associated with it, and hence a confidence interval and p value can be calculated for each parameter.

Regression generalizes to a case with multiple predictor variables, referred to as multiple regression. In this case, the calculations are more complex, but the principle is the same—we try to find values for the parameters for the intercept and slope(s) such that the amount of error is minimized. The great advantage and power of multiple regression is that it enables us to estimate the effect of each variable, controlling for the other variables. That is, it estimates what the slope would be if all other variables were controlled.

We can think of regression in a more general sense as being an attempt to develop a model that best represents our data. This means that regression can generalize in a number of different ways.

### Types of Regression

For linear regression as we have described it to be appropriate, it is necessary for the outcome (dependent) variable to be continuous and the predictor (independent) variable to be continuous or binary. It is frequently the case that the outcome variable, in particular, does not match this assumption, in which case a different type of regression is used.

### Categorical Outcomes

Where the outcome is binary—that is, yes or no— logistic or probit regression is used. We cannot estimate the conditional mean of a yes/no response, because the answer must be either yes or no—if the predicted outcome score is 0.34, this does not make sense; instead, we say that the probability of the individual saying yes is 0.46 (or whatever it is). Logistic or probit regression can be extended in two different ways: For categorical outcomes, where the outcome has more than two categories, multinomial logistic regression is used. Where the outcome is ordinal, ordinal logistic regression is used (SPSS refers to this as PLUM – PoLytomous Universal Models) and models the conditional likelihood of a range of events occurring.

### Count Outcomes

Where data are counts of the number of times an event occurred (for example, number of cigarettes smoked, number of times arrested), the data tend to be positively skewed, and additionally, it is only sensible to predict an integer outcome—it is not possible to be arrested 0.3 times, for example. For count outcomes of this type, Poisson regression is used. This is similar to the approaches for categorical data, in that the probability of each potential value is modeled—for example, the probability of having been arrested 0 times is 0.70, one time is 0.20, three times 0.08, and four times 0.02.

### Censored Outcomes

Some variables are, in effect, a mixture of a categorical and a continuous variable, and these are called censored variables. Frequently, they are cut off at zero. For example, the income an individual receives from criminal activities is likely to be zero, hence it might be considered binary—it is either zero or not. However, if it is not zero, we would like to model how high it is. In this situation, we use Tobit regression— named for its developer, James Tobin, because it is Tobin's Probit regression (Tobin himself did not call it this, but it worked so well that it stuck). Another type of censoring is common where the outcome is time to an event, for example, how long did the participant take to solve the problem, how long did the patient survive, or how long did the piece of equipment last. Censoring occurs in this case because, for some reason, we didn't observe the event in which we were interested—the participant may have given up on the problem before he or she solved it, the patient may have outlived the investigator, or the piece of equipment may have been destroyed in a fire. In these cases, we use a technique called Cox proportional hazards regression (or often simply Cox regression).

### Uses of Regression

Regression analysis has three main purposes: prediction, explanation, and control.

### Prediction

A great deal of controversy arises when people confuse the relationship between prediction in regression and explanation. A regression equation can be used to predict an individual's score on the outcome variable of interest. For example, it may be the case that students who spend more time drinking in bars perform less well in their exams. If we meet a student and find out that he or she never set foot inside a bar, we might predict that he or she will be likely do better than average in his or her assessment. This would be an appropriate conclusion to draw.

### Explanation

The second use of regression is to explain why certain events occurred, based on their relationship. Prediction requires going beyond the data—we can say that students who spend more time in bars achieve lower grades, but we cannot say that this is because they spend more time in bars. It may be that those students do not like to work, and if they didn't spend time in bars, they would not spend it working—they would spend it doing something else unproductive. Richard Berk has suggested that we give regression analysis three cheers when we want to use it for description, but only one cheer for causal inference.

### Control

The final use of regression is as a control for other variables. In this case, we are particularly interested in the residuals from the regression analysis. When we place the regression line on a graph, everyone above the line is doing better than we would have expected, given his or her levels of predictor variables. Everyone below the line is doing worse than we would have expected, given his or her levels of predictor variables. By comparing people's residuals, we are making a fairer comparison. Figure 2 reproduces Figure 1, but two cases are highlighted. The case on the left is approximately 40 years old, the case on the right approximately 70 years old. The 40-year-old has a higher grip strength than the 70-year-old (approximately 31 vs. approximately 28 kg). However, if we take age into account, we might say that the 40-year-old has a lower grip strength than we would expect for someone of that age, and the 70-year-old has a higher grip strength. Controlling for age, therefore, the 70-year-old has a higher grip strength.

We will give a concrete example of the way that this is used. The first example is in hospitals in the United Kingdom. Dr Foster (an organization, rather than an individual) collates data on the quality of care in different hospitals—one of the most important variables that it uses is the standardized mortality ratio (SMR). The SMR models an individual's chance of dying in each hospital. Of course, it would be unfair to simply look at the proportion of people treated in each hospital who died, because hospitals differ. They specialize in different things, so a hospital that specialized in heart surgery would have more patients die than a hospital that specialized in leg surgery. Second, hospitals have different people living near them. A town that is popular with retirees will probably have higher mortality than a hospital in a town that is popular with younger working people. Dr Foster attempts to control for each of the factors that is important in predicting hospital mortality, and then calculates the standardized mortality ratio, adjusting for the other factors. It does this by carrying out regression, and examining the residuals.

### Further Reading

- Biographies
- Babbage, Charles
- Bernoulli, Jakob
- Bonferroni, Carlo Emilio
- Bruno, James Edward
- Comrey, Andrew L.
- Cronbach, Lee J.
- Darwin, Charles
- Deming, William Edwards
- Fisher, Ronald Aylmer
- Galton, Sir Francis
- Gauss, Carl Friedrich
- Gresham, Frank M.
- Jackson, Douglas N.
- Malthus, Thomas
- Markov, Andrei Andreevich
- Pascal, Blaise
- Pearson, Karl
- Poisson, Siméon Denis
- Reynolds, Cecil R.
- Torrance, E. Paul
- Wilcoxon, Frank

- Charts, Graphs, and Visual Displays
- Computer Topics and Tools
- Concepts and Issues in Measurement
- Standards for Educational and Psychological Testing
- T Scores
- z Scores
- Ability Tests
- Achievement Tests
- Alternate Assessment
- Americans with Disabilities Act
- Anthropometry
- Aptitude Tests
- Artificial Neural Network
- Asymmetry of g
- Attitude Tests
- Basal Age
- Categorical Variable
- Classical Test Theory
- Coefficient Alpha
- Completion Items
- Computerized Adaptive Testing
- Construct Validity
- Content Validity
- Criterion Validity
- Criterion-Referenced Test
- Cronbach, Lee J.
- Curriculum-Based Measurement
- Diagnostic Validity
- Educational Testing Service
- Equivalence Testing
- Essay Items
- Ethical Issues in Testing
- Face Validity
- Gf-Gc Theory of Intelligence
- Guttman Scaling
- Health Insurance Portability and Accountability Act
- High-Stakes Tests
- Immediate and Delayed Memory Tasks
- Individuals with Disabilities Education Act
- Information Referenced Testing
- Informed Consent
- Intelligence Quotient
- Intelligence Tests
- Internal Review Board
- Interrater Reliability
- Interval Level of Measurement
- Ipsative Measure
- Item and Test Bias
- Item Response Theory
- KR-20 and KR-21
- Likert Scaling
- Measurement
- Measurement Error
- Metric Multidimensional Scaling
- Multiple-Choice Items
- Multitrait Multimethod Matrix and Construct Validity
- Nomothetic Versus Idiographic
- Ordinal Level of Measurement
- Parallel Forms Reliability
- Performance IQ
- Performance-Based Assessment
- Personality Tests
- Portfolio Assessment
- Predictive Validity
- Projective Testing
- Q Methodology
- Questionnaires
- Ratio Level of Measurement
- Reliability Theory
- Response to Intervention
- Reverse Scaling
- Scaling
- Section 504 of the Rehabilitation Act of 1973
- Self-Report
- Semantic Differential
- Semantic Differential Scale
- Six Sigma
- Spearman's Rho
- Split Half Reliability
- Standard Error of Measurement
- Standard Scores
- Test-Retest Reliability
- Thurstone Scaling
- Torrance, E. Paul
- True/False Items
- Validity Coefficient
- Validity Theory
- Verbal IQ

- Concepts and Issues in Statistics
- Artificial Neural Network
- Attenuation, Correction for
- Autocorrelation
- Bayesian Statistics
- Bioinformatics
- Central Limit Theorem
- Decision Theory
- Diggle-Kenward Model for Dropout
- DISTATIS
- Exploratory Factor Analysis
- Factorial Design
- Fourier Transform
- Generalized Additive Model
- Generalized Method of Moments
- Generalized Procrustes Analysis
- Graphical Statistical Methods
- Hierarchical Linear Modeling
- Historiometrics
- Logistic Regression Analysis
- Loglinear Analysis
- Markov Chain Monte Carlo Methods
- Matrix Operations
- Mean
- Measurement Error
- Mixtures of Experts
- Nonparametric Statistics
- Propensity Scores
- Rasch Measurement Model
- Regression Analysis
- Sampling Distribution of a Statistic
- Signal Detection Theory
- Simpson's Paradox
- Spurious Correlation
- Standard Error of the Mean
- Standard Scores
- Support Vector Machines
- Survival Analysis
- Type I Error
- Type II Error

- Data and Data Reduction Techniques
- Descriptive Statistics
- Arithmetic Mean
- Attenuation, Correction for
- Autocorrelation
- Average
- Average Deviation
- Bayley Scales of Infant Development
- Biserial Correlation Coefficient
- Class Interval
- Coefficients of Correlation, Alienation, and Determination
- Cognitive Psychometric Assessment
- Cohen's Kappa
- Correlation Coefficient
- Cumulative Frequency Distribution
- Deviation Score
- Difference Score
- Estimates of the Population Median
- Fisher's Z Transformation
- Frequency Distribution
- Galton, Sir Francis
- Grand Mean
- Harmonic Mean
- Histogram
- Kendall Rank Correlation
- Mean
- Measures of Central Tendency
- Median
- Mode
- Moving Average
- Parameter
- Parameter Invariance
- Part and Partial Correlations
- Pearson Product-Moment Correlation Coefficient
- Pearson, Karl
- Percentile and Percentile Rank
- Scattergram
- Semi-Interquartile Range
- Spurious Correlation
- Standard Deviation
- Survey Weights
- Text Analysis

- Evaluation
- Experimental Methods
- Standards for Educational and Psychological Testing
- Alternative Hypothesis
- American Statistical Association
- Americans with Disabilities Act
- Association for Psychological Science
- Basic Research
- Bioinformatics
- Complete Independence Hypothesis
- Continuous Variable
- Critical Value
- Data Collection
- Data Mining
- Delphi Technique
- Dependent Variable
- Descriptive Research
- Ethical Issues in Testing
- Ethical Principles in the Conduct of Research With Human Participants
- Fractional Randomized Block Design
- Hello-Goodbye Effect
- Hypothesis and Hypothesis Testing
- Independent Variable
- Informed Consent
- Instrumental Variables
- Internal Review Board
- Longitudinal/Repeated Measures Data
- Meta-Analysis
- Missing Data Method
- Mixed Models
- Mixture Models
- Moderator Variable
- Monte Carlo Methods
- Null Hypothesis Significance Testing
- Ockham's Razor
- Pairwise Comparisons
- Post Hoc Comparisons
- Projective Testing
- Quasi-Experimental Method
- Sample Size
- Section 504 of the Rehabilitation Act of 1973
- Significance Level
- Simple Main Effect
- Simulation Experiments
- Single-Subject Designs
- Statistical Significance
- Suppressor Variable
- Variable
- Variable Deletion
- Variance

- Inferential Statistics
- Akaike Information Criterion
- Analysis of Covariance (ANCOVA)
- Analysis of Variance (ANOVA)
- Bayes Factors
- Bayesian Information Criterion
- Binomial Test
- Bonferroni, Carlo Emilio
- Complete Independence Hypothesis
- Data Analysis ToolPak
- Exploratory Factor Analysis
- Factorial Design
- Fisher, Ronald Aylmer
- Hierarchical Linear Modeling
- Hypothesis and Hypothesis Testing
- Inferential Statistics
- Logistic Regression Analysis
- Markov, Andrei Andreevich
- Null Hypothesis Significance Testing
- Pairwise Comparisons
- Part and Partial Correlations
- Repeated Measures Analysis of Variance
- Type I Error
- Type II Error
- Wilcoxon, Frank

- Organizations and Publications
- Journal of Modern Applied Statistical Methods
- Journal of Statistics Education
- Journal of the American Statistical Association
- Abstracts
- American Doctoral Dissertations
- American Psychological Association
- American Statistical Association
- Association for Psychological Science
- Buros Institute of Mental Measurements
- Centers for Disease Control and Prevention
- Educational Testing Service
- National Science Foundation
- Psychometrics
- PsycINFO
- Society for Research in Child Development

- Prediction and Estimation
- Attributable Risk
- Bernoulli, Jakob
- Chance
- Conditional Probability
- Confidence Intervals
- Continuous Variable
- Curse of Dimensionality
- Decision Boundary
- Decision Theory
- File Drawer Problem
- Gambler's Fallacy
- Generalized Estimating Equations
- Law of Large Numbers
- Maximum Likelihood Method
- Nonprobability Sampling
- Pascal, Blaise
- Probability Sampling
- Random Numbers
- Relative Risk
- Signal Detection Theory
- Significance Level
- Three-Card Method

- Probability
- Qualitative Methods
- Samples, Sampling, and Distributions
- Acceptance Sampling
- Adaptive Sampling Design
- Age Norms
- Attrition Bias
- Career Maturity Inventory
- Central Limit Theorem
- Class Interval
- Cluster Sampling
- Confidence Intervals
- Convenience Sampling
- Cumulative Frequency Distribution
- Data Collection
- Diggle-Kenward Model for Dropout
- Gauss, Carl Friedrich
- Heteroscedasticity and Homoscedasticity
- Homogeneity of Variance
- Hypergeometric Distribution
- Kurtosis
- Malthus, Thomas
- Multicollinearity
- Multivariate Normal Distribution
- Nonprobability Sampling
- Normal Curve
- Ogive
- Parameter
- Percentile and Percentile Rank
- Poisson Distribution
- Poisson, Siméon Denis
- Posterior Distribution
- Prior Distribution
- Probability Sampling
- Quota Sampling
- Random Sampling
- Sample
- Sample Size
- Semi-Interquartile Range
- Simpson's Rule
- Skewness
- Smoothing
- Stanine
- Stratified Random Sampling
- Unbiased Estimator

- Statistical Techniques
- k-Means Cluster Analysis
- t Test for Two Population Means
- Binomial Distribution/Binomial and Sign Tests
- Bivariate Distributions
- Bonferroni Test
- Bowker Procedure
- Causal Analysis
- Centroid
- Chance
- Chi-Square Test for Goodness of Fit
- Chi-Square Test for Independence
- Classification and Regression Tree
- Cochran Q Test
- Cohen's Kappa
- Delta Method
- Dimension Reduction
- Discriminant Analysis
- Dissimilarity Coefficient
- Dixon Test for Outliers
- Dunn's Multiple Comparison Test
- Eigendecomposition
- Eigenvalues
- EM Algorithm
- Exploratory Data Analysis
- Factor Analysis
- Factor Scores
- Fisher Exact Probability Test
- Fisher's LSD
- Friedman Test
- Goodness-of-Fit Tests
- Grounded Theory
- Kolmogorov-Smirnov Test for One Sample
- Kolmogorov-Smirnov Test for Two Samples
- Kruskal-Wallis One-Way Analysis of Variance
- Latent Class Analysis
- Likelihood Ratio Test
- Lilliefors Test for Normality
- Mann-Whitney U Test (Wilcoxon Rank-Sum Test)
- McNemar Test for Significance of Changes
- Median Test
- Meta-Analysis
- Multiple Comparisons
- Multiple Factor Analysis
- Multiple Imputation for Missing Data
- Multivariate Analysis of Variance (MANOVA)
- Newman-Keuls Test
- O'Brien Test for Homogeneity of Variance
- Observational Studies
- One-Way Analysis of Variance
- Page's L Test
- Paired Samples t Test (Dependent Samples t Test)
- Path Analysis
- Peritz Procedure
- Scan Statistic
- Shapiro-Wilk Test for Normality
- Structural Equation Modeling
- Tests of Mediating Effects
- Three-Card Method
- Tukey-Kramer Procedure
- Wilcoxon Signed Ranks Test

- Statistical Tests
- t Test for Two Population Means
- Analysis of Covariance (ANCOVA)
- Analysis of Variance (ANOVA)
- Behrens-Fisher Test
- Binomial Distribution/Binomial and Sign Tests
- Binomial Test
- Bonferroni Test
- Bowker Procedure
- Chi-Square Test for Goodness of Fit
- Chi-Square Test for Independence
- Classification and Regression Tree
- Cochran Q Test
- Dixon Test for Outliers
- Dunn's Multiple Comparison Test
- Excel Spreadsheet Functions
- Fisher Exact Probability Test
- Fisher's LSD
- Friedman Test
- Goodness-of-Fit Tests
- Kolmogorov-Smirnov Test for One Sample
- Kolmogorov-Smirnov Test for Two Samples
- Kruskal-Wallis One-Way Analysis of Variance
- Latent Class Analysis
- Likelihood Ratio Test
- Lilliefors Test for Normality
- Mann-Whitney U Test (Wilcoxon Rank-Sum Test)
- McNemar Test for Significance of Changes
- Median Test
- Multiple Comparisons
- Multivariate Analysis of Variance (MANOVA)
- Newman-Keuls Test
- O'Brien Test for Homogeneity of Variance
- One- and Two-Tailed Tests
- One-Way Analysis of Variance
- Page's L Test
- Paired Samples t Test (Dependent Samples t Test)
- Peritz Procedure
- Repeated Measures Analysis of Variance
- Shapiro-Wilk Test for Normality
- Tests of Mediating Effects
- Tukey-Kramer Procedure
- Wilcoxon Signed Ranks Test

- Tests by Name
- Adjective Checklist
- Alcohol Use Inventory
- Armed Forces Qualification Test
- Armed Services Vocational Aptitude Battery
- Basic Personality Inventory
- Bayley Scales of Infant Development
- Beck Depression Inventory
- Behavior Assessment System for Children
- Bender Visual Motor Gestalt Test
- Bracken Basic Concept Scale–Revised
- California Psychological Inventory
- Career Assessment Inventory
- Career Development Inventory
- Career Maturity Inventory
- Carroll Depression Scale
- Children's Academic Intrinsic Motivation Inventory
- Clinical Assessment of Attention Deficit
- Clinical Assessment of Behavior
- Clinical Assessment of Depression
- Cognitive Abilities Test
- Cognitive Psychometric Assessment
- Comrey Personality Scales
- Coping Resources Inventory for Stress
- Culture Fair Intelligence Test
- Differential Aptitude Test
- Ecological Momentary Assessment
- Edwards Personal Preference Schedule
- Embedded Figures Test
- Fagan Test of Infant Intelligence
- Family Environment Scale
- Gerontological Apperception Test
- Goodenough Harris Drawing Test
- Graduate Record Examinations
- Holden Psychological Screening Inventory
- Illinois Test of Psycholinguistic Abilities
- Information Systems Interaction Readiness
- Internal External Locus of Control Scale
- International Assessment of Educational Progress
- Iowa Tests of Basic Skills
- Iowa Tests of Educational Development
- Jackson Personality Inventory–Revised
- Jackson Vocational Interest Survey
- Kaufman Assessment Battery for Children
- Kinetic Family Drawing Test
- Kingston Standardized Cognitive Assessment
- Kuder Occupational Interest Survey
- Laboratory Behavioral Measures of Impulsivity
- Law School Admissions Test
- Life Values Inventory
- Luria Nebraska Neuropsychological Battery
- Male Role Norms Inventory
- Matrix Analogies Test
- Millon Behavioral Medicine Diagnostic
- Millon Clinical Multiaxial Inventory-III
- Minnesota Clerical Test
- Minnesota Multiphasic Personality Inventory
- Multidimensional Aptitude Battery
- Multiple Affect Adjective Checklist–Revised
- Myers-Briggs Type Indicator
- NEO Personality Inventory
- Neonatal Behavioral Assessment Scale
- Peabody Picture Vocabulary Test
- Personal Projects Analysis
- Personality Assessment Inventory
- Personality Research Form
- Piers-Harris Children's Self-Concept Scale
- Preschool Language Assessment Instrument
- Profile Analysis
- Projective Hand Test
- Quality of Well-Being Scale
- Raven's Progressive Matrices
- Roberts Apperception Test for Children
- Rorschach Inkblot Test
- Sixteen Personality Factor Questionnaire
- Social Climate Scales
- Social Skills Rating System
- Spatial Learning Ability Test
- Stanford Achievement Test
- Stanford-Binet Intelligence Scales
- Strong Interest Inventory
- Stroop Color and Word Test
- Structured Clinical Interview for DSM-IV
- System of Multicultural Pluralistic Assessment
- Thematic Apperception Test
- Torrance Tests of Creative Thinking
- Torrance Thinking Creatively in Action and Movement
- Universal Nonverbal Intelligence Test
- Vineland Adaptive Behavior Scales
- Vineland Social Maturity Scale
- Wechsler Adult Intelligence Scale
- Wechsler Individual Achievement Test
- Wechsler Preschool and Primary Scale of Intelligence
- West Haven-Yale Multidimensional Pain Inventory
- Woodcock Johnson Psychoeducational Battery
- Woodcock Reading Mastery Tests Revised

- Loading...