Entry
Reader's guide
Entries A-Z
Coefficient of Determination
Represented by r2 for the bivariate case and R2 in the multivariate case, the coefficient of determination is a measure of GOODNESS OF FIT in ORDINARY LEAST SQUARES LINEAR REGRESSION. (In the multivariate case, it is often called the coefficient of multiple determination.) The statistic measures how well the estimated regression line fits the actual data. It tells us how much of the variation in the dependent variable, y, is explained by the model by all of the independent variables taken together. Specifically, it is a proportion of the explained variation in y to the total variation in y.
A number of equations may be used to calculate the coefficient of determination. The first takes the explained sum of squares and divides it by the total sum of squares: R2 = ESS/TSS. A second subtracts the proportion of the residual sum of squares (the unexplained sum of squares) to the total sum of squares from 1: R2 = 1− (RSS/TSS).
Because it is the proportion of the explained sum of squares over the total sum of squares, the measure must fall between 0 and 1. A value of 1 indicates a perfect fit of the linear regression line to the data; values closer to 0 suggest a poor fit. If the x and y variables are completely linearly independent, the R2 will equal 0. It should be noted that R2 could be a negative value ranging from 0 to –1 if the sample average accounts for more variation in the dependent variables than the independent variables explain. Multiplying the measure by 100 gives a value that allows for clearer interpretation; with this calculation, then, an R2 of .25 would be interpreted by saying that 25% of the variation in the dependent variable is explained by the independent variables in the model.
The square root of the coefficient of determination is known as the coefficient of MULTIPLE CORRELATION, or the sample CORRELATION coefficient in the bivariate case. Conversely, R2 is the square of the coefficient of multiple correlation, which is the measure of correlation between the estimated dependent variable calculated from the independent variables (ŷ^) and the actual dependent variable (y).
Although much consideration, perhaps sometimes too much, is often paid to this measure, there are a number of important points to remember when interpreting it. First, no matter how high your R2 is, it only is evidence of correlation; it does not provide positive support for causation. That is, a high R2 does not allow you to state that your independent variables caused your dependent variable. Second, although, for example, 30% is not an extremely high value, your model may be performing relatively well; explaining 30% of the variation of a factor in the political environment is often still a substantial portion. Also, when conducting time-series analysis, you may often find R2s over .90. Third, as independent variables are added to the model, the R2 will increase, but we should avoid trying to maximize the coefficient of correlation over theoretically sound models. Finally, R2s cannot be compared between any two given models if they do not have the same dependent variable.
...
- Analysis of Variance
- Association and Correlation
- Association
- Association Model
- Asymmetric Measures
- Biserial Correlation
- Canonical Correlation Analysis
- Correlation
- Correspondence Analysis
- Intraclass Correlation
- Multiple Correlation
- Part Correlation
- Partial Correlation
- Pearson's Correlation Coefficient
- Semipartial Correlation
- Simple Correlation (Regression)
- Spearman Correlation Coefficient
- Strength of Association
- Symmetric Measures
- Basic Qualitative Research
- Basic Statistics
- F Ratio
- N(n)
- t-Test
- X¯
- Y Variable
- z-Test
- Alternative Hypothesis
- Average
- Bar Graph
- Bell-Shaped Curve
- Bimodal
- Case
- Causal Modeling
- Cell
- Covariance
- Cumulative Frequency Polygon
- Data
- Dependent Variable
- Dispersion
- Exploratory Data Analysis
- Frequency Distribution
- Histogram
- Hypothesis
- Independent Variable
- Measures of Central Tendency
- Median
- Null Hypothesis
- Pie Chart
- Regression
- Standard Deviation
- Statistic
- Causal Modeling
- DISCOURSE/CONVERSATION ANALYSIS
- Econometrics
- Epistemology
- Ethnography
- Evaluation
- Event History Analysis
- Experimental Design
- Factor Analysis and Related Techniques
- Feminist Methodology
- Generalized Linear Models
- HISTORICAL/COMPARATIVE
- Interviewing in Qualitative Research
- Latent Variable Model
- LIFE HISTORY/BIOGRAPHY
- LOG-LINEAR MODELS (CATEGORICAL DEPENDENT VARIABLES)
- Longitudinal Analysis
- Mathematics and Formal Models
- Measurement Level
- Measurement Testing and Classification
- Multilevel Analysis
- Multiple Regression
- Qualitative Data Analysis
- Sampling in Qualitative Research
- Sampling in Surveys
- Scaling
- Significance Testing
- Simple Regression
- Survey Design
- Time Series
- ARIMA
- Box-Jenkins Modeling
- Cointegration
- Detrending
- Durbin-Watson Statistic
- Error Correction Models
- Forecasting
- Granger Causality
- Interrupted Time-Series Design
- Intervention Analysis
- Lag Structure
- Moving Average
- Periodicity
- Serial Correlation
- Spectral Analysis
- Time-Series Cross-Section (TSCS) Models
- Time-Series Data (Analysis/Design)
- Trend Analysis
Get a 30 day FREE TRIAL
-
Watch videos from a variety of sources bringing classroom topics to life
-
Read modern, diverse business cases
-
Explore hundreds of books and reference titles
Sage Recommends
We found other relevant content for you on other Sage platforms.
Have you created a personal profile? Login or create a profile so that you can save clips, playlists and searches