## Entry

## Reader's guide

## Entries A-Z

## Subject index

# Logistic Regression

Logistic regression is a statistical method to test for associations, or relationships, between variables. Like all regression analyses, logistic regression is a predictive analysis where a model is tested to find out whether the value of one variable, or the combination of values of multiple variables, can predict the value of another variable. The distinguishing feature of logistic regression is that the dependent (also called outcome or response) variable is categorical. This entry first describes the method and the concepts of causal inference and biological plausibility. It then discusses positive and negative associations and the odds ratio and provides an example of the use of logistic regression analysis to determine whether depression increases the risk of older people needing home help. The entry concludes by reviewing some assumptions and sources of error in logistic regression.

In binary logistic regression, which is the most common type of logistic regression, the dependent variable is binary or dichotomous. That means that there can only be two options for its value. For example, yes/no, pass/fail, alive/dead, satisfied/unsatisfied, and so on. In logistic regressions where there are more than two categories for the dependent variable, a less common multinomial logistic regression test is needed.

The dependent variable is the thing you are trying to explain or predict. There can be one or multiple independent (also called predictor or explanatory) variables tested in your model, and these can be either discrete variables (including dichotomous or ordinal), or they can be continuous (interval) variables. The term dependent suggests that this variable is dependent upon the status of the independent or predictor variable(s). As with all regression analyses, when there are multiple independent variables in a model, you are testing the predictive ability of each independent variable while controlling for the effects of other predictors. In logistic regression, the results lead to an estimation of the change in probability or odds of the outcome event occurring with a change in the value of the independent variable(s) relative to the probability or odds of the outcome event occurring given no change in the predictor variables. The results are not as easily interpreted as the results of a linear regression analysis, where the level of the outcome can be predicted from the predictor variables.

In logistic regression, the odds of the outcome of interest occurring for one unit change in the predictor variables is given in relation to the null hypothesis or equal odds. Equal odds is represented by an odds ratio value of 1.0. An increase in odds of the outcome occurring is indicated by an odds ratio value of greater than 1.0, and a decrease in the odds of the outcome occurring is indicated by an odds ratio value of less than 1.0. Statistically significant odds ratios are an indication of an association existing between the variables. The further the odds ratio number is from 1.0, the greater or stronger the association.

An example of a logistic regression inquiry can be: Does the value of x (independent variable) change the likelihood of y (dependent variable) being “yes” (rather than “no”)? For example, does eating bread crusts increase the likelihood of having curly hair (rather than straight hair)? In this case, a statistically significant odds value of greater than 1.0 would indicate that eating bread crusts does increase the chance of hair being curly.

Logistic regression can also indicate the strength of this predictive relationship by providing a value for the increased or decreased odds of the outcome occurring for a given change in the predictor variable. In our example, if the odds ratio is only a little bit greater than 1.0, then eating crusts only slightly increases the likelihood of having curly hair, and other factors are probably more important. However, if the odds ratio is a lot greater than 1.0, then eating bread crusts makes a really big difference to your chances.

In another example, the odds of being obese among children watching 11–20 hours of TV per week compared with children watching ≤10 hours of TV per week is around 1.4. That is, children are 1.4 times more likely to be obese in the 11–20 hours group than in the ≤10 hours group—a rather modest 40% increase. However, for children watching more than 30 hours of TV per week, the odds of being obese is around 3.6 or 3.6 times the odds than for children watching TV for ≤10 hours. Note that the dependent variable is obese versus not obese, and the independent variable is TV watching per week in hours categorized into ≤10 hours, 11–20 hours, 21–30 hours, and >30 hours. The reference group in the model is the ≤10 hours per week group, and therefore the odds of being obese among children watching ≤10 hours of TV per week is assumed to be 1.0. The odds of obesity in the other groups is given relative to the odds for the reference group and hence called an odds ratio. So far the examples only have one independent variable. With multiple independent variables, you can see the relative importance of the predictors. That is, which of the variables in the model is the strongest predictor of the outcome?

Note that logistic regression does not tell you the actual likelihood or odds of an outcome in an individual. The results give the probability of the outcome occurring with 1 unit value higher of the predictor variable compared with the probability given the original value of the variable. Nor can logistic regression be used to determine whether a variable causes an increased or decreased probability of the outcome.

### Causal Inference

The term dependent does not suggest that the independent variable(s) cause the outcome. This is a very important concept to understand when interpreting the results of regression analyses. In our example, if you found an association between eating bread crusts and curly hair, you cannot conclude that eating bread crusts causes curly hair. Similarly, it cannot be known from the TV watching data whether it is the increased TV watching that causes the increased likelihood of obesity. In fact, in this example, the causal relationship is likely to be complex, multifactorial, and possibly bidirectional. That is, there may be an element of higher body mass index (BMI) causing children to choose more sedentary behaviors. The possible reasons for a finding of increased odds include

- A direct causal relationship exists. Eating crusts does in fact make your hair grow curly.
- A reverse causal relation exists. Having curly hair makes you eat more bread crusts.
- An indirect causal pathway. People who eat more bread crusts are more likely to have curly hair, but the causal pathway is more complex. For example, eating more bread crusts makes you drink more water and drinking more water makes your hair go curly.
- A third factor is associated with both predictor and outcome variables. For example, bread crust eating tends to be higher in people who eat more bread, and it is bread that causes hair to be curly.
- There is no relationship between eating bread crusts and having curly hair and the finding was purely coincidence. This is a false positive or a Type I error.

The point is that finding an association does not tell you which of these possible reasons for the association is true. If you find an association between two variables, you cannot assume that the predictor variable caused the increased odds of the outcome occurring. From the results, you may be able to suggest an explanation, but you need to test your new hypothesis with another type of experimental design. A significant association in a regression analysis does not necessarily indicate a causal relationship.

### Biological Plausibility

This leads on to the concept of biological plausibility, or “Does this explanation make reasonable or logical sense?” Of course, in reality you should not find an association between eating bread crusts and curly hair because there is no biologically plausible rationale why eating bread crusts would make your hair curly. If you use a p value cutoff of <.05 for statistical significance in your logistic regression analyses, then 5 in every 100 relationships tested, where no relationship exists, will be statistically significant and simply a random chance finding. For this reason, it is important to use logistic regression to test only biologically plausible theories rather than to analyze all the combinations available in the data and then try to subsequently explain the significant relationships found. In other words, it is important to have a research question with a stated hypothesis or expectation before any analyses are carried out.

### Positive and Negative Associations and Increased or Decreased Odds

Whenever a logistic regression analysis identifies an association, the association may be either positive or negative. This tells you about the direction of the association or whether the factor increases or decreases the likelihood of the outcome of interest. In terms of odds, a positive association produces an odds ratio of greater than 1.0, and a negative association produces an odds ratio of less than 1.0. For this reason, it is important to be aware of how you code your outcome variable for the analysis.

Typically, statistical software packages will provide the odds for the outcome coded with the higher value compared with the outcome coded with the lower value. Thus, if you coded obesity with “one” and normal weight with “zero,” the odds will be for the probability of having obesity. In the example of higher grades at school increasing the odds of the student going on to tertiary education, if enrolling in tertiary education is coded “one” and not enrolling in tertiary education is coded “zero,” the association would be positive and the odds would be >1.0. However, if enrolling in tertiary education is coded “one” and not enrolling in tertiary education is coded “two,” the association would be negative and the odds would be <1.0. The results have the same interpretation. The odds ratio values are simply the inverse of each other. The odds of enrolling in tertiary education are better for students with higher grades and worse for students with lower grades. The difference in direction of the association and value of the odds ratio is simply due to the coding.

### Example of Logistic Regression Analysis

Let’s use the following example to help explain the results of a logistic regression analysis. An analysis tested the hypothesis that depression increases the risk of older people needing home help. The model has one independent (predictor) variable, depression, and a dichotomous dependent (outcome) variable, home help. Depression scores can range from 0 (no depression) to 21. Home help can either be 1 (yes) or 0 (no).

### Unstandardized Coefficient or B Value

In this model, the unstandardized coefficient (B value) was .15. The B value is similar to the B value in a linear regression analysis and can be used in a predictive equation. However, in logistic regression, the equation predicts the probability of a case falling into the desired category rather than the value for the outcome variable. In this case, the B value is positive; therefore, higher depression scores (if significant) are associated with greater likelihood of needing home help.

### Standardized Odds Ratio, Exp(B), or β Value

The β value is the exponential of the B value, or the odds ratio, and because it is standardized, its magnitude can be considered relative to the magnitude of the β value(s) for other variable(s) in the model or for variable(s) in other models. The β value is the point estimate of the strength of the association. The further away the β value is from 1.0, the stronger the association. In the depression versus home help example, the β value is 1.16 for depression. That means the odds of needing home help are 1.16 times higher for someone reporting one point more on the depression scale than for a person with a depression score one point lower.

### Significance

A p value of .05 is most commonly selected as the cutoff level to signify the statistical significance of an odds ratio. The cutoff value doesn’t have to be .05, and there may be reasons why you choose a cutoff value that is more (e.g., <.01) or less (e.g., <.1) stringent. A p value of <.05 means that there is a 5% chance of the association not being a true association and purely down to chance or coincidence or that there is 95% confidence of a true association existing between the two variables. Thus, if there is an association between two variables with a p value of .08, there is an 8% chance that a true association does not exist, which is generally considered unacceptably high.

The p value for the depression versus home help example was <.001. Therefore, we can be more than 99.9% sure that there is a true association between depression and home help (although we cannot assume that depression causes people to need home help).

### Confidence Interval

The confidence interval is another way of expressing likelihood an association truly exists. The β value is the point estimate of the odds ratio, whereby odds of 1.0 means that a one increment change in the independent variable does not increase or decrease the probability that the dependent variable will be in the category of interest. If the 95% confidence interval includes 1.0, there is a greater than 5% chance that a true relationship between the variables does not exist. For example, a 95% confidence interval of [0.93, 3.76] includes an odds ratio estimate of 1.0 and therefore we cannot say with confidence that a true association exists. However, confidence intervals give more information than just statistical significance and therefore more information than p values.

The 95% confidence interval for the β values is the range within which we can be 95% confident that the true β value lies for your population of interest based on the information from your sample. Thus, while the β value gives the point estimate of the odds ratio and therefore an indication of how much greater or lesser the odds of the outcome is, the 95% confidence interval provides an estimation of the precision of your point estimate. In the example, the true odds is likely to be somewhere between 0.93 and 3.76. This is a wide range of possible values, so the estimation of the odds is considered imprecise. And while the data do not support there being an association, it would be foolish to conclude that no association exists.

The 95% confidence interval for the odds ratio for home help with an increase in depression score was [1.12, 1.19]. That means we can be 95% confident that the true value for the population is between 1.12 and 1.19. This is only a small increase in odds, but a very precise finding thanks to the large sample size available.

The p value and width of the confidence interval is highly influenced by the sample size and homogeneity of the sample. In other words, if you have a very large sample with a wide spread of values, then your p value is more likely to be smaller, your confidence interval narrow, and your point estimate is likely to be closer to the true value for the population. In the depression and home help study, there were data available from over 6,000 people which enabled such a precise estimate of the odds ratio.

The sample size can influence the p value and the precision estimate (confidence interval) but does not influence the strength of the association (point estimate) apart from the possibility of it being closer to the true population value with greater sample sizes.

### Assumptions and Sources of Error

There are a number of assumptions and sources of error in a logistic regression analysis that should be considered. Logistic regression can handle ordinal and nominal data as independent variables as well as continuous (interval or ratio scaled) data. Binary logistic regression requires the dependent variable to be binary. Ordinal or interval data can be reduced to a dichotomous level but doing this loses a lot of information, which may make this test inferior compared to ordinal logistic regression or linear regression in these cases.

In regression analyses, it is good to have a wide range of values of the independent variable(s) in the analysis sample. If the sample includes only a small portion of the range of possible values for one or more of the independent variables, you might not get a very accurate indication of their relationship with the dependent variable. Certainly you will have limited generalizability of the results.

Models do not need to have linear relationships between the dependent and independent variables. Logistic regression can handle all sorts of relationships because it applies a nonlinear log transformation to the predicted odds ratio. The independent variables do not need to be normally distributed—although multivariate normality yields a more stable solution. Also the error terms (the residuals) do not need to be normally distributed.

As explained earlier, because logistic regression assumes that the odds ratio is the probability of the event occurring given a change in the independent variable, it is necessary that the dependent variable is coded accordingly for the event of interest. That is, for a binary regression, the higher factor level of the dependent variable should represent the desired outcome or outcome of interest.

Adding independent variables to a logistic regression model will always increase its statistical validity because it will always explain a bit more variance of the outcome. However, adding more and more variables to the model makes it inefficient and over fitting can occur. Only include as many variables as needed for your research question/hypothesis. That is, only the meaningful variables should be included. But you should try and include all meaningful variables, and this requires a good knowledge of the field of inquiry and deep consideration of the research question and hypothesis and is likely to be the most challenging part of a logistic regression analysis.

Logistic regression requires each observation to be independent, that is, that the data points should not be from any dependent samples design, such as before-after measurements, or matched pairings. The model should have little or no multicollinearity. That is, the independent variables should be pretty much independent from each other. As long as correlation coefficients among independent variables are less than .90, the assumption can be considered met. There is, however, the option to include interaction effects of categorical variables in the analysis.

Logistic regression assumes linearity of independent variables and log odds. Although it does not require the dependent and independent variables to be related linearly, it requires that the independent variables are linearly related to the log odds. Otherwise, the test underestimates the strength of the relationship and rejects the relationship too easily (i.e., indicating there are not significant results or not rejecting the null hypothesis) when the relationship is significant. A possible solution to this problem is the categorization of the independent variables. That is transforming interval variables to ordinal level and then including them in the model. An example of this is to transform BMI values into ordinal categories of underweight (BMI < 20), normal weight (BMI = 20–25), overweight (BMI >25 but ≤30), and obese (BMI >30).

Large sample sizes are important. Maximum likelihood estimates are less powerful than ordinary least squares (used for simple and multivariable linear regression). Ordinary least squares analysis needs at least five cases per independent variable in the analysis; however, maximum likelihood estimates need at least 10 cases per independent variable, and some statisticians recommend at least 30 cases for each parameter to be estimated. Odds ratios are most accurate if the outcome rate in the sample closely approximates the outcome rate in the population. There should be no outliers in the data. The presence of outliers can be assessed by converting the continuous predictors to standardized, or z scores, and removing values below −3.29 or greater than 3.29.

See also Multiple Linear Regression; Odds Ratio

### Further Readings

- Assessment
- Assessment Issues
- Standards for Educational and Psychological Testing
- Accessibility of Assessment
- Accommodations
- African Americans and Testing
- Asian Americans and Testing
- Cheating
- Ethical Issues in Testing
- Gender and Testing
- High-Stakes Tests
- Latinos and Testing
- Minority Issues in Testing
- Second Language Learners, Assessment of
- Test Security
- Testwiseness

- Assessment Methods
- Ability Tests
- Achievement Tests
- Adaptive Behavior Assessments
- Admissions Tests
- Alternate Assessments
- Aptitude Tests
- Attenuation, Correction for
- Attitude Scaling
- Basal Level and Ceiling Level
- Benchmark
- Buros Mental Measurements Yearbook
- Classification
- Cognitive Diagnosis
- Computer-Based Testing
- Computerized Adaptive Testing
- Confidence Interval
- Curriculum-Based Assessment
- Diagnostic Tests
- Difficulty Index
- Discrimination Index
- English Language Proficiency Assessment
- Formative Assessment
- Intelligence Tests
- Interquartile Range
- Minimum Competency Testing
- Mood Board
- Personality Assessment
- Power Tests
- Progress Monitoring
- Projective Tests
- Psychometrics
- Reading Comprehension Assessments
- Screening Tests
- Self-Report Inventories
- Sociometric Assessment
- Speeded Tests
- Standards-Based Assessment
- Summative Assessment
- Technology-Enhanced Items
- Test Battery
- Testing, History of
- Tests
- Value-Added Models
- Written Language Assessment

- Classroom Assessment
- Authentic Assessment
- Backward Design
- Bloom’s Taxonomy
- Classroom Assessment
- Constructed-Response Items
- Curriculum-Based Measurement
- Essay Items
- Fill-in-the-Blank Items
- Formative Assessment
- Game-Based Assessment
- Grading
- Matching Items
- Multiple-Choice Items
- Paper-and-Pencil Assessment
- Performance-Based Assessment
- Portfolio Assessment
- Rubrics
- Second Language Learners, Assessment of
- Selection Items
- Student Self-Assessment
- Summative Assessment
- Supply Items
- Technology in Classroom Assessment
- True-False Items
- Universal Design of Assessment

- Item Response Theory
- Reliability
- Scores and Scaling
- T Scores
- Z Scores
- Age Equivalent Scores
- Analytic Scoring
- Automated Essay Evaluation
- Criterion-Referenced Interpretation
- Decile
- Grade-Equivalent Scores
- Guttman Scaling
- Holistic Scoring
- Intelligence Quotient
- Interval-Level Measurement
- Ipsative Scales
- Levels of Measurement
- Lexiles
- Likert Scaling
- Multidimensional Scaling
- Nominal-Level Measurement
- Norm-Referenced Interpretation
- Normal Curve Equivalent Score
- Ordinal-Level Measurement
- Percentile Rank
- Primary Trait Scoring
- Propensity Scores
- Quartile
- Rankings
- Rating Scales
- Reverse Scoring
- Scales
- Score Reporting
- Semantic Differential Scaling
- Standardized Scores
- Stanines
- Thurstone Scaling
- Visual Analog Scales
- W Difference Scores

- Standardized Tests
- Achievement Tests
- ACT
- Bayley Scales of Infant and Toddler Development
- Beck Depression Inventory
- Dynamic Indicators of Basic Early Literacy Skills
- Educational Testing Service
- Iowa Test of Basic Skills
- Kaufman-ABC Intelligence Test
- Minnesota Multiphasic Personality Inventory
- National Assessment of Educational Progress
- Partnership for Assessment of Readiness for College and Careers
- Peabody Picture Vocabulary Test
- Programme for International Student Assessment
- Progress in International Reading Literacy Study
- Raven’s Progressive Matrices
- SAT
- Smarter Balanced Assessment Consortium
- Standardized Tests
- Standards-Based Assessment
- Stanford-Binet Intelligence Scales
- Summative Assessment
- Torrance Tests of Creative Thinking
- Trends in International Mathematics and Science Study
- Wechsler Intelligence Scales
- Woodcock-Johnson Tests of Achievement
- Woodcock-Johnson Tests of Cognitive Ability
- Woodcock-Johnson Tests of Oral Language

- Validity
- Concurrent Validity
- Consequential Validity Evidence
- Construct Irrelevance
- Construct Underrepresentation
- Content-Related Validity Evidence
- Criterion-Based Validity Evidence
- Measurement Invariance
- Multicultural Validity
- Multitrait–Multimethod Matrix
- Predictive Validity
- Sensitivity
- Social Desirability
- Specificity
- Test Bias
- Unitary View of Validity
- Validity
- Validity Coefficients
- Validity Generalization
- Validity, History of

- Assessment Issues
- Cognitive and Affective Variables
- Data Visualization Methods
- Disabilities and Disorders
- Distributions
- Educational Policies
- Brown v. Board of Education
- Adequate Yearly Progress
- Americans with Disabilities Act
- Coleman Report
- Common Core State Standards
- Corporal Punishment
- Every Student Succeeds Act
- Family Educational Rights and Privacy Act
- Great Society Programs
- Health Insurance Portability and Accountability Act
- Inclusion
- Individualized Education Program
- Individuals With Disabilities Education Act
- Least Restrictive Environment
- No Child Left Behind Act
- Policy Research
- Race to the Top
- School Vouchers
- Special Education Identification
- Special Education Law
- State Standards

- Evaluation Concepts
- Evaluation Designs
- Appreciative Inquiry
- CIPP Evaluation Model
- Collaborative Evaluation
- Consumer-Oriented Evaluation Approach
- Cost–Benefit Analysis
- Culturally Responsive Evaluation
- Democratic Evaluation
- Developmental Evaluation
- Empowerment Evaluation
- Evaluation Capacity Building
- Evidence-Centered Design
- External Evaluation
- Feminist Evaluation
- Formative Evaluation
- Four-Level Evaluation Model
- Goal-Free Evaluation
- Internal Evaluation
- Needs Assessment
- Participatory Evaluation
- Personnel Evaluation
- Policy Evaluation
- Process Evaluation
- Program Evaluation
- Responsive Evaluation
- Success Case Method
- Summative Evaluation
- Utilization-Focused Evaluation

- Human Development
- Instrument Development
- Accreditation
- Alignment
- Angoff Method
- Body of Work Method
- Bookmark Method
- Construct-Related Validity Evidence
- Content Analysis
- Content Standard
- Content Validity Ratio
- Curriculum Mapping
- Cut Scores
- Ebel Method
- Equating
- Instructional Sensitivity
- Item Analysis
- Item Banking
- Item Development
- Learning Maps
- Modified Angoff Method
- Norming
- Proficiency Levels in Language
- Readability
- Score Linking
- Standard Setting
- Table of Specifications
- Vertical Scaling

- Organizations and Government Agencies
- American Educational Research Association
- American Evaluation Association
- American Psychological Association
- Educational Testing Service
- Institute of Education Sciences
- Interstate School Leaders Licensure Consortium Standards
- Joint Committee on Standards for Educational Evaluation
- National Council on Measurement in Education
- National Science Foundation
- Office of Elementary and Secondary Education
- Organisation for Economic Co-operation and Development
- Partnership for Assessment of Readiness for College and Careers
- Smarter Balanced Assessment Consortium
- Teachers’ Associations
- U.S. Department of Education
- World Education Research Association

- Professional Issues
- Diagnostic and Statistical Manual of Mental Disorders
- Guiding Principles for Evaluators
- Standards for Educational and Psychological Testing
- Accountability
- Certification
- Classroom Observations
- Compliance
- Confidentiality
- Conflict of Interest
- Data-Driven Decision Making
- Educational Researchers, Training of
- Ethical Issues in Educational Research
- Ethical Issues in Evaluation
- Ethical Issues in Testing
- Evaluation Consultants
- Federally Sponsored Research and Programs
- Framework for Teaching
- Licensure
- Professional Development of Teachers
- Professional Learning Communities
- School Psychology
- Teacher Evaluation
- Teachers’ Associations

- Publishing
- Qualitative Research
- Auditing
- Delphi Technique
- Discourse Analysis
- Document Analysis
- Ethnography
- Field Notes
- Focus Groups
- Grounded Theory
- Historical Research
- Interviewer Bias
- Interviews
- Market Research
- Member Check
- Narrative Research
- Naturalistic Inquiry
- Participant Observation
- Qualitative Data Analysis
- Qualitative Research Methods
- Transcription
- Trustworthiness

- Research Concepts
- Applied Research
- Aptitude-Treatment Interaction
- Causal Inference
- Data
- Ecological Validity
- External Validity
- File Drawer Problem
- Fraudulent and Misleading Data
- Generalizability
- Hypothesis Testing
- Impartiality
- Interaction
- Internal Validity
- Objectivity
- Order Effects
- Representativeness
- Response Rate
- Scientific Method
- Type III Error

- Research Designs
- ABA Designs
- Action Research
- Case Study Method
- Causal-Comparative Research
- Cross-Cultural Research
- Crossover Design
- Design-Based Research
- Double-Blind Design
- Experimental Designs
- Gain Scores, Analysis of
- Latin Square Design
- Meta-Analysis
- Mixed Methods Research
- Monte Carlo Simulation Studies
- Nonexperimental Designs
- Pilot Studies
- Posttest-Only Control Group Design
- Pre-experimental Designs
- Pretest–Posttest Designs
- Quasi-Experimental Designs
- Regression Discontinuity Analysis
- Repeated Measures Designs
- Single-Case Research
- Solomon Four-Group Design
- Split-Plot Design
- Static Group Design
- Time Series Analysis
- Triple-Blind Studies
- Twin Studies
- Zelen’s Randomized Consent Design

- Research Methods
- Classroom Observations
- Cluster Sampling
- Control Variables
- Convenience Sampling
- Debriefing
- Deception
- Expert Sampling
- Judgment Sampling
- Markov Chain Monte Carlo Methods
- Quantitative Research Methods
- Quota Sampling
- Random Assignment
- Random Selection
- Replication
- Simple Random Sampling
- Snowball Sampling
- Stratified Random Sampling
- Survey Methods
- Systematic Sampling
- Weighting

- Research Tools
- Amos
- ATLAS.ti
- BILOG-MG
- Bubble Drawing
- C Programming Languages
- Collage Technique
- Computer Programming in Quantitative Analysis
- Concept Mapping
- EQS
- Excel
- FlexMIRT
- HLM
- HyperResearch
- IRTPRO
- Johari Window
- Kelly Grid
- LISREL
- Mplus
- National Assessment of Educational Progress
- NVivo
- PARSCALE
- Programme for International Student Assessment
- Progress in International Reading Literacy Study
- R
- SAS
- SPSS
- Stata
- Surveys
- Trends in International Mathematics and Science Study
- UCINET

- Social and Ethical Issues
- 45 CFR Part 46
- Accessibility of Assessment
- Accommodations
- Accreditation
- African Americans and Testing
- Alignment
- Asian Americans and Testing
- Assent
- Belmont Report
- Cheating
- Confidentiality
- Conflict of Interest
- Corporal Punishment
- Cultural Competence
- Deception in Human Subjects Research
- Declaration of Helsinki
- Dropouts
- Ethical Issues in Educational Research
- Ethical Issues in Evaluation
- Ethical Issues in Testing
- Falsified Data in Large-Scale Surveys
- Flynn Effect
- Fraudulent and Misleading Data
- Gender and Testing
- High-Stakes Tests
- Human Subjects Protections
- Human Subjects Research, Definition of
- Informed Consent
- Institutional Review Boards
- ISO 20252
- Latinos and Testing
- Minority Issues in Testing
- Nuremberg Code
- Second Language Learners, Assessment of
- Service-Learning
- Social Justice
- STEM Education

- Social Network Analysis
- Statistics
- Bayesian Statistics
- Statistical Analyses
- t Tests
- Analysis of Covariance
- Analysis of Variance
- Binomial Test
- Canonical Correlation
- Chi-Square Test
- Cluster Analysis
- Cochran Q Test
- Confirmatory Factor Analysis
- Cramér’s V Coefficient
- Descriptive Statistics
- Discriminant Function Analysis
- Exploratory Factor Analysis
- Fisher Exact Test
- Friedman Test
- Goodness-of-Fit Tests
- Hierarchical Regression
- Inferential Statistics
- Kolmogorov-Smirnov Test
- Kruskal-Wallis Test
- Levene’s Homogeneity of Variance Test
- Logistic Regression
- Mann-Whitney Test
- Mantel-Haenszel Test
- McNemar Change Test
- Measures of Central Tendency
- Measures of Variability
- Median Test
- Mixed Model Analysis of Variance
- Multiple Linear Regression
- Multivariate Analysis of Variance
- Part Correlations
- Partial Correlations
- Path Analysis
- Pearson Correlation Coefficient
- Phi Correlation Coefficient
- Repeated Measures Analysis of Variance
- Simple Linear Regression
- Spearman Correlation Coefficient
- Standard Error of Measurement
- Stepwise Regression
- Structural Equation Modeling
- Survival Analysis
- Two-Way Analysis of Variance
- Two-Way Chi-Square
- Wilcoxon Signed Ranks Test

- Statistical Concepts
- p Value
- R2
- Alpha Level
- Autocorrelation
- Bonferroni Procedure
- Bootstrapping
- Categorical Data Analysis
- Central Limit Theorem
- Conditional Independence
- Convergence
- Correlation
- Data Mining
- Dummy Variables
- Effect Size
- Estimation Bias
- Eta Squared
- Gauss-Markov Theorem
- Holm’s Sequential Bonferroni Procedure
- Kurtosis
- Latent Class Analysis
- Local Independence
- Longitudinal Data Analysis
- Matrix Algebra
- Mediation Analysis
- Missing Data Analysis
- Multicollinearity
- Odds Ratio
- Parameter Invariance
- Parameter Mean Squared Error
- Parameter Random Error
- Post Hoc Analysis
- Power
- Power Analysis
- Probit Transformation
- Residuals
- Robust Statistics
- Sample Size
- Significance
- Simpson’s Paradox
- Skewness
- Standard Deviation
- Type I Error
- Type II Error
- Type III Error
- Variance
- Winsorizing

- Statistical Models

- Teaching and Learning
- Active Learning
- Andragogy
- Bilingual Education, Research on
- College Success
- Constructivist Approach
- Cooperative Learning
- Curriculum
- Distance Learning
- Dropouts
- Evidence-Based Interventions
- Framework for Teaching
- Head Start
- Homeschooling
- Instructional Objectives
- Instructional Rounds
- Kindergarten
- Kinesthetic Learning
- Laddering
- Learning Progressions
- Learning Styles
- Learning Theories
- Literacy
- Mastery Learning
- Montessori Schools
- Out-of-School Activities
- Pygmalion Effect
- Quantitative Literacy
- Reading Comprehension
- Scaffolding
- School Leadership
- Self-Directed Learning
- Service-Learning
- Social Learning
- Socio-Emotional Learning
- STEM Education
- Waldorf Schools

- Theories and Conceptual Frameworks
- g Theory of Intelligence
- Ability–Achievement Discrepancy
- Andragogy
- Applied Behavior Analysis
- Attribution Theory
- Behaviorism
- Cattell–Horn–Carroll Theory of Intelligence
- Classical Conditioning
- Classical Test Theory
- Cognitive Neuroscience
- Constructivist Approach
- Data-Driven Decision Making
- Debriefing
- Educational Psychology
- Educational Research, History of
- Emotional Intelligence
- Epistemologies, Teacher and Student
- Experimental Phonetics
- Feedback Intervention Theory
- Framework for Teaching
- Generalizability Theory
- Grounded Theory
- Improvement Science Research
- Information Processing Theory
- Instructional Theory
- Item Response Theory
- Learning Progressions
- Learning Styles
- Learning Theories
- Mastery Learning
- Multiple Intelligences, Theory of
- Naturalistic Inquiry
- Operant Conditioning
- Paradigm Shift
- Phenomenology
- Positivism
- Postpositivism
- Pragmatic Paradigm
- Premack Principle
- Punishment
- Reinforcement
- Response to Intervention
- School-Wide Positive Behavior Support
- Scientific Method
- Self-Directed Learning
- Social Cognitive Theory
- Social Learning
- Socio-Emotional Learning
- Speech-Language Pathology
- Terman Study of the Gifted
- Transformative Paradigm
- Triarchic Theory of Intelligence
- True Score
- Unitary View of Validity
- Universal Design in Education
- Wicked Problems
- Zone of Proximal Development

- Threats to Research Validity

- Loading...