Entry
Reader's guide
Entries A-Z
Subject index
Validity
Writing a general statement about validity in evaluation is a hazardous business. The field of evaluation is so diverse and complex and it has such an array of models, approaches, forms, and disciplinary homes that generalizing about evaluation is invariably a potentially foolhardy enterprise. Moreover, this is historically disputed territory. Validity is related to truth. They are members of the same family, so to speak. Truth, however you cut it, is an essentially contested concept with little agreement about what constitutes the right basis for truth-seeking activities. Of course, much has already been said about validity. There are various classic explications of validity from some of evaluation's most notable theorists: Donald Campbell, Thomas Cook, Lee J. Cronbach, Egon Guba, Ernest House, Yvonna Lincoln, Michael Patton, Michael Scriven, and Robert Stake, to name a few.
Much of the justification for doing evaluation is that it can at least offer approximations to the truth and help discriminate between good and bad, better and worse, desirable and less desirable courses of action. It is not surprising, therefore, that one of the defining problems of the field of evaluation remains the validation of evaluative judgments. There are three main issues that have beset discussions about validity in evaluation. The first issue has to do with the nature and importance of generalization and the ways in which evaluation can and should support social decision making. In turn, this issue depends on the assumptions made about the objects of evaluation (practices, projects, programs, and policies), how they are theorized, and the political context of evaluation. The second issue is the extent to which nonmethodological considerations, such as fairness, social responsibility, and social consequence, should inform discussions about validity. The third issue, and seemingly the most intractable, is the extent to which it is possible to have a unified conception of validity in evaluation. Given the methodological and substantive diversity that now characterizes the transdisciplinary field of evaluation, is it possible to have common standards for and a shared discourse about validity? This issue is acutely felt in debates about whether the traditional discourse of scientific validity in quantitative research is relevant to qualitative approaches to evaluation.
The publication in 1963 of Donald Campbell and Julian Stanley's chapter “Experimental and Quasiexperimental Designs for Research on Teaching” is probably the single most significant landmark in the conceptualization of validity. This and Campbell's later work with Thomas Cook, Quasi-Experimentation: Design and Analysis Issues for Field Settings, published in 1979, were the touchstones for most, if not all, discussions about validity in evaluation and applied research more generally. Campbell and his colleagues introduced a nomenclature and framework for thinking about validity and constructing the research conditions needed to probe causal relationships. Central to this experimental tradition has been the development of strategies (designs) for controlling error and bias and eliminating plausible rival hypotheses or explanations.
Discussions about validity in the experimental tradition have resulted in a shared language about the threats to validity associated with different research designs and types of validity (e.g., internal validity, external validity, statistical conclusion validity, construct validity). The distinction between internal and external validity has been of particular importance in discussions about validity. Internal validity usually refers to the validity of inference from and confined to a particular study or domain of investigation. It is about the validity of statements or judgments about the case or cases under investigation. It addresses the question: Did the treatment make a difference in this experimental instance? By contrast, external validity refers to whether inferences or judgments from a study or domain of investigation apply to other populations, settings, or times. It is about whether findings generalize. In classic theories of experimental design, internal validity was taken to be the sine qua non of external validity. This was because establishing internal validity was regarded as the basic minimum for the interpretability of experiments. Thus generalization depended first and foremost on establishing that the findings of the study were true for practical and methodological purposes. The codification of various threats to validity has been central to the experimental tradition and has proved useful in the evaluation of social and educational intervention programs. Threats to validity indicate some of the prototypical rival hypotheses or alternative explanations for whether the program is in fact responsible for changes in outcomes and whether it will generalize. A list of threats to validity would often include the
...
- Concepts, Evaluation
- Personnel Evaluation
- Advocacy in Evaluation
- Evaluand
- Evaluation
- Evaluator
- Evaluator Roles
- External Evaluation
- Formative Evaluation
- Goal
- Grading
- Independence
- Internal Evaluation
- Judgment
- Logic of Evaluation
- Merit
- Metaevaluation
- Objectives
- Personnel Evaluation
- Process Evaluation
- Product Evaluation
- Program Evaluation
- Quality
- Ranking
- Standard Setting
- Standards
- Summative Evaluation
- Synthesis
- Value Judgment
- Values
- Worth
- Concepts, Methodological
- 360-Degree Evaluation
- Accountability
- Achievement
- Affect
- Analysis
- Applied Research
- Appraisal
- Appropriateness
- Assessment
- Audience
- Best Practices
- Black Box
- Capacity Building
- Client
- Client Satisfaction
- Consumer
- Consumer Satisfaction
- Control Conditions
- Cost
- Cost Effectiveness
- Criterion-Referenced Test
- Critique
- Cut Score
- Description
- Design Effects
- Dissemination
- Effectiveness
- Efficiency
- Feasibility
- Hypothesis
- Impact Assessment
- Implementation
- Improvement
- Indicators
- Inputs
- Inspection
- Interpretation
- Intervention
- Interviewing
- Literature Review
- Longitudinal Studies
- Measurement
- Modus Operandi
- Most Significant Change Technique
- Norm-Referenced Tests
- Opportunity Costs
- Outcomes
- Outputs
- Peer Review
- Performance Indicator
- Performance Program
- Personalizing Evaluation
- Rapport
- Reactivity
- Reliability
- Sampling
- Score Card
- Secondary Analysis
- Services
- Setting
- Significance
- Situational Responsiveness
- Social Indicators
- Sponsor
- Stakeholder Involvement
- Treatments
- Triangulation
- Concepts, Philosophical
- Verstehen
- Aesthetics
- Ambiguity
- Amelioration
- Argument
- Authenticity
- Authority of Evaluation
- Bias
- Conclusions, Evaluative
- Consequential Validity
- Construct Validity
- Context
- Credibility
- Criteria
- Difference Principle
- Empiricism
- Epistemology
- Equity
- External Validity
- Falsifiability
- Generalization
- Hermeneutics
- Inference
- Internal Validity
- Interpretation
- Interpretivism
- Logical Positivism
- Meaning
- Means-End Relations
- Moral Discourse
- Objectivity
- Ontology
- Paradigm
- Pareto Optimal
- Pareto Principle
- Phenomenology
- Point of View
- Positivism
- Postmodernism
- Postpositivism
- Praxis
- Probative Logic
- Proxy Measure
- Rationality
- Relativism
- Subjectivity
- Tacit Knowledge
- Trustworthiness
- Understanding
- Validity
- Value-Free Inquiry
- Values
- Veracity
- Concepts, Social Science
- Capitalism
- Chaos Theory
- Constructivism
- Critical Incidents
- Deconstruction
- Dialogue
- Disenfranchised
- Experimenting Society
- Feminism
- Great Society Programs
- Ideal Type
- Inclusion
- Lesbian, Gay, Bisexual, and Transgender Issues in Evaluation
- Minority Issues in Evaluation
- Persuasion
- Policy Studies
- Politics of Evaluation
- Qualitative-Quantitative Debate in Evaluation
- Social Class
- Social Context
- Social Justice
- Ethics and Standards
- The Program Evaluation Standards
- Certification
- Communities of Practice (CoPs)
- Confidentiality
- Conflict of Interest
- Ethical Agreements
- Ethics
- Guiding Principles for Evaluators
- Honesty
- Human Subjects Protection
- Impartiality
- Informed Consent
- Licensure
- Profession of Evaluation
- Propriety
- Public Welfare
- Reciprocity
- Social Justice
- Teaching Evaluation
- Evaluation and Approaches
- Accreditation
- Action Research
- Appreciative Inquiry
- Artistic Evaluation
- Auditing
- CIPP Model (Concept, Input, Process, Product)
- Cluster Evaluation
- Community-Based Evaluation
- Connoisseurship
- Cost-Benefit Analysis
- Countenance Model of Evaluation
- Critical Theory Evaluation
- Culturally Responsive Evaluation
- Deliberative Democratic Evaluation
- Democratic Evaluation
- Developmental Evaluation
- Empowerment Evaluation
- Evaluative Inquiry
- Experimental Design
- Feminist Evaluation
- Fourth-Generation Evaluation
- Goal-Free Evaluation
- Illuminative Evaluation
- Inclusive Evaluation
- Institutional Self-Evaluation
- Judicial Model of Evaluation
- Kirkpatrick Four-Level Evaluation Model
- Logic Model
- Models of Evaluation
- Multicultural Evaluation
- Naturalistic Evaluation
- Objectives-Based Evaluation
- Participatory Action Research (PAR)
- Participatory Evaluation
- Participatory Monitoring and Evaluation
- Quasiexperimental Design
- Realist Evaluation
- Realistic Evaluation
- Responsive Evaluation
- Success Case Method
- Transformative Paradigm
- Utilization-Focused Evaluation
- Evaluation Practice around the World, Stories
- Evaluation Planning
- Evaluation Theory
- Laws and Legislation
- Organizations
- Abt Associates
- Active Learning Network for Accountability and Performance in Humanitarian Action (ALNAP)
- American Evaluation Association (AEA)
- American Institutes for Research (AIR)
- Buros Institute
- Center for Instructional Research and Curriculum Evaluation (CIRCE)
- Center for Research on Evaluation, Standards, and Student Testing (CRESST)
- Center for the Study of Evaluation (CSE)
- Centers for Disease Control and Prevention (CDC)
- Centre for Applied Research in Education (CARE)
- ERIC Clearinghouse on Assessment and Evaluation
- Evaluation Center, The
- Evaluation Research Society (ERS)
- Evaluators' Institute™, The
- General Accounting Office (GAO)
- International Development Evaluation Association (IDEAS)
- International Development Research Center (IDRC)
- International Organization for Cooperation in Evaluation (IOCE)
- International Program in Development Evaluation Training (IPDET)
- Joint Committee on Standards for Educational Evaluation
- Mathematica Policy Research
- MDRC
- National Assessment of Educational Progress (NAEP)
- National Institutes of Health (NIH)
- National Science Foundation (NSF)
- Organisation for Economic Co-operation and Development (OECD)
- Performance Assessment Resource Centre (PARC)
- Philanthropic Evaluation
- RAND Corporation
- Research Triangle Institute (RTI)
- United States Agency of International Development (USAID)
- Urban Institute
- Westat
- WestEd
- World Bank
- World Conservation Union (IUCN)
- People
- Abma, Tineke A.
- Adelman, Clem
- Albæk, Erik
- Alkin, Marvin C.
- Altschuld, James W.
- Bamberger, Michael J.
- Barrington, Gail V.
- Bhola, H. S.
- Bickel, William E.
- Bickman, Leonard
- Bonnet, Deborah G.
- Boruch, Robert
- Brisolara, Sharon
- Campbell, Donald T.
- Campos, Jennie
- Chalmers, Thomas
- Chelimsky, Eleanor
- Chen, Huey-Tsyh
- Conner, Ross
- Cook, Thomas D.
- Cooksy, Leslie
- Cordray, David
- Cousins, J. Bradley
- Cronbach, Lee J.
- Dahler-Larsen, Peter
- Datta, Lois-ellin
- Denny, Terry
- Eisner, Elliot
- Engle, Molly
- Farrington, David
- Fetterman, David M.
- Fitzpatrick, Jody L.
- Forss, Kim
- Fournier, Deborah M.
- Freeman, Howard E.
- Frierson, Henry T.
- Funnell, Sue
- Georghiou, Luke
- Glass, Gene V
- Grasso, Patrick G.
- Greene, Jennifer C.
- Guba, Egon G.
- Hall, Budd L.
- Hastings, J. Thomas
- Haug, Peder
- Henry, Gary T.
- Hood, Stafford L.
- Hopson, Rodney
- House, Ernest R.
- Hughes, Gerunda B.
- Ingle, Robert
- Jackson, Edward T.
- Julnes, George
- King, Jean A.
- Kirkhart, Karen
- Konrad, Ellen L.
- Kushner, Saville
- Leeuw, Frans L.
- Levin, Henry M.
- Leviton, Laura
- Light, Richard J.
- Lincoln, Yvonna S.
- Lipsey, Mark W.
- Lundgren, Ulf P.
- Mabry, Linda
- MacDonald, Barry
- Madison, Anna Marie
- Mark, Melvin M.
- Mathison, Sandra
- Mertens, Donna M.
- Millet, Ricardo A.
- Moos, Rudolf H.
- Morell, Jonathan A.
- Morris, Michael
- Mosteller, Frederick
- Narayan, Deepa
- Nathan, Richard
- Nevo, David
- Newcomer, Kathryn
- Newman, Dianna L.
- O'Sullivan, Rita
- Owen, John M.
- Patel, Mahesh
- Patton, Michael Quinn
- Pawson, Ray
- Pollitt, Christopher
- Porteous, Nancy L.
- Posavac, Emil J.
- Preskill, Hallie
- Reichardt, Charles S. (Chip)
- Rist, Ray C.
- Rog, Debra J.
- Rogers, Patricia J.
- Rossi, Peter H.
- Rugh, Jim
- Russon, Craig W.
- Ryan, Katherine E.
- Sanders, James R.
- Scheirer, Mary Ann
- Schwandt, Thomas A.
- Scriven, Michael
- Shadish, William R.
- Shulha, Lyn M.
- Simons, Helen
- Smith, M. F.
- Smith, Nick L.
- Stake, Robert E.
- Stanfield, John II
- Stanley, Julian C.
- Stufflebeam, Daniel L.
- Tilley, Nick
- Torres, Rosalie T.
- Toulemonde, Jacques
- Trochim, William
- Tyler, Ralph W.
- VanderPlaat, Madine
- Wadsworth, Yoland
- Walberg, Herbert J.
- Walker, Rob
- Weiss, Carol Hirschon
- Whitmore, Elizabeth
- Wholey, Joseph S.
- Wildavsky, Aaron B.
- Worthen, Blaine R.
- Wye, Christopher G.
- Publications
- American Journal of Evaluation
- Evaluation & the Health Professions
- Evaluation and Program Planning
- Evaluation Review: A Journal of Applied Social Research
- Evaluation: The International Journal of Theory, Research and Practice
- New Directions for Evaluation (NDE)
- Practical Assessment, Research on Evaluation (PARE)
- The Personnel Evaluation Standards
- The Program Evaluation Standards
- EvalTalk
- Guiding Principles for Evaluators
- Qualitative Methods
- Archives
- Checklists
- Comparative Analysis
- Constant Comparative Method
- Content Analysis
- Cross-Case Analysis
- Deliberative Forums
- Delphi Technique
- Document Analysis
- Emergent Design
- Emic Perspective
- Ethnography
- Etic Perspective
- Fieldwork
- Focus Group
- Gendered Evaluation
- Grounded Theory
- Group Interview
- Key Informants
- Mixed Methods
- Narrative Analysis
- Natural Experiments
- Negative Cases
- Observation
- Participant Observation
- Phenomenography
- Portfolio
- Portrayal
- Qualitative Data
- Rapid Rural Appraisal
- Reflexivity
- Rival Interpretations
- Thick Description
- Think-Aloud Protocol
- Unique-Case Analysis
- Unobtrusive Measures
- Quantitative Methods
- Aggregate Matching
- Backward Mapping
- Benchmarking
- Concept Mapping
- Correlation
- Cross-Sectional Design
- Errors of Measurement
- Fault Tree Analysis
- Field Experiment
- Matrix Sampling
- Meta-analysis
- Multitrait-Multimethod Analysis
- Panel Studies
- Pre-Post Design
- Quantitative Data
- Quantitative Weight and Sum
- Regression Analysis
- Standardized Test
- Statistics
- Surveys
- Time Series Analysis
- Representation, Reporting, Communicating
- Systems
- Technology
- Utilization
- Loading...
Get a 30 day FREE TRIAL
-
Watch videos from a variety of sources bringing classroom topics to life
-
Read modern, diverse business cases
-
Explore hundreds of books and reference titles
Sage Recommends
We found other relevant content for you on other Sage platforms.
Have you created a personal profile? Login or create a profile so that you can save clips, playlists and searches