Entry
Reader's guide
Entries A-Z
Subject index
Validity Theory
The concept of validity is one of the most influential concepts in science because considerations about its nature and scope influence everything from the design to the implementation and application of scientific research. Validity is not an abstract property of any observable, unobservable, or conceptual phenomenon, such as a measurement instrument, a personality trait, or a study design. Rather, validity is a characteristic of the inferences that are drawn about phenomena by human agents and the actions that result from these inferences. Specifically, validity is always a matter of degree and not absolutes. This stems partly from the fact that validity is not an observable characteristic of inferences and actions but something that has to be inferred also.
The evaluation of the degree to which inferences are valid and resulting actions are justifiable is, therefore, necessarily embedded in a social discourse whose participants typically bring to the table diverse frameworks, assumptions, beliefs, and values about what constitutes credible evidence. Specifically, modern frameworks for validity typically list both rational and empirical pieces of evidence as necessary, but in each individual context, what these pieces should look like is open to debate. Put differently, a coherent statement about the validity of inferences and actions requires negotiation as well as consensus and places multiple responsibilities on the stakeholders who develop such a statement.
Negotiating Validity
A metaphor may illustrate complications that can arise in a discourse about validity. If an educational assessment is viewed as the construction of a house, inferences are markers of the utility of the house. In this sense, an evaluation of the validity of inferences can be viewed as an evaluation of the degree to which the house provides structural support for the purposes that are envisioned for it. Obviously, the parties who are envisioning a certain use of the house are not necessarily the same as the designers or builders of the house, and so discrepancies can arise easily. Of course, other reasons for a mismatch are possible and could stem from a miscommunication between the designers of the house and the users of the house or from a faulty implementation of the design plans for the house. In a sense, the search for inferences that can be supported can be viewed as the search for how a house can be transformed into a home.
In general, the stakeholders in an assessment can be coarsely viewed as belonging to four complementary groups. First, there are the test developers, who create a research program, a framework, or an instrument under multiple considerations, such as theoretical adequacy and feasibility for practical implementation. Second are the examinees, whose needs in the process are typically more practical and may differ quite substantially from those of the other stakeholders involved. Third are the test users, or the decision makers who utilize the scores and diagnostic information from the assessment to make decisions about the examinees; only rarely are the examinees the only decision makers involved. Fourth are the larger scientific and nonscientific communities to which the results of an assessment program are to be communicated and whose needs are a mélange of those of the test developers, the test users, and the examinees. Therefore, determining the degree to which inferences and actions are justifiable is situated in the communicative space among these different stakeholders.
Not surprisingly, examples of problems in determining the validity of inferences abound. For example, the inferences that test users may want to draw from a certain assessment administered to a certain population may be more commensurate with an alternative assessment for a slightly different population. However, that is not a faulty characteristic of the assessment itself. Rather, it highlights the difference between the agents who make inferences and the agents who provide a foundation for a certain set of inferences, of which the desired inferences may not be a member.
Historical Developments of Validity Theories
Until well into the 1970s, validity theory presented itself as the coexistent, though largely unrelated, trinity of criterion-based, content-based, and construct-based conceptions of validity. According to the criterion-based approach, the validity of an assessment could be evaluated in terms of the accuracy with which a test score could predict or estimate the value of a defined criterion measure, usually an observable performance measure. The criterion-based model, notably introduced by Edward L. Thorndike at the beginning of the 20th century, owed much of its lingering popularity to an undisputable usefulness in many applied contexts that involve selection decisions or prognostic judgments, such as hiring and placement decisions in the workplace or medical and forensic prognoses. Depending on whether the criterion is assessed at the same time as the test or at a subsequent time, one can distinguish between concurrent and predictive validity, respectively. Though a number of sophisticated analytical and statistical techniques have been developed to evaluate the criterion validity of test scores, the standard methods applied were simple regression and correlation analyses. The resulting coefficient was labeled validity coefficient. Occasionally, these procedures were supplemented by the known-groups method. This approach bases validity statements on a comparison of mean test scores between groups with hypothesized extreme values (e.g., devout churchgoers and members of sex-chat forums on the Internet on a newly developed sexual permissiveness scale).
The content-based model of validity comes into play when a well-defined and undisputed criterion measure is not readily available, especially when the prediction is targeted at a broader and multifaceted criterion (e.g., achievement in a content area like mathematics). An argument for content validity is usually established through a panel of experts, who evaluate the test content in terms of (a) relevance and (b) representativeness for the content area under scrutiny. Not surprisingly, the vagueness and subjectivity of the evaluation process has led many psychometricians to discount the content-based model as satisfying facevalidity requirements at best. However, modern proponents of the content-based model have applied a wealth of sophisticated quantitative procedures to ensure and evaluate interrater agreement, thereby trying to lend credibility to otherwise qualitative and judgment-based validity evidence.
Shortcomings of the criterion-based and the content-based models of validity incited the American Psychological Association to set forth technical recommendations for justifying interpretations of psychological tests. As a result of this endeavor, the term construct validity was coined and later elaborated by Lee J. Cronbach and Paul Meehl. In the beginning, they tied their validity theory closely to a more general and abstract nomological network, which was described in 1952 by Carl G. Hempel in his classic essay Fundamentals of Concept Formation in Empirical Science. Metaphorically and graphically, the constructs are represented by knots, and the threads connecting these knots represent the definitions and hypotheses included in the theory. The whole system, figuratively speaking, “floats” above the plane of observation and is connected to it by “strings,” or rules of interpretation. The complex system of theoretical definitions can be used to formulate theoretical hypotheses, which can, in turn, be used to formulate empirical hypotheses about relationships among observable variables. In this framework, validity is not a characteristic of a construct or its observed counterpart but of the interpretation of defined logical relations of a causal nature that function to semantically circumscribe a theoretical network of constructs and construct relations.
An obvious epistemological problem arises, however, when the observed relationships are inconsistent with theory, which is exacerbated by the dearth of developed formal theories in many psychological and social science domains. This lack of strong theory led Cronbach to coin the phrases “weak program” and “strong program” of construct validity. He cautions that, without solid theory (i.e., with only a weak program of construct validity), every correlation of the construct under development with any other observed attribute or variable could be accepted as validity evidence. Consequently, in the absence of any coordinated argument, validation research would then resemble more an empirical shotgun procedure than a scientific program.
Such problems notwithstanding, by the 1980s, the notion of construct validity became accepted as the basis for a new framework of validity assessment that is characterized by its unifying nature. The unifying aspect stems primarily from the acknowledgment that interpretive elements like assumptions and value judgments are pervasive when measuring psychological entities and, thus, are unavoidable in any discourse about any aspect of validity. As Samuel Messick, the most prominent proponent of a unified theory of validity, has framed it, “The validation process is scientific as well as rhetorical and requires both evidence and argument.”
The most controversial aspect of the unified concept of validity as developed by Messick pertains to the role of consequences in the validation process. In this view, a validity argument must specifically consider and evaluate the social consequences of test interpretation and test use, which are describable only on the basis of social values. Importantly, his notion of social consequences does not refer merely to test misuse but, specifically, to the unanticipated consequences of legitimate test score interpretation and use. A number of critics reject his idea that evidential and consequential aspects of construct validity cannot be separated, but despite this debate and recent clarifications on the meaning of the consequential aspects, the question of value justification within a unified validity approach persists.
Philosophical Challenges
The unification of validity theory under a constructivist paradigm has challenged the prevailing implicit and explicit philosophical realism that many applied social scientists had hitherto followed in their practical measurement endeavors. In philosophical realism, a test's task was to accurately measure an existing entity and not to question whether such an entity existed in the first place (an ontological question) or whether it could be assessed at all (an epistemological question). In the constructivist view, it is not a test that is validated but its interpretation (i.e., the inferences that are drawn from a test score). Therefore, it is insufficient to operationalize validity through a single validity coefficient. Rather, validation takes the form of an open-ended argument that evaluates the overall plausibility of the proposed test score interpretations from multiple facets. Currently, the strengthening of cognitive psychology principles in construct validation as described by Susan Embretson and Joanna Gorin, for example, appears to be one of the most promising avenues for developing validity theory toward a more substantive theory that can truly blend theoretical models with empirical observations. Models with genesis in cognitive psychology enable one to disentangle and understand the processes that respondents engage in when they react to test items and to highlight the test instrument as an intervention that can be used to search for causal explanations, an argument that was developed recently in detail by Borsboom, Mellenbergh, and van Heerden.
Perspectives for the Future
To comprehensively develop a unified theory of validity in the social sciences, a lot more must be accomplished besides a synthesis of the evidential and consequential bases of test interpretation and use. In particular, a truly unified theory of validity would be one that crosses methodological boundaries and builds on the foundations that exist in other disciplines and subdisciplines. Most prominently, consider the threats-to-validity approach for generalized causal inferences from experimental and quasi-experimental designs, the closely related validity generalization approach by virtue of meta-analytical techniques, and the long tradition of validity concepts in qualitative research. In the end, it may be best to acknowledge that validity itself is a complex construct that also needs to be validated every once in a while.
Further Reading
- Biographies
- Babbage, Charles
- Bernoulli, Jakob
- Bonferroni, Carlo Emilio
- Bruno, James Edward
- Comrey, Andrew L.
- Cronbach, Lee J.
- Darwin, Charles
- Deming, William Edwards
- Fisher, Ronald Aylmer
- Galton, Sir Francis
- Gauss, Carl Friedrich
- Gresham, Frank M.
- Jackson, Douglas N.
- Malthus, Thomas
- Markov, Andrei Andreevich
- Pascal, Blaise
- Pearson, Karl
- Poisson, Siméon Denis
- Reynolds, Cecil R.
- Torrance, E. Paul
- Wilcoxon, Frank
- Charts, Graphs, and Visual Displays
- Computer Topics and Tools
- Concepts and Issues in Measurement
- Standards for Educational and Psychological Testing
- T Scores
- z Scores
- Ability Tests
- Achievement Tests
- Alternate Assessment
- Americans with Disabilities Act
- Anthropometry
- Aptitude Tests
- Artificial Neural Network
- Asymmetry of g
- Attitude Tests
- Basal Age
- Categorical Variable
- Classical Test Theory
- Coefficient Alpha
- Completion Items
- Computerized Adaptive Testing
- Construct Validity
- Content Validity
- Criterion Validity
- Criterion-Referenced Test
- Cronbach, Lee J.
- Curriculum-Based Measurement
- Diagnostic Validity
- Educational Testing Service
- Equivalence Testing
- Essay Items
- Ethical Issues in Testing
- Face Validity
- Gf-Gc Theory of Intelligence
- Guttman Scaling
- Health Insurance Portability and Accountability Act
- High-Stakes Tests
- Immediate and Delayed Memory Tasks
- Individuals with Disabilities Education Act
- Information Referenced Testing
- Informed Consent
- Intelligence Quotient
- Intelligence Tests
- Internal Review Board
- Interrater Reliability
- Interval Level of Measurement
- Ipsative Measure
- Item and Test Bias
- Item Response Theory
- KR-20 and KR-21
- Likert Scaling
- Measurement
- Measurement Error
- Metric Multidimensional Scaling
- Multiple-Choice Items
- Multitrait Multimethod Matrix and Construct Validity
- Nomothetic Versus Idiographic
- Ordinal Level of Measurement
- Parallel Forms Reliability
- Performance IQ
- Performance-Based Assessment
- Personality Tests
- Portfolio Assessment
- Predictive Validity
- Projective Testing
- Q Methodology
- Questionnaires
- Ratio Level of Measurement
- Reliability Theory
- Response to Intervention
- Reverse Scaling
- Scaling
- Section 504 of the Rehabilitation Act of 1973
- Self-Report
- Semantic Differential
- Semantic Differential Scale
- Six Sigma
- Spearman's Rho
- Split Half Reliability
- Standard Error of Measurement
- Standard Scores
- Test-Retest Reliability
- Thurstone Scaling
- Torrance, E. Paul
- True/False Items
- Validity Coefficient
- Validity Theory
- Verbal IQ
- Concepts and Issues in Statistics
- Artificial Neural Network
- Attenuation, Correction for
- Autocorrelation
- Bayesian Statistics
- Bioinformatics
- Central Limit Theorem
- Decision Theory
- Diggle-Kenward Model for Dropout
- DISTATIS
- Exploratory Factor Analysis
- Factorial Design
- Fourier Transform
- Generalized Additive Model
- Generalized Method of Moments
- Generalized Procrustes Analysis
- Graphical Statistical Methods
- Hierarchical Linear Modeling
- Historiometrics
- Logistic Regression Analysis
- Loglinear Analysis
- Markov Chain Monte Carlo Methods
- Matrix Operations
- Mean
- Measurement Error
- Mixtures of Experts
- Nonparametric Statistics
- Propensity Scores
- Rasch Measurement Model
- Regression Analysis
- Sampling Distribution of a Statistic
- Signal Detection Theory
- Simpson's Paradox
- Spurious Correlation
- Standard Error of the Mean
- Standard Scores
- Support Vector Machines
- Survival Analysis
- Type I Error
- Type II Error
- Data and Data Reduction Techniques
- Descriptive Statistics
- Arithmetic Mean
- Attenuation, Correction for
- Autocorrelation
- Average
- Average Deviation
- Bayley Scales of Infant Development
- Biserial Correlation Coefficient
- Class Interval
- Coefficients of Correlation, Alienation, and Determination
- Cognitive Psychometric Assessment
- Cohen's Kappa
- Correlation Coefficient
- Cumulative Frequency Distribution
- Deviation Score
- Difference Score
- Estimates of the Population Median
- Fisher's Z Transformation
- Frequency Distribution
- Galton, Sir Francis
- Grand Mean
- Harmonic Mean
- Histogram
- Kendall Rank Correlation
- Mean
- Measures of Central Tendency
- Median
- Mode
- Moving Average
- Parameter
- Parameter Invariance
- Part and Partial Correlations
- Pearson Product-Moment Correlation Coefficient
- Pearson, Karl
- Percentile and Percentile Rank
- Scattergram
- Semi-Interquartile Range
- Spurious Correlation
- Standard Deviation
- Survey Weights
- Text Analysis
- Evaluation
- Experimental Methods
- Standards for Educational and Psychological Testing
- Alternative Hypothesis
- American Statistical Association
- Americans with Disabilities Act
- Association for Psychological Science
- Basic Research
- Bioinformatics
- Complete Independence Hypothesis
- Continuous Variable
- Critical Value
- Data Collection
- Data Mining
- Delphi Technique
- Dependent Variable
- Descriptive Research
- Ethical Issues in Testing
- Ethical Principles in the Conduct of Research With Human Participants
- Fractional Randomized Block Design
- Hello-Goodbye Effect
- Hypothesis and Hypothesis Testing
- Independent Variable
- Informed Consent
- Instrumental Variables
- Internal Review Board
- Longitudinal/Repeated Measures Data
- Meta-Analysis
- Missing Data Method
- Mixed Models
- Mixture Models
- Moderator Variable
- Monte Carlo Methods
- Null Hypothesis Significance Testing
- Ockham's Razor
- Pairwise Comparisons
- Post Hoc Comparisons
- Projective Testing
- Quasi-Experimental Method
- Sample Size
- Section 504 of the Rehabilitation Act of 1973
- Significance Level
- Simple Main Effect
- Simulation Experiments
- Single-Subject Designs
- Statistical Significance
- Suppressor Variable
- Variable
- Variable Deletion
- Variance
- Inferential Statistics
- Akaike Information Criterion
- Analysis of Covariance (ANCOVA)
- Analysis of Variance (ANOVA)
- Bayes Factors
- Bayesian Information Criterion
- Binomial Test
- Bonferroni, Carlo Emilio
- Complete Independence Hypothesis
- Data Analysis ToolPak
- Exploratory Factor Analysis
- Factorial Design
- Fisher, Ronald Aylmer
- Hierarchical Linear Modeling
- Hypothesis and Hypothesis Testing
- Inferential Statistics
- Logistic Regression Analysis
- Markov, Andrei Andreevich
- Null Hypothesis Significance Testing
- Pairwise Comparisons
- Part and Partial Correlations
- Repeated Measures Analysis of Variance
- Type I Error
- Type II Error
- Wilcoxon, Frank
- Organizations and Publications
- Journal of Modern Applied Statistical Methods
- Journal of Statistics Education
- Journal of the American Statistical Association
- Abstracts
- American Doctoral Dissertations
- American Psychological Association
- American Statistical Association
- Association for Psychological Science
- Buros Institute of Mental Measurements
- Centers for Disease Control and Prevention
- Educational Testing Service
- National Science Foundation
- Psychometrics
- PsycINFO
- Society for Research in Child Development
- Prediction and Estimation
- Attributable Risk
- Bernoulli, Jakob
- Chance
- Conditional Probability
- Confidence Intervals
- Continuous Variable
- Curse of Dimensionality
- Decision Boundary
- Decision Theory
- File Drawer Problem
- Gambler's Fallacy
- Generalized Estimating Equations
- Law of Large Numbers
- Maximum Likelihood Method
- Nonprobability Sampling
- Pascal, Blaise
- Probability Sampling
- Random Numbers
- Relative Risk
- Signal Detection Theory
- Significance Level
- Three-Card Method
- Probability
- Qualitative Methods
- Samples, Sampling, and Distributions
- Acceptance Sampling
- Adaptive Sampling Design
- Age Norms
- Attrition Bias
- Career Maturity Inventory
- Central Limit Theorem
- Class Interval
- Cluster Sampling
- Confidence Intervals
- Convenience Sampling
- Cumulative Frequency Distribution
- Data Collection
- Diggle-Kenward Model for Dropout
- Gauss, Carl Friedrich
- Heteroscedasticity and Homoscedasticity
- Homogeneity of Variance
- Hypergeometric Distribution
- Kurtosis
- Malthus, Thomas
- Multicollinearity
- Multivariate Normal Distribution
- Nonprobability Sampling
- Normal Curve
- Ogive
- Parameter
- Percentile and Percentile Rank
- Poisson Distribution
- Poisson, Siméon Denis
- Posterior Distribution
- Prior Distribution
- Probability Sampling
- Quota Sampling
- Random Sampling
- Sample
- Sample Size
- Semi-Interquartile Range
- Simpson's Rule
- Skewness
- Smoothing
- Stanine
- Stratified Random Sampling
- Unbiased Estimator
- Statistical Techniques
- k-Means Cluster Analysis
- t Test for Two Population Means
- Binomial Distribution/Binomial and Sign Tests
- Bivariate Distributions
- Bonferroni Test
- Bowker Procedure
- Causal Analysis
- Centroid
- Chance
- Chi-Square Test for Goodness of Fit
- Chi-Square Test for Independence
- Classification and Regression Tree
- Cochran Q Test
- Cohen's Kappa
- Delta Method
- Dimension Reduction
- Discriminant Analysis
- Dissimilarity Coefficient
- Dixon Test for Outliers
- Dunn's Multiple Comparison Test
- Eigendecomposition
- Eigenvalues
- EM Algorithm
- Exploratory Data Analysis
- Factor Analysis
- Factor Scores
- Fisher Exact Probability Test
- Fisher's LSD
- Friedman Test
- Goodness-of-Fit Tests
- Grounded Theory
- Kolmogorov-Smirnov Test for One Sample
- Kolmogorov-Smirnov Test for Two Samples
- Kruskal-Wallis One-Way Analysis of Variance
- Latent Class Analysis
- Likelihood Ratio Test
- Lilliefors Test for Normality
- Mann-Whitney U Test (Wilcoxon Rank-Sum Test)
- McNemar Test for Significance of Changes
- Median Test
- Meta-Analysis
- Multiple Comparisons
- Multiple Factor Analysis
- Multiple Imputation for Missing Data
- Multivariate Analysis of Variance (MANOVA)
- Newman-Keuls Test
- O'Brien Test for Homogeneity of Variance
- Observational Studies
- One-Way Analysis of Variance
- Page's L Test
- Paired Samples t Test (Dependent Samples t Test)
- Path Analysis
- Peritz Procedure
- Scan Statistic
- Shapiro-Wilk Test for Normality
- Structural Equation Modeling
- Tests of Mediating Effects
- Three-Card Method
- Tukey-Kramer Procedure
- Wilcoxon Signed Ranks Test
- Statistical Tests
- t Test for Two Population Means
- Analysis of Covariance (ANCOVA)
- Analysis of Variance (ANOVA)
- Behrens-Fisher Test
- Binomial Distribution/Binomial and Sign Tests
- Binomial Test
- Bonferroni Test
- Bowker Procedure
- Chi-Square Test for Goodness of Fit
- Chi-Square Test for Independence
- Classification and Regression Tree
- Cochran Q Test
- Dixon Test for Outliers
- Dunn's Multiple Comparison Test
- Excel Spreadsheet Functions
- Fisher Exact Probability Test
- Fisher's LSD
- Friedman Test
- Goodness-of-Fit Tests
- Kolmogorov-Smirnov Test for One Sample
- Kolmogorov-Smirnov Test for Two Samples
- Kruskal-Wallis One-Way Analysis of Variance
- Latent Class Analysis
- Likelihood Ratio Test
- Lilliefors Test for Normality
- Mann-Whitney U Test (Wilcoxon Rank-Sum Test)
- McNemar Test for Significance of Changes
- Median Test
- Multiple Comparisons
- Multivariate Analysis of Variance (MANOVA)
- Newman-Keuls Test
- O'Brien Test for Homogeneity of Variance
- One- and Two-Tailed Tests
- One-Way Analysis of Variance
- Page's L Test
- Paired Samples t Test (Dependent Samples t Test)
- Peritz Procedure
- Repeated Measures Analysis of Variance
- Shapiro-Wilk Test for Normality
- Tests of Mediating Effects
- Tukey-Kramer Procedure
- Wilcoxon Signed Ranks Test
- Tests by Name
- Adjective Checklist
- Alcohol Use Inventory
- Armed Forces Qualification Test
- Armed Services Vocational Aptitude Battery
- Basic Personality Inventory
- Bayley Scales of Infant Development
- Beck Depression Inventory
- Behavior Assessment System for Children
- Bender Visual Motor Gestalt Test
- Bracken Basic Concept Scale–Revised
- California Psychological Inventory
- Career Assessment Inventory
- Career Development Inventory
- Career Maturity Inventory
- Carroll Depression Scale
- Children's Academic Intrinsic Motivation Inventory
- Clinical Assessment of Attention Deficit
- Clinical Assessment of Behavior
- Clinical Assessment of Depression
- Cognitive Abilities Test
- Cognitive Psychometric Assessment
- Comrey Personality Scales
- Coping Resources Inventory for Stress
- Culture Fair Intelligence Test
- Differential Aptitude Test
- Ecological Momentary Assessment
- Edwards Personal Preference Schedule
- Embedded Figures Test
- Fagan Test of Infant Intelligence
- Family Environment Scale
- Gerontological Apperception Test
- Goodenough Harris Drawing Test
- Graduate Record Examinations
- Holden Psychological Screening Inventory
- Illinois Test of Psycholinguistic Abilities
- Information Systems Interaction Readiness
- Internal External Locus of Control Scale
- International Assessment of Educational Progress
- Iowa Tests of Basic Skills
- Iowa Tests of Educational Development
- Jackson Personality Inventory–Revised
- Jackson Vocational Interest Survey
- Kaufman Assessment Battery for Children
- Kinetic Family Drawing Test
- Kingston Standardized Cognitive Assessment
- Kuder Occupational Interest Survey
- Laboratory Behavioral Measures of Impulsivity
- Law School Admissions Test
- Life Values Inventory
- Luria Nebraska Neuropsychological Battery
- Male Role Norms Inventory
- Matrix Analogies Test
- Millon Behavioral Medicine Diagnostic
- Millon Clinical Multiaxial Inventory-III
- Minnesota Clerical Test
- Minnesota Multiphasic Personality Inventory
- Multidimensional Aptitude Battery
- Multiple Affect Adjective Checklist–Revised
- Myers-Briggs Type Indicator
- NEO Personality Inventory
- Neonatal Behavioral Assessment Scale
- Peabody Picture Vocabulary Test
- Personal Projects Analysis
- Personality Assessment Inventory
- Personality Research Form
- Piers-Harris Children's Self-Concept Scale
- Preschool Language Assessment Instrument
- Profile Analysis
- Projective Hand Test
- Quality of Well-Being Scale
- Raven's Progressive Matrices
- Roberts Apperception Test for Children
- Rorschach Inkblot Test
- Sixteen Personality Factor Questionnaire
- Social Climate Scales
- Social Skills Rating System
- Spatial Learning Ability Test
- Stanford Achievement Test
- Stanford-Binet Intelligence Scales
- Strong Interest Inventory
- Stroop Color and Word Test
- Structured Clinical Interview for DSM-IV
- System of Multicultural Pluralistic Assessment
- Thematic Apperception Test
- Torrance Tests of Creative Thinking
- Torrance Thinking Creatively in Action and Movement
- Universal Nonverbal Intelligence Test
- Vineland Adaptive Behavior Scales
- Vineland Social Maturity Scale
- Wechsler Adult Intelligence Scale
- Wechsler Individual Achievement Test
- Wechsler Preschool and Primary Scale of Intelligence
- West Haven-Yale Multidimensional Pain Inventory
- Woodcock Johnson Psychoeducational Battery
- Woodcock Reading Mastery Tests Revised
- Loading...