Face validity is a test of internal validity. As the name implies, it asks a very simple question: “On the face of things, do the investigators reach the correct conclusions?” It requires investigators to step outside of their current research context and assess their observations from a commonsense perspective. A typical application of face validity occurs when researchers obtain assessments from current or future individuals who will be directly affected by programs premised on their research findings. An example of testing for face validity is the assessment of a proposed new patient tracking system by obtaining observations from local community health care providers who will be responsible for implementing the program and getting feedback on how they think the new program may work in their centers.
What follows is a brief discussion on how face validity fits within the overall context of validity tests. Afterward, documentation of face validity's history is reviewed. Here, early criticisms of face validity are addressed that set the stage for how and why the test returned as a valued assessment. This discussion of face validity concludes with some recent applications of the test.
To better understand the value and application of face validity, it is necessary to first set the stage for what validity is. Validity is commonly defined as a question: “To what extent do the research conclusions provide the correct answer?” In testing the validity of research conclusions, one looks at the relationship of the purpose and context of the research project to the research conclusions. Validity is determined by testing (questions of validity) research observations against what is already known in the world, giving the phenomenon that researchers are analyzing the chance to prove them wrong. All tests of validity are context-specific and are not an absolute assessment. Tests of validity are divided into two broad realms: external validity and internal validity. Questions of external validity look at the generalizability of research conclusions. In this case, observations generated in a research project are assessed on their relevance to other, similar situations. Face validity falls within the realm of internal validity assessments. A test of internal validity asks if the researcher draws the correct conclusion based on the available data. These types of assessments look into the nuts-and-bolts of an investigation (for example, looking for sampling error or researcher bias) to see if the research project was legitimate.
For all of its simplicity, the test for face validity has had an amazing and dramatic past that, until recently, has re-emerged as a valued and respected test of validity. In its early applications, face validity was used by researchers as [Page 472]a first-step assessment, in concert with other tests, to assess the validity of an analysis. During the 1940s and 1950s, face validity was used by psychologists when they were in the early stages of developing tests for use in selecting industrial and military personnel. It was soon widely used by many different types of researchers in different types of investigations, resulting in confusion on what actually constituted face validity. Quickly, the confusion over the relevance of face validity gave way to its being rejected by researchers in the 1960s, who took to new and more complex tests of validity.
Discussions surrounding face validity were revived in 1985 by Baruch Nevo's seminal article “Face Validity Revisited,” which focused on clearing up some of the confusion surrounding the test and challenging researchers to take another, more serious look at face validity's applications. Building on Nevo's research, three questions can be distinguished in the research validity literature that have temporarily prevented face validity from getting established as a legitimate test of validity (see Table 1).
The first question regarding face validity is over the legitimacy of the test itself. Detractors argue that face validity is insignificant because its observations are not based on any verifiable testing procedure yielding only rudimentary observations about a study. Face validity does not require a systematic method in the obtaining of face validity observations. They conclude that the only use for face validity observations is for public relations statements.
Advocates for face validity see that face validity provides researchers with the opportunity for commonsense testing of research results: “After the investigation is completed and all the tests of validity and reliability are done, does this study make sense?” Here, tests of face validity allow investigators a new way to look at their conclusions to make sure they see the forest for the trees, with the forest being common sense and the trees being all of the different tests of validity used in documenting the veracity of their study.[Page 473]
The second question confuses the value of face validity by blurring the applications of face validity with content validity. The logic here is that both tests of validity are concerned with content and the representativeness of the study. Content validity is the extent to which the items identified in the study reflect the domain of the concept being measured. Because content validity and face validity both look at the degree to which the intended range of meanings in the concepts of the study appear to be covered, once a study has content validity, it will automatically have face validity. After testing for content validity, there is no real need to test for face validity.
The other side to this observation is that content validity should not be confused with face validity because they are completely different tests. The two tests of validity are looking at different parts of the research project. Content validity is concerned with the relevance of the identified research variables within a proposed research project, whereas face validity is concerned with the relevance of the overall completed study. Face validity looks at the overall commonsense assessment of a study. In addition to the differences between the two tests of validity in terms of what they assess, other researchers have identified a sequential distinction between content validity and face validity. Content validity is a test that should be conducted before the data-gathering stage of the research project is started, whereas face validity should be applied after the investigation is carried out. The sequential application of the two tests is intuitively logical because content validity focuses on the appropriateness of the identified research items before the investigation has started, whereas face validity is concerned with the overall relevance of the research findings after the study has been completed.
The third question surrounding face validity asks a procedural question: Who is qualified to provide face validity observations—experts or laypersons? Proponents for the “experts-only” approach to face validity believe that experts who have a substantive knowledge about a research topic and a good technical understanding of tests of validity provide constructive insights from outside of the research project. In this application of face validity, experts provide observations that can help in the development and/or fine-tuning of research projects. Laypersons lack technical research skills and can provide only impressionistic face validity observations, which are of little use to investigators.
Most researchers now see that the use of experts in face validity assessments is more accurately understood as being a test of content validity because they provide their observations at the start or middle of a research project, and face validity focuses on assessing the relevance of research conclusions. Again, content validity should be understood sequentially in relation to face validity, with the former being used to garner expert observations on the relevance of research variables in the earlier parts of the investigation from other experts in the field, and face validity should come from laypersons for their commonsense assessment at the completion of the research project.
The large-scale vista that defines face validity, defines the contribution this assessment provides to the research community, also provides its Achilles heel. Face validity lacks the depth, precision, and rigor of inquiry that comes with both internal and external validity tests. For example, in assessing the external validity of a survey research project, one can precisely look at the study's sample size to determine if it has a representative sample of the population. The only question face validity has for a survey research project is a simple one: “Does the study make sense?” For this reason, face validity can never be a stand-alone test of validity.
The renewed interest in face validity is part of the growing research practice of integrating laypersons’ nontechnical, one-of-a-kind insights into the evaluation of applied research projects. Commonly known as obtaining an emic viewpoint, testing for face validity provides the investigator the opportunity to learn what many different people affected by a proposed program already know about a particular topic. The goal in this application of face validity is to include the experiential perspectives of people affected by research projects in their assessment of what causes events to happen, what the effects of the study in the community may be, and what specific words or events mean in the community.[Page 474]
The following examples show how researchers use face validity assessments in very different contexts, but share the same goal: obtaining a commonsense assessment from persons affected by research conclusions. Michael Quinn Patton is widely recognized for his use of “internal evaluators” to generate face validity observations in the evaluation of programs. In the Hazelden Foundation of Minnesota case study, he describes his work in providing annual evaluations based on the foundation's data of tracking clients who go through its program. At the completion of the annual evaluation, a team of foundation insider evaluators then participates in the evaluation by assessing the data and conclusions made in the reports.
Face validity assessments are commonly used in applied research projects that include the fields of community development, planning, public policy, and macro social work. In planning, face validity observations are obtained during scheduled public hearings throughout the planning process. The majority of planning research is based on artificial constructs of reality that allow planners to understand complex, multivariable problems (e.g., rush-hour traffic). One of the reasons that planners incorporate citizen input into the planning process is that it allows them to discover the “inside perspective” from the community on how their research and proposed plans may affect their day-to-day lives. A street-widening project in Lincoln, Nebraska, is one example of how a city used face validity in its planning process. A central traffic corridor was starting to experience higher levels of rush-hour congestion as the result of recent growth on the city's edge. Knowing that simply widening the street to accommodate more vehicles could affect area businesses adversely, city planners met with local store owners to get their face validity observations of how the street affected their daily operations. Armed with traffic data and face validity observations of local store owners, the city was able to plan a wider street that took into account both traffic commuters’ and area businesses’ experiences with the street.