Skip to main content icon/video/no-internet

Four-Level Evaluation Model

The four-level model of training evaluation criteria is a classic framework originally developed by Donald Kirkpatrick for evaluation of organizational training but subsequently extended to many other educational contexts. The four levels in the model are reaction, learning, behavior, and results criteria. Each subsequent level of criteria provides increasingly valued information for evaluation of training effectiveness, yet the difficulty of obtaining information also increases with each subsequent level. In addition to traditional use in business and not-for-profit organizations, Kirkpatrick’s model has been applied to understanding educational effectiveness in schools, colleges, and universities and other contexts such as camps. This entry describes the model and methodological recommendations for evaluation on each level, its applications and modification in organizational contexts, and its applications to various educational contexts.

Kirkpatrick’s Four-Level Model of Training Evaluation

Kirkpatrick’s model for evaluating training programs was developed to provide actionable information to trainers and organizations to help decide whether to continue offering a particular training program and how to improve future programs. It is also used to validate the work of training professionals and organizational training enterprise overall. The four levels of evaluation allow for comprehensive evaluation of training, and it is recommended to proceed through all four levels without skipping, as all levels provide uniquely valuable information. The following are the four levels in Kirkpatrick’s model:

  • reaction criteria, or the participant’s feelings regarding the training;
  • learning criteria, or the participant’s knowledge and understanding of training content;
  • behavior criteria, or change in the participant’s behavior, sometimes referred to as transfer of learning; and
  • results criteria, or intended outcomes such as increased productivity.

Reaction and learning criteria focus on what occurs within the training program and thus are considered internal. Behavioral and results criteria focus on changes that occur outside and typically after the program and are seen as external criteria. Kirkpatrick noted that evaluation becomes more important and meaningful as it progresses from the reactions level to the results level, but at the same time, it becomes more difficult, complicated, and expensive. It is important to be reminded of the importance of unique information obtained from evaluation of the behavior and results levels when the difficulty and cost of such evaluation tempt professionals to rely solely on the less complex reaction and learning evaluation levels.

Reaction Criteria

Reaction criteria are trainees’ perceptions of training. Kirkpatrick defined evaluation of reactions as evaluation of trainees’ feelings of whether they liked the training program. One of the later modifications to the model proposed by George Alliger and his colleagues suggests distinguishing between trainees’ enjoyment of the training (affective reactions) and perceived amount of learning (utility judgments) within the reaction criteria. Suggestions for measuring reactions include clearly defining the goal of measurement and carefully aligning specific questions with that goal, developing standards for evaluation, obtaining quantifiable data, ensuring honest responses through participant anonymity, striving for a 100% response rate, and providing an opportunity for additional qualitative comments.

Kirkpatrick cautioned that although reaction criteria provide valuable information and are important to measure well, use, and communicate, positive reaction itself is not an indication of learning. It is even less of an indication of the behavioral change or results attributable to training. Meta-analysis of the relationship between the reaction criteria and the other levels of evaluation found no association between affective reactions and other levels and a very weak relationship between utility judgments and the other levels. Similarly, in educational contexts, student evaluation of teaching and self-perception of learning are found to be weakly—and in some studies negatively—related to objectively measured learning. Heavy reliance on reactions criteria in evaluation of teaching may even lead to diminished use of teaching methods that benefit long-term learning and transfer of learning, such as facilitating desirable difficulties or varying learning conditions, in favor of approaches that elicit positive reactions in the short term, yet do not result in lasting learning.

...

  • Loading...
locked icon

Sign in to access this content

Get a 30 day FREE TRIAL

  • Watch videos from a variety of sources bringing classroom topics to life
  • Read modern, diverse business cases
  • Explore hundreds of books and reference titles

Sage Recommends

We found other relevant content for you on other Sage platforms.

Loading