Skip to main content icon/video/no-internet

Intercoder Reliability Techniques: Cohen’s Kappa

Cohen’s kappa (κ) constitutes one classic technique of measuring the level of consistency between two raters. This entry discusses measuring intercoder reliability using κ and presents two approaches for characterizing κ.

Measuring Intercoder Reliability

Suppose there is a researcher investigating the extent to which a particular news company produces reports in favor of a certain presidential candidate. The investigator would first generate a sample frame (e.g., news aired or published within certain time period) and then randomly select a predetermined number of news articles to be used for analysis. The researcher will finally create a coding protocol whereby each unit of analysis (e.g., word, sentence, paragraph, whole article) can be judged whether or not it contains elements conveying favorable attitudes toward the candidate (e.g., presence or absence of positive adjectives used to describe his or her political campaign, past achievements, family, or support for or denouncement of the policies he or she has pledged to adopt or retract when elected).

The remaining task is assessing all the selected articles, say, sentence by sentence, and coding each either as 1, when it contained any signs of favoritism, or 0, when such semantic cues were absent. The task would be relatively easy if the investigator himself or herself performs the content coding following the protocol the investigator himself or herself created. The problem, however, is that the investigator may have at least some rough expectation on how the media company has been portraying the candidate, and such experience- or theory-based hunch can readily bias the manner in which the investigator codes the contents. A sound alternative would be to invite two coders blind to the prediction, train them to become familiar with the coding protocol, have them content-code a common portion of the sample independently, and assess the extent to which the independent results correspond to one another. The level of correspondence between the two coders is usually termed the intercoder or interrater reliability.

To the extent that the obtained intercoder reliability exceeded a conventional criterion—this may be achieved after multiple rounds of training sessions—the researcher can confidently assume that the coders are quantifying the data in a consistent manner and finally have each coder content-code half of the remaining transcripts separately. This procedure provides the researcher with a strong rebuttal to a possible suspicion of the quality of the data, which may arise when the researcher himself or herself conducted the coding alone. The internal validity of the research findings, the ultimate goal of almost every scientific activity, is also sustained this way.

Characterizing κ

The Classic Approach

Cohen’s kappa (κ) is a statistical technique devised to estimate the level of intercoder reliability. κ estimates the level of intercoder consistency as following.

κ = P ( A ) P ( E ) 1 P ( E ) .

The characteristics of κ could be elucidated returning to the previous example. The coding results could be summarized into a 2 × 2 contingency table as shown in Table 1. The values in each cell represent the relative frequencies computed to vary between .00 (0%) and 1.00 (100%). So, for example, cell (a) indicates a 40% intercoder agreement achieved by both coders judging that the unit contains no favoritism. Likewise, cell (b) demonstrates that 5% of the time Coder 1 judged the material was written in favor of the candidate whereas Coder 2 found no such clues there. The far right-hand column shows row totals and the bottom row indicates column totals. All the coded results sum to 1.00 or constitute 100% of the sample data.

...

  • Loading...
locked icon

Sign in to access this content

Get a 30 day FREE TRIAL

  • Watch videos from a variety of sources bringing classroom topics to life
  • Read modern, diverse business cases
  • Explore hundreds of books and reference titles

Sage Recommends

We found other relevant content for you on other Sage platforms.

Loading