Skip to main content icon/video/no-internet

Stem-and-Leaf Plot

The stem-and-leaf plot was developed by John Tukey and is used for continuous data during exploratory data analysis. It gives a detailed description of the distribution of the data and gives insight into the nature of the data. It is more informative than a simple tally of the numbers or a histogram because it retains individual data points. Using the information in a stem-and-leaf plot, the mean, median, mode, range, and percentiles can all be determined. In addition, when the stem-and-leaf plot is turned on its side, one is looking at a histogram of the data. From this, one can get an idea of how the data are distributed. For example, whether the data appear to be described by a normal curve or whether they are positively or negatively skewed. It also can point to unusual observations in the data that may be real or a result of reporting errors or data entry errors.

To make a stem-and-leaf plot by hand, the data should be ordered and made into categories or groups. If no logical categories exist for the data, a rough guide for the number of stems to use in the plot is two times the square root of the number of data points. When using a statistical program to create a stem-and-leaf plot, the number of categories is determined by the software. The leaf is generally the last digit of the number, and the stem includes the digits that come before the leaf. For example, if the data are whole numbers ranging from 20 to 75, the categories may be from 20 to 24, from 25 to 29, from 30 to 34, and so on. The stem is the digit in the tens position, valued from 2 to 7, and the leaf is the digit in the ones position, valued from 0 to 9. It is useful, especially when there is a place filler as in the example shown here, to add a key so that it is clear what number one's stem and leaf is portraying. See Table 1 for an illustration of a stem-and-leaf plot. By simply looking at this plot, it can be seen that the mode is 60 (the most frequent value) and the median, boldface in the table, is 59 (21 of the values fall above this number, and 21 of the values fall below this number).

The stem-and-leaf plot can also be used to compare data sets by using a side-by-side stem-and-leaf plot. In this case, the same stems are used for both data sets. The leaves of one data set will be on the right, and the leaves of the other will be on the left. When the leaves are side by side like this, the distributions, data ranges, and where the data points fall can be compared.

The stem-and-leaf plot becomes more difficult to create by hand as the amount of data increases. In addition, with large amounts of data, the benefits of being able to see individual data values decrease. This is the case because it becomes increasingly difficult to determine summary measures, such as the median, which is part of the value of using a stem-and-leaf plot. A histogram or a box-and-whisker plot may be a better option to visually summarize the data when the data set is large.

...

  • Loading...
locked icon

Sign in to access this content

Get a 30 day FREE TRIAL

  • Watch videos from a variety of sources bringing classroom topics to life
  • Read modern, diverse business cases
  • Explore hundreds of books and reference titles

Sage Recommends

We found other relevant content for you on other Sage platforms.

Loading