Skip to main content icon/video/no-internet

Cluster analysis intends to provide groupings of set of items, objects, or behaviors that are similar to each other. The outcome of a cluster analysis provides the set of associations that exist among and between various groupings that are provided by the analysis. The actual technique depends on the application of multivariate statistics and the generation of association from available inputs. The choice of input variables becomes important in both providing the basis for similarity within a grouping and differentiating the differences between groupings.

For communication, suppose a research goal is to cluster public speakers on the basis of various behaviors exhibited during a presentation. So, a variety of behaviors are rated (e.g., speaking rate in words per minute, number of hand gestures, number of steps taken, number of shrugs of shoulders, number of disfluencies, length of pauses used during the presentation, number of transitions provided, and number of internal summaries used). Notice that the clustering does not provide which speakers would be considered strong or weak; the cluster is simply on the basis of similarity of behaviors. Thus, the clustering provides relations among various elements and the degree to which those elements are related. The remainder of this entry considers an application of the example of public speaking to illustrate the various approaches and uses of cluster analysis.

Methods of Cluster Analysis

Various methods of cluster analysis exist: additive, divisive, and k-means clustering, often described as a nonhierarchical method. Although there are similarities between the methods, understanding the distinctions becomes important when utilizing the method. The method can apply to any form of data (categorical, ordinal, interval, ratio) and permits combinations of forms of data in the analysis. The technique ultimately uses the data to create a Euclidian distance between any two elements of the system. Generally, the difference in the mathematical sets of issues involves the use of standardized scores so that for each separate evaluation the score used to compare distance involves the use of standardized scores.

The underlying definition of what constitutes a cluster is adjusted based on the tolerance for distance, expressed in mathematical terms.

Additive Clustering

The additive cluster approach starts with the assumption that each element provides a separate cluster; then the two most similar elements are put into a cluster—the process is repeated by putting elements into smaller and smaller numbers of clusters until the result is a single cluster with all persons in that cluster. The result is a diagram that starts with as many clusters as persons and then reduces to a single cluster. The person conducting the analysis must decide how many clusters are optimal and whether a minimum number or size for clusters is required. The technique lacks a requirement and may lack a recommendation for such values.

The resulting decisions can be represented in a diagram called a dendogram, sometimes also called a binary tree. The dendogram represents a visual display of the elements and the aggregation or disaggregation process. See Figure 1 for a display of this kind of analysis, showing a dendogram representing the example of public speaking assessment. The visual representation is the same regardless of method (additive or divisive); the distinction between methods indicates the starting point rather than the representation of the particular structure used in the diagram.

...

  • Loading...
locked icon

Sign in to access this content

Get a 30 day FREE TRIAL

  • Watch videos from a variety of sources bringing classroom topics to life
  • Read modern, diverse business cases
  • Explore hundreds of books and reference titles

Sage Recommends

We found other relevant content for you on other Sage platforms.

Loading