Skip to main content icon/video/no-internet

DISTATIS is a generalization of classical multidimensional scaling (MDS) proposed by Abdi, Valentin, O'Toole, and Edelman. Its goal is to analyze several distance matrices computed on the same set of objects. The name DISTATIS is derived from a technique called STATIS, whose goal is to analyze multiple data sets. DISTATIS first evaluates the similarity between distance matrices. From this analysis, a compromise matrix is computed that represents the best aggregate of the original matrices. The original distance matrices are then projected onto the compromise.

The data sets to analyze are distance matrices obtained on the same set of objects. These distance matrices may correspond to measurements taken at different times. In this case, the first matrix corresponds to the distances collected at time t = 1, the second one to the distances collected at time t = 2, and so on. The goal of the analysis is to evaluate if the relative positions of the objects are stable over time. The different matrices, however, do not need to represent time. For example, the distance matrices can be derived from different methods. The goal of the analysis, then, is to evaluate if there is an agreement between the methods.

The general idea behind DISTATIS is first to transform each distance matrix into a cross-product matrix as it is done for a standard MDS. Then, these cross-product matrices are aggregated to create a compromise cross-product matrix that represents their consensus. The compromise matrix is obtained as a weighted average of individual cross-product matrices. The principal component analysis (PCA) of the compromise gives the position of the objects in the compromise space. The position of the object for each study can be represented in the compromise space as supplementary points. Finally, as a by-product of the weight computation, the studies can be represented as points in a multidimensional space.

An Example

To illustrate DISTATIS, we will use the set of faces displayed in Figure 1. Four different “systems” or algorithms are compared, each of them computing a distance matrix between the faces. The first system corresponds to PCA and computes the squared Euclidean distance between faces directly from the pixel values of the images. The second system starts by taking measurements on the faces (see Figure 2) and computes the squared Euclidean distance between faces from these measures. The third distance matrix is obtained by first asking human observers to rate the faces on several dimensions (e.g., beauty, honesty, empathy, and intelligence) and then computing the squared Euclidean distance from these measures. The fourth distance matrix is obtained from pairwise similarity ratings (on a scale from 1 to 7) collected from human observers. The average similarity rating s was transformed into a distance using Shepard's transformation: d = exp{–s2}.

None

Figure 1 Six Faces to Be Analyzed by Different “Algorithms”

None

Figure 2 The Measures Taken on a Face

Notations

The raw data consist of T data sets and we will refer to each data set as a study. Each study is an I × I distance matrix denoted D[t], where I is the number of objects and t denotes the study.

...

  • Loading...
locked icon

Sign in to access this content

Get a 30 day FREE TRIAL

  • Watch videos from a variety of sources bringing classroom topics to life
  • Read modern, diverse business cases
  • Explore hundreds of books and reference titles

Sage Recommends

We found other relevant content for you on other Sage platforms.

Loading