Skip to main content icon/video/no-internet

A dissimilarity coefficient is a function that measures the difference between two objects. It is defined from a set E × E (e.g., R × R, R 2 × R 2, R n × R n) to the nonnegative real numbers R +. Let g be a dissimilarity coefficient. Let x and y be two elements from E, and g verifies the following properties:

None

The function g is said to be a pseudo-metric if and only if g verifies C1, C2, C3, and the following property. Let z be another element from E,

None

Furthermore, the function g is said to be a metric if and only if g verifies C1, C2, C3, C4, and the following additional property:

None

The value taken by g for two elements x and y is called “dissimilarity” if g is simply a dissimilarity coefficient; “semi-distance” if g is, in addition, a pseudo-metric; and “distance” if g is a metric.

The application of the function g to a finite set of S elements {x1,…, xk,…, xS} leads to a matrix of dissimilarities (or semi-distances, or distances) between pairs of the elements. This matrix is said to be Euclidean if and only if one can find S points Mk (k = 1, …, S) that can be embedded in a Euclidean space so that the Euclidean distance between Mk and Ml is

None

where ck and cl are the vectors of coordinates for Mk and Ml, respectively, in the Euclidean space. These vectors of coordinates can be obtained by a principal coordinate analysis. Consequently, the interest of this Euclidean property is the direct association between the dissimilarities and the obtention of a typology, a graphical representation of the dissimilarities among elements. Other types of graphical displays can be obtained with any dissimilarity coefficient by hierarchical cluster analysis and nonmetric multidimensional scaling.

Examples

Examples 1

Let E be the Euclidean space R n, vector space of all n-tuples of real numbers (x1,…, xi,…, xn). An element of this space is noted xk. In that case, each element may be characterized by n quantitative variables X1,…, Xi,…, Xn. Let xk = (x1k,…, xik,…, xnk)t and xl = (x1l,…, xil,…, xnl)t be two vectors containing the values taken by the objects k and l, respectively, for each of the variables considered; xk, xl ∊ R n. The following dissimilarity coefficients can be used to measure the difference between the objects k and l:

  • the Euclidean metric

    None
  • the Jöreskog distance

    None

where V = diag (V(Y1),…, V(Yi),…, V(Yn)) is the diagonal matrix containing the variances of the n variables

  • the Mahalanobis distance

    None

where W is the variance-covariance matrix for the n variables.

All of these dissimilarity coefficients are metrics and provide Euclidean dissimilarity matrices.

Example 2

Let E be the set of frequency vectors

None

In that case, let p and q be two vectors from E.

Several functions can be used to measure the dissimilarity between the two frequency

...

  • Loading...
locked icon

Sign in to access this content

Get a 30 day FREE TRIAL

  • Watch videos from a variety of sources bringing classroom topics to life
  • Read modern, diverse business cases
  • Explore hundreds of books and reference titles

Sage Recommends

We found other relevant content for you on other Sage platforms.

Loading