Mixture Models

Neil J.Salkind

doi:10.4135/9781412952644

Entry
Reader's guide
Entries A-Z
Subject index

Return to Entries

Mixture Models

Edited by:
Neil J. Salkind
In:Encyclopedia of Measurement and Statistics
Chapter DOI:https://doi.org/10.4135/9781412952644.n286
Subject:Anthropology, Business and Management, Criminology and Criminal Justice, Communication and Media Studies, Counseling and Psychotherapy, Economics, Education, Geography, Health, History, Marketing, Nursing, Political Science and International Relations, Psychology, Social Policy and Public Policy, Social Work, Sociology, Science, Technology, Computer Science, Engineering, Mathematics, Medicine

Request Permissions

Show page numbers Hide page numbers

This entry discusses statistical models involving mixture distributions. As well as being useful in identifying and describing subpopulations within a mixed population, mixture models are useful data analytic tools, providing flexible families of distributions to fit to unusually shaped data. Theoretical advances in the past 30 years, as well as advances in computing technology, have led to the wide use of mixture models in applications as varied as ecology, machine learning, genetics, medical research, psychology, reliability, and survival analysis.

Suppose that F = {Fθ:θ∊ S} is a parametric family of distributions on a sample space X, and let Q denote a probability distribution defined on the parameter space S. The distribution

is a mixture distribution. An observation X drawn from FQ can be thought of as being obtained in a two-step procedure: First a random Θ is drawn from the distribution Q and then conditional on Θ=θ, X is drawn from the distribution Fθ. Suppose we have a random sample X1,…, Xn from FQ. We can view this as a missing data problem in that the “full data” consist of pairs (X1,Θ1),…, (Xn,Θn), with Θi ∼ Q and Xi |Θi = θ ∼ Fθ, but then only the first member Xi of each pair is observed; the labels Θi are hidden.

If the distribution Q is discrete with a finite number k of mass points θ1,…, θk, then we can write

where qj = Q{θj}. The distribution FQ is called a finite mixture distribution, the distributions Fθ are the component distributions, and the qj are the component weights.

There are several reasons why mixture distributions, and in particular finite mixture distributions, are of interest. First, there are many applications where the mechanism generating the data is truly of a mixture form; we sample from a population that we know or suspect is made up of several relatively homogeneous subpopulations, in each of which the data of interest have the component distributions. We may wish to draw inferences, based on such a sample, relating to certain characteristics of the component subpopulations (parameters θj) or the relative proportions (parameters qj) of the population in each subpopulation, or both. Even the precise number of subpopulations may be unknown to us. An example is a population of fish, where the subpopulations are the yearly spawnings. Interest may focus on the relative abundances of each spawning, an unusually low [Page 617]proportion possibly corresponding to unfavorable conditions one year.

Second, even when there is no a priori reason to anticipate a mixture distribution, families of mixture distributions, in particular finite mixtures, provide us with particularly flexible families of probability distributions and densities that can be used to fit to unusually shaped (skewed, long-tailed, multimodal) data that would be difficult to describe otherwise with a more conventional parametric family of densities. Also, such a fit is often comparable in flexibility to a fully nonparametric estimate but structurally simpler, and often requires less subjective input, for example, in terms of choosing smoothing parameters. As another example, it has been shown that the very skewed log-normal density often can be well approximated by a two- or three-component mixture of normals, each with possibly different means and variances.

...

Sign in to access this content

Get a 30 day FREE TRIAL

Watch videos from a variety of sources bringing classroom topics to life
Read modern, diverse business cases
Explore hundreds of books and reference titles

No internet connection.

All search filters on the page have been cleared.

Your search has been saved.

Entry

Reader's guide

Entries A-Z

Subject index

Mixture Models

Sign in to access this content

Get a 30 day FREE TRIAL

Read next

More like this

Sage Recommends