Skip to main content
Search form
  • 00:00

    [MUSIC PLAYING]Brian Francis, thank you very much for talking to me today.What I wanted to ask you first is simply if you could tell me,in very straightforward terms, what is latent class analysis?

  • 00:28

    OK.Yes.Latent class analysis is really a wayof finding patterns in data.So you think your data may have a number of classes,hidden classes-- so the word "latent" means hidden--that you want to discover.But you may not know how many classes there areor what to extract from those classes.

  • 00:49

    Statisticians, to me, are always looking for hidden patterns,aren't they?Surely that's what statisticians do.So what is it that's unique about latent classanalysis in doing that work?Yes.There are two types of patterns that statisticians look for.The first type is really trying to predict a variablefrom a set of other variables, predict a response from a setof explanatory variables.

  • 01:17

    And that's, well, one form of pattern that statisticians do.And that's called linear modeling,or generalized linear modeling, or sometimes linear regression,multiple linear regression.It goes under various forms of name.What I'm talking about here is reallya collection of items, which are normally categorical in nature,and you have a collection of them.

  • 01:42

    And they might be responses to a bank of questionsin a questionnaire or they-- I guess they mightbe other categorical items.And you're trying to see where the responses fallinto particular groups.One of the functions of latent class analysisis that there's a formal statistical model [INAUDIBLE]and so you can use the statistical ideas,such as likelihood, to determine the number of classes.

  • 02:12

    That problem isn't entirely solved,but there are recognized techniquesthat people use to determine classes.I should contrast it with cluster analysis.Most people think of latent class analysisas a form of cluster analysis.But cluster analysis is not really more developedfor binary items, for yes/no items,or for categorical items.

  • 02:36

    And there's no underlying statistical theoryto cluster analysis.It's simply based on distance measures between points.So you've got to make your own decisionas to how many clusters exist in the data.Whereas latent class analysis gives yousome help towards that.Can you give me any examples, then,of when latent class analysis might be used,or examples of its use elsewhere,to put some flesh on those bones?

  • 03:07

    Yes, certainly.I did a study in Lancaster, England,for example, to do with different ways in which nursesview their job, through a set of questions, of 30 questions,on whether nurses felt their job was for something for themfor life, or just for earning money.

  • 03:34

    We found that there were three groupsin the data-- a group of nurses who viewed nursingas just a job, another group of nurses who viewed itas a career, and a third group of nurseswho viewed it as a vocation.And the career nurses wanted to progress,the vocational nurses weren't so.

  • 03:57

    But [INAUDIBLE] progressing, but some valued nursingas a form of activity in its own sake.So there were these three classeswhich were clearly identified from the latent class analysis.And a different statistical techniquewouldn't have identified those classes?

  • 04:18

    There are examples, which you canfind in journal articles, where things have gone badlywrong using cluster analysis.Whereas with latent class analysis,it appears to have produced a far better analysisof the data.I can talk about one of those, for example.It's a classic data set.It's a classic study of really one of the first uses of latentclass analysis.

  • 04:41

    For [INAUDIBLE] type of data, for individual little data.So there were 468 teachers, and therewas a famous study in the 1970s to do with teaching styles.And its data set is called the teaching styles data set.And it made a lot of press coverage in the late 1970s,because the study-- once they determined the teachingstyle, which was like the formal or informal,they used cluster analysis.

  • 05:14

    They failed to classify some teachersbecause they were too far away, so there were70-odd in the data cluster.They counted all clusters, and thenthey put them into order from formal to informal,and that rendered a subsequent analysisthat showed that formal teaching wassuperior to informal teaching.And this got a lot of publicity in the late 1970s.

  • 05:37

    But also, it generated lots of controversy.And so the data was reanalyzed by Murray Aitkin, John Hinde,and Dorothy Anderson, who publishedin the Journal of the Royal Statistical Society, Series A,in 1981, using latent class analysis.They found only three classes.They found a formal group, an informal group,and a mixed-teaching group.

  • 06:05

    And they found, instead, that the mixed group was the worst,and the formal and informal were very much the samein terms of outcome.So latent class analysis was really superior.And this study's now been generally acceptedas the gold standard, or the final wordon that particular research study.

  • 06:26

    So it does have [INAUDIBLE] strengths.And the problem with the original studyis that they didn't know how many clusters were in the data,and they had difficulty in knowing where to stop,where to draw the line, whether to have two clusters or 12clusters, or 20 clusters.Could you tell me a little bit more about what kind of dataI can use latent class analysis on?

  • 06:52

    That's a more difficult question than it seems.I do latent class analysis as really a techniquethat could be used for binary data, for yes or no data.Or for [INAUDIBLE] scale data, for ordinal categorical data,or even for un-ordered categorical data.

  • 07:14

    And it could be used on tabular data or individual-level data.Most of the early papers in latent class analysiswere based on tabular data.Data giving responses on a small number of items,and then the number of people whohave given that particular pattern of response.But now it's more used on individual-level data,simply putting me in a data set whichmight be a particular response to a questionnaire,and analyzing it straightforwardly.

  • 07:45

    But latent class analysis can be viewedas a special case of a more extensive setof statistical models known as mixture models.Mixture models can be used-- you canhave mixture models for continuous data or count data.And so, it depends whether you wantto call that data, that form of analysis, latent classanalysis, or not.

  • 08:09

    I would tend not to under [INAUDIBLE].Right.Thank you, Brian Francis.[MUSIC PLAYING]

Video Info

Publisher: SAGE Publications Ltd.

Publication Year: 2011

Video Type:Interview

Methods: Latent variables, Cluster analysis

Keywords: applications and contexts; likelihood ratio; pattern recognition; teaching styles

Segment Info

Segment Num.: 1

Persons Discussed:

Events Discussed:



Professor Brian Francis explains latent class analysis as a way of identifying patterns in data. He also describes a famous study of teaching styles that was badly analyzed using cluster analysis, but when the data were reexamined using latent class analysis, the results were very different.

Looks like you do not have access to this content.

What is latent class analysis?

Professor Brian Francis explains latent class analysis as a way of identifying patterns in data. He also describes a famous study of teaching styles that was badly analyzed using cluster analysis, but when the data were reexamined using latent class analysis, the results were very different.

Copy and paste the following HTML into your website