Skip to main content
Search form
  • 00:00

    [MUSIC PLAYING][Using Geodemographics to Better UnderstandHealth and Deprivation]

  • 00:09

    LUKE BURNS: My name is Luke Burns.And I'm a lecturer in quantitative human geographyat the University of Leeds.My background is in a range of quantitative techniques,including geodemographics, composite indices,and geographical information systems.I've worked in industry and academia,and try to bring these experiences into my teaching.The purpose of this video is to introduce

  • 00:31

    LUKE BURNS [continued]: geodemographics and k-means classificationas a research method.Geodemographics is transferable to any domain.And the purpose of this video is to show youhow geodemographics works, how I applied itin the context of health, some of the advantagesand disadvantages of applying geodemographics,and the seven phases required to build a geodemographic system.

  • 00:52

    LUKE BURNS [continued]: I'll also give you my advice on how to applythe method in your own context.[How Geodemographics Works]Geodemographics is defined as the analysis of peopleby where they live.In essence, it's about putting peopleinto categories based on shared characteristics.And these characteristics can range from demographic

  • 01:14

    LUKE BURNS [continued]: variables-- such as age, sex, ethnicity, or education--to more behavioral or lifestyle characteristics,such as your propensity to go on holiday or disposable income.The concept of geodemographics and area classificationis quite dated.It actually dates back to Charles Boothin the 18th century.He was a social researcher and a philanthropist.

  • 01:34

    LUKE BURNS [continued]: And he went around the streets of Londoncategorizing all the streets into areas based on poverty.More recently, the concept has been in existencefor around about 40 years.And with the data we have available nowadays,there are leading commercial vendorswho produce geodemographic systems, usinga range of different data sets.The leading commercial vendor is CACI,

  • 01:56

    LUKE BURNS [continued]: who produced the ACORN System, whichthe acronym stands for A Classification Of ResidentialNeighbourhoods.And it is about putting each postcode or zipcode into a category based on these shared characteristics.The ACORN System separates the UK postcodes into 59 groups,ranging from type 1, which is classed as affluent achievers,

  • 02:18

    LUKE BURNS [continued]: to type 59, which is urban adversity.Geodemographic systems are able to segment population groupsbased on all these different characteristics.Consider the area in which you live,whether that is your university address or your home address.Probably, the people who live near youshare similar characteristics to you.So at university, you will live with people

  • 02:39

    LUKE BURNS [continued]: who are around the same age as youand have similar interests to you.Whereas your parents, or where you lived back home,it'll be a different type of person--different households, different age groups, even differentdisposable income levels.Consider the political party which you vote for,or the newspaper that you read.The chances are, the people who live in the same areaas you share those same characteristics.

  • 03:03

    LUKE BURNS [continued]: [Applying Geodemographics to the Context of Health]Geodemographics can be applied in a range of different areas.Commercially, geodemographics is used for product targeting.If you are trying to sell a particular product or service,knowing your target market enables youto direct it at relevant areas.Equally, in the public sector--

  • 03:24

    LUKE BURNS [continued]: this research, for example, looksat health profiling and trying to identify areas at riskfrom health deterioration.Again, by using relevant data sets and relevant variables,we can bring these together into one classification.This particular research explored geodemographicsfrom a health perspective.It was a hypothetical piece of researchthat assisted the government to allocate

  • 03:46

    LUKE BURNS [continued]: scarce resources to local authorities most in need.There are 388 local authorities in the UK.And resources such as funding areallocated from central governmentto subgovernment regions such as local authorities.By putting these local authoritiesinto different groups, we could determinewhich groups were most in need of funding

  • 04:06

    LUKE BURNS [continued]: to support increases in physical health and illness.Geodemographics was picked as the methodin this research, given that we could put areas into groups.We could create a series of clusters, wherebycluster one is the cluster of areas that are mostin need of funding, and cluster nine, in this case,is the cluster that is least in need of funding.

  • 04:27

    LUKE BURNS [continued]: We mixed 15 census variables--examples include health, central heating, access to a car--to determine areas which may be at risk from healthdeterioration.[Seven Phases Required to Build a Geodemographics System]Building a geodemographic system is an ideal research method

  • 04:47

    LUKE BURNS [continued]: for students, particularly when completing extended projectsor dissertations, because it gives youa very clear structure from which to work to.There are seven phases required to builda good geodemographic system.And we'll discuss these one by one.The seven phases are defining a purpose,selecting the scale and data to input, preprocessing the data,

  • 05:12

    LUKE BURNS [continued]: clustering the data, labeling and interpretation, applicationand evaluation, and mapping spatial patterns.The first stage is defining a purpose.Many geodemographic systems are general purpose,in that they take in lots of different variablesjust to understand ground conditions in an area.

  • 05:34

    LUKE BURNS [continued]: Whereas other geodemographic systems are bespoke.Bespoke systems may look at health profilingto determine areas at risk from health deterioration,such as this research, or crime profiling to determine areasthat may be at risk from crime.The second phase involves selecting a scale and datainputs.There are two important considerationsto think about here.

  • 05:55

    LUKE BURNS [continued]: Firstly, in terms of the data.We're awash with data at present.The famous phrase that 90% of the world's datawas generated in the past two yearsmeans that we have got lots of data at our fingertips.And that is a big difference to Charles Booth, whoconducted this type of work manually by walkingthe streets of London.The work that I am evidencing in this research

  • 06:17

    LUKE BURNS [continued]: used health data sets because the profile wasto determine areas at risk from health deterioration.It was therefore important to lookat academic literature, which is the best source of informationhere to determine variables that may be synonymous with healthdeterioration.So, for example, aging populations,people who don't have access to central heating,

  • 06:38

    LUKE BURNS [continued]: or other variables that may be indicative of poor health.In terms of spatial scale, it's alsonecessary to choose a unit at which to conduct the analysis.In the UK, we can conduct analysisat postcodes, which are very small areas,or output areas, which are the smallest form of censusoutput in the UK.

  • 06:58

    LUKE BURNS [continued]: Larger-scale outputs include local authoritiesand districts.Picking a relevant spatial scale is all downto your chosen output.In this case, it was assigning funds to local authorities.So local authorities was the natural choice of scale.The next stage involves preprocessing the data.And there are three things to think about here.

  • 07:19

    LUKE BURNS [continued]: The first is standardization.The second is normalization.And the the third is multicollinearity.If we start with standardization first,if exploring census variables, such as in this healthresearch, it is possible to turn variablessuch as the number of people without accessto central heating in an area into a percentage.

  • 07:41

    LUKE BURNS [continued]: If we convert all our variables into percentage format,that enables us to conduct fair comparisons between areas.Because we're taking into considerationthe total population size.The second aspect to think about here is normalization.And normalization involves linearlyrescaling all variables onto a 0 to 1 scale.This ensures parity across all your variables,

  • 08:03

    LUKE BURNS [continued]: and ensures that some variables are not overrepresentedin the final classification.Normalization is one of the more complicated stepsto get your head around.There is further reading available at the endof this video.The final step here is to think about multicollinearity.And again, this is quite complicated.So you may wish to follow up on the reading.

  • 08:24

    LUKE BURNS [continued]: In essence, multicollinearity is concerned with correlationbetween variables.If you have two variables going into your classification,such as the number of people aged over 65and maybe a variable more indicative of poor health,there's a good chance the two variables aregoing to be highly correlated.If you put these two variables into your classification,

  • 08:46

    LUKE BURNS [continued]: then you are, in essence, double-weightingthat dimension.If, following a correlation analysis,you find that any two or more variablesare highly correlated--and what you determine is highly correlated is quite subjectiveand up to you--you could potentially remove one of those variablesto avoid this double-weighting effect.Just as a further consideration, something to weigh up

  • 09:07

    LUKE BURNS [continued]: at this stage is polarity.And polarity refers to data direction.Sometimes, particularly if working with lots of variables,you may find that a high score in one variableis good towards your classification, wherea high score in another variable may be bad.And in this case, you've got variablesthat are running in different directions.

  • 09:29

    LUKE BURNS [continued]: In this case, it is possible to flip a variable.To elaborate on that further, and as an example,if you had two variables-- let's say one was the percentageof people without access to central heating in an areaand another variable was the percentage of people with twoor more cars in an area--you can see how a high score in one,percentage of no central heating, is negative.

  • 09:51

    LUKE BURNS [continued]: Whereas percentage of people with two or more carsis a positive.So you've got good and bad.You've got variables running in alternative directions.If you rotate, for example, the percentage of peoplewithout central heating, that thenbecomes the percentage of people with central heating,and all your variables are running in the same direction.The next stage involves physically clustering the data.

  • 10:13

    LUKE BURNS [continued]: And if you've done all your preprocessing correctly,this stage is quite straightforward.Clustering in geodemographics generallyuses the k-means classification algorithm. k refersto the number of clusters, and means refers to the average.So, in essence, you are putting all your areasinto k number of clusters based onthese average characteristics.

  • 10:34

    LUKE BURNS [continued]: The big decision to make when clustering the datais choosing how many clusters youwant the data partitioned into.And that is up to you as the system creator.You could choose 5, 7, 9, or more clusters.And inevitably, the number of clusters you choosewill have a big impact on the final solution.So it's something to think carefully about.

  • 10:55

    LUKE BURNS [continued]: The next phase is labeling and interpretation.This is quite a subjective phase,and something to give considerable thought to.This involves trying to place a pen portrait or a labelon each of your clusters that is indicative of the areas thatfall within them.In my research, I produced a nine-cluster solution,with cluster one being the most affluent and least in need

  • 11:16

    LUKE BURNS [continued]: of funding--and I call this Prospering Predominant South--down to cluster nine, which was the most in need of funding.And I termed this Northern Hardship.Labeling is quite difficult. But what we normally dois compare the clusters to the national average.So, for example, if cluster one has a higher number of people

  • 11:36

    LUKE BURNS [continued]: who are over the age of 65 or a higher number of peoplewith access to central heating, wemay be able to use these variables some wayto inform the name we place on that cluster.Sometimes, it can be very useful to evaluateyour system through some kind of validation.Validation really is concerned with seeinghow accurate your output is.So if you have another data set--

  • 11:57

    LUKE BURNS [continued]: let's say you had real data from GPs or hospitalson the number of people who were actually ill--you could see whether the number of people who were actually illcorrelated with the areas that youwere suggesting need funding.And that's a nice way just to determinethe accuracy and validity of your classification.The final phase is probably the most exciting.

  • 12:17

    LUKE BURNS [continued]: And it involves mapping the spatial patterns.So here, we can integrate our outputand link it to a geographical information system,such as MapInfo or ArcGIS.And that allows us to see spatially and visually,on a map, the areas that are most in need.[Advice for Using Geodemographicsin Your Research]

  • 12:38

    LUKE BURNS [continued]: If you are looking to create a geodemographic system,I've probably got four pieces of advice for you.The first one is to thoroughly researchedthe variables which you will input into your classification,as these will have a huge bearing on the final clustersolution.Look at academic literature.See what other people are saying about your chosen topic.

  • 12:58

    LUKE BURNS [continued]: In my case, I was looking at health.So I surveyed a wide range of health literatureto determine the types of things peoplewere talking about when talking about health deterioration.You could do the same for crime.You could do the same for sustainability.Survey the literature.Don't necessarily focus on geodemographic literature.Look at any literature that is concerned with your subject

  • 13:19

    LUKE BURNS [continued]: area.A second piece of advice is to thinkabout the spatial scale at which you create your classification.There's often a misconception that a smaller scale or a finergeography is better here.So we're talking about very small spatial units,such as postcodes, because they generally give you more detail.And you can understand why people come to that decision.Because smaller areas mean smaller numbers,

  • 13:41

    LUKE BURNS [continued]: means more detail.But you need to think about usinga spatial scale that is relevant for your particularclassification.In the case of my research, it wasabout attributing funding from central governmentto local government.So we used a local government geography,which are quite large areas.So think about the purpose, thinkabout why you're building the application, when you're

  • 14:03

    LUKE BURNS [continued]: deciding on a spatial scale.My third piece of advice is to havevery good reasons for lots of the decisionsyou make along the way.Think about some of the things we've spoken about--what variables to choose, how many variables to choose,what spatial scale you should choose,what constitutes a high correlation value.All these decisions are decisionsthat need to be made by you as the system creator.

  • 14:26

    LUKE BURNS [continued]: And a different decision in any of thosewill clearly have a big impact on the final cluster solution.So my real piece of advice is underpin these with science.Think hard about them.And if you're producing an extended projector dissertation, make sure you document your reasoningclearly, as no doubt the marker will have this in mind.My final piece of advice is don't

  • 14:46

    LUKE BURNS [continued]: be scared to change things.We've already spoken about lots of the decisionsthat you need to make along the way.And if you've produced your final cluster solution,don't be afraid of going back and changing thingsto see how it affects the final cluster solution.These are subjective decisions.So these decisions you make along the wayare very important.Make sure designing a geodemographic system

  • 15:08

    LUKE BURNS [continued]: is iterative.Don't be scared to change things.[Advantages and Disadvantages of Geodemographics]So, to conclude, geodemographics is a very transferable researchmethod.It can be applied in a wide range of areas,such as health-- as evidenced here--but also crime, sustainability, retail, business, transport,

  • 15:30

    LUKE BURNS [continued]: almost any application area.So you may be able to see connectionsto your own research.Another great thing about geodemographicsis it enables you to create your own data set.In this research, we were lookingat areas that were at risk from health deterioration.There isn't actually a data set out therethat allows us to look at that.

  • 15:50

    LUKE BURNS [continued]: So geodemographics allowed us to pull lots of other datasets together, such as the people aged over 65,such as the people without central heating, to derive--which means to create new--information.[Conclusion]Finally, I wish you every successif you decide to apply geodemographics

  • 16:12

    LUKE BURNS [continued]: in your research, particularly givenhow transferable it really is.If you're interested in finding more,please have a look at the reading associatedwith this video.[MUSIC PLAYING]

Video Info

Publisher: SAGE Publications Ltd

Publication Year: 2018

Video Type:Video Case

Methods: Classification, Cluster analysis

Keywords: challenges, issues, and controversies; characteristics of groups; common-identity groups; data sources in conducting health services research; decision making; decision weights; governmental role in public health; health risk assessment; Spending on health care ... Show More

Segment Info

Segment Num.: 1

Persons Discussed:

Events Discussed:



Luke Burns details the process for creating a unique geodemographic classification system and how to utilize it for research.

Looks like you do not have access to this content.

Using Geodemographics to Better Understand Health and Deprivation

Luke Burns details the process for creating a unique geodemographic classification system and how to utilize it for research.

Copy and paste the following HTML into your website