Skip to main content
Search form
  • 00:01

    [MUSIC PLAYING][Investigating European Migration to the United KingdomUsing Facebook]

  • 00:10

    FRANCESCO RAMPAZZO: So I'm Francesco Rampazzo.[Francesco Rampazzo, PhD Student]I'm a first-year PhD student at the University of Southampton.I'm doing a PhD in social statistics and demography.And the main idea of my PhD is to use internet dataand social media data for estimatingdemographical behavior.

  • 00:30

    FRANCESCO RAMPAZZO [continued]: So since there were also before some question about demography,now we are going to see how we can repurpose a data set thatis made for advertising for doing demographical analysis.In this first chapter of my PhD thesis,I'm going to look at the European migration to the UK--

  • 00:52

    FRANCESCO RAMPAZZO [continued]: a really hot topic at the moment, because we alwayshear newspapers, politicians, and researchersspeaking about migration.We were hearing about migration during the Brexit campaign.We heard it during the political campaign

  • 01:13

    FRANCESCO RAMPAZZO [continued]: for the Italian election a few weeks ago.So this is really a topic that everybody's discussing about.But the main question here is, which kind of datawe're using for estimating migration?So this was saying before that the data are not

  • 01:36

    FRANCESCO RAMPAZZO [continued]: really timely and accurate.And, for example, the UK is using the InternationalPassenger Survey for estimating migration.And this is a survey running since 1961.It's conducting in all the major airports, sea routes,and train station.

  • 01:56

    FRANCESCO RAMPAZZO [continued]: But the sample is quite big.It's around 700,000 and 800,000 interviews.But if we look at how many migrants this interviewis interviewing, there are just for 4,000.Another source of data that the ONS isusing for estimating migration is the Annual Population

  • 02:17

    FRANCESCO RAMPAZZO [continued]: Surveys.And this survey has a sample size of 320,000 interviewsand is combined with the Labour Force Survey.So this can also give us data about the economical backgroundof the migrants.So surveys is one of the source of data for migration.

  • 02:39

    FRANCESCO RAMPAZZO [continued]: But if we look at Europe, the sources of dataare quite different.We have for censuses.We have surveys as the UK.We have administrative services, like unregistered datain the Scandinavian countries.But one of the main limitation of this datais the fact that the definition might vary between countries.

  • 03:01

    FRANCESCO RAMPAZZO [continued]: So when we are starting looking at the international migration,and we want to understand how many Italiansare immigrating to the UK, this becomes quite hard because we--Italy-- has a different definitionof migrant than the UK.And so the definition of short-term migrantand long-term migrant are not mentioned.

  • 03:24

    FRANCESCO RAMPAZZO [continued]: So this brings a problem to compare these different dataset.Also, the quality of data set between European countriesare quite different.We have issues, as Susie was saying, about time information.Because if we are considering a census,the last census made in the UK was 2011.

  • 03:47

    FRANCESCO RAMPAZZO [continued]: So right now, we are going to use a census thatwas true for 2011.But, in 2018, the situation has completely changed.Brexit is happening, so we don't know, really,if the European migrants or the migrants that were herein the UK are going to stay or they

  • 04:07

    FRANCESCO RAMPAZZO [continued]: are going to move back to their own country.And those-- all the sources are really expensive.So the main questions that we have researched,and I think, also, many field is,can we estimate migration with new data sources?And this doesn't mean that we want to use just the new data

  • 04:29

    FRANCESCO RAMPAZZO [continued]: sources, but we want to combine the new datasources with the traditional data sourcesfor having better estimates of what is going on in migration.So internet is one of these new data sources.It's spreading really fast.

  • 04:51

    FRANCESCO RAMPAZZO [continued]: All over the world, we are using internet.And I'm showing here, in this first graph,that upper bulletin from the InternationalOrganization of Migration that was justreleased a few weeks ago in which they are telling usthat 53% of the population in the entire worldis using the internet and the 68% are using mobile phone.

  • 05:16

    FRANCESCO RAMPAZZO [continued]: And, actually, social media is also a big partof this internet use.So we should start thinking about using social mediadata for estimating demographical events.Also this is true for, if we consider the entire world.But if we look at Britain, so at the Western countries,

  • 05:37

    FRANCESCO RAMPAZZO [continued]: the data are even better because--and here, I'm showing data from the Oxford Internet Institute.And, in this map, we see that greater Londonhas a really high percentage.And I think you cannot read it, but it's 88%,89% of the people aren't using the internet.

  • 05:58

    FRANCESCO RAMPAZZO [continued]: But, also, when it's not really bright,the color, the lower percent is 59%.So this means that, really, more than half of the populationis using the internet.And we can use this source of datafor doing analysis on people.So we started also in demography to investigate

  • 06:22

    FRANCESCO RAMPAZZO [continued]: whether we can use internet data and not just traditional datasources.And one of my supervisor is Emilio Zaghenithat made a talk one month ago at the InternationalForum of Migration Statistics in which he's telling--he was saying that there are three main reasons why

  • 06:43

    FRANCESCO RAMPAZZO [continued]: we should using internet data for demography.The first one, I think we heard a lot about this,is for improving existing statistics.The second one is for adding a new dimension, whichmeans for looking at the assimilation of migrantsin the country of its destination

  • 07:05

    FRANCESCO RAMPAZZO [continued]: or for studying through Strava or Citymapper that is reallyused here in London--how people are commuting from the areain which they live to the place in which they work.But, also, the further reason is for having informationon an out-of-reach population, which

  • 07:26

    FRANCESCO RAMPAZZO [continued]: means for looking at small migrants group or refugees.So groups of individuals-- that is notreally easy to catch through a survey or to interview.So here, I have a little bit of literature review on the use

  • 07:48

    FRANCESCO RAMPAZZO [continued]: of internet data for--but not just internet data--for demography.The first paper is from Blumenstockand it shows the use of mobile phone records,as Susie was mentioning before, for studyinginternal migration in Rwanda.And, specifically for the developing world,

  • 08:10

    FRANCESCO RAMPAZZO [continued]: data are really lacking.So having these data sources can helpus to understand what is going on there.Then, the use of Twitter was massivesince Twitter is freely accessible and geolocated.A lot of work has been done through this data source--

  • 08:30

    FRANCESCO RAMPAZZO [continued]: so Hawelka studied global mobility partners;Swier, from the ONS study, "Residence and Mobilityin the UK."But, also, Twitter can help us in understandingthe definition-- how can they vary of migration?So short-term mobility and long-term migration,

  • 08:52

    FRANCESCO RAMPAZZO [continued]: and also make the difference between travelers.And the last paper here is from Zagheni, Gummadi, and Weber,and is about using Facebook advertising platformthat is the same data source that I'musing for this research.And about hard to reach population,

  • 09:14

    FRANCESCO RAMPAZZO [continued]: I wanted to give you some example.The first one is by State and colleagues,and they used LinkedIn data for estimating migrationof professionals to the US.So we have also to think at different social mediafor looking at the different groups.

  • 09:35

    FRANCESCO RAMPAZZO [continued]: I will not use LinkedIn for estimating a low job migration,but I can use LinkedIn for estimating migrationof professionals to a country and to understandwhich are the skills and which arethe profession that a country's able to attract.And they other paper here is by Potzschke and Braun.

  • 10:00

    FRANCESCO RAMPAZZO [continued]: And they are using Facebook for conducting a surveyand to find Polish migrants in four European countries.And one of these countries is the UK.So my contribution to their researchand to the demographical area is to use

  • 10:23

    FRANCESCO RAMPAZZO [continued]: these aggregated and anonymized Facebook advertisingdata for estimating European migration to the UK.So, today, I'm going to show just a static picture,as we were saying before.So I will just show the data for one time point.But what I'm doing is to collect the data every week

  • 10:44

    FRANCESCO RAMPAZZO [continued]: to try to create a time series.And then going to see if I can see some decreased,increased stabilization of the number of migrants in the UK.And, also, I'm going to use the other variablesto stratify this data set and for looking

  • 11:05

    FRANCESCO RAMPAZZO [continued]: at the characteristics of the migrants thatare coming to the UK.One of the big challenge of this projectis really to understand the bias and the representativenessof the data because not everybody is on Facebookand, also, is a selecting sample.So we really have to understand how

  • 11:27

    FRANCESCO RAMPAZZO [continued]: to calibrate these data with traditional data sourcesto reduce the bias.So this is how Facebook advertising platformlooks like.It's a website.And if I were an Italian moving here to Londonand I wanted to advertise my new restaurant,

  • 11:49

    FRANCESCO RAMPAZZO [continued]: what I was going to do it to start a campaign on Facebookand start targeting people livingin greater London and the specific age group, and so on.So what am I doing is looking at all the individual Facebookusers living in the UK.I can specify the age group.

  • 12:09

    FRANCESCO RAMPAZZO [continued]: I can say the gender.I can even tell which language these people are speakingor using in the internet.The variable that I'm using is the variable expert.So Facebook also provide informationabout the country of origin of the Facebook users.

  • 12:32

    FRANCESCO RAMPAZZO [continued]: Of course, here, we don't really knowhow Facebook is doing this.But we are trying to understand if this is correct or not.Anyway, this is a goldmine of data,but there are also limitation--

  • 12:52

    FRANCESCO RAMPAZZO [continued]: limitation that we have to take into accountthrough our calibration approach.And there were this article that came out in September, before Istarted a PhD, and I was really thinking, mamma mia,what I'm going to do now that they are saying that Facebookis overestimating people?

  • 13:15

    FRANCESCO RAMPAZZO [continued]: But we have to take this into account,and we have to understand why, which groups.And we have to be aware that 10% of the Facebook profilesare actually fake.So these are all considerations that we have to train our modelto understand.So my data set--

  • 13:36

    FRANCESCO RAMPAZZO [continued]: and, as I said before, I'm downloading this every week--is made from the variable countries.So I'm estimating the data for all the Great Britain,but also for England, Wales, Northern Ireland, and Scotland.I have the variable sex.I have the country of origin, the age groups,and an education variable.

  • 13:59

    FRANCESCO RAMPAZZO [continued]: Problem is, of this variable--education-- that is a default for the American system.So here in Europe, we have a really different system.But we-- this is the variable that we have,and we have to think about how to use it best.My data set is made up by 30 million Facebook

  • 14:22

    FRANCESCO RAMPAZZO [continued]: anonymized and aggregated group of individuals,of which 31 million are British and 7 million are migrants.So this is the first descriptive graph that I'm showing you.It tells a lot of stories.So, in this graph, I'm showing youdata from the Office of National Statistics, their Annual

  • 14:45

    FRANCESCO RAMPAZZO [continued]: Population Survey for 2017.And is the red bar.And the other bar is the Facebook estimatefor October 2017.I have to point out that Facebook,I'm downloading data for people over 15 years old.

  • 15:05

    FRANCESCO RAMPAZZO [continued]: So, also, the discrepancy is in the factthat I'm not collecting data for minors on Facebookbecause they are not on Facebook.So what we can see from here is that sometimes Facebookis overestimating the migrants, sometimes it'sestimating quite accurately, and sometimes it's underestimating.

  • 15:30

    FRANCESCO RAMPAZZO [continued]: We can see that Polish is the biggest group of migrantshere in the UK, and Facebook is underestimating.But, for example, Romanian--that is the second largest group after Irish--Facebook is overestimating them.But for some other countries like Italy, Spain, and France,

  • 15:55

    FRANCESCO RAMPAZZO [continued]: the data is quite close to what the ONS is showing us.We'd really have to understand-- and this will take furtheranalysis--if, in some cases, is the ONS thatis underestimating or is Facebookthat is overestimating?And we will do this through calibration approach

  • 16:16

    FRANCESCO RAMPAZZO [continued]: and using my time series data.Here, actually, some of my PhD colleagueswere saying that I was not supposedto show this graph back, but since Ididn't know much about the UK, what I'm shown hereis the British population and theyare all the migrants in the UK.

  • 16:36

    FRANCESCO RAMPAZZO [continued]: And, here, there are the bars for England, Scotland, Wales,and Northern Ireland.And the order we're seeing from this graph is the fact that--and people might think that, OK, thisis how it is-- but the migrants are in England,and too many are in Scotland, Wales, and Northern Ireland.

  • 16:57

    FRANCESCO RAMPAZZO [continued]: Here, I have the Facebook age profileof the people in the UK.I have this first graphic for the rich population,all migrants, and European migrants.What I can see on here is that Europeansare coming in their working age or for education.

  • 17:20

    FRANCESCO RAMPAZZO [continued]: And this is what, also, we are expecting.But when we consider all the migrants,there is the largest group what is between 40 and 49 years old.And this might be also people from the Commonwealththat migrated here 20, 30 years ago.So one of the good things about Facebook

  • 17:40

    FRANCESCO RAMPAZZO [continued]: is that it's really providing us withdemographic characteristics of the peoplethat we want to study.And here, I have my educational variable.There are some issues with this variable since many--the biggest group is the one of unspecified.

  • 17:60

    FRANCESCO RAMPAZZO [continued]: So we don't really know which educational backgroundthese people have.But then, we can see that we have peoplewith the target weighted.We have people in high school, in college, non-degreein high school, and in grad school.And these are absolute numbers, and the graphis not on the same scale.

  • 18:21

    FRANCESCO RAMPAZZO [continued]: But what I was saying is that if we considerall migrants in this proportion, this proportiona little bit lower, but European seems to be a little bit moreeducated than the general migrants here to the UK.So, for concluding, these data are showing a lot

  • 18:44

    FRANCESCO RAMPAZZO [continued]: of potential and for understandingthe characteristics of migrants.But it's really necessary to understandthe bias of this data.And what I'm going to do next is to use out-of-sample samplevalidation, combination of traditional dataand new data and my data.

  • 19:06

    FRANCESCO RAMPAZZO [continued]: And so use a sort of machine learning approachfor trying to understand how we can use this data,how to make this data better, and extract some conclusionabout migration here in the UK.And I want to thank my supervisor, Agnese Vitali,

  • 19:27

    FRANCESCO RAMPAZZO [continued]: Jakub Bijak, Ingmar Weber, and Emilio Zagheni, and all of you.Thank you.[APPLAUSE][MUSIC PLAYING]

Video Info

Publisher: SAGE Publications Ltd

Publication Year: 2019

Video Type:Tutorial

Methods: Social media research

Keywords: bias; cell phones; data analysis; demographic analysis; demographic data; demographic studies; education; Facebook; internet data collection; internet users; linkedin; migration; representation; Social media; Statistics; Twitter ... Show More

Segment Info

Segment Num.: 1

Persons Discussed:

Events Discussed:



PhD student, Francesco Rampazzo, reviews his research on using Facebook's advertising platform to estimate European migration to the UK, including some of the limitations and biases that account for.

Looks like you do not have access to this content.

Investigating European Migration into the United Kingdom Using Facebook

PhD student, Francesco Rampazzo, reviews his research on using Facebook's advertising platform to estimate European migration to the UK, including some of the limitations and biases that account for.

Copy and paste the following HTML into your website