Skip to main content
Search form
  • 00:00

    [MUSIC PLAYING][Predictive Modelling on Open Data to Improve City Traffic--Transnet]

  • 00:19

    CHICO CAMARGO: My name is Chico Camargo.I'm a Postdoctoral researcher and data sciencehere at the Oxford internet Instituteat the University of Oxford.The OII is quite a young institute.We're studying how humans behave in the age of data.So rather than being the computer scientists developingthe infrastructure behind the internet,

  • 00:41

    CHICO CAMARGO [continued]: we're studying what humans do, what humans do with data,and what data has done to us.That goes from people studying fake news and Twitterbots to the ethics of robots and how far people shouldbe using others data, and also, closer to my project, howhaving this so-called big data helps

  • 01:02

    CHICO CAMARGO [continued]: us understand how humans move across urban placeslike cities.[MUSIC PLAYING]I'm working in a project called TRANSNET.We're trying to learn something about the traffichere in Oxford but also about how traffic behaves,so we can perhaps extrapolate that to learn about traffic

  • 01:25

    CHICO CAMARGO [continued]: elsewhere and to perhaps try to use that for policymakersto make better decisions.I know that there are a couple of smart cities in the world.Cities that are data-driven citiesthat use data to regulate themselves,to improve themselves.But the problem with smart citiesis that it often requires a lot of data collecting sensors

  • 01:46

    CHICO CAMARGO [continued]: to be deployed, and that's expensive.And what we are asking here is whether you can do thatwithout all the expensive machinery,whether you can use open data-- data that's alreadybeen collected by the government to then try to improve how wepredict and understand traffic.We are using three sources of data.One is the Office of National Statistics

  • 02:07

    CHICO CAMARGO [continued]: that collects demographic data so really the numberof people living in each neighborhood of Oxfordshire.The second source of data is data from smartphones.People that leave their location onare providing data to their mobile provider.And what we have through a partnership

  • 02:27

    CHICO CAMARGO [continued]: with the Alan Turing Institute isthat that data is aggregated, so wedon't know what each person is doing and then anonymized.So all I know is how many people are going from A to B,but it comes from someone's GPS originally.The third source of data is the OpenStreetMapthat comes from individuals just volunteering and tagging

  • 02:48

    CHICO CAMARGO [continued]: things on their phones as well.Saying, well, in this location, there is a shop, et cetera.So that's collated by individualsand then, put together into this one big filethat we use to make our predictions.This is Oxfordshire, and Oxford is in this tiny black circlein here.It's the part on the right.

  • 03:10

    CHICO CAMARGO [continued]: We're looking at how many people go from one of these wardsto another ward.And rather than looking at the individual trajectory of oneperson, we're looking at just how many people go from A to B.And here this diagram shows a lot of dots.And every dot represents one point in the OpenStreetMap dataset.

  • 03:30

    CHICO CAMARGO [continued]: And you can use this information to then predict how many peopleare going to go into that ward.[MUSIC PLAYING]TRANSNET has got about a year and a half to go.And my goal for it is to go from this part in which we're

  • 03:53

    CHICO CAMARGO [continued]: trying to understand how the city worksto then going into predicting the impact of veryconcrete actions.We've been in touch with the Oxford County Council.We meet them a couple times a year.We're trying to help the County Council predict where peopleare going to go so both seeing where traffic jams are going

  • 04:16

    CHICO CAMARGO [continued]: to happen and also the impact of perhaps havinga new school, a hospital in traffic.Does a school create more traffic jams?Well, probably.But going from probably to yes, it creates 50% more trafficjams, we're trying to get that number right.[MUSIC PLAYING]

  • 04:39

    CHICO CAMARGO [continued]: I think our project is split in two parts.The first one is to just collect all that data, which is mostlydone by either literally going into the websiteand finding what you need to download.Or when it's not so easy, I just needto write a couple scripts to scrape all that data online.Often, I just use the Python programming language.

  • 04:59

    CHICO CAMARGO [continued]: And then, once I've got all that data,then it's time to deploy a couple simple machine learningtools, like unsupervised learning,to see if I can sort of cluster those high-dimensional datapoints and all that to make it more useful to me.I think personal development and personal learning experience

  • 05:20

    CHICO CAMARGO [continued]: for me was to understand how data really doesn't behave,how data wasn't made for data scientists, how you reallyhave to kind of clean it and turn it into a toolthat you can use.I find that data is maybe a little rock,and that information is a gem.It's not just downloading something and plugging

  • 05:42

    CHICO CAMARGO [continued]: in a mathematical tool.No, it's really making these thingsmeet each other in the middle.After I understand it, then it's my jobto make other people understand it.Data visualization is a fundamental part of my job.It's not just illustrating my data,but it's communicating what I want them to understand.

  • 06:04

    CHICO CAMARGO [continued]: On the left side here, we've got somethingthat you can just read from the data.I know how many people are leavingeach neighborhood of Oxfordshire at a given a dayat a given time.And here, darker colors means more people leavinga ward or a neighborhood.The model side, all I mean here is that I'vegot this mathematical tool.

  • 06:25

    CHICO CAMARGO [continued]: It's essentially a function that takes the population of a wardor a pair of wards, it takes the distance between themand tries to estimate how many people willbe going from A to B. The model will make a simplifiedversion of Oxfordshire and try to sayin this understanding of our city, what does the traffic

  • 06:46

    CHICO CAMARGO [continued]: look like?And if you'll look closely, you'llsee that they don't match.There's a big light area here, meaning not a lotof people leaving these spots.And here this part is kind of light,but this is still a bit darker.So it means that the model is predicting a lot of peoplewould be commuting out of these neighborhoods.And the data says not that much.

  • 07:07

    CHICO CAMARGO [continued]: And that's OK.A model is not a perfect simulation of reality.A model's just a way to try to estimatehow things are behaving.And as you modify the simulation,you try to make it look more like the cityand that helps you understand and predicthow traffic is happening in your city.

  • 07:29

    CHICO CAMARGO [continued]: In the past, I have worked a lot with the mathematical modelsthat people use in biology.But I wanted to deal more with data and less with just models.My role here in this project is to use a lot of the modelsthat people use in traffic and also in biology, actuallyin the study of mobility in general,

  • 07:49

    CHICO CAMARGO [continued]: to try to predict how cars are going,how people are going to different places.And, to me, it's quite exciting because I'mone of those data points.I am a person that's walking somewhere in Oxford goingto get a coffee or something.And I'm not just studying some abstract datain an abstract space.No, I'm studying how people here in my town behave.

  • 08:10

    CHICO CAMARGO [continued]: And that's quite exciting to me.It feels very personal.I think, today, because we have so many things collectingour data sometimes without our consent,developing those methods that allow us to do researchwithout getting close to people's private lives

  • 08:31

    CHICO CAMARGO [continued]: is quite an important thing, quitean important ethical decision that shows other researchersas well that you can still develop a lot of data sciencewithout breaching into people's lives.As a researcher, my main job every dayis to do something that has never been done before.

  • 08:53

    CHICO CAMARGO [continued]: And I'm 100% sure that that's the most exciting thingabout being a researcher to me is that whatever I tryand whenever I fail, which happens all the time,I'm OK with failing because no one's done this before.So if I failed, maybe it's impossible,or maybe it's just hard, but that's OK.You have your predictions, how you think your, in this case,

  • 09:14

    CHICO CAMARGO [continued]: city would behave, and you've got how it actually behaves.And there's a mismatch.But whenever I can bring them closer and closer,I'm like, aha, now I understand whymy city behaves in that way.The way marginal research being used in the futureis that a person in any County Council,

  • 09:35

    CHICO CAMARGO [continued]: not necessarily Oxford, could plug in some simple datasources that are open and available to a lot of citiesin the world and quickly predict how traffic in their cityis going to go but also to predictthe impact of having new buildings of different sorts

  • 09:55

    CHICO CAMARGO [continued]: in the city.So if you were to ask a very practical question, if we builda hospital here or there, how is thatgoing to affect people going from this neighborhoodto that one?And I think the outcome of our projectwould be to let people predict thatwith a good degree of accuracy in any city in the world,not just in a country level, but also in a smaller town level.

  • 10:18

    CHICO CAMARGO [continued]: I think that would be an amazing outcome.They can turn their city into a smart city.

Video Info

Publisher: SAGE Publications Ltd.

Publication Year: 2019

Video Type:In Practice

Methods: Computational modelling, Prediction, Data visualization, Spatial analysis

Keywords: data sources: empirical data; data visualisation; models and modeling; monitoring; movement; personal data; programming and scripting languages; research; traffic congestion; traffic control; transportation ... Show More

Segment Info

Segment Num.: 1

Persons Discussed:

Events Discussed:



Chico Camargo, postdoctoral researcher and data sciencist at the Oxford Internet Institute, discusses TRANSNET, a project designed to forecast and understand transportation network resilience and anomalies.

Looks like you do not have access to this content.

Predictive Modelling on Open Data to Improve City Traffic: Transnet

Chico Camargo, postdoctoral researcher and data sciencist at the Oxford Internet Institute, discusses TRANSNET, a project designed to forecast and understand transportation network resilience and anomalies.

Copy and paste the following HTML into your website