Skip to main content
SAGE
Search form
  • 00:02

    [MUSIC PLAYING]

  • 00:12

    MARTIN ZALTZ AUSTWICK: My name's Martin Zaltz Austwick.I'm a lecturer here at CASA.The research I do tends to focus around data visualization--looking at data sets that are generatedfrom social systems, cities, transport systems,and visualizing them, understanding them, analyzingthem, and then communicating them.The data I've been looking at is around bikeshare schemes.

  • 00:33

    MARTIN ZALTZ AUSTWICK [continued]: And this is an example of one.We've got one here in London.This is the one that's currently sponsored by Santander.You turn up at one of these bike stands.There's a bunch of bikes.You get on one, cycle off, and yougo to wherever you want to go in the city--obviously where there's a bike stand-- put it back again.So it's a bit like it's a public transport system,and a private transport system.Hey!

  • 00:57

    MARTIN ZALTZ AUSTWICK [continued]: Anytime a bike gets taken away or slotted in,that registers on the system.So that enables you to track, for an individual bike,and therefore for an individual journey,where it's been taken from, where it's been taken to,and exactly when that happened.[MUSIC PLAYING]So the first stage to getting this data

  • 01:18

    MARTIN ZALTZ AUSTWICK [continued]: is to actually get onto the Transport for London website.And it is open data, so they ask you to register for an account.And then once you've done that, you can get in.They've actually got a whole scope of different data sets.But one of the feeds they've got is around the Santander cycles.They've got live data.They've got a stream for looking at whatthe current status of the system is,

  • 01:39

    MARTIN ZALTZ AUSTWICK [continued]: but also you can get historical data from that.Now typically, there will be problems with that data.We'll have to work with it to clean it upand to make sense of it, which is a lot of the meatand potatoes of working with data,is actually just making it clean.The next step, actually having got that data,is to pull it into a database so wecan work with it-- we can slice it and query it and interrogateit.For example, I can look at particular information

  • 02:01

    MARTIN ZALTZ AUSTWICK [continued]: about the actual journeys.So these are individual journeys whichwere taking place over the course of the sample period.And then there's also information which tells mewhere those locations are.So from this, I can start to make queries.I can aggregate all these individuals up into flows.I can select particular time periods.I can look at weekdays versus weekends--

  • 02:22

    MARTIN ZALTZ AUSTWICK [continued]: all of those different ways of slicingthe data we talked about.With these data sets, they're pretty big.So the data set I was looking at was journey data.And if you just look at a table of that data,you're not really necessarily goingto get any real insight into what's going on with that.So we put that into some sort of visualization.One of the first things I did wasto visualize those individual journeys.

  • 02:42

    MARTIN ZALTZ AUSTWICK [continued]: To say, where are they starting, where are they ending up?And you can see, through time, where the cyclists are,and over the scope of the whole citywhere all the people are moving.That doesn't give you any sort of analysis or summary.But it starts to let you see the patterns in the data.So then you might go onto a more specific and detailed analysis.

  • 03:05

    MARTIN ZALTZ AUSTWICK [continued]: And then the final stage might beto combine that analysis with an output visualizationto communicate the summary and the results of whatyou found about the data.And that's what I did with this particular dataset, or this set of data sets, around bikeshare schemesaround the world.[MUSIC PLAYING]So visualization is often consideredthe end of a journey.

  • 03:26

    MARTIN ZALTZ AUSTWICK [continued]: And that you've analyzed all the data,you know what your answer is, and you present it.But it's actually very useful in understanding the dataas you go through the process.Now the fact is, there's much more sophisticated methodsto visualize data and analyze data now.And picking a visualization method which brings outthe data might point you towards what sort of analysis methodyou want to use.This is the piece of code I used to create

  • 03:48

    MARTIN ZALTZ AUSTWICK [continued]: the visualizations of moving bikes through the system.And it's done in a language called Processing.It's very geared towards visual outputs.So it's language for creating animations, interactive,and things like that.So this piece of software will go through.And it will start up, say, 6 o'clock in the morning.And it will ask the database, whatjourney started at 6:00 AM?And then pour them into the software,

  • 04:09

    MARTIN ZALTZ AUSTWICK [continued]: and then it'll start the journey of that particular bikeand that particular cyclist.And you can actually build up a series of journeys.And while this interrogation of the database is happening,there's an animation going on, whichis moving the cyclists between a start point and this end point.So if you're interested in something like-- I don't know,something which is a spatial behavior, a very natural thing

  • 04:29

    MARTIN ZALTZ AUSTWICK [continued]: would be to map it.To create a map of how that variesacross the country or the city.If you're interested in the time trend,a very natural thing to do would be to plot it over time,and see if something's increasing, decreasing,or how it's fluctuating.You want to pick something where it's illuminatingthe variational property of the datathat you're interested in understanding.

  • 04:50

    MARTIN ZALTZ AUSTWICK [continued]: Whether it's spatial, temporal, hasto do with connections between different things.So this is a day we just chose.It's Christmas Day, 2010.And you can see, these are the journeys thatare taking place on that day.Where they're starting, where they're ending up.You've got a clock, you can what time of day it is.Christmas Day is really, really busy,which is quite surprising.But there's no other public transport on Christmas Day.

  • 05:10

    MARTIN ZALTZ AUSTWICK [continued]: And there's a bar here, which is justan aggregate measure of how many bikesare on the road at this particular point in time.So you have a very detailed pictureof what's happening in the city.Where are the cyclists?How quickly are they moving?Where are they starting off?Where are they ending up?I'm using programming languages like Processing or Pythonor to build up images and patterns from the data

  • 05:32

    MARTIN ZALTZ AUSTWICK [continued]: at a very low level.That's quite a technical process.You have to know your way around programming to do that.But at the same time, in producing visualizations,you're thinking about what you're presenting,and to whom you're presenting it.Visually, I'm producing just for myself,so that I can see the patterns.It doesn't need the same amount of labeling and contextand highlighting of different featuresthan if I'm producing it to present at a conference,

  • 05:54

    MARTIN ZALTZ AUSTWICK [continued]: or put in a publication.So this is all the places where bikes are leaving.If it's a cyan color, that's bikes leaving a location.If it's red, it's bikes arriving.In the middle of the day, it's a bit more of a mix.The red and the cyan co-combine.Because they're complementary colors, you get white.So you get towards the afternoon, and these placeswhere people were leaving, which is-- that's Waterloo

  • 06:16

    MARTIN ZALTZ AUSTWICK [continued]: and King's Cross.That's places where people are arriving.So at the end of the day, people are going from the centerout to the edge of the city.Broadly speaking, open data is datathat anyone can get access to, and anyone can work with.There's varied criteria that will define how open data is.So getting hold of it is somethingthat's quite important.

  • 06:37

    MARTIN ZALTZ AUSTWICK [continued]: It's sitting on a web server somewhereis also-- that's another step up in openness.And then there's various criteriawhich have to do with the format of the data.So ideally, you want a text file,which in a hundred years' time, someonewill be able to open and do exactly the same analysisthat you've done.So I think open data has been known for a few years.People like Tim Berner-Lee are very active in it.

  • 06:58

    MARTIN ZALTZ AUSTWICK [continued]: There's an open data institute down the road here in London.And in the last few years, I thinkit's become quite a big thrust, especiallyfor open government data.All of this information that's collected on our behalves,but in the past, wasn't necessarilyshared or made available for easy use,

  • 07:20

    MARTIN ZALTZ AUSTWICK [continued]: is now coming to light.The advance of the open data movementis that there's just a lot more access to good data.And it just enables a lot of researchthat couldn't previously have happened.And obviously, with the number of different data setsthat may overlap, or they might relate to similar geographiesor similar topics-- so people canstart to put together new analyses and new research

  • 07:41

    MARTIN ZALTZ AUSTWICK [continued]: questions.I don't need to rely on the credentialas being a [INAUDIBLE] academic to get this stuff.So anyone can do it.A lot of the techniques and methods I'm usingare open source, as well.So the software's open source.The methodologies are open source.So for the first time, you're getting a workflow,if you like, where someone with the time and the skill

  • 08:04

    MARTIN ZALTZ AUSTWICK [continued]: can reproduce it.And I think that's quite interestingfrom the perspective of critiquing and arguingagainst the conclusions that I come up with.So you can kind of see Hyde Park-- that'sHyde Park there popping out.That's quite strong.The bridge is obviously very important,where people are coming from Waterloo, tryingto cross the river up into the City of London, which is aroundhere.

  • 08:25

    MARTIN ZALTZ AUSTWICK [continued]: There's quite a lot of activity on these bridges here.But you see the structure of the city popping out.Social network analysis is the root of network analysis,arguably.And [INAUDIBLE] that we're all interconnected in some way.And the reason that I applied it to bikeshare datawas to understand the structure of the connections of howpeople move around the city.So there's a bit of a network analysis called

  • 08:48

    MARTIN ZALTZ AUSTWICK [continued]: community detection, which is if you'remaking quite a big network of things connected up together,are there sub-networks within that?One of the things this community detection helps you to dois to spot groups of connections whichare stronger than they should be, just based on proximity.The weight of the edges on this graphis telling you how frequent journeys are between those two

  • 09:09

    MARTIN ZALTZ AUSTWICK [continued]: locations.So you can see Hyde Park is popping out very strongly.People are traveling around and about Hyde Park.And people are traveling from Waterloo Stationout towards the east part of London.That's called the Financial District of the City of London.If you look at the way that people cycle around Hyde Park,people cycle more within Hyde Parkwhere you'd expect, just from the distance.

  • 09:30

    MARTIN ZALTZ AUSTWICK [continued]: So there are spots which are close to Hyde Parkthat people don't really go to.They'd rather go right the way across Hyde Park.So this isn't a surprising result. It's sort of intuitive.It makes sense.You go, OK, there are more people travelinglonger distances because it's a pleasant place to cycle.But it allows you to quantify that.So how much does that affect that?And the other big part which happens in Londonis there's a big cluster which connects Waterloo

  • 09:53

    MARTIN ZALTZ AUSTWICK [continued]: Station to the City of London.So obviously, there are people coming on the trainto Waterloo Station, and there's the people thatwork in the City of London, and they'recycling rather than getting a taxior getting other kinds of public transport.And what's interesting is this is weekdaysand this is weekends.And you can see there's a huge, huge difference here.So the most obvious difference isif you look at the City of London, in the weekdays,

  • 10:14

    MARTIN ZALTZ AUSTWICK [continued]: it's so busy.Incredibly busy.One of the most busy parts of the city for cycling.If you look at the weekend, it's a sort of wasteland.There's just nothing happening in this part of the cityat all.So one of the things you can do with a data set like this--and the way that I do this analysis-- was youcan create something called a [INAUDIBLE] model.Which is, you build some assumptions in,and you say, well, if you've got two busy bike stations,there will probably be a lot of journeys between them,just statistically.

  • 10:35

    MARTIN ZALTZ AUSTWICK [continued]: And if they're close together, they'llbe even more journeys between them.And you can represent that mathematically.And you can say, that's my norm model.And you can use that norm model to represent a baseline.And then anything you get above and beyond that tells yousomething different.So in the example of Hyde Park, thereare journeys which are happening there which are much,much bigger than expected.Because the model I've constructeddoesn't explain leisure use.

  • 10:56

    MARTIN ZALTZ AUSTWICK [continued]: And there's more journeys going from Waterlooto the City of London, because the model I've builtdoesn't explain land use, and the factthat they're all offices.So depending on what network you look at,you get very, very different answersin terms of the properties of the system.In terms of what questions you want to ask.Are you interested in commuters?Are you interested in the aggregate?Are you interested in seasonal trends?

  • 11:16

    MARTIN ZALTZ AUSTWICK [continued]: Or do you want to take into account all of those factors?[MUSIC PLAYING]

Video Info

Publisher: SAGE Publications Ltd.

Publication Year: 2017

Video Type:In Practice

Methods: Data visualization, Social network analysis, Data mining

Keywords: bicycling; open source; public transport; Shared work; Software; Spatial behavior; time factors ... Show More

Segment Info

Segment Num.: 1

Persons Discussed:

Events Discussed:

Keywords:

Abstract

Martin Zaltz Austwick explains his project examining the use of bikeshare schemes in London. He uses open data sources and visualization software to detect patterns in how people use bikeshare transport.

Looks like you do not have access to this content.

Data Visualization: London's Bikeshare Scheme

Martin Zaltz Austwick explains his project examining the use of bikeshare schemes in London. He uses open data sources and visualization software to detect patterns in how people use bikeshare transport.

Copy and paste the following HTML into your website