Skip to main content
Search form
  • 00:00


  • 00:09

    LUKE KEELE: Hi, I'm Luke Keele.[Dr. Luke Keele, Professor University of Pennsylvania]I'm a professor at the University of Pennsylvania.I conduct research on statistics and causal inference.So in everyday life we readily make causal inferences.We assume that x causes y.However, when we try and make such inferences with datait's actually quite difficult.For example, we want to know whether havingpeople having health insurance improves people's health.

  • 00:31

    LUKE KEELE [continued]: So let's say we let two groups of people buy insurance.Or one group decides to buy insurance,and the other group decides not to buy insurance.Later on we compare the health outcomesof those who had the insurance to those who didn'thave the health insurance.And when we're done we see that both groupsappear to be equally healthy.Does that mean that the additional healthcare brought on by having health insurance had no effect?

  • 00:52

    LUKE KEELE [continued]: The problem is, what if the group of people who boughthealth insurance were sicker?On average, those people bought health insurancebecause they expected to be sicker and then use that healthinsurance.If that were the case, what we'd find later on,of course, is that if the health insurance did improvetheir health they would be about ashealthy as the healthier people whodidn't buy health insurance.

  • 01:12

    LUKE KEELE [continued]: So actually what we've done is we'vemissed the fact that the people who bought health insurancelooked different than the people who didn't buy health insuranceand we're confusing the effects of the health insurancewith the differences between the two groups.So in statistics we often refer to somethingas statistical association or statistical correlation.A statistical association refers to whenthere's differences in y as a function of say d, a treatment.

  • 01:34

    LUKE KEELE [continued]: In this case, when we're looking for a statistical associationbetween health and insurance we'retrying to see if health outcomes differ across insurance status.So we'd be looking, of course, at the average differencesin health across levels of d.So how are association and causation related?Well if it's the case that d is a true cause of y,we will find that they are statistically associated,

  • 01:55

    LUKE KEELE [continued]: or correlated, in our data.The bigger problem, however, is the dand y can be unrelated to each otherbut we still find that they're correlated.That is, in our data they appear to be correlatedor uncorrelated when in fact d is an actual cause of y,or may not be a cause of y.The key difficulty is how do we tell whether our associationsare actually causal effects or just mere associations that

  • 02:17

    LUKE KEELE [continued]: aren't real?One way we can do that is if we formthe groups in a particular way.What happens in this case if we assign our treatment randomly?That is, before we give people health insurancewe form them by forming two random groups.That is, what we're going to do iswe're going to take a group of peopleand divide them into two groups by flipping a coin.When we flip a coin and heads comes upwe'll put one people in one group.

  • 02:39

    LUKE KEELE [continued]: And then we'll put the people who come up tails, willput them in another group.What will happen then is these two groups will on average lookvery similar to each other, right?Think about this in terms of gender.If we're using a coin to split people into the two groupswe'll find that on average the proportion of femalesshould be very similar in one or the other.What happens is by forming groupsusing coin flips we make the group

  • 03:00

    LUKE KEELE [continued]: characteristics independent of the group membership.So the two groups on average lookvery similar to each other.When we then give one a treatment and the other not atreatment we know that the resultof any difference in the outcomesis a function of the fact that one group got the treatment,and not because of the way the two groups were formed.One group isn't on average healthier than the other,or less healthy.

  • 03:21

    LUKE KEELE [continued]: One of the classic examples of thiswas the early studies in the '50s and '60s on smoking.There was a good chance that smoking caused lung cancerbut the risk was so great that itwas felt unethical to assign people to a treatment thatcould be harmful.There's also a principle of equipoise,in terms of if you've got two treatments, that one there'sa good chance is more beneficial from the other,it's considered unethical to withhold that

  • 03:42

    LUKE KEELE [continued]: from a control group.So for example, the ICU, we generallydon't randomly assign people to the ICU or notbecause those two treatments are consideredto be out of equipoise.It's unethical to give one personICU care and the other not.So there are many instances in lifewhere we find we're unable to do random assignment.What do we do then?So what we then do is we look for somethingcalled a natural experiment.

  • 04:02

    LUKE KEELE [continued]: A natural experiment is a term wherewe look for situations in everyday life,in the natural world, where people are assignedto treatments in some sort of haphazard,or what we hope to be as if random fashion.This isn't truly random.People think of random as something that'sjust completely a chance.When we form groups with coin flipsthat's random in a very precise mathematical way.

  • 04:23

    LUKE KEELE [continued]: Here we're looking for something that mimicssomething like a coin flip.Where by chance we get one group of peoplewho get one treatment and the otherwithheld that treatment so that it breaks upthis issue of self selection.What typically happens is when peopleare allowed to choose their own treatmentsthey choose those treatments basedon how they think the outcomes are going to come out.And the groups end up, just like in the health insurancecase, being formed because the healthy people are selecting

  • 04:45

    LUKE KEELE [continued]: one treatment and the unhealthy peopleare selecting a different treatment.So what we need is some reason to believethe two groups are identical.The most famous example perhaps of a natural experimentis John Snow, who was trying to determinewhether it was water or airborne related diseases thatcaused cholera.That is, did you get cholera from drinking dirty water

  • 05:05

    LUKE KEELE [continued]: or did you get cholera from beingaround someone else who had cholerawho then breathed into the air?So he surmised there were two water companies in London,and one that had pulled its water from a source thatwas considered quite polluted, and anotherwas much farther upstream, was consideredto be quite clear water.He found that the pipes from those two watercompanies tended to mingle very much in London such

  • 05:26

    LUKE KEELE [continued]: that it was sort of as if random didyou happen to live near a pump.That is, people pump their water out of pumps in the street.And he went and he mapped which pumpscame from the contaminated water and what wasfrom the uncontaminated water.This map here records the number of deaths from cholerathat he recorded.That big dot obscures a pump that was from the dirty water

  • 05:47

    LUKE KEELE [continued]: source.Around the edges are several clean water sources.And you can see, it's essentially as if randomwhere you happened to live near a dirty waterpump, or a clean water pump.And of course, the deaths were very clusteredaround the dirty water pump.Which is again, essentially, an as if random,a natural experiment, that some peoplehappen to live near one source of waterwhile other people live near the clean source of water.

  • 06:08

    LUKE KEELE [continued]: Another interesting example of a natural experimentis after 9/11.People were trying to understand whatcould be the effect of the threat of terrorismon real estate prices.So in this particular study here,this map is a map of Chicago wherethey assume that there are three trophy buildings like the SearsTower in Chicago that would be likely for an attack.

  • 06:28

    LUKE KEELE [continued]: So they mapped an area around those buildingsthat could be affected by a terrorism attack.And they looked at the price of real estatewithin buildings that happen to be near the trophybuildings versus buildings that just happened to be justnext door, just out of the range of where a terrorist attackwould have an effect.And what they found is there was actuallya clear difference in rents.The places close to the trophy buildings

  • 06:49

    LUKE KEELE [continued]: had significantly lower rents than buildingsthat were just outside what was considered the radius of wherean attack would occur.Another key area where, again, wecouldn't do random assignment is the effect of seat belts.Does wearing a seat belt present a fatality?It would be generally unethical to randomly assignone group of people to wear seat belts and another not.However, again, we get to selection.You can very much imagine if you compare one car

  • 07:10

    LUKE KEELE [continued]: crash to another car crash where one person had a seatbelt and the other didn't the personwho doesn't wear seat belts may alsodrive more recklessly and engage in lotsof other risky behaviors versus the person who wears a seatbelt.What can we do?So in this particular study, they very cleverly decidedto only look at car crashes where they were two peoplein the car and one wore a seat belt and one didn't.

  • 07:31

    LUKE KEELE [continued]: The idea being it was probably sort of a coin flip that isroughly random that those two people in that same car crash,one decided to wear their seat belt that day and the otherdidn't.There's another interesting study of the effect of media.Of course in the Cold War era people in East Germanydidn't have availability to Western media or Western TV

  • 07:51

    LUKE KEELE [continued]: reception.In this particular study, they mapped the areas.There were some parts of East Germany where you couldactually get West German TV.So in the white areas around West Berlinand near West Germany, those wereareas where people could pick up, just because of the way TVaerial reception worked at the time, versus the darker areaswere areas they were too far from any sort of Western TV

  • 08:14

    LUKE KEELE [continued]: service.And people in those areas generallywe're not exposed to any German media.What they did then is collected dataon people's political attitudes and lookedat areas near each other.That is, did people who were exposed to West German TV,was it the case that they had much moreliberal political attitudes?That were more receptive to the West versus those

  • 08:34

    LUKE KEELE [continued]: who just by chance happen to live in areas where the TVsignals didn't quite reach and as such they weren'texposed to Western media.Did they then have different political attitudesthan those others?When in particular the concept of natural experimentshas become so widespread that we've started to classify them.One particular type of natural experimentthat has started to appear is something known as a regression

  • 08:56

    LUKE KEELE [continued]: discontinuity design.It's an old idea that was sort of forgotten for a long timebut has become very popular of late.Here what we have is we have a treatmentthat's going to be assigned as a functionof some continuous score.And there's going to be a thresholdon that continuous score such that people above that cutoffwill be exposed to the treatment or people below the cutoffwon't be exposed to treatment or vise versa.

  • 09:18

    LUKE KEELE [continued]: So the idea actually dated back to a research in educationby Thistlewaite and Campbell where they are actuallyinterested in the effect of National Meritscholarships on how well people did in college.Of course we have our classic selection bias in the sensethat people who get National Merit scholarships, whichis given based on how well you do on a test, those peopleare probably going to generally do better in college.

  • 09:40

    LUKE KEELE [continued]: So they had the idea of looking--that is in the PSAT, which is how the Nationalsscholarships were given out, there is a threshold.If you score above that thresholdyou get a National Merit scholarship.If you're below that threshold, you don't get the scholarship.So here we have our score.Our score is your PSAT score.We have a cutoff, the place at which you have to score above

  • 10:02

    LUKE KEELE [continued]: which you get the scholarship, below which you don't.And then we have, we're interested in sortof college grades.So how is it helpful?How is it helpful for us to actually observe the factthat we have this cutoff?Because we know our x here, our continuous scoreis going to be correlated with the outcome, whichis college grades.That is people with really high PSAT scores probably

  • 10:22

    LUKE KEELE [continued]: had higher grades on average.That is as usual we can't simply compare the treatmentto the control, the National Merit scholarsto the non National Scholars.But the key insight that they noticedis the students near the thresholdare probably pretty similar.Let's say the threshold on the PSAT was a 1,200.You might very well imagine that the student whohappened to score 1,201 that day looks

  • 10:43

    LUKE KEELE [continued]: very much like the student who scored 1,199 that day.That is, we can think of it as sortof an as if random process that sorts peoplearound that threshold.While it's very non random who's at the top endand the bottom end we can imagine that in the middlethere's something like a natural experiment where the people whoare just above the threshold, they

  • 11:05

    LUKE KEELE [continued]: did well that day because they managedto get a little extra sleep, they managed eat breakfast.While the people just below that thresholdfor some random reason just did a little worse that day.Such that we have a comparable group around that thresholdwhere we can think of National Merit scholarshipas essentially as if randomly assigned.One key advantage to regression discontinuity design iswhen you start to look around you

  • 11:25

    LUKE KEELE [continued]: find that often many types of treatmentsare formed or assigned in this way.That is, it's often the case that for scarcity reasonswe have continuous scores and people above the thresholdare given some program or some benefit.One very classic famous example of a regression discontinuitydesign actually comes from Israelwhere they're trying to study the effect of class sizes

  • 11:45

    LUKE KEELE [continued]: on test scores.It happens to be the case that in Israel they follow somethingcalled Maimonides' rules.Which says is the cohort of students,the number of students coming into a schoolis a particular size that affectshow the class sizes are.That is specifically if there are40 students in a cohort they are given one class.However if there are 41 or 42 students

  • 12:06

    LUKE KEELE [continued]: they divide that up into two classes.So you can see here in this graphright along the bottom we have cohort size,and right around 40 the class sizes are much larger.However, if you just happen to be in a cohort, whichis 41 or 42 you're in a class that's essentially roughly 50%smaller.The y-axis here is test scores, theseare reading scores in particular.

  • 12:27

    LUKE KEELE [continued]: And you can see a very clear-- if youlook at the students who are near that threshold-- wesee a very clear effect.That is, the reading test scores for the students justat 40 who had the larger classes tend to be quite a bit lower.However, we also see that for the students justabove that threshold, who happenedto be in the classes of 20 or 21,their reading test scores are quite a bit higher.

  • 12:49

    LUKE KEELE [continued]: So this is a sense where we can think something was as if,or naturally assigned.But for those students around, if you look,you find that their characteristicsare quite comparable, because it's roughly by chancethat you ended up in a cohort of about 40 studentsversus a cohort of 41 or 42.So these are just many examples of regression discontinuitydesign, the different types of natural experiments.

  • 13:10

    LUKE KEELE [continued]: The field has changed very much across economics,the social sciences, biomedical sciences.There's now a lot of effort in a,identifying these natural experiments,and coming up with new statistical methodsto analyze them.Because we understand that while randomized experiments areoften referred to as a gold standard,our natural experiments are often our second best bet.

  • 13:30

    LUKE KEELE [continued]: So big data is actually often quite helpful in the sensethat for natural experiments we're looking actuallyfor small quirky comparisons.And for those comparisons to be goodwe often need lots of data.So take the regression discontinuity design,for example, by definition there we're not goingto be using all of our data.We're only going to be comparing those people wholook comparable around the threshold.

  • 13:51

    LUKE KEELE [continued]: The problem is if you start out with not very much data youoften don't end up with enough datato actually estimate a treatment effect.Big data is often very helpful in the sensethat if we have very large amounts of data that gives usenough data that we can actually do a feasible type of analysisaround that threshold.Big data is often not that helpful in identifyingthe natural experiments.

  • 14:12

    LUKE KEELE [continued]: So there's a lot of movement thatwere big data is associated with things like machine learning.Machine learning is often quite useful, but oftenwhen combined with clear insight.One of the things about natural experimentsis they often require a lot of qualitative informationto find them.So it requires clear human skills

  • 14:32

    LUKE KEELE [continued]: that is then combined with the machine learning methodsto then analyze the data that comesfrom a natural experiment.I would say that a natural experimentis a set of natural circumstancesthat produces something that we think is haphazard treatmentassignment.Where for no particular good reasonsome people get the treatment but other peoplesare denied the treatment.The key is to know what they are so you

  • 14:53

    LUKE KEELE [continued]: can know to identify them.If you've never heard of a regression discontinuity designit's easy to walk by them every day.If you know there are three or four sort of major typesof natural experiments, if you know where to look,they tend to pop up more often than you think.[MUSIC PLAYING]


Dr. Luke Keele, PhD, Professor at the University of Pennsylvania discusses how natural experiments, such as regression discontinuity, are an established alternative to randomized assignment in research, including examples of natural experiments and how big data and machine learning are being used.

Looks like you do not have access to this content.

An Introduction to Natural Experiments in Data Science

Dr. Luke Keele, PhD, Professor at the University of Pennsylvania discusses how natural experiments, such as regression discontinuity, are an established alternative to randomized assignment in research, including examples of natural experiments and how big data and machine learning are being used.

Copy and paste the following HTML into your website