• ## Summary

Search form
• 00:03

ALISON GIBBS: When carrying out a study,if we observe an interesting relationship between twovariables and we've concluded that it doesn't seemto be just due to chance, we might want to ask the question,does changing one of the variablescause the other variable to change?In this video, we'll begin to investigate whether or notwe can answer that question.

• 00:25

ALISON GIBBS [continued]: Here's a headline that concerns those of us who love eggs.Can they really be as great of a cause of heart diseaseas smoking is?Or maybe it's possible that people who eat a lot of eggsalso tend to eat a lot of fatty foods,and so there are other dietary factors thatcontribute to heart disease rather than the eggs.

• 00:46

ALISON GIBBS [continued]: Another recent headline reported on the negative effecton IQ of teenagers smoking marijuana.But maybe it says more about who smokes marijuanathan the actual effects of marijuana.For example, socioeconomic statushas been blamed as a potential alternate factor,contributing to the drop in IQ.

• 01:07

ALISON GIBBS [continued]: If we want to be able to make causal conclusions about onevariable causing another, we needto control how the data were collected.Let's start with a general paradigm for a study.For simplicity, we'll assume that we'reinterested in comparing just two groups,and these two groups are in two different situations.

• 01:29

ALISON GIBBS [continued]: There is a particular variable, called the response variableor outcome, that we're interested in comparingbetween the groups.This could be cardiovascular disease risk,depending on whether or not one eatsegg yolks, or IQ in pot smokers and non pot smokersor error in age estimation between males and females

• 01:49

ALISON GIBBS [continued]: or life expectancy across two geographic regions or infectionrates for a vaccinated versus a non-vaccinated group.An explanatory variable is any variablethat might explain differences that weobserve in the response between the groups.An important concern when planning the studyand analyzing the resulting data is the possibility

• 02:11

ALISON GIBBS [continued]: of confounding variables.Confounding variables affect the responseand have different values in the different groupswe're comparing.So that is impossible to tell if differences between the groupsare due to the group composition or due to the confoundingvariables.To avoid having confounding, again, we

• 02:31

ALISON GIBBS [continued]: need to collect the data carefully.Here are some ways we could collect some data.We often hear anecdotes about howa change in diet or environment hasresulted in miraculous improvementsto an acquaintance's health.For example, a close family member of mineeliminated gluten from his diet, and subsequently he

• 02:52

ALISON GIBBS [continued]: lost weight.So should all of us looking to lose a few poundseliminate gluten?Not based on the strength of the evidenceprovided by this anecdote.It's a story about only one person,and I can tell you that this person happensto drink a lot of coffee.And he's had to avoid all the pastries in the coffee shopsince giving up gluten.

• 03:14

ALISON GIBBS [continued]: So we can't make any general conclusionsbased on an anecdote.Two other and better methods of data collectionare observational studies and experiments.The advantage of an experiment over an observational studyis the strength of the causal conclusions we can make.I've ordered these methods of data collection

• 03:34

ALISON GIBBS [continued]: from worst to best in terms of the strengthof the conclusions.We'll talk about experiments in more detail in another videoand focus on observational studiesfor the rest of this one.In observational studies, as the name implies,the data are measurements of existing characteristicsof a group or groups of individuals.

• 03:56

ALISON GIBBS [continued]: The goal is typically to make conclusionsabout a population based on the sample of individualswe've observed or to compare a measurement on individualsbetween different groups.Because existing characteristics are being observed,the investigator has no control over which groupan individual belongs to.In contrast, in an experiment the investigator

• 04:19

ALISON GIBBS [continued]: imposes some kind of interventionon the individuals being studied,perhaps assigning some to one group and some to another.Experiments are the gold standardfor making causal conclusions.But for now, we'll focus more on potential issueswith observational studies.Let's look at an example of an observational studyand discuss why we need to be careful when stating

• 04:41

ALISON GIBBS [continued]: the conclusions of the study.The study we'll consider was publishedin The New England Journal of Medicine in May 2012.It was about the relationship between coffee drinkingand mortality.Headlines in The New York Times and The Toronto Starreported that coffee may help us live longer.The Washington Post in contrast was a little more reserved.

• 05:03

ALISON GIBBS [continued]: So should we need pausing now for a coffee break?Here are a few details about the study.In 1995, the researchers collected dataon over 400,000 men and women aged 50 to 71and then followed them until 2008.By then, over 52,000 of them died.They asked the question, did the amount of coffee

• 05:24

ALISON GIBBS [continued]: these people drink affect how much longer they lived?The study is an example of an observational study.Subjects weren't told how much coffee to drink.They just did what they would do otherwise.In all observational studies, we haveto be careful about interpreting our observed associations.Here our explanatory variable is the amount of coffee drank

• 05:47

ALISON GIBBS [continued]: and the outcome is how much longer the subjects lived.Suppose the coffee drinkers tended to live longer,so there's a positive association.It may be tempting to say coffee causes long life,but there are other possible reasonswhy an association between coffee and longevitymay have been observed.In observational studies, there are at least five mechanisms

• 06:10

ALISON GIBBS [continued]: that could result in an observed association between the outcomeand the explanatory variable.What we'd usually like to show isthat changes in the explanatory variablecaused the outcome to change.However, sometimes the outcome causes the explanatory variableto change.This is known as reverse causation.In our coffee study, maybe people

• 06:31

ALISON GIBBS [continued]: with chronic health conditions, who are more likely to die,tend to purposely avoid coffee.It's also possible that the relationship is justcoincidence.Another possibility is that both the outcomeand the explanatory variable result from a common cause.Then the observed association is not

• 06:52

ALISON GIBBS [continued]: because the explanatory variable causes the outcome to changebut because the common cause causes both to change.Whether a subject had diabetes is a possible common causefor the coffee study.Perhaps because of dietary restrictions,diabetics drink less coffee, and diabetics alsotend to have shorter lives.

• 07:14

ALISON GIBBS [continued]: And finally, there could be confounding variables.Confounding variables vary with the explanatory variable.So we don't know if we can attribute changesin the outcome to the explanatory variableor to the confounding variable.One of the many possible confounders in the coffee studywas smoking.Smokers tend to drink more coffee,so smoking status is associated with our explanatory variable.

• 07:36

ALISON GIBBS [continued]: And smoking also has a causal effecton our outcome length of life.We can't fully understand the effect of coffeeon length of life unless we adjust for smoking.Imagine if a study showed that coffee drinking was associatedwith shorter lives.Could we blame it on the coffee or the associated smoking?In observational studies, even when we observe an association,

• 07:60

ALISON GIBBS [continued]: we can't conclude that the explanatory variablehas a causal relationship with the outcome.There are at least five mechanisms thatcan result in the association.As we saw in a previous video in the examplewith Simpson's paradox, there were alsovariables that can be lurking, sometimes also calledhidden variables.

• 08:21

ALISON GIBBS [continued]: A lurking variable is a variable thatis not accounted for in the analysis,but it affects the association being considered.In the previous video, we saw the effectof age, which was associated with mortality, our response,and also with smoking, our explanatory variable.And when age was considered in the analysis,it reversed the nature of the association

• 08:43

ALISON GIBBS [continued]: between smoking and mortality.Lurking variables can be confounding variablesas in the smoking, age, and death example.They can also be the source of a common response or justanother variable that, if we had considered it,it would change the nature of the relationship we'relooking at.

• 09:03

ALISON GIBBS [continued]: In the coffee study, method of preparationis a possible lurking variable.We can't say if differences in the composition of espressoversus filtered coffee affect the protective effectsof coffee and thus would influencethe relationship between coffee drinking and length of life.Controlling for confounding variablesis a difficult part of the analysis

• 09:24

ALISON GIBBS [continued]: of any statistical study.In another video, we'll talk about the use of experimentsto mitigate the possibility of confounders.

### Video Info

Series Name: Understanding Data

Publisher: Alison Gibbs and Jeffrey Rosenthal

Publication Year: 2013

Video Type:Tutorial

Methods: Structured observation

Keywords: heart disease and diet

### Segment Info

Segment Num.: 1

Persons Discussed:

Events Discussed:

Keywords:

## Abstract

Alison Gibbs discusses data collection and drawing conclusions in observational studies.