- 00:00
[Goals of Statistical Testing II]

- 00:00
DANIEL LITTLE: In this video, I wantto discuss the goals of statistical testingwhen you have more than a single data point.[More than one data point: A sample of data]With more than one data point, youcan imagine that what we want to knowis whether the entire sample of datawas generated from some hypothesized populationdistribution.So you can see in this example that we've

- 00:22
DANIEL LITTLE [continued]: collected a number of observations--here, each of these represented by a small circle--from our hypothesized population distribution, representedby this blue curve.But it's important to remember that we don't actuallyhave access to that population distribution.All we have is our sample of data.

- 00:45
DANIEL LITTLE [continued]: With a sample of data, what we want to knowis whether the entire sample came from the populationdistribution of interest.For instance, we might run an experimentand ask whether our observed sample is significantlydifferent from a normal distribution with a mean of 0.For this test, what we would do is conduct a one sample t-test.

- 01:06
DANIEL LITTLE [continued]: [More than one sample]We can also deal with cases in which wehave more than one sample.For instance, thinking back to the class arrival time examplein part 2 of this video series, psychology studentsare usually arts or science students.The science students might have some lectureon the other side of campus.So all of the science students have a little bit further

- 01:28
DANIEL LITTLE [continued]: to travel than the art students.That extra travel time adds a little bit of timeto their travel on top of all of those other factors,like stopping for coffee and dropping your books,to reach the lecture theater.The key thing is that for the science students,it adds roughly the same time to allof the students who have to travel

- 01:49
DANIEL LITTLE [continued]: from the other side of campus.Fundamentally, what we're doing with statisticsand what we're interested in is checking whether or notwe can tell science and art students apartbased on their arrival times alone without asking whattheir actual enrollment is.If it were the case that many science students take a courseon the other side of campus, then this

- 02:11
DANIEL LITTLE [continued]: provides systematic variability in our data.It's important to note, though, that we can't say anythingabout a single individual here.For instance, there might be caseswhere an art student who lives in East Brunswickand can easily ride their fixie to school wakes up late,refuses the cafe latte with the burnt milkand must wait for another one, gets

- 02:32
DANIEL LITTLE [continued]: their scarf caught on a fence while riding past the trainstation.And in that case, an art student would arrive to classjust as late as a science student would.For instance, you might have an art studentwho typically arrives rather early,but due to all of those causal eventsactually arrive somewhere down here.

- 02:53
DANIEL LITTLE [continued]: The science students, who have to travel further,might be typically placed all around this area here--these are arrival times.But if we pull out any one of those students,it's very difficult to tell whether that student isan arts or a science student.But by looking at the entire sample of data,

- 03:14
DANIEL LITTLE [continued]: we might be able to make that distinction.What matters, even though looking at arrival timesdoesn't provide a clear cut-off to decide between artsand science students, it does showthat systematic variability in the data can be picked up.So the question that we ask is whether our two sampleswere generated by one distribution

- 03:37
DANIEL LITTLE [continued]: or by two distributions.So you might imagine what this lookslike is you have one hypothesized populationdistribution that looks like that.Alternatively, you might have two population distributions,one for science students and one for art students.[More than one sample: Two or more samples]With multiple samples of data, what we want to know

- 03:59
DANIEL LITTLE [continued]: is whether both samples came from the same populationdistribution of interest.For instance, we might run an experimentand ask whether our observed samples are significantlydifferent from each other.For this particular test, we mightconduct an independent or paired samples t-test.Or, with more than two groups, wemight conduct an ANOVA-- an Analysis of Variance Test.

- 04:20
DANIEL LITTLE [continued]: [What the p value means: Part II]The statistical tests provide us with a wayof getting at the p-value so we can make an inference.Consider the case with multiple groups.If our experimental manipulations have no effect,then all of the variation that we observeis variation just due to chance, just due

- 04:40
DANIEL LITTLE [continued]: to those random factors.And our groups will not be different enoughto say that they are truly different.If our experimental manipulation, however,does have an effect, then it will push the groups apart,more than just by chance alone.What we want to know is whether our groupshave been pushed apart by our experimental manipulation.

- 05:02
DANIEL LITTLE [continued]: So with more than one group, we have two possibilities.Our experimental groups, our samples,come from the same underlying population,or, our experimental groups might comefrom different populations.Whenever our samples come from the same population,any difference in our observed samples--

- 05:23
DANIEL LITTLE [continued]: so the difference between this group here and this grouphere-- is a difference just due to chance alone.On the other hand, the alternativeis that each of these groups camefrom their own individual populations,then the differences between the samplesis due not only to chance variation

- 05:45
DANIEL LITTLE [continued]: alone, but also due to some systematic effect--for instance, that science students haveto walk further than art studentsto get to their psychology class.Thinking about this idea that the samples were generatedfrom the same underlying population distribution

- 06:09
DANIEL LITTLE [continued]: allows us to introduce the concept of the null hypothesis.Our null hypothesis is that, in fact, thereis no difference between our groupthat is due to any kind of systematic effect.The difference we observe between our two samplesis just due to chance alone, justdue to these random factors.The p-value tells us the probability

- 06:30
DANIEL LITTLE [continued]: of observing the differences that we actuallyobserve between our samples-- and this is our data that we'vecollected-- given that the null hypothesis is true.When we find that we have a p-value less than 0.05,what this means is that if the null hypothesis is true,then we have less than a 5% chance

- 06:52
DANIEL LITTLE [continued]: of observing the difference that we actuallyfound between our samples.[Statistical Inference]To sum up, in these three videos,we have looked briefly at three situations.We have looked at inferring somethingabout a single observation of data.We have looked at inferring somethingabout an entire sample of data.

- 07:14
DANIEL LITTLE [continued]: And we have looked at inferring whether two samples comefrom the same or different populations.All of these examples utilizes the same underlying mechanismfor statistical inference-- find the probabilityof the set of observations representingthe data under the distribution whichrepresents the null hypothesis.The p-value gives us the probability of our data,

- 07:36
DANIEL LITTLE [continued]: assuming that our null hypothesis is true.We can then apply a cut-off of 0.05in order to make a decision about our sample.Every statistical test is designedto go through each of these steps,albeit for different types of data.It is important to note that each of these testshave different assumptions.Consequently, it is critical that we first characterize

- 07:58
DANIEL LITTLE [continued]: the distribution of data.Usually we want to know that that distribution comesfrom a normal distribution.Having characterized our distribution of dataand showing that we meet the underlying assumptionsof the statistical test, then we can use that statistical testto tell us all sorts of fundamentally useful thingsabout our data.

- 08:19
DANIEL LITTLE [continued]: If on the other hand, we find that we have violationsof the specific test assumptions,then we're prevented from characterizing the underlyingdistribution appropriately.And this prevents us from accuratelycomputing the p-value.Each of the tutorial videos in this serieswill address some component of this particular statisticalinference process.

### Video Info

**Series Name:** Statistics for Psychology

**Episode:** 7

**Publisher:** University of Melbourne

**Publication Year:** 2014

**Video Type:**Tutorial

**Methods:** Hypothesis testing, T-test, Analysis of variance, P-value

**Keywords:** attendance patterns; mathematical concepts

### Segment Info

**Segment Num.:** 1

**Persons Discussed:**

**Events Discussed:**

**Keywords:**

## Abstract

Chapter 7 of this series on statistics for psychology concludes the section on the goals of statistical testing. Professor Daniel Little discusses the analysis of data from distinct populations.