- 00:00
DANIEL LITTLE: In this video we willexamine the statistical analysis tool called a one-way ANOVA.We will examine what a on-way ANOVA is, what it tests,and how to compute the relevant components of a one-way ANOVAusing example data.Our goal in one-way ANOVA is to compareone independent variable, which varies between two

- 00:23
DANIEL LITTLE [continued]: or more independent groups.If we only had two groups we could use a t-test,but an ANOVA will give us an identical result.Where an ANOVA is really helpful isif we have more than two groups.When we have more than two independent groupsthis type of research design is calleda between-subjects design.The question we're asking is whether any of the groupsare significantly different from any of the other groups.

- 00:46
DANIEL LITTLE [continued]: We are testing whether all of the groupsare generated from the same distribution.Conceptually an ANOVA compares variation between groupsto variation within groups.This is why the test is called an ANOVA,and ANOVA stands for an analysis of variance.

- 01:06
DANIEL LITTLE [continued]: Within group's variation is due to random sampling,just due to chance alone.Between-group's variation is variationdue to random sampling, plus additional variationdue to some experimental manipulation-- the effectthat we're actually interested in.To give you a good understanding of what this test is trying

- 01:28
DANIEL LITTLE [continued]: to achieve, I'm going to present an example problem in which wehave measured three different groups using an anxiety score.For instance, you can imagine that group 1 are studentssitting an exam, group 2 are ambulance drivers,and group 3 are firefighters immediatelyafter fighting a bushfire.The question we want to answer is do these three groupsdiffer in their anxiety scores?

- 01:52
DANIEL LITTLE [continued]: The data that I'm using are simulated data,but they'll illustrate what the underlying logic of the ANOVAtest actually is.In this video we will examine a data setin which there is no difference in the populationmeans between the groups.That is all three of our groups aregoing to have been generated from the same underlyingpopulation, with a population mean which is equal to 10

- 02:14
DANIEL LITTLE [continued]: for each of our three groups.We'll contrast this in the second video,with a second simulated data set in whichthere is a difference in the population mean.What we will see is that the ANOVAdetects no difference between the groups in the first dataset, but detects a significant difference between the groupsin the second data set.

- 02:36
DANIEL LITTLE [continued]: The purpose of the first two parts of the seriesis to demonstrate how the ANOVA does this.So let's start with the simulated dataset 1.For this dataset I've generated three groupsof 1,000 data points from a normal distributionwith a mean of 10.So each group has the same population mean of 10,

- 02:58
DANIEL LITTLE [continued]: and the same population standard deviation of 2.As you can see in the table each of the groupshas a sample mean very close to 10.So group 1 is our students, their mean is 10.3.Group 2 is our ambulance drivers,they have a mean of 10.09.

- 03:19
DANIEL LITTLE [continued]: And group three is our firefighters,they have a mean of 10.02, or 10.03 if you round.The means aren't exactly 10, due to random sampling,but as you can see the differences between the groupsappear to be very trivial.The standard deviations are also not quite exactly two,but they're very close to two.And if we look at the total, so all 3,000 points of data

- 03:42
DANIEL LITTLE [continued]: at once, and compute the mean it's also very close to 10.And the standard deviation is alsovery close to 2 for our total data set.Here are some histograms showing sample distributionsfor each of those groups, as well as the sample distributionfor the collection of all three groups--the total distribution.

- 04:04
DANIEL LITTLE [continued]: Just looking at each of these distributionsyou can see that they all appear to bepretty similar to one another.For one-- they all are centered around a mean of 10,and they all have the same bell shaped distribution,which is roughly normal.Each of our three separate groupslooks a lot like our total group as well.

- 04:29
DANIEL LITTLE [continued]: If we look at the total distribution for this data set,it has a mean and standard deviationclose to the population mean and standard deviation.Even though the analysis that we're usingis termed the analysis of variance,we don't actually use the variance computation directly.So if we wanted to compute the variance allwe would have to do is take our standard deviation here,

- 04:51
DANIEL LITTLE [continued]: and square it to get a total variance of 3.95.However we're not actually going to be working with the variancedirectly.Instead, we're going to compute a value calledthe sum of squares.Actually, it's the sum of squared deviationsof each score from the mean score.What we need to do to compute this

- 05:12
DANIEL LITTLE [continued]: is to take the mean of our total distribution here,represented by m, subtract that offof each of our separate scores-- I'verepresented each of our scores as an x-- and indexthem all by j.So xj will go from x1, which is our first score, x2, x3,

- 05:35
DANIEL LITTLE [continued]: all the way up to xn-- which is our total sample.So we take each score.We subtract from it the mean score, we square those scores,and then we add them up.So this giant thing which looks like an eis actually a capital Greek lettersigma, which indicates that we need

- 05:57
DANIEL LITTLE [continued]: to add up all of our score.So what we're doing is taking x1, subtracting off the mean,squaring it.Taking x2, subtracting off the mean,squaring it, and doing that for all of our sample data points.

- 06:17
DANIEL LITTLE [continued]: What this does is allow us to compute the sum of squares,which we refer to as the sum of squares total,because we compute that using the meanfrom the total distribution of scores.Which, in this case, is somewhere around 11,000--11,176.We need to compute the sum of squares value

- 06:40
DANIEL LITTLE [continued]: for a number of different comparisons.One of them is the sum of squaresfor the total distribution.We also need to compute the sum of squares for each groupseparately.We do that for group 1, group 2, and group 3.

- 07:01
DANIEL LITTLE [continued]: By adding up these separate sums of squares for each groupwe compute a quantity known as the sumof squares within groups.So we can see that for group 1 the variance is equal to 3.95,and the sum of squares is equal to 3,950.

- 07:24
DANIEL LITTLE [continued]: For group 2 the variance is 3.65, and the sum of squaresis equal to 3,648.And for group 3 the variance is 3.65, and the sum of squaresis 3,556.You'll probably notice that there'sa relationship here between the varianceand the sum of squares.What that relationship is is that the variance is actually

- 07:46
DANIEL LITTLE [continued]: the average sum of squares that we'd predict.So if we would take the sum of squares,and divide by 1,000-- which is the number of data pointswe have in each of our groups-- that would give usthe variance for these cases.So having computed the sum of squares totalon the previous slide, and the sum of squareswithin on this slide, we're then in a positionin which we can compute the sum of squares between groups.

- 08:09
DANIEL LITTLE [continued]: Here we would do that by taking the sum of squares totaland subtracting from it the sum of squareswithin, which gives us a sum of squares between groups of 23.We find for our dataset 1 that the sum of squaresbetween groups is actually a much smaller proportionof the total sum of squares than the sum of squareswithin groups.

- 08:29
DANIEL LITTLE [continued]: So we computed the sum of squaresfor three different quantities-- one for the total group.The sum of squares total equals the squared deviationsof each score from the mean.So we take each score, we subtract from it the mean,square them, and then add them up.And this quantity equals the sum of squares within plus the sum

- 08:52
DANIEL LITTLE [continued]: of squares between.We could also write out the sum of squares betweenand the sum of squares within scores usingthe same type of notation.Here we would note that what we're doingis taking each score within each group,subtracting off the mean of that group--so I've now indexed the scores by j,

- 09:14
DANIEL LITTLE [continued]: and the mean of each group by i-- by mi.So these-- this should have potentially a littlei here indicating that we only wantto include the scores from each groupwithin that particular group.We subtract off of those scores the meanof the group that they come from,square them, and then add them up within groups.

- 09:36
DANIEL LITTLE [continued]: And then we add up across all three of our groups.For the sum of squares within-- or between, rather.For the sum of squares between.What we do is we, again, take the total mean.We subtract that from the means of each of our groups--so this mi term-- square it, and multiplyby the number of samples that we have in each of our groups,

- 09:58
DANIEL LITTLE [continued]: and add those up across all of our samplesto get our sum of squares between.In summary, for dataset 1, there was no substantial differencebetween the means.The variation between groups is much lessthan the variation within groups,and more importantly it represents

- 10:20
DANIEL LITTLE [continued]: a much smaller proportion of the total variancethan the sum of squares within groups.In the next video what I'll do islook at the second data set in whichwe built in a difference between the means.What we're doing in the ANOVA is comparing now ourwithin groups variation, represented

- 10:40
DANIEL LITTLE [continued]: by our sum of squares-- which is variation just due to chancealone-- due to our between groupsvariation, which is due to chance alone, plus some effect.In the first data set-- because our sum of squares betweenwas a much smaller proportion of the total variationthan our sum of squares within-- this effectis actually quite small.

- 11:02
DANIEL LITTLE [continued]: All of those populations-- or all of those sampleswere generated from the same population.

### Video Info

**Series Name:** Statistics for Psychology

**Episode:** 8

**Publisher:** University of Melbourne

**Publication Year:** 2014

**Video Type:**Tutorial

**Methods:** One-way analysis of variance

**Keywords:** anxiety; anxiety assessment; mathematical concepts

### Segment Info

**Segment Num.:** 1

**Persons Discussed:**

**Events Discussed:**

**Keywords:**

## Abstract

In this 8th chapter of his series on statistics for psychology, Professor Daniel Little begins a three-part section about one-way ANOVA. He introduces the analysis of variance concept and provides a thorough explanation of the sum of squares.