- 00:00
[MUSIC PLAYING][An Introduction to Experimental Designs II]

- 00:11
MAHTASH ESFANDIARI: My name is Mahtash Esfandiari.I'm on the faculty in the Department of Statisticsat UCLA and also the director of the statistical processingcenter in my department.In this video, I'm going to be talkingabout the different methods of analysis of varianceas it gets applied to non-repeated, repeated,[INAUDIBLE], and mixed design.

- 00:32
MAHTASH ESFANDIARI [continued]: [One Way ANOVA With Fixed Effects]I'm going to be starting with one-way analysis of variancewith fixed effects.The reason that this is called one-way analysis of varianceis because we have a single, independent variable.

- 00:55
MAHTASH ESFANDIARI [continued]: The independent variable is categorical,and it can have as many levels as you want.And the dependent variable is always going to be numerical.An analysis of variance is an extensionof the two-sample test of the mean.In other words, it can be used with a categorical variablewith two levels or more levels.

- 01:22
MAHTASH ESFANDIARI [continued]: The question is sometimes that peopleask why couldn't we run just a whole bunchof two-sample tests?Let's say if you're comparing African American with AsianAmerican and Caucasian, why not just compareAsian-American with African, and African with Caucasian,and Caucasian with Asian-American?

- 01:43
MAHTASH ESFANDIARI [continued]: That is because statistically that's wrong,and you are going to be your type one error.For instance, if you do a t-test three timesand you consider them to be independent events,then your type one error goes to 0.14, which, by default,you always usually put equal to 0.05.

- 02:05
MAHTASH ESFANDIARI [continued]: What do we mean by fixed effects?Fixed effects means comparing means in specific populations.For example, comparing three methods of teaching,comparing three drugs, comparing three ethnic backgrounds--these are specific levels.Another example of a fixed effectwould be four levels of education-- graduate,four-year college, some college, no college.

- 02:33
MAHTASH ESFANDIARI [continued]: And what do we in one-way ANOVA, we want to explain,we want to compare the differencesbetween multiple means.But there are some assumptions that we haveto make sure that we follow.Before that, I would like to also talka little bit about what we call random effects as comparedto fixed effects and why they're important.

- 03:01
MAHTASH ESFANDIARI [continued]: Because the statistical methods thatare used to analyze random effects and the way the statethem all of the alternative hypothesisare totally different than fix effects.Let me give you an example.Suppose that we have two categoriesof painkillers-- category A and category B.And suppose that within each categorythere are 10 painkillers.

- 03:25
MAHTASH ESFANDIARI [continued]: Then we randomly pick one painkiller from category A,we randomly pick one painkiller from category B. Wedo our study, and we figure that the painkiller from category Awas better than the painkiller in category B.Then that result can be generalizedto all the painkillers in category A.Then we can conclude that all those 10painkillers in category A are betterthan all those painkillers in categoryB. Where as when we talk about the fixed effect,it's only specific levels that we talk about.

- 04:00
MAHTASH ESFANDIARI [continued]: And then I'm going to give you an example of a one-way ANOVA.Suppose that there is a research question you want to ask,and we want to say is there any relationship between BMI--that's body mass index-- and the person's blood pressure?OK, so we know here that BMI is a numerical variable.

- 04:22
MAHTASH ESFANDIARI [continued]: And one of the things that you have to know,you can always take a numerical variableand change it into categories, but you can't do itthe other way around.You can't take it categorical and change it into a numerical.So let's say we take the BMI, which is numerical,and we change it into three levels basedon the literature-- people who we consider obese,people who we call average weight,and people who we call low-BMI.

- 04:50
MAHTASH ESFANDIARI [continued]: And then the question that we ask hereis whether the average systolic bloodpressure is similar for high, medium, and low.Therefore, they way we're going to be-- another thing that wecan say is we could say equivalence,we can say-- there's no relationship between BMIand systolic blood pressure, or wecould say that the average level of systolic blood pressureis similar for high, average, and low BMI.

- 05:25
MAHTASH ESFANDIARI [continued]: Mathematically speaking, the way wewant to write the null and the alternative hypotheses,we are going to say mu of the blood pressure for high BMIis equal to the mu of blood pressure for average BMI,is going to be equal to the mu blood pressure for low BMI.I As you know, mu is the mean in the population,and the null hypothesis are alwayswritten about parameters.

- 05:51
MAHTASH ESFANDIARI [continued]: The alternative hypotheses would bethat at least one pair of these meansare not equal to each other.In order for you to reject the null, at least oneof these pairs should not be similar.And at least one pair, what do I mean?There are three particular populations,so you can consider-- you can makethree pairs-- high with average, high with low, medium with low.

- 06:15
MAHTASH ESFANDIARI [continued]: And in order for you to reject the null,it's enough for one of these pairs not to be similar.Mathematically speaking, we writethe alternative hypotheses as mu j does notbe equal to mu j prime.So j is the sub-population.I'm going to say j and j prime-- thatmeans they are two different sub-populations.

- 06:36
MAHTASH ESFANDIARI [continued]: Another example of analysis of variancewould be to compare three methods of weight class.And then what you do, you measure the weightbefore, you measure the weight after.So your null hypotheses is that the average weight loss issimilar for the three programs.And now I want to tell you, what dowe mean by analysis of variance conceptually?

- 07:02
MAHTASH ESFANDIARI [continued]: OK, some statisticians just define statisticsas the study of variation.Therefore, by analysis of variance,we mean we take the variance in the outcome variable--let's say weight-- and we divide it into two parts--the part we can explain, and the part that we cannot explain.

- 07:25
MAHTASH ESFANDIARI [continued]: In this particular study, the part that we can explain by isBMI, and the part that we cannot explain are all the confoundingfactors, all the factors that can affect your outcome.And therefore, mathematically speaking,we are dividing the total variationinto two parts, one we call the between effect,and the other one we call the within effect.

- 07:50
MAHTASH ESFANDIARI [continued]: Let me explain.Let's say that you have 60 individualsin the different group weights.Part of the differences is due to the differencesin their BMI.That's the between difference.The other is the within difference.Because if 20 people belong to the same BMI group,they have their own, individual differences.

- 08:13
MAHTASH ESFANDIARI [continued]: And those are called the within differences.Therefore genetically, we write the test of the null hypothesisas a mu1 equals to mu2 is equal to muj,and there is no limit on the number of populationsyou can compare.And the alternative, we write muj does notequal to muj prime.

- 08:36
MAHTASH ESFANDIARI [continued]: [Assumptions of One-Way ANOVA]Next thing I want to talk about is any statistical testthat you run, you need to make some assumptions.That assumption of one-way ANOVA, the first thingis the independence, which means the participants can belongto one and one group only.

- 08:58
MAHTASH ESFANDIARI [continued]: You can be in one BMI group, you canbe in one ethnic background group, or so on and so forth.You cannot be in two groups at the same.The second assumption of ANOVA that we want to talk aboutis called equality of variance.We want to make sure that the populations from which youdraw the samples have similar variances.

- 09:19
MAHTASH ESFANDIARI [continued]: In other words, the assumption beingsigma 1 squared is equal to sigma 2 squaredis equal to sigma j squared.The third assumption we want to talk aboutis normality, which means the outcome variable followsthe normal model.However, what you want to know isthat if the sample sizes are pretty large, let's 30 or more,and more important, close to each other,then ANOVA is going to be pretty robust with respectto violation of these assumptions.

- 09:46
MAHTASH ESFANDIARI [continued]: And by robust, we mean that if you violate these assumptions,you can still continue to conduct your analysisof variance methods.And one of the examples I'm going to talked about hereis that a study was conducted to examinethe relationship between level of education and discriminationagainst women.

- 10:15
MAHTASH ESFANDIARI [continued]: So our predictor is level of education,which has got three levels includingcollege, two-year college, and high school or less.And the outcome variable is the numerical variable,which is attitude toward discrimination against women.And it was evaluated with a scale,and the range is from 0 to 100.

- 10:37
MAHTASH ESFANDIARI [continued]: So what we're doing is that we'retaking the total variance in discrimination against womenand dividing it into two parts.One is called the between part, whichis the differences in the level of education,and the other one is called the within part,which is the differences that existwithin the subjects in four-year college,in two-year college, et cetera.

- 10:60
MAHTASH ESFANDIARI [continued]: [Overall F-test of ANOVA]I'm not going to get into a lot of mathematical details here,and I'll try to more present you with a conceptual approachto analysis of variance.So the first thing we do we call it an overall F-test.And F-test is a statistical test that we conduct,and it's the ratio of the between varianceto the within variance.

- 11:31
MAHTASH ESFANDIARI [continued]: And, as you see, F is never goingto be negative, because the variance is never negative,either.So two things happen-- you start an F-test, two things happen.The F-test is significant statistically,the F-test is not statistically significant.So the F-test is not statistically significant,you stop.

- 11:53
MAHTASH ESFANDIARI [continued]: You're done.In the previous example, is the F-test is not significant,you figure that there's no relationship between levelof education and discrimination against women,or, on average, different levels of education discriminateabout the same against women.But let's say the F-test is statistically significant,you're not done.

- 12:14
MAHTASH ESFANDIARI [continued]: You need to compare the different levels of educationand figure out with one of them are different from each other.And that is what we mean by post hoc analysis.And I'm going to give you an example-- I justran this in R. I'm going to show you what I found.

- 12:35
MAHTASH ESFANDIARI [continued]: And that I found the overall F-value to be 18, about 18.I found the P-value to be almost zero, which means in that case,I reject the null.And I conclude that education is a significant predictorof discrimination against women, but Ineed to come back post hocs to figure which level is higherand which level is lower.

- 12:60
MAHTASH ESFANDIARI [continued]: And the first thing I need to do isthat I need to look at the means.I look at the means, I figure outthat as the level of education increases,then the discrimination against women decreases.And when I look at the post hoc analysis,I figure out that there is no statistically significantdifference between two-year college and four-year college.

- 13:28
MAHTASH ESFANDIARI [continued]: There is a statistically significant differencebetween high school and four-year college,and there is a statistically significant differencebetween two-year college and high school.Which means that if you go to college,you are less likely to discriminate against women.And if you look at the plot of the means,you see that the mean of the high schoolis high up there, about a value 48 or 49.

- 13:57
MAHTASH ESFANDIARI [continued]: And you look at the mean of two-year collegeand four-year college, and they'repretty close to each other, like 42 and 44.So basically, that part of the meanis very, very useful, because it shows youthat the two-year college and four-year college,on average, discriminate about the same against women.

- 14:18
MAHTASH ESFANDIARI [continued]: And this is what we mean by post hocs.So you have three groups, and you could makethree pairwise comparisons.1 with 2, 2 with 3, and 1 with 3.And when we say at least one needsto be different to reject the null,here we see that two of them are differentfrom each other-- high school from four-year, and high schoolfrom two-year.

- 14:45
MAHTASH ESFANDIARI [continued]: I'm now going to talk about an interesting article whichis about progressive of loss of gray matter among threegroups of people.One in the participants have a childhood onsetof schizophrenia.Another one, they have multidimensional impairmentand that is healthy controls.

- 15:10
MAHTASH ESFANDIARI [continued]: And the outcome variable is the total loss of the gray matter.You can look at this article on the website.And the results are-- what they have doneis exactly what I talked about.First, they do a one-way ANOVA, comparing the average lossof gray matter in the three groups,followed by post hoc analysis.

- 15:35
MAHTASH ESFANDIARI [continued]: And as it is clear from the plot, generallyspeaking, the childhood-- the peoplewith childhood onset of schizophrenialose a lot more and gray mass in their brain,and that loss is statistically significantfrom the healthy control and the peoplewith multidimensional impairment.

- 15:56
MAHTASH ESFANDIARI [continued]: But the loss of the gray matter for the peoplewith multidimensional impairment is not different,statistically, from the healthy control.We'll also look at the loss of the gray matter in the frontal,parietal and the different lobes of the brain,but the only part that I concentrated onwas the total loss of the gray matter, which isfar-left of bars, as you see.

- 16:26
MAHTASH ESFANDIARI [continued]: [Two-Way ANOVA]What do we mean by two-way ANOVA?Remember I said, we say one-way ANOVAbecause you only have one predictor, or oneindependent variable?The way we say two-way ANOVA is because weare looking at the combined effect of twocategorical variables, or on a single numerical outcome.

- 16:54
MAHTASH ESFANDIARI [continued]: And this combined effect is very important in statistics,and that's many times what people are interested in.And it is referred to as an interaction effect.I'm going to give you an example of it.Let's say we want to find out if the effect of twomethods of teaching statistics-- lecture only, and lectureplus simulation, are similar on the learning of central limittheorem, and whether this varies by major-- social sciencemajors and statistics majors.

- 17:29
MAHTASH ESFANDIARI [continued]: Therefore, what are our valuables?One is the method of teaching, which is lecture versus lectureplus simulations.And the other one are the two majorsthat are dealing with, social science and statistics,and the outcome variable is learningof central limit theorem.So my question is, are these two methodsgoing to be equally effective for teachingof the social science students and social and statisticsstudents?

- 17:58
MAHTASH ESFANDIARI [continued]: So I'm looking at the combined effect,and this is called a two-way ANOVA.If I had added a third factor like gender,it would've been a three-way ANOVA, et cetera,a four-way ANOVA.And so this can get more complicatedand more complicated.And so major teaching methods are the independent variables,and the outcome variables is the knowledgeof central limit theorem, which is goingto be a numerical variable.

- 18:23
MAHTASH ESFANDIARI [continued]: And all methods of analysis of variance, your outcome variableis going to be numerical, and your predictorsare going to be categorical.Another example of two-way ANOVA with fixed effectswould be let's say we have 60 volunteers, 30 men and 30women, who want to participate in two-way class programs.

- 18:47
MAHTASH ESFANDIARI [continued]: So here I have 30 men and 30 women.I'm going to randomly assign theminto one weight loss program or the other,and that outcome is the weight loss.So this is an example where weight is like-- sorry,gender is like a blocking valuable.

- 19:07
MAHTASH ESFANDIARI [continued]: So here, then I have 30 men, 30 women.And then I put 10 in weight loss program number one, 10in weight loss program number two, 10 in weight loss programnumber three, and this is done randomly.So then I have gender with two levels,then I have weight loss with three levels.

- 19:30
MAHTASH ESFANDIARI [continued]: Then this is going to generate a three null hypotheses.One is going to be about equalityof mean of weight loss for men and women, another one,equality of weight loss for the three methods of weight loss,and the last one will be whether the combinedeffect of the two methods-- combined methods of the twofactors on weight loss is the same.

- 19:55
MAHTASH ESFANDIARI [continued]: So what did I do here?I partitioned the total variance in the weight lossto two parts.The part I can explain, the part I cannot explain.But the part that I can explain now consists of three factors--the factor of gender, the factor of weight loss program,and the combined effect of the two,and the piece that I cannot explain.

- 20:20
MAHTASH ESFANDIARI [continued]: [Repeated Measures & Mixed Design ANOVA]The next method I want to talk aboutis called repeated measures ANOVA.This is an extension the paired sample, or the pre-post design.In other words, the same subjectsare tested over and over again.

- 20:45
MAHTASH ESFANDIARI [continued]: For example, an oncologist is following 14 patients whosuffer from chronic leukemia.He measures the white blood cellsare time point zero after three months, after six months.Then the null hypothesis becomes that the mu of the bloodcount at baseline is equal to the mu of blood countafter three months is going to be equal to the mu of bloodcounts after nine months.

- 21:09
MAHTASH ESFANDIARI [continued]: But the thing here is that the null hypothesislooks like the one-way ANOVA, wherethe groups are independent.But mathematically speaking, these groupsare not independent.These groups are dependent.The battle between them are correlated.And you have to be very cautious in statisticsto distinguish between repeated dataor dependent and independent data.

- 21:37
MAHTASH ESFANDIARI [continued]: Because mathematically speaking, they are totally different.The correlation between independent datais equal to zero.The correlation between dependent data is not.And the last design that I want to conceptually introduceto you is a design that is called a mixed design.And a mixed design is a mix of what we call a between subjectsfactor and a within subjects factor.

- 22:05
MAHTASH ESFANDIARI [continued]: A between subjects factor would be somethinglike gender-- men and women.A within subjects factor would be somethinglike repeatedly looking at, for example, weight measures,or looking at blood pressure, or looking at students' gradesin quiz one and quiz two and quiz three and quiz fours.

- 22:29
MAHTASH ESFANDIARI [continued]: This would be the repeated measure.And then on this side, your between subjectcould be like students from different majors.It could be different GPAs-- above about average GPA,below average GPA, and so on.One of the things that I can talkabout here, an example of that I can talk about hereare two methods of diet.

- 22:52
MAHTASH ESFANDIARI [continued]: Let's say diet method one and diet method two.That's the between subject.And then I have participants, let's say 1 to 20in the first method, and 21 to 40 in the second method.And then what I'm going to do, I'mgoing to look at their weight at baseline,I'm going to look at their weight after two months,after four months, after six months.

- 23:14
MAHTASH ESFANDIARI [continued]: So here I am analyzing the variance, again,into two parts-- the part I can explain,the part I cannot explain.The part I can explain is due to one-- the type of diet.The other two are the repeated measures,and the next one is the combined effect of the two together.

- 23:35
MAHTASH ESFANDIARI [continued]: I'm going to give you an example of a mixed design.This is a data set that I was allowed to useby Professor Bijan Siassi.He's a professor of pediatrics of children's heartat USC School of Medicine.And they did a very interesting study.They took like 200, 300 some newborns.

- 23:58
MAHTASH ESFANDIARI [continued]: And these are the newborns that have very small weight,between 500 grams to 1500 grams at birth.So what we did, we divided them into three groups--if the weighed between 500 to 750, 750 grams to 1,000,and let's say 1,000 to 1,500.So these are three weight groups.

- 24:20
MAHTASH ESFANDIARI [continued]: And so this is the between subject.And then the within subject is that the blood pressurewas measured from the time that they were born to eight weeksafter birth.So the question that they were interested inwas whether the blood pressure of these infants-- the patternof change of the blood pressure over timeis different for the different weight groups.

- 24:47
MAHTASH ESFANDIARI [continued]: And interestingly enough, the pattern of changewas very much the same.Which means from week one to week five or so,the blood pressure kept on increasing with all the threegroups.And then from five to the eight, it sort of plateaued.So here we can say there is no interaction effect here,is because the pattern of change in the blood pressureis the same.

- 25:14
MAHTASH ESFANDIARI [continued]: But if you look at the plot, you seethat the ones that have a higher weight, on average,have a higher blood pressure.So the higher the weight, the higher the blood pressure.The ones in the weight group of 1,000 to 1,500 gramshave a higher weight that the ones that have 750 to 1,000and higher than the ones who have 500 to 750.

- 25:37
MAHTASH ESFANDIARI [continued]: So we're testing three null hypotheses here.Number one, is the average blood pressuredifferent for the three groups, weight groups?The answer is yes.Two-- did the average blood pressure change over time?The answer is yes, it went up.And the third one is the combined effect of the two,which means is the trend of changeof blood pressure over time similar for the threeweight groups?

- 26:04
MAHTASH ESFANDIARI [continued]: And the answer is yes.Now what I have to mention is that thisis a very, very conceptual introductionto the more advanced methods of analysis of variance.The reason it's important to go over this subjectis because in the real world, your data are notlimited to two groups, and you're notinterested in looking only at one factor.

- 26:29
MAHTASH ESFANDIARI [continued]: You're interested-- in the real world,you're interested in looking at multiple factors,and you're interested in looking on the combinedeffect of the multiple factors.Therefore, that was our major goal,to give you like a taste of these moreadvanced experimental designs and how importantthey can be in the real world in different areas,including medicine, education, psychology--any area that involves some kind of experimentationor treatment.

- 27:05
MAHTASH ESFANDIARI [continued]: For reference, out of the difference of that I likeis the Rogers E. Kirk Experimental Design.Of course, the more important partare the data analysis techniques that youcan use to analyze this data.And there are multiple softwares that I can recommend.There is use SPSS.

- 27:26
MAHTASH ESFANDIARI [continued]: That allows you to do a lot of analysiswithout having to do much programming, a lot of clickand pointing.But I also recommend R highly, because R is a free software.You can just download it into the Mac version or your PCversion and start working with it.

- 27:47
MAHTASH ESFANDIARI [continued]: I also can recommend SAS.And it all depends on what's available to youin your particular institution.Thank you very much.[MUSIC PLAYING]

### Video Info

**Publisher:** SAGE Publications Ltd.

**Publication Year:** 2017

**Video Type:**Tutorial

**Methods:** Analysis of variance, One-way analysis of variance, Two-way analysis of variance

**Keywords:** mathematical concepts; practices, strategies, and tools

### Segment Info

**Segment Num.:** 1

**Persons Discussed:**

**Events Discussed:**

**Keywords:**

## Abstract

Professor Mahtash Esfandiari discusses experiential designs and analysis of variance. Analysis of variance can be applied as non-repeated, repeated, and mixed design. Esfandiari also discusses the f-test and offers examples of research using these methods.