Hello welcome to this lesson in Mastering Statistics.Here we're going to start to use the central limit theoremto explore certain topics.And so for this lesson, and the next several lessons,we'll drill down into some problemsof increasing complexity.And so by the end of it, I'm hoping--and I really do believe that you'llstart to see and understand why the central limit
theorem is actually useful.Unfortunately, it took a lot of groundwork to get here.So let's go ahead and do it and show ourselves that.So this is what we call applying the central limittheorem to population means.All right, so we know from previous lessonsthat when we have a sample size greater than 30--that's the magic number that's been studied.
And when we have a sample size greater than 30,what it's basically telling us is the samplingdistribution of sample means will alwayslook like a normal distribution.Doesn't matter what the original distribution looks like.If our sample size is greater than 30,we're always going to be in good shapefor our sampling distribution to also be normal.So what we're going to end up doing in this section
is learning how to use the normal tablesin the back of your book to solve practical problems.But don't forget though that whenwe solved all of the normal distribution problems before,we used something called the z-score.Because if you remember-- and this was allcovered in previous volumes of Mastering Statistics--but if you remember, the normal distribution,they all look about the same as far as the shape.
But some can be fatter.Some could be skinnier.They have slightly different means and standard deviations.So in order to be able to solve problems,we have a standard normal distribution.That's in the back of your book.And it's in table form.And you can use it to solve problems.But in order to use it, you have to calculate a z-scorethat you can then look in the chart
and get the answer you're searching for.So all of that background materialscovered in the previous volume of Mastering Statistics,I'm assuming that you know how to use a z-chart table.If you don't, you need to go back to my previous lessonsand really learn that to move on.But anyway, when we're using the z-scorefrom before if you remember, a z-score was equal to the valueI was going explore minus the mean divided
by the standard deviation.So I would plug in the mean of my distribution,the standard deviation my distribution,the point of interest, and I would get a z-score.The z-scores is what you look up in the chart.And you can find the probability, whichis the area under the curve.But we're not actually doing problemsof exactly the same type here because we're notlooking at individual data points
as part of a normal distribution.What we're doing is we're sampling a population.And we get a new distribution called the samplingdistribution.And we've learned some of the properties of the samplingdistribution.We've already talked about that.But still, we know that if we choosea sample size greater than 30, we can use the normal chart.
So what we do is we replace this guy,for the sampling distribution problems,with the following z-score score.Instead of x, you'll put x bar.I'll explain why in a minute.And you subtract-- instead of the mean of the population,we don't really always know that.We're going to put the mean of the sample means, the meanof the sampling distribution.
Now on the bottom, instead of the standard deviationof the population, we put the standard deviationof the sample mean.Now you can understand why these concepts we talk so muchabout in the previous section.Because once you know a mean and a standard deviationof any normal curve, then you can use and calculatea z-score associated with it.
And you can use the charts and tablesin the back any statistics books to solve problems.All we've done is we replace the mean with the samplingdistribution, and the standard deviation with the samplingdistribution standard deviation.The reason that this is x bar here instead of xjust comes about because when we'redoing sampling distributions, we'realways looking at a sample.
And we're calculating a sample meanfor every one of those guys.So really what you plug in here is a sample mean.You'll understand a little bit more as we solve some problems.Now this is 100% correct, right?But we could also say-- you could simplify it a little bit.And you could also say that x bar would stay the same.
We've already learned that the mean of the sample meansis always going to basically be equal to the population means.So we can replace this with a population mean if we want.And we also know what this is equal to.If you remember, it's the standard deviationof the population mean divided by the square rootof the sample size.Sample sizes is n right here.So whichever one of these you want to use is fine by me.
I don't really care.If you want to calculate mu x and sigma x and then justput in here, that's what I do, to be honest with you.But if you don't want to do that,you could just stick the original data into here.And you'll still get a z-score.Either way, you're getting exactly the same thing.So what I want to do is kind of circle these.These are important.And then I think at this point, I
don't want to talk about them anymore.I want to solve a problem.Because a lot of times, you can talk about equationsover and over again, but when you solve a problem,it becomes much easier to understand.So here's our first problem.Very practical, because everyone has some experience with this.Body temperatures of adults are normallydistributed with the mean of 98.6 degrees Fahrenheit
and the standard deviation of 0.73 degrees Fahrenheit.Find the probability of a sample of 36 adults havingan average body type of less than 98.3 degrees Fahrenheit.Now this is a problem that you would notbe able to solve unless you understood the central limittheorem.This is the kind of problem that you would--it would just be intractable.
You wouldn't know how to attack it.Notice that it's very similar to some of the problemswe've done in the past.I've asked you problems many, many timesin the past where I've told you, hey the mean body temperature's98.6 of the population.The standard deviation's 0.73 or whatever is.Find the probability that a person at randomwill have a body temp less than 98.1.
That you know how to do.We've done those kinds of problems before but.That's not what we're saying.We're saying when given informationabout the population, this is oneof those rare times when we actually knowa lot about the population.But we're asked to find the probabilitythat a sample of 36 adults have a body temperature lessthan 98.3 degrees.
Now this is a little confusing at firstbecause it doesn't say anything about a samplingdistribution in here.It doesn't tell you that you're doing sampling distributionor anything like that.But what you need to realize is that,notice that the word sample is in there.And it has 36 adults.Now 36 is greater than 30.So that's good.So we know, hypothetically speaking,that if I go select 36 people from the population
and calculate their average body temperature,I'll get a sample mean, for sample number one of 36 people.That goes for like 36 more people, for sample number two.I'll get a sample mean for that one.36 for another sample, 36 for another sample,36 for another-- I'll keep going and going and goinguntil so I've created a sampling distribution of sample means.Now I expect that on average, especially because my sample
size is greater than 30, I'm goingto get a normal distribution of sample meansthat I've calculated like that.And because it's normal, I can use the z-chart tables thatare in the back of your book.But we need to make a little bit of a modification.Now what we're being asked is find the probabilitythat a sample of 36 people is less than 98.3.
That's another way of saying find the probabilitythat a sample mean-- which a sample size would be definedas 36 people, in this case-- findthe probability that a sample mean willbe less than 98.3 degrees.That's what it's basically boiling down to.So let's start drawing some things,and understanding some things, and see what we can get.
So we know that because n is greater than 30,we know that this sampling distribution is normal.OK, so we know that.So that's awesome.That's going to allow us to do this.All right?And we also know that the mean of the sampling distribution,
if we were to go calculate everything and get the samplingdistribution, is the same as the population mean, which is 98.6.We know that because that's part of the central limit theorem.Then we also know that the standard deviationof the sampling distribution is equal to the standard deviationof the population over the square root of the mean.The standard deviation of the population is 0.73.
The number of samples that we're collecting at any given timeis 36, which is 0.73 divided by 6.So it should get a 0.1217.Now notice that it never said in the problemthat we're actually going to create this samplingdistribution.But you can assume, for the sake of argument, that you did.
What if you did actually create this samplingdistribution of sample means?What would it look like?Well we know it looks normal.So let's just-- you don't have to drawthis to solve the problem.But I'm trying to teach you here.What would it look like?We already know the looks normal,so it's going to look pretty much like this.Now this data here, this curve, this whole thing
that I'm drawing, this comes from samples of 36 people.This is not the curve of the population.This comes from samples of 36 people.In other words, I take 36 people.Find their average temperature.36 people, find their average temperature.I do that for everybody, and then I can organize my answers.
And I find out that more often than not,the mean of the sample means that I get is actually 98.6.We talked about that.It comes to the central limit theorem.So this basically, the center of this distribution is 98.6,the exact same mean as the population mean.But the standard deviation of the sampling distribution
is actually 0.122.I'm rounding up here 0.122 degrees.Very, very tight, right, it's very, very tight.In other words, I've drawn it kind of broad here.But actually it's probably pretty tight, pretty narrow,pretty steep there.So it's normal.It's tightly packed around the mean,with a standard deviation only 0.122.
And that's because I have a very large number of samples.I have in this problem here, it's 36.I'm sampling 36 people at a time.Now if I out on the street and sample36 people, and do it over and overand over again, I can expect that the sample meansthat I calculate are not going to vary very muchfrom one another, with sampling that many people in any given
time.But this is what it would look like.Now our problem is asking us-- in fact it says right here.It says, find the probability of a sample of 36 adultshaving an average body temp less than 98.3.So if this is the sampling distribution of sample means,then 98.3 is over here.
So over here, right there, 98.3 is here.What we're really asking for is whatis the probability of getting an average value for those 36people of less than 98.3?So here's 98.6, right in the center.98.3 is a little bit below it.We're asking what is the area under this curve?
Because I've constructed this sampling distribution.And when I'm trying to calculate the probability here,you can visualize it as trying to findthe area under the curve of everythingto the left of this temperature.Notice the problem doesn't say you'reactually going to construct the sampling distribution.But when you read it, you know that if you'retrying to calculate the probability, then
you can refer to it as being a normal distributionbecause the sample size is greater than 30.So moving on to the punchline, what we basically have hereis in order to use this, we have to use a z-score,in order to actually calculate anything.And we talked about, for sampling distributions,this is what it looks like.We're just copying the straight from the board over here.
And what we're going to put in hereis we're going to put the sample mean here, which is 98.3.And we'll subtract from it the mean of the samplingdistribution, 98.6.And on the bottom, we're going to putthe standard deviation of the sampling distribution 0.122.
When we subtract this on the top, divided by 0.122,what you're going to get is negative 2.47.That's in the z-score here.And if you go back and look at your z-chart table,the probability of z being less than negative 2.47, whichis what we want, because we want the area underneath that body
temperature.When you look this thing up in the z-chart,you're going to get 0.0068.This is the probability.I'll just write it down.This is the probability that a sample of 36 people
have less than 98.3 degrees Fahrenheit.In other words, if we sample 36, people average their body temp,then we're going to get something less than 98.3,which is a very low probability.But think about it, average body temp for humans is 98.6.
I could totally see how I could goask one person on the street what their body temp isand it'll be less than 98.3.I could fathom that.If I asked two people, could happen,but not terribly likely.but it could happen, if I ask two people.But if I ask 36 people, and I average those things together,and then the odds of all those people
having such a low temperature to skew it down this farhas got to be very, very low.Right?So this is basically what it's talking about.So this is problem number one.We're going to do another one in a second.But it's essential that you understand the conceptof what's going on here.We know that when we sample a large population, if the samplemean is greater than 30, then the sampling distribution
of sample means is going to be normal.So what we end up doing is we use the normal tables exactlyas we have before.Except in place of this variable,whereas before if it was just one personwe were trying to see if their body temp was less than 98.3,we would just put 98 point whatever the temperature isfor that one person minus the mean
over the standard deviation.And we would look this value up in the chart.What we're doing differently now isthat we need to put the sampling distributionmean, the sampling distribution standard deviation in there,which is the bell curve that we have,or the normal distribution that we have.The reason this is labeled x bar isbecause really what we're asking-- when we formulate
a question that says, find the probabilityof a sample of 36 adults that have an average bodytype of lesson 98.3.Another way to phrase that, which probably wouldbe a little more direct-- but you're not alwaysgoing to have direct problems on your test--would be the following.You could phrase this question like this.
You sample 36 adults at a time and construct a samplingdistribution of sample means.Find the probability that the sample mean of one set of 36people is less than 98.3.That would be a little bit more direct because then youwould say, well I'm going to construct the sample mean.I'm going to draw on the bell curve.It's the shaded area to the left.And so that's what I'm going to do.
But the problem doesn't say that.It just says, find the probabilityof a sample of 36 adults have an average bodytemp of less than 98.3.So you can assume, even though you don't actuallydo it in the problem, you can assume that a samplingdistribution was constructed.And you can assume it's normal because the samplesize is great enough.And then if you just plug in the appropriate information
into the formula for the z-score,using the sampling distribution information here and usingthe sample mean that I'm trying to figure out here,then I can get a z-score associated with this sampledistribution, which is normal.And then I can look that up and get the probabilitythat the sample mean of these 36 people that I'm looking at
is actually less than that.Because that's what it is.I expect it to take a couple of problems for youto really wrap your brain around that.So let's do another.Hopefully couple of examples will warm you upto this whole idea.
So we'll say it again.We know what the IQ scores for the US.We know the mean and standard deviation.That's the population information.And then we're saying our sample size is 50.What is the probability that these 50people, that their mean IQ will be less than 95?We know that since we're sampling 50 people at a time,if we were to construct a sampling distribution of sample
means, it would look normal.We know that.And since we know that looks normal,then we just simply calculate a new z-score,and then find out what the probability of a sample meanwould be less than 95.That's what we're trying to do.The probability that a sample mean will be less than 95from this sampling distribution.So what we can then write down is
that the mean of the sampling distribution, we knowthat it's 100 because it's equal to the population mean, always.And then we can calculate the standard deviationof the sample means.And that's going to basically be the standard deviationa population over the square root of n.Now the population has a standard deviation of 15.
My sample size is 50.So it's the square root of 50.And when you take the square root of 50and do this division here you get 2.121.So here we have for the sampling distribution-- noticethe problem didn't say we're constructing one.But you can assume that one's been constructed.Even if you don't do it, you can stillassume that it's been done.And you can say, well, the mean of that sampling distribution
would be 100.The standard deviation of that sampling distributionwould be this.So again, to draw a picture-- youdon't have to do this-- but to make it maybe a little bit moreclear, the sampling distribution of sample meansis going to look normal.It's going to look symmetric like this.The mean of this guy we've already written down here,
is 100.The standard deviation of this guy is 2.121.And again this entire curve is constructedof samples of 50 people.We take 50 people, get their average.50 more people, get their average.50 more people, get their average.I collect all that data, and I find outthat the peak of this curve is right smack dabat 100, where it should be.
And it falls off on both sides, as a normal distribution.It looks normal because my sample size, whichI haven't written on the board.But the sample size is 50.I have a very large sample size.So I'm assured that my sampling distribution looks normal.But what am I actually asked to do?I'm saying, what is the probabilitythat the mean IQ of 50 people, which means the mean IQ of one
sample, which means a sample mean--it's another way of saying thatt-- is less than 95.So if this is 100, 95 is going to be somewhere over here.So I'll write 95 in blue.And I'm saying what is the probability that a samplemean would fall this direction?What is the probability of a sample of 50people would be less than this?So this is a distribution of sample means.
So if I look on the chart, where the mean is 95,and I want to find the probability to the left,I'm finding the area under the curve.So then what I'm doing is I need to calculatea z-score, in order to calculate it,which is going to be x bar minus the sample mean dividedby the standard deviation of sample means.
Now the sample mean I'm interested in here is 95.The mean of the entire distribution is 100.The standard deviation of this whole distribution is 2.121.So when I do the subtraction and do that division thenthe z-score I'm going to get is negative 2.36.
So if I looked this up on a chart,at negative 2.36 the probability that z would be lessthan negative 2.36 is going to correspond to exactly the areathat I care about.And what I'm going to get is 0.0091.Again that's a pretty small probability.But look at what I'm asked to do.
I'm saying that the mean of this distribution is 100 for the IQ.And I'm asking, what are the odds, what'sthe probability that if I look at 50 peopleand average them that I would get actually lower than 95,or less than 95?It's a pretty low number because when you look at,the mean is 100.
The standard deviation of this guyis 2.1, not a very wide standard deviation.I'm asked to figure out what is the probability of 50people being less than 95, which is pretty faraway from the mean.Obviously is going to be a low number.So after these two problems, I hopeyou understand a little bit more about why the central limit
theorem is so useful.Because notice that in this particular problem,actually I didn't tell you what the shape of the IQdistribution is for the population.I didn't tell you anything.I just told you that the mean value of the IQ in the countryis 100.The standard deviation is 15.
That's what the problem said.I didn't tell you it was normal.I didn't tell it was skewed.I didn't tell you anything.I just told you that there's a meanand there's a standard deviation.And from the fact that the sample size is large enough,we know from the central limit theoremthat the sampling distribution will look normal.We know what the mean of the sampling distribution is.And we know what the standard deviationof the sampling distribution is.
And so we can put those in from those values into the z-score,into their appropriate spot.This equation is exactly as it was before.It's just that we replace these guys with the informationfrom the sampling distribution.And the value we put here is the valuethat comes from the problem.Trying to find the probability less than 95,which is the same thing as sayingwhat's the probability that a sample mean is less than 95?
That an average of these 50 peopleis going to be less than 95?So that value goes in for x bar.That's why there's an x bar here.In the original formula for z-score,there's just an x there.Now I'm going to be the first to admitthat applying the central limit theorem, when you firsttackle it, is complicated, not because the math iscomplicated.I mean look, I just got a couple of subtractions and divisions.
None of that's complicated.The actual math is not complicated.What's tricky is understanding really what's going on here.Understanding what the central limit's telling you,understanding the standard deviation and the meanand what a sampling distribution is, and all that stuff.That's why I spent so much time covering that here.But we've done two examples in the section.
I hope that they've given you a little bit of insightinto understanding how these kinds of problems work.We're not done yet.We're going to work a lot more problems with the central limittheorem.So stay tuned for that.But I want to make sure you understand this.Work these yourself.I highly encourage you to work them yourself.Make sure you understand the conceptsand I'm following out to the next section,where we will continue to gain more practice.
Jason Gibson explains how to apply the central limit theorem to population means. He also provides example problems to demonstrate how to work the statistical formulas involved in calculations of the central limit theorem.
Looks like you do not have access to this content.
Jason Gibson explains how to apply the central limit theorem to population means. He also provides example problems to demonstrate how to work the statistical formulas involved in calculations of the central limit theorem.