The distribution of t according to the Central Limit Theorem
Last updated on 2024-03-12 | Edit this page
Estimated time: 7 minutes
Overview
Questions
- What is the central limit theorem?
- What does it predict for the distribution of \(t\)?
Objectives
- Introduce the Central limit theorem.
- Explain that it gives a theoretical distribution for \(t\).
Calculate a p-value using the central limit theorem
Let me sum up what we know now:
- We have a sample of mice weights and a weight \(\mu_0\) that we compare these weights to.
We want to know whether the average weight differs from \(\mu_0\).
- We calculated \(t\) for this
specific sample. \(t\) is a scaled
difference between sample mean and \(\mu_0\). In case the sample mean is the
same as \(\mu_0\), \(t\) should be close to zero.
- Moreover, the central limit theorem tells us that, if high fat diet mice in general don’t differ from \(\mu_0\) in their weight, we can expect \(t\) to come from a standard normal distribution.
This means, that if we sample again and again (i.e. the experiment
where 20 mice were fed with special diet and weighed, and a \(t\) was calculated), we should measure a
different \(t\) each time, and plotting
the a histogram or density of \(t\)s
will give a Gaussian bell shape.
This is enough knowledge to calculate a p-value. You can do this :)
The distribution of t
We start with the mouse weights and the resulting \(t\) from the last section.
What is the probability of observing a \(t\) with value at least as high as that seen here in a standard normal distribution (normal distribution with mean 0 and standard deviation of 1)?
Remember that in this scenario the question is not whether the weight is “higher than”, but “different from” \(\mu_0\). How do you have to adapt the above calculation to make this a two-sided test?
R
weights <- c(31.41, 28.29, 22.82, 26.07, 31.97, 22.60, 31.47, 29.18, 22.98, 23.26, 23.48, 20.88, 28.44, 30.34, 23.14, 22.80, 24.47, 39.73, 25.71, 22.74)
mu0 <- 23.89338
tstat <- (mean(weights) - mu0) *sqrt(length(weights)) / sd(weights)
pnorm(tstat, mean=0, sd=1,lower.tail=FALSE)
OUTPUT
[1] 0.005122719
- Just multiply by 2: that gives the probability to get something larger than the observed t, or smaller than -t. Multiplying works, because the normal distribution is symmetric.
R
pnorm(tstat, mean=0, sd=1,lower.tail=FALSE)*2
OUTPUT
[1] 0.01024544