Similar to the previous lecture where we talked about the two-sample test regarding the proportions, we are going to the same notation for the population parameter representing the difference of the population means:

$$ \delta = \mu_1 -\mu_2 $$

In the same way, our null and alternative hypotheses for a two-sided test are:

$$ \boldsymbol{H_0:} \ \;\delta = \mu_1 - \mu_2 = 0 \\ \boldsymbol{H_1:} \ \; \delta = \mu_1 - \mu_2 \neq 0 $$

Now the situation is a bit more complicated in terms of comparing two population means. We need to analyse different scenarios.

Paired Samples

In this scenario, we pair the values we get from the two samples, and it is only meaningful to compare the values within the sample. For example, if we want to see if doing exercises would raise one’s body temperature, we could measure the body temperature of the same individuals before and after exercises. In this case it is only meaningful to compare the body temperature of the same person. Or we want to see if there is difference of the income between married couples. In this case it is only meaningful to compare the incomes within the couple, not between couples.

Under this scenario, we only care about the difference, so what we actually have here is just one-sample data: the difference $\boldsymbol{D}$. We are asking: is the mean of the difference ($\bar{\boldsymbol{D}}$) $0$ or not?

If the sample size is large enough, the central limit theorem tells us that:

$$ \bar{\boldsymbol{D}} \sim \mathcal{N} \left(\delta, \dfrac{\sigma_{\delta}^2}{n} \right) $$

Therefore, we have:

$$ \dfrac{\bar{\boldsymbol{D}} -\delta}{\sigma_{\delta}/\sqrt{n}} \sim \boldsymbol{\mathcal{N}} (0,1) $$

When we replace $\sigma_{\delta}$ with the sample standard deviation $s_d$, the standard normal becomes a $\boldsymbol{t}$-distribution. Under the null hypothesis, our test statistic would be:

$$ t = \dfrac{\bar{d}}{s_d/\sqrt{n}} \sim \boldsymbol{\mathcal{T}} (n-1) $$

We can use that to calculate the p-value.

Independent Samples

When we have independent samples, we are in a situation similar to what was introduced in the case of comparing two populations proportions. We can analyse the problem using the same technique.

Let’s say, we have two samples: sample 1 with a size $n_1$ comes from population 1 with a mean of $\mu_1$ and a variance of $\sigma_1^2$; sample 2 with a size $n_2$ comes from population 2 with a mean of $\mu_2$ and a variance of $\sigma_2^2$. The sample mean and the sample variance are $\bar{x}_1$ and $s_1^2$, respectively, for sample 1; The sample mean and the sample variance are $\bar{x}_2$ and $s_2^2$, respectively, for sample 2. Under some assumptions, we have

$$ \bar{\boldsymbol{X}}_1 \sim \boldsymbol{\mathcal{N}} \left( \mu_1, \dfrac{\sigma_1^2}{n_1} \right) \textmd{ and } \bar{\boldsymbol{X}}_2 \sim \boldsymbol{\mathcal{N}} \left( \mu_2, \dfrac{\sigma_2^2}{n_2} \right) $$

Then our estimator for the population parameter $\delta$ is $\boldsymbol{D} = \bar{\boldsymbol{X}}_1 - \bar{\boldsymbol{X}}_2$, and it can be shown (in the same way as the previous lecture) that:

$$ \boldsymbol{D} \sim \boldsymbol{\mathcal{N}} \left( \mu_1 - \mu_2, \dfrac{\sigma_1^2}{n_1} + \dfrac{\sigma_2^2}{n_2} \right) $$ Then we have: $$ \dfrac{(\bar{\boldsymbol{X}}_1 - \bar{\boldsymbol{X}}_2) - (\mu_1 - \mu_2)}{\sqrt{\frac{\sigma_1^2}{n_1} + \frac{\sigma_2^2}{n_2}}} \sim \boldsymbol{\mathcal{N}} ( 0, 1 ) $$ Now we do not know $\sigma_1^2$ and $\sigma_2^2$. Intuitively, we should replace them with $s_1^2$ and $s_2^2$, respectively, and they should follow a $\boldsymbol{t}$-distribution with certain degree of freedom. In fact, we can actually do better.

Equal Variance

If we assume $\sigma_1^2 = \sigma_2^2 = \sigma^2$, we have:

$$ \dfrac{(\bar{\boldsymbol{X}}_1 - \bar{\boldsymbol{X}}_2) - (\mu_1 - \mu_2)}{\sqrt{\sigma^2 \cdot \left(\frac{1}{n_1} + \frac{1}{n_2}\right)}} \sim \boldsymbol{\mathcal{N}} ( 0, 1 ) $$

We can prove that

$$ \dfrac{(\bar{\boldsymbol{X}}_1 - \bar{\boldsymbol{X}}_2) - (\mu_1 - \mu_2)}{\sqrt{s_p^2 \cdot \left(\frac{1}{n_1} + \frac{1}{n_2}\right)}} \sim \boldsymbol{\mathcal{T}} ( n_1 +n_2 - 2 ) $$

where $s_p^2 = \dfrac{(n_1 - 1)s_1^2 + (n_2 -1)s_2^2}{n_1 + n_2 - 2}$ is called the pooled estimate for the common variance, which is basically the weighted average of the two sample variances with the degrees of freedom as the weights.

Under the null hypothesis, $\mu_1 - \mu_2 =0$, so our test statistic would be:

$$ t=\dfrac{\bar{x}_1 - \bar{x}_2}{\sqrt{s_p^2 \cdot \left(\frac{1}{n_1} + \frac{1}{n_2}\right)}} \sim \boldsymbol{\mathcal{T}} ( n_1 +n_2 - 2 ) $$

We could use that to calculate the p-value. In this case we are performing a Student’s $\boldsymbol{t}$-test.

Now perhaps a more important question would be: how can we know if the population variances are equal or not? The short answer is: we don’t. However, in reality, many treatments just shift means without changing the dispersion, so it is okay to use Student’s $\boldsymbol{t}$-tests. There are tests to check if the variances are equal or not, but I generally don’t think that’s a good idea. In addition, Student’s $\boldsymbol{t}$-tests are robust. Even this equal variance assumption is violated, it still performs well.

Unequal Variance

In this case, we have no other choices but using the sample variances to replace population variances. Our test statistic is:

$$ t=\dfrac{\bar{x}_1 - \bar{x}_2}{\sqrt{\frac{s_1^2}{n_1} + \frac{s_2^2}{n_2}}} \sim \boldsymbol{\mathcal{T}} (\nu) $$

where $\nu \approx \cfrac{\bar{x}_1-\bar{x}_2}{\sqrt{\cfrac{s_1^2}{n_1}+\cfrac{s_2^2}{n_2}}},, \nu \approx \cfrac{\left( \cfrac{s_1^2}{n_1} + \cfrac{s_2^2}{n_2}\right)^2}{\cfrac{(s_1^2/n_1)^2}{n_1-1} + \cfrac{(s_2^2/n_2)^2}{n_2-1}}$.

In this case, we are doing a Welch’s $\boldsymbol{t}$-test.