$$z$$ Tests and Confidence Intervals for a Difference Between Two Population Means">

# $$z$$ Tests and Confidence Intervals for a Difference Between Two Population Means

Posted by Beetle B. on Tue 18 July 2017


Assumptions:

1. $$X_{1},\dots,X_{m}$$ from population mean $$\mu_{1}$$ and $$\sigma_{1}^{2}$$
2. $$Y_{1},\dots,Y_{m}$$ from population mean $$\mu_{2}$$ and $$\sigma_{2}^{2}$$
3. The $$X$$ and $$Y$$ samples are independent.

The expected value of $$\bar{X}-\bar{Y}$$ is $$\mu_{1}-\mu_{2}$$ (unbiased estimator).

\begin{equation*} \sigma_{\bar{X}-\bar{Y}}^{2}=\frac{\sigma_{1}^{2}}{m}+\frac{\sigma_{2}^{2}}{n} \end{equation*}

## Test Procedures For Normal Populations With Known Variances

Assume both distributions are normal, with $$\sigma_{1}^{2}$$ and $$\sigma_{2}^{2}$$ known.

Then $$\bar{X}-\bar{Y}$$ is normal, with $$E(\bar{X}-\bar{Y})=\mu_{1}-\mu_{2}$$. Standardizing:

\begin{equation*} Z=\frac{\bar{X}-\bar{Y}-(\mu_{1}-\mu_{2})}{\sqrt{\frac{\sigma_{1}^{2}}{m}+\frac{\sigma_{2}^{2}}{n}}} \end{equation*}

The null hypothesis: $$\mu_{1}-\mu_{2}=\Delta_{0}$$ (usually $$\Delta_{0}=0$$)

• $$H_{a}:\mu_{1}-\mu_{2}>\Delta_{0}\implies z\ge z_{\alpha}$$
• $$H_{a}:\mu_{1}-\mu_{2}<\Delta_{0}\implies z\le-z_{\alpha}$$
• $$H_{a}:\mu_{1}-\mu_{2}\ne\Delta_{0}\implies z\ge z_{\alpha/2}$$ or $$z\le-z_{\alpha/2}$$

### Using a Comparison to Identify Causality

Beware studies that do not randomize but do imply causality.

### $$\beta$$ and the Choice of Sample Size.

• $$H_{a}:\mu_{1}-\mu_{2}>\Delta_{0}\implies \beta(\Delta')=\Phi\left(z_{\alpha}-\frac{\Delta'-\Delta_{0}}{\sigma_{\bar{X}-\bar{Y}}}\right)$$
• $$H_{a}:\mu_{1}-\mu_{2}<\Delta_{0}\implies \beta(\Delta')=1-\Phi\left(-z_{\alpha}-\frac{\Delta'-\Delta_{0}}{\sigma_{\bar{X}-\bar{Y}}}\right)$$
• $$H_{a}:\mu_{1}-\mu_{2}\ne\Delta_{0}\implies \beta(\Delta')=\Phi\left(z_{\alpha/2}-\frac{\Delta'-\Delta_{0}}{\sigma_{\bar{X}-\bar{Y}}}\right)-\Phi\left(-z_{\alpha/2}-\frac{\Delta'-\Delta_{0}}{\sigma_{\bar{X}-\bar{Y}}}\right)$$

## Large Sample Tests

If $$n$$ is large, then $$\bar{X}-\bar{Y}$$ is normal (from the Central Limit Theorem).

\begin{equation*} Z=\frac{\bar{X}-\bar{Y}-(\mu_{1}-\mu_{2})}{\sqrt{\frac{S_{1}^{2}}{m}+\frac{S_{2}^{2}}{n}}} \end{equation*}

This is approximately standard normal.

This is appropriate for $$m\ge40,n\ge40$$

### Confidence Intervals for $$\mu_{1}-\mu_{2}$$

If both $$m\ge40,n\ge40$$, a confidence interval for $$\mu_{1}-\mu_{2}$$ at approximately $$100(1-\alpha)$$ is:

\begin{equation*} \bar{x}-\bar{y}\pm z_{\alpha/2}\sqrt{\frac{s_{1}^{2}}{m}+\frac{s_{2}^{2}}{n}} \end{equation*}

Note: Normality of the distribution is not required for this.

The required sample size:

\begin{equation*} n=\frac{4z_{\alpha/2}^{2}(\sigma_{1}^{2}+\sigma_{2}^{2})}{w^{2}} \end{equation*}

where $$w$$ is the desired width.