\(z\) Tests and Confidence Intervals for a Difference Between Two Population Means">

\(z\) Tests and Confidence Intervals for a Difference Between Two Population Means

Posted by Beetle B. on Tue 18 July 2017

\(\newcommand{\Cov}{\mathrm{Cov}}\) \(\newcommand{\Corr}{\mathrm{Corr}}\) \(\newcommand{\Sample}{X_{1},\dots,X_{n}}\)

Assumptions:

  1. \(X_{1},\dots,X_{m}\) from population mean \(\mu_{1}\) and \(\sigma_{1}^{2}\)
  2. \(Y_{1},\dots,Y_{m}\) from population mean \(\mu_{2}\) and \(\sigma_{2}^{2}\)
  3. The \(X\) and \(Y\) samples are independent.

The expected value of \(\bar{X}-\bar{Y}\) is \(\mu_{1}-\mu_{2}\) (unbiased estimator).

\begin{equation*} \sigma_{\bar{X}-\bar{Y}}^{2}=\frac{\sigma_{1}^{2}}{m}+\frac{\sigma_{2}^{2}}{n} \end{equation*}

Test Procedures For Normal Populations With Known Variances

Assume both distributions are normal, with \(\sigma_{1}^{2}\) and \(\sigma_{2}^{2}\) known.

Then \(\bar{X}-\bar{Y}\) is normal, with \(E(\bar{X}-\bar{Y})=\mu_{1}-\mu_{2}\). Standardizing:

\begin{equation*} Z=\frac{\bar{X}-\bar{Y}-(\mu_{1}-\mu_{2})}{\sqrt{\frac{\sigma_{1}^{2}}{m}+\frac{\sigma_{2}^{2}}{n}}} \end{equation*}

The null hypothesis: \(\mu_{1}-\mu_{2}=\Delta_{0}\) (usually \(\Delta_{0}=0\))

  • \(H_{a}:\mu_{1}-\mu_{2}>\Delta_{0}\implies z\ge z_{\alpha}\)
  • \(H_{a}:\mu_{1}-\mu_{2}<\Delta_{0}\implies z\le-z_{\alpha}\)
  • \(H_{a}:\mu_{1}-\mu_{2}\ne\Delta_{0}\implies z\ge z_{\alpha/2}\) or \(z\le-z_{\alpha/2}\)

Using a Comparison to Identify Causality

Beware studies that do not randomize but do imply causality.

\(\beta\) and the Choice of Sample Size.

  • \(H_{a}:\mu_{1}-\mu_{2}>\Delta_{0}\implies \beta(\Delta')=\Phi\left(z_{\alpha}-\frac{\Delta'-\Delta_{0}}{\sigma_{\bar{X}-\bar{Y}}}\right)\)
  • \(H_{a}:\mu_{1}-\mu_{2}<\Delta_{0}\implies \beta(\Delta')=1-\Phi\left(-z_{\alpha}-\frac{\Delta'-\Delta_{0}}{\sigma_{\bar{X}-\bar{Y}}}\right)\)
  • \(H_{a}:\mu_{1}-\mu_{2}\ne\Delta_{0}\implies \beta(\Delta')=\Phi\left(z_{\alpha/2}-\frac{\Delta'-\Delta_{0}}{\sigma_{\bar{X}-\bar{Y}}}\right)-\Phi\left(-z_{\alpha/2}-\frac{\Delta'-\Delta_{0}}{\sigma_{\bar{X}-\bar{Y}}}\right)\)

Large Sample Tests

If \(n\) is large, then \(\bar{X}-\bar{Y}\) is normal (from the Central Limit Theorem).

\begin{equation*} Z=\frac{\bar{X}-\bar{Y}-(\mu_{1}-\mu_{2})}{\sqrt{\frac{S_{1}^{2}}{m}+\frac{S_{2}^{2}}{n}}} \end{equation*}

This is approximately standard normal.

This is appropriate for \(m\ge40,n\ge40\)

Confidence Intervals for \(\mu_{1}-\mu_{2}\)

If both \(m\ge40,n\ge40\), a confidence interval for \(\mu_{1}-\mu_{2}\) at approximately \(100(1-\alpha)\) is:

\begin{equation*} \bar{x}-\bar{y}\pm z_{\alpha/2}\sqrt{\frac{s_{1}^{2}}{m}+\frac{s_{2}^{2}}{n}} \end{equation*}

Note: Normality of the distribution is not required for this.

The required sample size:

\begin{equation*} n=\frac{4z_{\alpha/2}^{2}(\sigma_{1}^{2}+\sigma_{2}^{2})}{w^{2}} \end{equation*}

where \(w\) is the desired width.