Tests Concerning a Population Proportion

Posted by Beetle B. on Tue 18 July 2017

\(\newcommand{\Cov}{\mathrm{Cov}}\) \(\newcommand{\Corr}{\mathrm{Corr}}\) \(\newcommand{\Sample}{X_{1},\dots,X_{n}}\)

Let \(X\) be the number of successes.

If \(n<<N\), then \(X\) is approximately binomial (i.e. approximating as if we are sampling with replacement). If \(n\) is large, \(X\) and \(\hat{p}=X/n\) are approximately normal.

Large Sample Tests

If \(np_{0}\ge10\) and \(n(1-p_{0})\ge10\):

\(H_{0}:p=p_{0}\)

Let the test statistic be:

\begin{equation*} z=\frac{\hat{p}-p_{0}}{\sqrt{\frac{p_{0}(1-p_{0})}{n}}} \end{equation*}
  • \(H_{a}:p>p_{0}\implies z\ge z_{\alpha}\)
  • \(H_{a}:p<p_{0}\implies z\le-z_{\alpha}\)
  • \(H_{a}:p\ne p_{0}\implies z\ge z_{\alpha/2}\) or \(z\le-z_{\alpha/2}\)

\(\beta\) and Sample Size Determination

When \(H_{0}\) is true, \(Z\) above is approximately standard normal. But what if \(p=p'\ne p_{0}\)? \(Z\) is still approximately normal (linear in \(\hat{p}\)). But the mean is not 0 and \(\sigma\ne 1\):

\begin{equation*} E(Z)=\frac{p'-p_{0}}{\sqrt{\frac{p_{0}(1-p_{0})}{n}}} \end{equation*}
\begin{equation*} V(Z)=\frac{p'(1-p')/n}{p_{0}(1-p_{0})/n} \end{equation*}

Let:

\begin{equation*} A=\frac{p_{0}-p'}{\sqrt{\frac{p'(1-p')}{n}}},B_{\alpha}=\frac{z_{\alpha}\sqrt{p_{0}(1-p_{0})/n}}{\sqrt{p'(1-p')/n}} \end{equation*}

Then:

  • \(H_{a}:p>p_{0},\beta(p')=\Phi(A+B_{\alpha})\)
  • \(H_{a}:p<p_{0},\beta(p')=1-\Phi(A-B_{\alpha})\)
  • \(H_{a}:p\ne p_{0},\beta(p')=\Phi(A+B_{\alpha/2})-\Phi(A-B_{\alpha/2})\)

The sample size \(n\) for \(\beta(p')=\beta\) (one-tailed):

\begin{equation*} n=\left[\frac{1}{p'-p_{0}}\left(z_{\alpha}\sqrt{p_{0}(1-p_{0})}+z_{\beta}\sqrt{p'(1-p')}\right)\right]^{2} \end{equation*}

For two tailed, replace \(\alpha\) with \(\frac{1}{2}\alpha\)

Small Sample Tests

When \(n\) is small, use the Binomial Distribution directly. If \(H_{0}\) is true, then \(p=p_{0}\) and the pmf is \(\textrm{Bin}(n,p_{0})\) The Type I error can be calculated directly.

If \(p=p'>p_{0}\), then the pmf=\(\textrm{Bin}(n,p')\) You can calculate the Type II error directly.