Large Sample Confidence Intervals for a Population Mean and Proportion

Posted by Beetle B. on Sun 16 July 2017

\(\newcommand{\Cov}{\mathrm{Cov}}\) \(\newcommand{\Corr}{\mathrm{Corr}}\) \(\newcommand{\Sample}{X_{1},\dots,X_{n}}\)

For any distribution, if \(n\) is large, use \(Z=\frac{\bar{X}-\mu}{S/\sqrt{n}}\) By the Central Limit Theorem, it is approximately normal. Then \(\bar{x}\pm z_{\alpha/2}s/\sqrt{n}\) will give you a confidence interval for \(\mu\). It is approximately \(100(1-\alpha)\)

\(n>40\) is a good size to justify this approximation. Note that this actually has 2 random variables: \(\bar{X},S\), which is why \(n\) is a bit larger than our previous rule for the CLT.

There is no way to estimate \(n\) in advance, though. Just be conservative on \(\sigma\).

For large \(n\), the upper confidence bound for \(\mu<\bar{x}+z_{\alpha}s/\sqrt{n}\).

For large \(n\) the lower confidence bound for \(\mu>\bar{x}-z_{\alpha}s/\sqrt{n}\).

A General Large Sample Confidence Interval

Need to estimate \(\theta\). Let \(\hat{\theta}\) satisfy:

  1. It is approximately normal.
  2. It is unbiased.
  3. The expression for \(\sigma_{\hat{\theta}}\) is known.

Then standardizing \(\hat{\theta}\) gives:

\begin{equation*} P\left(-z_{\alpha/2}<\frac{\hat{\theta}-\theta}{\sigma_{\hat{\theta}}}<z_{\alpha/2}\right)\approx1-\alpha \end{equation*}

If \(\hat{\theta}\) involves an unknown (e.g. \(\sigma\)), see if substituting \(s_{\hat{\theta}}\) works. If \(\sigma_{\hat{\theta}}\) involves \(\theta\), try replacing \(\theta\) with \(\hat{\theta}\).

A Large Sample Confidence Interval for a Population Proportion

Recall that if \(n<<N\), and \(np\ge10\), and \(nq\ge10\), then \(X\), the number of successes in the sample, can be treated as a normal distribution. \(p\) is unknown. But if we use \(\hat{p}=\frac{X}{n}\), then it is approximately normal. We know that \(E(\hat{p})=p\) and \(\sigma_{\hat{p}}=\sqrt{\frac{p(1-p)}{n}}\). Standardizing \(\hat{p}\) and computing the CI:

\begin{equation*} p=\frac{p+\frac{z_{\alpha/2}^{2}}{2n}\pm z_{\alpha/2}\sqrt{\frac{\hat{p}\hat{q}}{n}+\frac{z_{\alpha/2}^{2}}{4n^{2}}}}{1+\frac{z_{\alpha/2}^{2}}{n}} \end{equation*}

If \(n\) is large, this is roughly \(\hat{p}\pm z_{\alpha/2}\sqrt{\frac{\hat{p}\hat{q}}{n}}\)

But the full form should be used. It is often accurate even when \(n\hat{p}\ge 10\) or \(n\hat{q}\ge 10\) does not apply.

If \(w\) is the desired width, then \(n\approx\frac{4z_{\alpha/2}^{2}\hat{p}\hat{q}}{w^{2}}\). But we don’t know \(\hat{p}\hat{q}\). However, we do know it is maximum when each is 0.5.