The T-Distribution: The T-Distribution

Posted by Beetle B. on Sun 16 July 2017

Let \(\Sample\) be independent and identically distributed from \(N(\mu,\sigma^{2})\). Define the following random variable:

\begin{equation*} Y'=\frac{\bar{X}-\mu}{\sigma / \sqrt{n}} \end{equation*}

Then \(Y'\) is a standard normal distribution.

Now instead of \(\sigma\), we use the sampling standard deviation:

\begin{equation*} Y=\frac{\bar{X}-\mu}{S / \sqrt{n}} \end{equation*}

This has a t-distribution with \(n-1\) degrees of freedom. One interesting feature of this random variable is that the numerator and denominator are independent, despite arising from the same sample!

Below is a plot of the t-distribution:

t-distribution

The blue curve is the standard Gaussian distribution. The other curves are for various degrees of freedom. As the degrees of freedom increases, it approaches the normal distribution. You can see that the t-distribution has fatter tails.

(This image was taken from Wikipedia.)

Definition

Let \(\nu\) be an integer. Define the following random variable:

\begin{equation*} T=\frac{Z}{\sqrt{V / \nu}}=Z \sqrt{\frac{\nu}{V}} \end{equation*}

where:

  • \(Z\) is a standard normal with expected value 0 and variance 1
  • \(V\) has a chi-squared distribution with \(\nu\) degrees of freedom
  • \(Z\) and \(V\) are independent

Then this distribution is the t-distribution.

Let \(\bar{X}_{n}=\frac{1}{n}\left(X_{1}+\cdots+X_{n}\right)\) be the sample mean and \(S_{n}^{2}=\frac{1}{n-1} \sum_{i=1}^{n}\left(X_{i}-\bar{X}_{n}\right)^{2}\) be the sample standard deviation. It can be shown that:

\begin{equation*} V=(n-1) \frac{S_{n}^{2}}{\sigma^{2}} \end{equation*}

has a chi-squared distribution using Cochran’s Theorem (i.e. it’s not a trivial derivation).

Now:

\begin{equation*} Z=\left(\bar{X}_{n}-\mu\right) \frac{\sqrt{n}}{\sigma} \end{equation*}

is a standard normal distribution. If we now plug in \(Z\) and \(V\) in \(T\) above, we’ll get:

\begin{equation*} T \equiv \frac{Z}{\sqrt{V / \nu}}=\left(\bar{X}_{n}-\mu\right) \frac{\sqrt{n}}{S_{n}} \end{equation*}

Note that any dependence on \(\sigma\) has canceled out.

I didn’t show that the sample mean and variance is independent. But do note that this is unique to the normal distribution.

Now even though \(\mu\) shows up in \(T\), it is independent of \(\mu\). This is because \(Z\) is a standard normal.

It is important to point out that the underlying distribution need not be normal, as long as the numerator is a standard normal (which the sample mean will give for large sample sizes due to the Central Limit Theorem), and the denominator be chi-squared, and the two be independent. I honestly don’t know if such an application exists, though.