\(\newcommand{\Cov}{\mathrm{Cov}}\) \(\newcommand{\Corr}{\mathrm{Corr}}\) \(\newcommand{\Sample}{X_{1},\dots,X_{n}}\)

Things you should think about:

- What are the implications of your choice of \(\alpha\)?
- How much did intuition play a role in deciding the test statistic which helps you reject \(H_{0}\)?
- If two or more tests are appropriate, how do you decide which to use?
- If you assumed anything about the distribution or population, what is the implication if your assumption is wrong?

## Statistical vs Practical Significance

Beware of P-values when \(n\) is large! As an example: \(H_{0}:\mu=100,H_{a}:\mu>100\) Assume a normal distribution where the real mean is 101. If you make your \(n\) large enough, \(\bar{x}\approx101\) is observed and the likelihood of rejection is high.

But if practically we’re OK with \(\mu=101\), we should *not* reject
\(H_{0}\). A P-value result with this kind of \(H_{a}\) is
statistically significant, but not *practically significant*.

## The Likelihood Ratio Principle

Let \(x_{1},\dots,x_{n}\) be \(n\) observations from a
distribution \(f(x;\theta)\). The joint distribution is
\(f(x_{1};\theta)\dots
f(x_{n};\theta)\) Let \(H_{0}:\theta\in\Omega_{0}\) and
\(H_{a}:\theta\in\Omega_{a}\) where \(\Omega_{0}\) and
\(\Omega_{a}\) are disjoint. The **likelihood ratio principle** for
test construction proceeds as follows:

- Find the largest value of the likelihood for any \(\theta\) in \(\Omega_{0}\). You do this by finding the MLE within \(\Omega_{0}\) and substituting it back into the likelihood function.
- Find the largest value of the likelihood for any \(\theta\) in \(\Omega_{a}\).
- The ratio \(\lambda(x_{1},\dots,x_{n})=\frac{\max\textrm{likelihood in} \Omega_{0}}{\max\textrm{likelihood in} \Omega_{a}}\)

Reject if \(\lambda\le k\). \(k\) is selected to yield the desired Type I error probability.

In words, we are maximizing the parameters in the two regions (fixing the sample). Then taking the ratio of probabilities.

We can use this test even when the \(X_{i}\) have different distributions and even when they are dependent.

This method often minimizes \(\beta\) for a given \(\alpha\). The drawback is that you need to know the distribution.