\(\newcommand{\Cov}{\mathrm{Cov}}\) \(\newcommand{\Corr}{\mathrm{Corr}}\)

## Expected Value

The expected value of a function \(h(X,Y)\) is \(\int\int h(x,y)f(x,y)\ dxdy\).

## Covariance

The **covariance** between \(X\) and \(Y\) is:

It is a measure of how closely related the two variables are. I suppose you could also think of it as a generalized variance.

- Strong positive relationship is a positive number
- Strong negative relationship is a negative number
- No relationship is a number close to 0

Another way of calculating it:

Note that \(\Cov(X,X)=\sigma_{X}^{2}\)

Note that \(\Cov(X,Y)^{2}\le\sigma_{X}^{2}\sigma_{Y}^{2}\)

This follows from the Cauchy-Schwarz Inequality, and follows from the fact that the covariance follows all the properties of an inner product.

## Correlation Coefficient

The problem with the covariance is that it depends on the units of the
variables. So instead we use the **correlation coefficient**:

If \(a,c\) are of the same sign, then \(\Corr(aX+b,cY+d)=\Corr(X,Y)\)

Also, \(-1\le\Corr(X,Y)\le1\)

- If \(|\rho|\ge0.8\), we say the correlation is strong.
- If \(0.5<|\rho|<0.8\), we say the correlation is moderate.
- If \(|\rho|\le0.5\), we say the correlation is weak.

These are rules of thumbs and they vary from discipline to discipline.

If \(X,Y\) are independent, then the coefficient is 0. However, the reverse need not be true. You can have a strongly dependent random variable set whose correlation coefficient is 0.

\(\rho=0\) is called **uncorrelated**, even when they are highly dependent.

\(\rho=\pm 1\) iff \(Y=aX+b\). Thus, \(\rho\) is a measure
of the degree of **linear** relationship.

Note that if \(X,Y\) are independent, \(E(XY)=E(X)E(Y)\). This is used to show that \(\Corr(X,Y)=0\) for independent variables.