Continuous Random Variables and Probability Distributions

Continuous distributions are given by a probability density function (pdf):

\begin{equation*} P\left(a\le X\le b\right)=\int_{a}^{b}f(x)\ dx \end{equation*}

\(f(x)\) is the pdf.

We require:

\begin{equation*} f(x)\ge0 \end{equation*}

\begin{equation*} \int_{-\infty}^{\infty}f(x)\ dx=1 \end{equation*}

For a given point, this definition gives the probability being 0. This may seem counter intuitive, but assume it was non-zero. Then from the axiom of probability recall that we require a countable union of countable (disjoint) sets to have their probabilities add. Let the sets be point sets in a uniform distribution. This will result in an infinite probability.

Unlike discrete distributions, we usually don’t derive the continuous distribution from probabilistic arguments.

We sometimes treat a discrete random variable as a continuous one due to the ease in analyzing continuous functions.

Cumulative Distribution Functions

\begin{equation*} F(x)=P(X\le x)=\int_{-\infty}^{y}f(y)\ dy \end{equation*}

And

\begin{equation*} f(x)=\frac{dF}{dx} \end{equation*}

To compute the probability \(P(a\le X\le b)=F(b)-F(a)\). This is a useful formula. Also, it is irrelevant whether the endpoints are included.

Let \(0\le p\le1\). The \((100p)\) th percentile is obtained by solving for \(\eta(p)\) in

\begin{equation*} p=F\left(\eta(p)\right)=\int_{-\infty}^{\eta(p)}f(y)\ dy \end{equation*}

The median is given by \(0.5=F(\tilde{\mu})\)

Tip: It is often easier to calculate the cdf and differentiate.

Expected Value, Variance and Mode

The mean is given by:

\begin{equation*} \mu_{X}=\int_{-\infty}^{\infty}xf(x)\ dx \end{equation*}

and the mean for a function \(h(x)\) is:

\begin{equation*} \mu_{h(X)}=\int_{-\infty}^{\infty}h(x)f(x)\ dx \end{equation*}

The variance is given by:

\begin{equation*} \sigma_{X}^{2}=\int_{-\infty}^{\infty}\left(x-\mu\right)^{2}f(x)\ dx=E\left(X^{2}\right)-\left[E(X)\right]^{2} \end{equation*}

The mode of a continuous distribution is the value that maximizes \(f(x)\).

Terminology

Consider families with two parameters, \(\theta_{1},\theta_{2}\). If changing \(\theta_{1}\) shifts the pdf, it is a location parameter. If changing \(\theta_{2}\) stretches or compresses the pdf, it is a scale parameter.

Properties

Let \(Y=h(X)\). Let \(h\) be invertible such that \(x=k(y)\). Then the pdf of \(Y\) is \(g(y)=f(k(y))|k'(y)|\)

Jensen’s Inequality

Let \(g(x)\) be convex and differentiable. Then \(g(E(X))\le E(g(X))\)

Chebyshev’s Inequality

Chebyshev’s inequality works for continuous distributions as well.