Probability

Sample Spaces and Events

The sample space \(\mathcal{S}\) is a set. An event is a subset of the sample space. A simple event is an event with only one element.

Two events \(A\) and \(B\) are mutually exclusive if \(A\bigcap B=\emptyset\)

Axioms, Interpretations and Properties of Probability

The axioms of probability:

For an event \(A\), \(P(A)\ge 0\)
\(P(\mathcal{S})=1\)
For disjoint sets, \(P(\bigcup_{i}A_{i})=\sum_{i}P(A_{i})\) This applies even for a sequence of countably infinite sets.

The frequentist approach appears objective, because the assignment of probabilities is independent of the observer. But it is still subjective in the belief that the probability converges for large \(n\). The frequentist methods can be used only for repeatable experiments.

It should always be clear what the experiment is, and what the event is, for a given problem. Many “paradoxes” that arise (e.g. 2 card problem) do so because the experiment is ambiguous.

Various convenient tricks:

\(P(A)=1-P(A')\)
\(P(A\bigcap B)=\emptyset\) for mutually exclusive events
\(P(A\bigcup B)=P(A)+P(B)-P(A\bigcap B)\)

Note that if \(\mathcal{S}\) uncountable, then we cannot assign a probability to all events in \(\mathcal{S}\) (i.e. not all subsets will have a probability defined for it).

Conditional Probability

\begin{equation*} P(A|B)=\frac{P(A\bigcap B)}{P(B)} \end{equation*}

The Law of Total Probability: If \(A_{i}\) are mutually exclusive, and \(\mathcal{S}=\bigcup_{i}A_{i}\), then:

\begin{equation*} P(B)=\sum_{i}P(B|A_{i})P(A_{i}) \end{equation*}

Baye’s Theorem:

\begin{equation*} P(A_{j}|B)=\frac{P(A_{j}\bigcap B)}{P(B)}=\frac{P(A_{j}\bigcap B)}{\sum_{i}P(B|A_{i})P(A_{i})} \end{equation*}

\(P(A|B)+P(A'|B)=1\). Obviously, this assumes \(P(B)>0\).

Counting

Some useful identities in combinatorics:

\begin{equation*} \binom{n}{k}=\binom{n}{n-k} \end{equation*}

Mnemonic: Assume you’re choosing 3 people from 10. For every way you have of choosing 3 people, you are fixing the other 7 (and vice versa).

\begin{equation*} \binom{n}{k}=\binom{n-1}{k-1}+\binom{n-1}{k} \end{equation*}

Mnemonic: Say you need to pick 3 people out of 10. Fix a person John. You either choose him or don’t. If you picked him, there are \(\binom{9}{2}\) ways to pick the remaining 2. If you didn’t pick him, you have to pick 3 from the remaining 9.

\begin{equation*} \binom{2n}{n}=\sum_{k=0}^{n}\binom{n}{k}^{2} \end{equation*}

Mnemonic: You have to pick 10 people out of 20 (e.g. two teams of equal size). 10 of the people are women, and 10 are men. We could pick no women (only 1 way to do this). Or you could pick only one woman (10 ways to pick a woman, and 10 ways to omit a man). Or you could pick 2 women, and you need to pick 2 men to exclude, and so on.

This is a specific case of the Vandermonde Identity

\begin{equation*} \binom{n}{k}=\sum_{l=k}^{n}\binom{l-1}{k-1} \end{equation*}

The mnemonic for this is: Suppose you need to pick 4 objects from 11. We’ll partition all the ways to do this as follows: We pick the last of the 11 objects. Now we need to pick 3 more out of the remaining 10. Next, instead of picking the last element, we pick the second last element. Now we need to pick 3 items out of the 9 remaining ones (we do not want to include the last element, as all combinations involving it have been counted). And we keep iterating till we can’t go any further.

Independence

Two events \(A\) and \(B\) are independent if \(P(A|B)=P(A)\). We can show that they are independent if and only if \(P(A\bigcap B)=P(A)P(B)\).

Be careful when you have more than 2 events. It is possible to have \(P(A_{i}\bigcap A_{j})=P(A_{i})P(A_{j})\) for all \(i,j\), (in which case they are pairwise independent), but still have mutual dependence (e.g. \(P(A\bigcap B\bigcap C)\ne P(A)P(B)P(C)\)).

An example is as follows. Let 4 cards be placed in a box. If you select card 1, you win prize 1. Likewise for 2 and 3. But if you select card 4, you win prizes 1, 2 and 3. If the event is “Win prize n”, then all the events are pairwise independent, but the product of all occurs with probability 0.25, not 0.125!

For them to be mutually independent, we need:

\begin{equation*} P(\bigcap_{i}^{k} A_{i})=\prod_{i}^{k}P(A_{i}) \end{equation*}

for all combinations of \(i\), and for all number of terms \(k\).

Mutually exclusive events are never independent.