01. Outcomes, Events, and Probability

What is probability?

Consider the Monty Hall Dilemma example from the textbook (Dekking et al. 2005, sec. 1.3, p. 4). You are asked the following question:

Suppose you’re on a game show, and you’re given the choice of three doors; behind one door is a car; behind the others, goats. You pick a door, say No. 1, and the host, who knows what’s behind the doors, opens another door, say No. 3, which has a goat. He then says to you, “Do you want to pick door No. 2?” Is it to your advantage to switch your choice? – Craig F. Whitaker. ‘Ask Marilyn’. 1990. Columbia, Md.

The question was inspired by Monty Hall’s “Let’s Make a Deal” game show. Here are some assumptions we will make as participants in the game show.

Without any knowledge, we assume that there is an equal chance that thec ar is behind each door.
We know that Monty Hall, the host, always opens a door with a goat.

1/3

1/3
1/3

Initial winning probabilities.

Based on the first assumption, we can initially assume that the three doors each have one out of three chance of having a car behind it. Thus, there is one in three chance, or a probability of 1/3, that the initial choice was a car and a probability of 2/3 that the inital choice was a goat.

1/2

1/2
0

Are these the winning probabilities after reveal?

When the host opens one of the remaining doors to reveal a goat, you are now left with 2 closed doors. You may be tempted to conclude switching your choice does not make any difference in your chance of winning.

However, we also know the host always opens a door with a goat. When you first pick a door with a goat behind, the host is forced to reveal the other goat. You win the car when you switch. You can only win by not switching if you made the right choice initially.

Pick a Goat First (2/3)

Switch

Win

Pick the Car First (1/3)

Switch

Baaahhh

Consequences of switching based on your first choice.

As illustrated, you have twice as much chance of winning the car if you switch and to quantify our chances of winning, we used probabilities.

The probabilities helped us express how likely it was that we chose the door with the car. It allowed us to make an informed decision to have a better chance of winning the car by quantifying our level of uncertainty about what’s behind each door. Another way of thinking about probability is in terms of long-term relative frequency. For example, if you play the Monty Hall game repeatedly over 1,000,000 times and switched the doors each time, you would expect to win approximately $1,000,000 \times 2/3 \approx 666,667$ cars.

Probability is the science of uncertainty. It provides precise mathematical rules for understanding and analyzing our own ignorance (Evans and Rosenthal 2004).

The term probability refers to the study of randomness and uncertainty…the theory of probabilty provides methods for quantifying the chances, or likelihoods, associated with the various outcomes (Devore and Berk 2012).

Outcomes, Sample Space, and Events

So far, we discussed what a probability repersents. We will now formulate the mathematical framework to formally define probability.

We use the term (random) experiment in a very general sense to describe mechanisms and phenomena where the outcomes are unpredictable, or random.

A sample space is the collection of all possible outcomes from an experiment. It’s often denoted $\Omega$ (Omega).

An event is a subset of the sample space.

In the Monty Hall example, we can think of the game as a random experiment with the location of the car as the outcome. The location of the car is unknown and unpredictable to the participant until the car is revealed at the end. The sample space is $\Omega=${Door 1, Door 2, Door 3}. Assuming the participant chooses Door 2 at the end of the game, the event that the participant wins is {_Door 2-} and the event of losing is {Door 1, Door 3}.

Some Basic Theory

Events are represented as subsets of the sample space. We will define notations and rules that are useful when working with such sets. To demonstrate the notions, we will use Venn diagrams where two events, $A$ and $B$, are represented with circles enclosed by a rectangle representing their sample space $\Omega$.

A B

The event that satisfies $A$ or $B$ is called the union of $A$ and $B$. $\cup$ is the set operator that represents a union.

`$A \cup B$`

A B

The event that satisfies $A$ and $B$ is called the intersection of $A$ and $B$. $\cap$ is the set operator that represents an intersection.

`$A \cap B$`

The event that does not satisfy an event is called the complement of the event. ${(\phantom{A})}^c$ is the set operator that represents a complement. Another way to represent this is $\Omega\setminus A$ where $\setminus$ is the set minus operator. The complement operation is a special cases of the set minus operation.

`$A^c$`

Example: Rolling a Fair Die

Suppose you role a regular die once. Let $A$ represent the event that you role an even number and $B$ the event that you role a number less than 3.

The sample space $\Omega$ is the collection of posssible numbers from a regular sie-faced die. That is, $$\Omega=\left\{1,2,3,4,5,6\right\}.$$
The union of $A$ and $B$ is the event that you roll a number that is even or less than 3. That is, $$A\cup B=\left\{1,2,4,6\right\}.$$
The intersection of $A$ and $B$ is the event that you roll a number that is even and less than 3. That is, $$A \cap B=\left\{2\right\}.$$
The complement of $A\cup B$ is the event that you do not roll a number that is even or less than 3. That is, $$\left(A\cup B\right)^c = \left\{3,5\right\}^c.$$

DeMorgan’s Laws

Note that the last event can be rephrased as the event that you roll a number that is not even and not less than 3. In order words, $$\left(A\cup B\right)^c = A^c \cap B^c.$$

The relationship is demonstrated with Venn diagrams below.

$$A\cup B$$

A B

Complement

$$(A\cup B)^c$$

A B

$$A^c$$

A B

$$B^c$$

A B

Intersection

$$A^c\cap B^c$$

A B

DeMorgan’s Laws generalizes the relationship.

DeMorgan’s Laws state that we have

$$\left(A\cup B\right)^c=A^c \cup B^c\text{ and } \left(A\cap B\right)^c=A^c\cup B^c$$

for any two events $A$ and $B$.

Disjoint Sets and Subset

Disjoint events and subsets are two other relationships that are useful when discussion probability.

A B

For any two events $A$ and $B$, we say they are disjoint, or mutually exclusive if they have no outcomes in common. Their intersection is an empty set denoted as $\emptyset$.

`$A \cap B = \emptyset$`

A B

For any two events $A$ and $B$, we say $A$ is a subset of $B$, or $B$ implies $A$, if all outcomes of $A$ lie within $B$. Their intersection is $A$.

`$A \subset B$`

Probability and Probility Function

We are now ready to formally define probability. We provide two separate versions based on the size of the sample space here. Namely, we provide one for the simple case where the sample space is finite, or $\lvert \Omega\rvert<\infty$, and one for the case where the sample space is infinite, or $\lvert \Omega\rvert=\infty$.

For any event $A$, $\lvert A\rvert$ is called the cardinality of the event $A$, or the size of the event. It is the count of all outcomes that belong to the event.

A probability function $P$ defined on a finite sample space $\Omega$ assigns each event $A$ in $\Omega$ a number $P(A)$ such that

$0\le P(A)\le 1$,
$P(\Omega)=1$, and
$P(A\cup B)=P(A) + P(B)$ if $A$ and $B$ are disjoint.

The number $P(A)$ is called the probability that $A$ occurs.

A probability function $P$ defined on a infinite sample space $\Omega$ assigns each event $A$ in $\Omega$ a number $P(A)$ such that

$0\le P(A)\le 1$,
$P(\Omega)=1$, and
$P(A_1\cup A_2\cup A_3\cup \cdots) = P(A_1) + P(A_2) + P(A_3) + \cdots$ if $A_1,A_2,A_3,\ldots$ are disjoint.

The number $P(A)$ is called the probability that $A$ occurs.

Suppose we are interested in the probability of the union of two events $A$ and $B$ in sample space $\Omega$. To compute the probability using the definition, we may start by identifying disjoint subsets of $A\cup B$.

$$A\cap B^c$$

A B

$$A^c\cap B$$

A B

$$A\cap B$$

A B

Union

$$A \cup B$$

A B

\begin{equation} \implies P(A\cup B)=P(A\cap B^c) + P(A^c \cap B) + P(A\cap B) \tag{1} \end{equation}

It’s important to identify subsets that are disjoint.

Similarly, we can decompose events $A$ and $B$ into two disjoint subsets respectively.

\begin{equation} P(A) = P(A\cap B) + P(A\cap B^c) \end{equation} \begin{equation} \implies P(A\cap B^c) = P(A) - P(A\cap B) \tag{2} \end{equation} \begin{equation} P(B) = P(A\cap B) + P(A^c\cap B) \end{equation} \begin{equation} \implies P(A^c \cap B) = P(B) - P(A\cap B) \tag{3} \end{equation}

Substituting Equations (2) and (3) into Equation (1), we can derive the following result.

\begin{equation} P(A\cup B)=P(A) - P(A\cap B) + P(B) - P(A\cap B) + P(A\cap B)\\ \end{equation} \begin{equation} \implies P(A\cup B)=P(A) + P(B) - P(A\cap B) \tag{4} \end{equation}

This result for a union of any two sets hold true in general.

The Probability of a Uninion states that we have

$$P(A \cup B) = P(A) + P(B) - P(A\cap B)$$

for any two events $A$ and $B$.

A similar investigation between any event $A$ and its sample space $\Omega$ combined with the definition $P(\Omega)=1$ yields the following general result for the probability of the complement of $A$.

The Probability of a Complement states that wehave

$$P(A^C) = 1 - P(A)$$

for any event $A$.

Calculating Probability by Counting

Let’s consider the example of rolling a fair die. Recall, the sample space $\Omega$ is $\{1,2,3,4,5,6\}$. Note that each event with single outcome is disjoint from each other. Therefore,

$$P(\{1\}) + P(\{2\}) + P(\{3\}) + P(\{4\}) + P(\{5\}) + P(\{6\}) = P(\Omega) = 1.$$

Since it’s a fair die, all 6 outcome have the chance of being rolled. Let’s denote the probability with $p$. Then, we have

$$6p = 1 \implies p = \frac{1}{6}.$$

In general, we can compute $P(A)$ for any event $A$ of the sample space $\Omega$ using

\begin{equation} P(A)=\frac{\text{number of outcomes that belong to }A}{ \text{total number of outcomes in }\Omega } \end{equation}

if the following conditions are satisfied:

all outcomes of the sample space are equally likely, and
$\Omega$ is a finite sample space.

Recall $B$ is the event that you roll a number less than 3 in the example of rolling a fair die. Using the counting method, we can compute the probability as:

$$P(B) = \frac{\lvert B\rvert}{\lvert\Omega\rvert} =\frac{2}{6}=\frac{1}{3}.$$

Multiple Experiments

So far, we considered outcomes of running a single random experiment. Often, we are interested in outcomes of running multiple experiments.

Example: Tossing a Fair Coin Twice

Suppose you toss a coin with an equal chance of landing heads and landing tails twice. How should we define the sample space?

We will assume it is not possible for the coin to land on its side. While one may argue it is possible, the probability is so small that it is often negligible.

We can first examine the sample space of each of the two tosses individually. Let $\Omega_1$ be the sample space for the first toss and $\Omega_2$ be the sample space for the second. Since the coin can only land heads or tails from the first toss, we have

$$\Omega_1=\left\{H,T\right\}.$$

We are using the same coin for the second toss and the sample space does not change:

$$\Omega_2=\left\{H,T\right\}.$$

If we land heads followed by another heads, we can denote the outcome as $(H,H)$. If the first heads is followed by tails, we can denote $(H,T)$ and so on. Note that for each outcome from the first toss, we have the same set of outcomes for the second. We can thus see that

$$\Omega = \Omega_1\times\Omega_2 =\{H,T\}\times\{H,T\}=\left\{ \left(H,H\right),\left(H,T\right), \left(T,H\right),\left(T,T\right) \right\}.$$

Product of Sample Spaces

In general, the sample space of multiple experiments is the Cartesian product of the sample paces for the individual experiments. In the case of two experiments, we have

$$\Omega=\Omega_1\times\Omega_2= \left\{\left(\omega_1,\omega_2\right): \omega_1\in\Omega_1,\omega_2\in\Omega_2\right\}.$$

Note that $\lvert\Omega\rvert=\lvert\Omega_1\rvert\cdot\lvert\Omega_2\rvert$ when $\Omega_1$ and $\Omega_2$ are finite.

For the coin tossing example, we have

$$\lvert \Omega \rvert = \left| \{H,T\}\times\{H,T\}=\left\{ \left(H,H\right),\left(H,T\right), \left(T,H\right),\left(T,T\right) \right\} \right|=4=2\cdot 2.$$

We also note that each outcome in $\Omega$ are equally likely and we can compute the probabilties of any events from the sample space by counting. For example,

$$P\left(\left\{H,T\right\}\right)=\frac{1}{4}.$$

In general, you can extend the counting method to multiple experiments when

the sample spaces of all individual experiments are finite,
the outcomes in each sample space are equally likely, and
the outcomes from one experiment do not affect the outcome of another experiment.

References

Dekking, Frederik Michel, Cornelis Kraaikamp, Hendrik Paul Lopuhaä, and Ludolf Erwin Meester. 2005. A Modern Introduction to Probability and Statistics: Understanding Why and How. Springer Science & Business Media.

Devore, Jay L, and Kenneth N Berk. 2012. Modern Mathematical Statistics with Applications. Springer.

Evans, Michael J, and Jeffrey S Rosenthal. 2004. Probability and Statistics: The Science of Uncertainty. Macmillan.

Notes

What is probability?

Pick a Goat First (2/3)

Switch

Win

Pick the Car First (1/3)

Switch

Baaahhh

Outcomes, Sample Space, and Events

Some Basic Theory

`\(A \cup B\)`

`\(A \cap B\)`

`\(A^c\)`

Example: Rolling a Fair Die

DeMorgan’s Laws

Disjoint Sets and Subset

`\(A \cap B = \emptyset\)`

`\(A \subset B\)`

Probability and Probility Function

Calculating Probability by Counting

Multiple Experiments

Example: Tossing a Fair Coin Twice

Product of Sample Spaces

References