what is difference between probability and probability space?

A probability space is a triplet $(\Omega,\mathcal F,P)$, where $\Omega$ is a set, $\mathcal F$ is a sigma-algebra on $\Omega$, and $P$ is a probability measure on $(\Omega,\mathcal F)$. Thus, a probability is the third element in a triplet defining a probability space.

Example: To model one throw of a die, one could use $\Omega=\{1,2,3,4,5,6\}$, $\mathcal F=2^\Omega$ and $P:A\mapsto\frac16\#A$ for every $A\subseteq\Omega$.


You're a beginning in probability? In that case, you shouldn't be learning about probability spaces, I think. You should be learning about Kolmogorov axioms, counting techniques, distribution functions, Bayes' Theorem, etc.

Anyway, a probability space is often denoted $(\Omega, \mathfrak{F}, P)$, a triple of three things:

$\Omega$ is the set of possible outcomes. It is nonempty. Here $\Omega = (H, TH, TTH, TTTH, ...)$

$\mathfrak{F}$ is a set of subsets of $\Omega$ that represents "information". The elements of $\mathfrak{F}$ are called "events" meaning we know the probability of any element of $\mathfrak{F}$ (probability of any event). You might want to check this Wikipedia section out.

Sometimes, we can choose $\mathfrak{F} = 2^{\Omega}$, meaning we know the probabilities of each of the subsets of $\Omega$. I think we can always choose $\mathfrak{F} = 2^{\Omega}$ if $\Omega$ is countable.

$2^{\Omega}$ then is the largest possible $\mathfrak{F}$.

The smallest possible $\mathfrak{F}$ is $(\emptyset, \Omega)$, meaning you know only that the probability that nothing will happen is zero and that something will happen is one. Meaning, you don't know the probabilities of any of the subsets of $\Omega$ or as Ygritte would say, you know nothing.

$\mathfrak{F}$ has certain properties:

1 $\Omega \in \mathfrak{F}$ Obviously we know, the probability that anything can happen is 1.

2 $A \in \mathfrak{F} \to \Omega$ \ $ A \in \mathfrak{F}$ Obviously, if we know the probability of an event happening, we know the probability that said event doesn't happen. It is obvious that $A \subseteq \Omega$ since $\mathfrak{F}$ is a set of subsets of $\Omega$.

3 $A_1, A_2, ... \in \mathfrak{F} \to \bigcup_{n=1}^{\infty} A_n \in \mathfrak{F}$. If we know the probabilities of each event $A_1, A_2, ...$, then we know the probability of at least one of them happening but not vice-versa.

In our case, here are some possible $\mathfrak{F}$'s (check that the verify the above properties):

$\mathfrak{F} = (\emptyset, \Omega, H, TH \cup TTH \cup TTTH \cup ...)$

$\mathfrak{F} = (\emptyset, \Omega, H \cup TH, TTH \cup TTTH \cup ...)$

$\mathfrak{F} = (\emptyset, \Omega, H \cup TTH, TH \cup TTTH \cup ...)$

$\mathfrak{F} = (\emptyset, \Omega, H \cup TTH \cup TTTTH \cup ..., TH \cup TTTH \cup TTTTTH...)$

and of course

$\mathfrak{F} = (\emptyset, \Omega)$

$\mathfrak{F} = 2^{\Omega}$

Finally, $P$ is the probability measure, which is the method by which you compute the probability of events. It's something like assigning a number (i.e. the probability) to which event.

It has the properties:

1 mapping from $\mathfrak{F} \to [0,1]$ We can't compute the probability of objects not in $\mathfrak{F}$ by definition. Also, there are negative probabilities, but let's just consider $[0,1]$ for now.

2 $P(\Omega) = 1$ by assumption

3 $P(\bigcup_{n=1}^{\infty} A_n) = \sum_{n=1}^{\infty} P(A_n)$ if $A_1, A_2, ... \in \mathfrak{F}$ and $A_1, A_2, ...$ are pairwise disjoint. If we know the probabilities of each of the pairwise disjoint events $A_1, A_2, ...$, then we know the probability of at least one of them happening and vice-versa.

In our case, if we choose $\mathfrak{F} = 2^{\Omega}$, we can say that

$$P(H) = 1/2, P(TH) = 1/4, P(TTH) = 1/8, ...$$ since anything you put inside $P()$ is an element of $\mathfrak{F}$

It can be seen that $$P(H^C) = P(TH \cup TTH \cup TTTH \cup ...)$$

$$= P(TH) + P(TTH) + P(TTTH) + ... = 1/4 + 1/8 + ... = 1/2$$

If we choose $\mathfrak{F} = (\emptyset, \Omega, H \cup TTH, TH \cup TTTH \cup ...)$, then there's no way we can assign a value to $P(TH)$ as $TH$ is not even a valid input for $P()$ since $TH \notin \mathfrak{F}$

It's helpful to consider random variables when explaining probability spaces.

A random variable can be thought of as a payoff of a game.

Formally, it is a mapping from $\Omega \to \mathbb{R}$ s.t. its "preimages" (see examples below) are events (elements of $\mathfrak{F}$)

Say we flip two coins.

Let X be the random variable equal to 1 if the result is two heads and 0 otherwise. So, you gain, say, $1 if the result is two heads and nothing otherwise.

The outcomes that result in $X=1: \ HH$

The outcomes that result in $X=0: \ TH, HT, TT$

Thus, the preimages of X are: $\emptyset, \Omega, HH, TH \cup HT \cup TT$

Now consider the probability space $(\Omega, \mathfrak{F}, P)$

where

$\Omega = (HH, TH, HT, TT)$

$\mathfrak{F} = (\emptyset, \Omega)$

$X$ is not a "random variable in $(\Omega, \mathfrak{F}, P)$" since some of its preimages are not in $\mathfrak{F}$.

The intuitive explanation: Again, $(\emptyset, \Omega)$ is the smallest possible $\sigma$-algebra of $\Omega$. So, if $\mathfrak{F} = (\emptyset, \Omega)$, all we know is that $P(HH \cup TH \cup HT \cup TT) = 1$ and $P(\emptyset) = 0$. We have no idea how the game will turn out.

Now what happens if we rigged the game into making sure that the second toss is heads i.e. $HH \cup TH$ will happen? We have $\mathfrak{F_1} = (\emptyset, \Omega, (HH \cup TH), (HT \cup TT))$

$X$ is also not a "random variable in $(\Omega, \mathfrak{F_1}, P)$. We don't know the probabilities of all the preimages of $X$ given our information (based on our rigging).

However if we let $Y$ be the random variable equal to 1 if the result is $HH$ or $TH$ and $0$ otherwise, then we have:

The outcomes that result in $Y = 1: (HH, TH)$

The outcomes that result in $Y = 0: (HT, TT)$

$Y$ is a random variable in $(\Omega, \mathfrak{F_1}, P)$.

$\mathfrak{F_1}$ represents information on the assumption our rigging will be successful, and since we know how the game will turn out since we have rigged the game to turn up heads on the second toss, we know the probabilities of the preimages of Y: $P(Y = 1) = P(HH, TH) = 1$ and $P(Y = 0) = P(HT, TT) = 0$.

If we know the probability of two heads is $0.5$, then we have $(\Omega, \mathfrak{F_2}, P)$ where $\mathfrak{F_2} = (\emptyset, \Omega, HH, HT \cup TH \cup TT)$.

$X$ is a random variable in $(\Omega, \mathfrak{F_2}, P)$.

Of course, $X$ and $Y$ are random variables in $(\Omega, 2^{\Omega}, P)$. If we know the probabilities of all possible events (all subsets of $\Omega$), then know all the probabilities of the preimages of $X$ and $Y$.

Try this too (Here, $\mathfrak{F}$ always $= 2^{\Omega}$).

This also.


I will try to explain it in very simple words:

Finding the probability space means finding each possible event and its probability.

Let's do this with regards to the specific question at hand:

  • Event #$1$:
    • You toss the coin once and it lands on head
    • The probability for this event is $\frac{1}{2}$
  • Event #$2$:
    • You toss the coin once and it lands on tail
    • You toss the coin again and it lands on head
    • The probability for this event is $\frac{1}{2}\cdot\frac{1}{2}=\frac{1}{4}$
  • $\dots$
  • Event #$k$:
    • You toss the coin $k-1$ times and it lands on tail each time
    • You toss the coin again and it lands on head
    • The probability for this event is $\frac{1}{2^{k-1}}\cdot\frac{1}{2}=\frac{1}{2^{k}}$
  • $\dots$

Of course, you have an infinite number of events in this probability space.

You may also assert that the sum of the probabilities of these events is equal to $1$:

$$\sum\limits_{n=1}^{\infty}P(\text{number of tosses}=n)= \sum\limits_{n=1}^{\infty}\frac{1}{2^{n}}=1$$