Suppose $(\Omega, \mathcal{F}, \mathbb{P})$ is a [[probability|probability space]] and $\mathcal{G} \subset \mathcal{F}$ is a sub-[[σ-algebra]]. For an event $A \in \mathcal{F}$, the **conditional probability of $A$ given $\mathcal{G}$** is the $\mathcal{G}$-[[measurable function|measurable]] [[random variable]] $Z: \Omega \to [0,1]$ satisfying $\int _{G} Z \, d\mathbb{P}=\mathbb{P}(A \cap G) \text{ for all }G \in \mathcal{G}.$ We write $Z=\mathbb{P}(A |\mathcal{G})$. It is unique up to modification [[almost-everywhere|on null sets]]. If $B \in \mathcal{F}$ is an event, $\sigma(B)$-measurability implies $\mathbb{P}(A |\sigma(B))$ is constant on $B$ and on $B^{c}$, so $\mathbb{P}(A |\sigma(B))=\underbrace{ c_{1} }_{ = : \mathbb{P}(A |B)}\boldsymbol 1_{B}+ \underbrace{ c_{0} }_{ =:\mathbb{P}(A |B^{c}) }\boldsymbol 1_{B^{c}} \text{ a.s.}$ When $\mathbb{P}(B)>0$, we can [[integral|integrate]] both sides over $B$ resp. $B^{c}$ and divide by $\mathbb{P}(B)$ to recover the classical definitions $\mathbb{P}(A |B) = \frac{\mathbb{P}(A \cap B)}{\mathbb{P}(B)} \text{ and } \mathbb{P}(A |B^{c})= \frac{\mathbb{P}(A \cap B^{c})}{\mathbb{P}(B^{c})}.$ Next consider a [[random variable|random variable]] $X:(\Omega, \mathcal{F}) \to (E, \mathcal{E})$. A **(regular) conditional distribution** of $X$ given $\mathcal{G}$ is a [[transition kernel|probability kernel]] $k=\mathbb{P}_{X |\mathcal{G}}: (\Omega, \mathcal{G}) \to (E , \mathcal{E})$that [[good measurable space|distintegrates]] $\mathbb{P}$ over $\mathcal{G}$, in the sense that [[category|in]] $\mathsf{Stoch}$ one has $\mu^{\mathcal{G}}=\mathbb{P} |_{\mathcal{G}} \ltimes k,$ where $\mu^{\mathcal{G}}=\mathbb{P}_{*}(\id_{(\Omega, \mathcal{G}) \to (\Omega, \mathcal{G})} \times X)$. Explicitly, for all $G \in \mathcal{G}$ and $A \in \mathcal{E}$, $\int _{G} \, k(\omega, A)\, d\mathbb{P}(\omega)=\mathbb{P}( G \cap \{ X \in A \}).$ Taking $\mathcal{G}=\sigma(Y)$ for $Y:(\Omega, \mathcal{F}) \to (S, \mathcal{S})$ [[σ-algebra generated by a set collection|another]] [[random variable|random variable]], we obtain the notion of **(regular) conditional distribution $\mathbb{P}_{X |Y}$ of $X$ given $Y$**. For [[good measurable space|good state spaces]] $(E, \mathcal{E})$ the required distintegration always exists. > [!definition] Suppose $(\Omega, \mathcal{F}, \mathbb{P})$ is a [[probability|probability space]] and $B \in \mathcal{F}$ is an [[probability|event]] with $\mathbb{P}(B)>0$. Define a new [[probability|probability measure]] $\mu=\mathbb{P}(\cdot | B)$ on $\mathcal{F}$ by $\mu(A)=\mathbb{P}(A |B)=\frac{\mathbb{P}(A \cap B)}{\mathbb{P}(B)}.$ We call $\mathbb{P}(A |B)$ the **conditional probability that $A$ occurs given that $B$ occurs**. We say > - "B **attracts** A" if $\mathbb{P}(A \vert B)>\mathbb{P}(A)$; >- "B **repels** A" if $\mathbb{P}(A \vert B)<\mathbb{P}(A)$. > > With $\sigma(B)=\{ B, B^{c}, \Omega, \emptyset \}$ the [[σ-algebra]] [[σ-algebra generated by a set collection|generated by]] $B$, we define $\begin{align} > \mathbb{P}\big(A | \sigma(B)\big) (\omega)&:=\mathbb{P}(A | B)\boldsymbol 1_{B}(\omega) + \mathbb{P}(A |B^{c})\boldsymbol 1_{B^{c}}(\omega) \\ > &= \begin{cases} > \mathbb{P}(A |B) & \omega \in B \\ > \mathbb{P}(A | \text{not } B) & \omega \not \in B. > \end{cases} > \end{align}$This gives rise to: > - For fixed $\omega \in \Omega$, a [[probability|probability measure]] $\mathbb{P} \big( \cdot | \sigma(B) \big)(\omega)$. > - For fixed $A \in \mathcal{F}$, a [[random variable]] $\mathbb{P}\big( A | \sigma(B) \big)(\cdot)$ taking two values. > > The latter point of view sets probability alive (and apart from analysis). $\hat{}$ interpretation with [[transition kernel]]? > [!basicproperties] > - $\mathbb{P}(A\, | \,B)=\mathbb{P}(A)$ if and only if $A,B$ are [[independent|independent events]]. > > > [!proof]- Proof. > > If $A,B$ are [[independent|independent events]], then $\begin{align} > > \mathbb{P}(A\, | \,B)= \frac{\mathbb{P}(A \cap B)}{\mathbb{P}(B)}=\mathbb{P}(A) \mathbb{P}(B)/\mathbb{P}(B)=\mathbb{P}(A). > > \end{align}$ > > Conversely, if $\mathbb{P}(A)=\mathbb{P}(A \, | \,B)$ then $\mathbb{P}(A \cap \mathbb{B})=\mathbb{P}(A \, | \,B)\mathbb{P}(B)=\mathbb{P}(A)\mathbb{P}(B).$ > > > > [!intuition] > How do we quantify statements like 'If it rains tomorrow, then the probability of the bus being late is $p? ... \ Repeat an [[experiment]] $N$ times, and on each occasion observe the occurrences or non-occurrences of two [[event]]s $A$ and $B$. Now suppose we take an interest in *only* those [[event]]s where $B$ occurs; all others are disregarded. There are $N(B)$ such events. \ Then, the proportion of times that $A$ occurs in this smaller collection of $N(B)$ trials is $\frac{N(A \cap B)}{N(B)}$. Moving to thinking of these as probabilities, it is reasonable to make the above definition.