Suppose $(\Omega, \mathcal{F}, \mathbb{P})$ is a [[probability|probability space]] and $\mathcal{G} \subset \mathcal{F}$ is a sub-[[σ-algebra]]. For an event $A \in \mathcal{F}$, the **conditional probability of $A$ given $\mathcal{G}$** is the $\mathcal{G}$-[[measurable function|measurable]] [[random variable]] $Z: \Omega \to [0,1]$ satisfying $\int _{G} Z \, d\mathbb{P}=\mathbb{P}(A \cap G) \text{ for all }G \in \mathcal{G}.$
We write $Z=\mathbb{P}(A |\mathcal{G})$. It is unique up to modification [[almost-everywhere|on null sets]].
If $B \in \mathcal{F}$ is an event, $\sigma(B)$-measurability implies $\mathbb{P}(A |\sigma(B))$ is constant on $B$ and on $B^{c}$, so $\mathbb{P}(A |\sigma(B))=\underbrace{ c_{1} }_{ = : \mathbb{P}(A |B)}\boldsymbol 1_{B}+ \underbrace{ c_{0} }_{ =:\mathbb{P}(A |B^{c}) }\boldsymbol 1_{B^{c}} \text{ a.s.}$
When $\mathbb{P}(B)>0$, we can [[integral|integrate]] both sides over $B$ resp. $B^{c}$ and divide by $\mathbb{P}(B)$ to recover the classical definitions $\mathbb{P}(A |B) = \frac{\mathbb{P}(A \cap B)}{\mathbb{P}(B)} \text{ and } \mathbb{P}(A |B^{c})= \frac{\mathbb{P}(A \cap B^{c})}{\mathbb{P}(B^{c})}.$
Next consider a [[random variable|random variable]] $X:(\Omega, \mathcal{F}) \to (E, \mathcal{E})$. A **(regular) conditional distribution** of $X$ given $\mathcal{G}$ is a [[transition kernel|probability kernel]] $k=\mathbb{P}_{X |\mathcal{G}}: (\Omega, \mathcal{G}) \to (E , \mathcal{E})$that [[good measurable space|distintegrates]] $\mathbb{P}$ over $\mathcal{G}$, in the sense that [[category|in]] $\mathsf{Stoch}$ one has $\mu^{\mathcal{G}}=\mathbb{P} |_{\mathcal{G}} \ltimes k,$
where $\mu^{\mathcal{G}}=\mathbb{P}_{*}(\id_{(\Omega, \mathcal{G}) \to (\Omega, \mathcal{G})} \times X)$. Explicitly, for all $G \in \mathcal{G}$ and $A \in \mathcal{E}$, $\int _{G} \, k(\omega, A)\, d\mathbb{P}(\omega)=\mathbb{P}( G \cap \{ X \in A \}).$
Taking $\mathcal{G}=\sigma(Y)$ for $Y:(\Omega, \mathcal{F}) \to (S, \mathcal{S})$ [[σ-algebra generated by a set collection|another]] [[random variable|random variable]], we obtain the notion of **(regular) conditional distribution $\mathbb{P}_{X |Y}$ of $X$ given $Y$**. For [[good measurable space|good state spaces]] $(E, \mathcal{E})$ the required distintegration always exists.
> [!definition]
Suppose $(\Omega, \mathcal{F}, \mathbb{P})$ is a [[probability|probability space]] and $B \in \mathcal{F}$ is an [[probability|event]] with $\mathbb{P}(B)>0$. Define a new [[probability|probability measure]] $\mu=\mathbb{P}(\cdot | B)$ on $\mathcal{F}$ by $\mu(A)=\mathbb{P}(A |B)=\frac{\mathbb{P}(A \cap B)}{\mathbb{P}(B)}.$
We call $\mathbb{P}(A |B)$ the **conditional probability that $A$ occurs given that $B$ occurs**. We say
> - "B **attracts** A" if $\mathbb{P}(A \vert B)>\mathbb{P}(A)$;
>- "B **repels** A" if $\mathbb{P}(A \vert B)<\mathbb{P}(A)$.
>
> With $\sigma(B)=\{ B, B^{c}, \Omega, \emptyset \}$ the [[σ-algebra]] [[σ-algebra generated by a set collection|generated by]] $B$, we define $\begin{align}
> \mathbb{P}\big(A | \sigma(B)\big) (\omega)&:=\mathbb{P}(A | B)\boldsymbol 1_{B}(\omega) + \mathbb{P}(A |B^{c})\boldsymbol 1_{B^{c}}(\omega) \\
> &= \begin{cases}
> \mathbb{P}(A |B) & \omega \in B \\
> \mathbb{P}(A | \text{not } B) & \omega \not \in B.
> \end{cases}
> \end{align}$This gives rise to:
> - For fixed $\omega \in \Omega$, a [[probability|probability measure]] $\mathbb{P} \big( \cdot | \sigma(B) \big)(\omega)$.
> - For fixed $A \in \mathcal{F}$, a [[random variable]] $\mathbb{P}\big( A | \sigma(B) \big)(\cdot)$ taking two values.
>
> The latter point of view sets probability alive (and apart from analysis).
$\hat{}$ interpretation with [[transition kernel]]?
> [!basicproperties]
> - $\mathbb{P}(A\, | \,B)=\mathbb{P}(A)$ if and only if $A,B$ are [[independent|independent events]].
>
> > [!proof]- Proof.
> > If $A,B$ are [[independent|independent events]], then $\begin{align}
> > \mathbb{P}(A\, | \,B)= \frac{\mathbb{P}(A \cap B)}{\mathbb{P}(B)}=\mathbb{P}(A) \mathbb{P}(B)/\mathbb{P}(B)=\mathbb{P}(A).
> > \end{align}$
> > Conversely, if $\mathbb{P}(A)=\mathbb{P}(A \, | \,B)$ then $\mathbb{P}(A \cap \mathbb{B})=\mathbb{P}(A \, | \,B)\mathbb{P}(B)=\mathbb{P}(A)\mathbb{P}(B).$
> >
>
> [!intuition]
> How do we quantify statements like 'If it rains tomorrow, then the probability of the bus being late is $p