----
> [!definition] Definition. ([[expectation]])
> Suppose $(\Omega, \mathcal{F}, \mathbb{P})$ is a [[probability|probability space]]. [[Lp-norm|If]] $X \in \mathcal{L}^{1}(\mathbb{P})$, then the **expectation** or **expected value** of the [[random variable]] $X$ is defined by the [[integral]] $\int _{\Omega} X \, d\mathbb{P}. $
It is variously denoted $\mathbb{E}X$, $\mathbb{E}[X]$, $\mathbb{E}(X)$.
^definition
> [!basicproperties]
> - If $X,Y$ are both integrable random variables (i.e. $X,Y \in \mathcal{L}^{1}(\mathbb{P})$), then $\mathbb{E}(cX+Y)=c\mathbb{E}X+\mathbb{E}Y$ for all $c>0$, by linearity of the [[integral]].
> - [[Expectation is multiplicative for independent random variables]]
^properties
----
####
#analysis/probability-statistics
## Motivation: Chuck-a-Luck
*You bet on an integer between 1 and 6 and you roll 3 fair dice. If you get your number 3 times, you get $3. Roll it two times and get $2. Roll it one time and get $1. Roll it 0 times and pay $1 (get $-1)*
Let $X$ be the payout [[random variable]]. What values can $X$ take? :
![[CleanShot 2022-09-25 at 15.44.11.jpg]]
we say that the [[expectation]] of $X$, $E(X)$, is $E(X) = 3\cdot \frac{1}{216} + 2 \cdot \frac{15}{216} + 1\cdot \frac{125}{216} - 1 \frac{125}{216} \approx -0.079.$
What does it mean? We can give a *frequency interpretation*: If you play $216$ times, you expect to get
![[CleanShot 2022-09-25 at 15.48.32.jpg]],
i.e., you expect to pay $17 if you play 216 times, averaging out to $17/216 \approx -0.079 \approx$ 8 cents each time!
# 2-Stage Definition
## Stage 1
We say that a [[random variable]] $X: \Omega \to \mathbb{R}$ is *simple* if it takes on finitely many many values (e.g. 3,2,1,-1). For a simple $X$ taking values $a_1, \dots, a_n$, we define $E(X) = \sum_{i=1}^n a_iP(x=a_i)$
### Properties
- If $X \geq 0$ then $E(X) \geq 0$
- $E(bX) = bE(X)$
- $E(X+Y) = E(X)+E(Y)$ ---- not trivial!
## Stage 2
Let $X: \Omega \to \mathbb{R}$ be a [[random variable]]. We say that $X$ is *integrable* if $\exists$ a [[sequence]] $X_n: \Omega \to \mathbb{R}$ of **simple** random variables s.t. the following holds:
1. $P\{\omega : \lim_{x \to \infty} X_n(\omega) = X(\omega)\} = 1$
2. For any $\varepsilon > 0, \exists N = N(\varepsilon) \in \mathbb{N}$ s.t. if $m,n \geq N$ then $E \vert x_m - x_n \vert \leq \varepsilon$
**If this happens,** then $\lim_{n \to \infty} E(x_n)$ exists and *does not* depend on $(X_n)$; we call $E(X) = \lim_{n \to \infty}E(X_n)$
### Properties
- If $X \geq 0$ then $E(X)>0$
- $E(bX) = bE(X)$ $\ \forall b \in \mathbb{R}$
- $E(X+Y) = E(X) + E(Y)$ (if $X,Y$ integrable, then so is $X+Y$)
## What happens if $X$ is a [[continuous random variable]]?
- suppose $X$ has [[probability density function]] $f_x(t)$, then $X \text{ is integrable} \iff \int_{-\infty}^\infty \vert t \vert f_x(t)dt < +\infty$and $E(X)=\int_{-\infty}^\infty tf_x(t)dt.$... i think*?*
- ([[TODO]]: more examples)
# Composition Property
If $X$ is a *simple* [[random variable]] taking on values $a_1, \dots, a_n$, and if $g: \mathbb{R} \to \mathbb{R}$ is a function, then $g(X)$ is also a [[random variable]] and $E(g(X)) = \sum_{i=1}^n g(a_i)P(X=a_i).$
If $X$ is a *[[continuous random variable]]* with [[probability density function]] $f_x(t)$ and if $g: \mathbb{R} \to \mathbb{R}$, and "borel measurability" is satisfied, then $g(X)$ is a [[random variable]] and $E(g(X)) = \int_{-\infty}^\infty g(t)f_X(t)dt$ provided the [[integral]] [[converge]]s absolutely (see [[absolute convergence]])
# Computing Expectation
*important skill: to be able to compute expectation by writing [[random variable]]s as [[linear combination]]s of [[indicator random variable]]s of [[event]]s*... *often its much easier to get expectation than it is to get [[probability]]!*
- EXAMPLE: see [[the problem of letters#With Expectation]]
- EXAMPLE:
### Counting Records
Let $a_1, \dots, a_n$ be a [[permutation]] of numbers from $1$ to $n$. We say that $a_k$ of that [[permutation]] if $a_k > a_i$ for all $i<k$. ![[CleanShot 2022-10-01 at
[email protected]]]
Let $X$ be the *number of records* in the random [[permutation]]. **What's $E(X)?$**
The trick is to *split $X$ into [[indicator random variable]]s*: $X_k = f(x) = \begin{cases} 1& \text{ if $a_k$ is a record} \\ 0& \text{otherwise} \\
\end{cases}.$
Then the number of records is $X = X_1 + \dots + X_n$ and therefore $E(X) = E(X_1) + \dots + E(X_n)$. For a given $X_k$ we have $P(X_k=1) = \frac{1}{k}$ (think symmetry), so we're left with $E(X) = 1 + \frac{1}{2} + \frac{1}{3} + \dots + \frac{1}{k} + \dots + \frac{1}{n} = \gamma + \ln n,$
where $\gamma$ is the [[euler-mascheroni constant]]. (As seen in [[math 297]]!)
# Also see:
- [[conditional expectation]]
#notFormatted
----
#### References
> [!backlink]
> ```dataview
> TABLE rows.file.link as "Further Reading"
> FROM [[]]
> FLATTEN file.tags as Tag
> WHERE Tag = "#definition" OR Tag = "#theorem" OR Tag = "#MOC" OR Tag = "#proposition" OR Tag = "#axiom"
> GROUP BY Tag
> ```
> [!frontlink]
> ```dataview
> TABLE rows.file.link as "Further Reading"
> FROM outgoing([[]])
> FLATTEN file.tags as Tag
> WHERE Tag = "#definition" OR Tag = "#theorem" OR Tag = "#MOC" OR Tag = "#proposition" OR Tag = "#axiom"
> GROUP BY Tag
> ```