----
> [!definition] Definition. ([[Erdos-Renyi random graph model]])
> The **Erdos-Renyi random graph model** refers to one of two (closely related) models for generating [[simple graph|simple]] [[random graph|random graphs]] of $n$ nodes.
> ## Fixed number of edges $m$: $G(n,m)$
> In the $G(n,m)$ model, a [[network]] is chosen [[random variable|uniformly at random]] from the collection of all [[network]]s having $n$ labeled nodes and $m$ edges.
> ## Edges exist with [[probability distribution|probability]] $p$: $G(n,p)$
> In the $G(n,p)$ model, a [[network]] is constructed by connecting labeled nodes randomly. Each edge is included in the [[network]] with [[probability distribution|probability]] $p$, [[independent|independently]] from every other edge. That is, the probability for generating each graph which has $n$ nodes and $m$ edges is $\overbrace{p^{m}}^{p\text{($m$ edges present)}} \times \overbrace{(1-p)^{{n \choose 2} - m}}^{p\text{(no other edges present)}}.$
>
> [!basicproperties] The structure of $G(n,p)$.
> 1. The [[expectation|expected]] number of edges in $G(n,p)$ is $n \choose 2$p$;
> 2. The [[expected|expected]] [[mean degree]] is $c:=(n-1)p$;
> 3. $G(n,p)$ has a [[binomial random variable|binomial]] [[degree distribution]] in general, which under sparsity assumptions becomes [[Poisson random variable|poisson]] with parameter $c$ in the [[function limit]] of large $n$;
> 4. The [[clustering coefficient]] is (trivially) $C:= \frac{c}{n-1}$ [^1]
> 5. [[characterization of giant component in Erdos-Renyi random graph model]]
> 6. [[small components of Erdos-Renyi random graph model are trees]]
> 7. [[size of small components in Erdos-Renyi random graph model]] (nuanced)
> 8. [[the small-world effect in the Erdos-Renyi random graph model]]
> [!proof] Proof of Properties.
>
Define a [[random variable]] $X=\text{``number of edges"}$. We want $\mathbb{E}[X]$.
>
Let us compute the [[expectation|expected]] number of edges in $G(n,p)$. For each $i,j \in [n]$ define an [[indicator random variable]] $X_{ij}=\begin{cases}1 & \text{edge exists between nodes }i,j \\ 0 & \text{otherwise.}\end{cases}$.
>
Then $2X=\sum_{i \neq j}^{} X_{ij}$. Note that this sum has $n(n-1)$ terms. Using [[expectation|linearity of expectation]] we obtain $\begin{align}
2\mathbb{E}[X]= & \mathbb{E}\left[ \sum_{i \neq j}^{} X_{ij} \right] \\
= & \sum_{i \neq j}^{} \mathbb{E} [X_{ij}] \\
= & \sum_{i \neq j}^{} (1 \cdot p + 0 \cdot (1-p)) \\
= & n(n-1)p .
\end{align}$
Hence $\mathbb{E}[X]={n \choose 2}p$.
>
It follows that the [[expectation|expected]] [[mean degree]] is $\frac{2 \cdot {n \choose 2}p}{n}=(n-1)p=: c$.
>
Let's look at the [[degree distribution]] of $G(n,p)$. Let $i$ be a [[random variable|random]] node; we want to find $p(i=k)$ for any $k < n$. We use [[independent random variables|independence]] heavily. The [[probability distribution|probability]] that $i$ is connected to a particular $k$ out of $n-1$ nodes and none of the remainder is $p^{k}(1-p)^{n-k-1}$. Now the probability that $i$ is connected to *any* $k$ nodes is given by the disjoint specialization of [[the exclusion rule]]: $\begin{align}
p(i\text{ is connected to set A of $k$ nodes OR set B of $k$ nodes or...})\\
= \sum_{{n-1 \choose k}\text{ options }}^{} p^{k}(1-p)^{n-k-1}\\
= {n-1 \choose k} p^{k}(1-p)^{n-k-1}. \
\end{align}$
>
To see the relationship with [[Poisson random variable|Poisson distribution]], suppose $G$ is sparse so that $p=\frac{c}{n-1}$ becomes vanishingly small as $n$ becomes large, allowing us to write $\begin{align}
\ln[(1-p)^{n-1-k}]= & (n-1-k)\ln \left( 1- \frac{c}{n-1} \right) \\
\approx & -(n-1-k) \frac{c}{n-1} \\
\approx & -c,
\end{align}$
where we have [[taylor series|Taylor-expanded]] $\ln$ and $\approx \to =$ as $n \to \infty$ with $k$ fixed. Now we exponentiate both sides to find that $(1-p)^{n-1-k}=e^{-c}$ in the large-$n$ [[function limit]]. Also for large $n$ we have ${n-1 \choose k} = \frac{(n-1)!}{k! (n-1-k)!} \approx \frac{(n-1)^{k}}{k!},$
and thus our binomial expression above becomes $p_{k}=\frac{(n-1)^{k}}{k!}p^{k}e^{-c}=\frac{(n-1)^{k}}{k!} \left( \frac{c}{n-1} \right)^{k} e ^{-c}=e^{-c} \frac{c^{k}}{k!}.$
This is the [[Poisson random variable|poisson distribution]].
>
Let's calculate the [[clustering coefficient]] $C$ of the [[network]], which is the uniform probability that a given two neighbors of a node $i$ are themselves connected. Since all connections in $G(n,p)$ are [[independent random variables|independently made]], this is merely the probability $p=c / (n-1)$ that any two given nodes are connected.
[^1]: This is perhaps one of the most unrealistic consequences of the assumptions $G(n,p)$ makes. Real networks often have extremely high [[clustering coefficient]]s, while this value of $C$ tends to $0$ as $n\to \infty$!
> [!basicexample] Examples and Exercises.
>- (Newman 11.1) [[(no) clustering in Erdos-Renyi random graph]]
>- (Newman 11.2) [[size of small components in Erdos-Renyi random graph model|small component degree distribution in ERRG]] or [[fraction of nodes in small component of given degree in Erdos-Renyi random graph model]] (same thing)
> - (Newman 11.4) [[degree distribution within giant component of Erdos-Renyi random graph]]
>- (Newman 11.5) [[giant bicomponent in Erdos-Renyi random graph]]
>- (Newman 11.6) [[when small components imply existence of giant component]]
>- (Newman 11.7) [[directed Erdos-Renyi random graph]]
>- (Newman 11.8) [[the cascade model]]
> - [[random graph with nontrivial clustering coefficient]]
----
####
----
#### References
> [!backlink]
> ```dataview
> TABLE rows.file.link as "Further Reading"
> FROM [[]]
> FLATTEN file.tags as Tag
> WHERE Tag = "#definition" OR Tag = "#theorem" OR Tag = "#MOC" OR Tag = "#proposition" OR Tag = "#axiom"
> GROUP BY Tag
> ```
> [!frontlink]
> ```dataview
> TABLE rows.file.link as "Further Reading"
> FROM outgoing([[]])
> FLATTEN file.tags as Tag
> WHERE Tag = "#definition" OR Tag = "#theorem" OR Tag = "#MOC" OR Tag = "#proposition" OR Tag = "#axiom"
> GROUP BY Tag
> ```