degree distribution - Jacob Hume

---- > [!definition] Definition. ([[degree distribution]]) > The **degree distribution** of a [[network]] is the [[discrete random variable|probability mass function]] over $\begin{align} p_{k}:= & (\text{probability that a random node }i \text{ has degree }k) \\ = & \frac{\#\text{ of nodes of degree }k}{\#\text{ of nodes in network}}. \end{align}$ ![[CleanShot 2023-10-15 at [email protected]]] Nodes with exceptionally high [[degree]] (tail-end of image) are called **hubs**. > [!basicproperties] > - [[degree distribution]]s are generally right-skewed, as illustrated above. > [!basicexample] > > **Exercise.** A network has an exponential degree distribution of the form $p_k = Ca^k$, where $C$ and $a$ are positive constants and $a < 1$. > > #### a) Assuming the distribution is properly normalized, find $C$ as a function of $a$. > > **Solution.** We require $\sum_{k=0}^{\infty} p_{k}=C\sum_{k=0}^{\infty} a ^{k}=1$, hence $C=1 / \sum_{k=0}^{\infty}a^{k}=1-a$. > > #### b) Suppose you are given the degrees $k_1, \dots, k_N$ of a representative set of $N$ nodes in the network. Show that the log-likelihood that these degrees are drawn from the exponential distribution with parameter $a$ is $\log P(k_1, \dots, k_N|a) = N\log(1-a) + \sum_{i=1}^{N} k_i \log a.$ > > **Solution.** Assuming independence we compute $\begin{align} > \log P(k_{1},\dots,k_{N} | a)= & \log \prod_{i=1}^{N} Ca^{k_{i}} \\ > = & \sum_{i=1}^{N} \log Ca^{k_{i}} \\ > = & \sum_{i=1}^{N} \log \textcolor{Skyblue}{C} + \sum_{i=1}^{N} k_{i} \log a \\ > = & N \log (\textcolor{Skyblue}{1-a}) + \sum_{i=1}^{N} k_{i} \log a. > \end{align}$ > > #### Using the method of maximum likelihood, derive a formula for the most likely value of $a$ in terms of the degrees. > This is a [[convex function|convex]] [[optimization problem]], so we can solve it by [[derivative|differentiating]] $(b)$ and setting to $0$ (see [[characterization of extrema for differentiable convex functions]]) then solving for $a$. We set $\begin{align} > \frac{d}{da} \left[ N \log (1-a) + \sum_{i=1}^{N} k_{i} \log a \right] = & N\frac{d}{da}[ \log (1-a)] + \sum_{i=1}^{N} k_{i}\frac{d}{da}[ \log a] \\ > = & -\frac{N}{1-a} + \sum_{i=1}^{N} \frac{k_{i}}{a} \\ > =0. > \end{align}$ > The optimal parameter is $a=\frac{\sum_{i=1}^{N}k_{i}}{N + \sum_{i=1}^{N}k_{i}}.$ > > #### If the observed degrees are as follows, what is the value of $a$? > $ > \begin{matrix} > 11 & 1 & 2 & 8 & 3 \\ > 2 & 4 & 6 & 3 & 8 \\ > 23 & 5 & 1 & 30 & 2 \\ > 1 & 12 & 1 & 39 & 8 > \end{matrix} > $ > > Here $\sum_{i=1}^{N}k_{i}=170$ and $N=20$. So $a=\frac{17}{19}$. ---- #### ---- #### References > [!backlink] > ```dataview > TABLE rows.file.link as "Further Reading" > FROM [[]] > FLATTEN file.tags as Tag > WHERE Tag = "#definition" OR Tag = "#theorem" OR Tag = "#MOC" OR Tag = "#proposition" OR Tag = "#axiom" > GROUP BY Tag > ``` > [!frontlink] > ```dataview > TABLE rows.file.link as "Further Reading" > FROM outgoing([[]]) > FLATTEN file.tags as Tag > WHERE Tag = "#definition" OR Tag = "#theorem" OR Tag = "#MOC" OR Tag = "#proposition" OR Tag = "#axiom" > GROUP BY Tag > ```