# Branching processes¶

## Stochastic and brancing processes¶

This chapter deals with branching processes, which is a particular type of stochastic process. We make a brief digression to define stochastic processes[Wei20] for completeness.

**Definition (Stochastic process)** A stochastic process is a family of random variables \(\{X(t), t\in \mathcal{I}\}\) on a probability space \((\Omega, \mathcal{F}, \mathbb{P})\), where \(\mathcal{I}\) is a set called the index set.

Stochastic processes can take discrete or continuous values and can have a discrete or continuous index set \(\mathcal{I}\). The branching process describes how a population evolves at discrete time intervals, and therefore takes discrete values and has a discrete index set \(\mathcal{I}\).

**Definition (Branching process)** A branching process is a collection of random variables \(X_0, X_1, X_2, ...\), representing a population of *nomads* which evolves as follows:

At \(t = 0\) there is one nomad, \(X_0 = 1\).

At every time-step, each nomad dies after giving birth to a

*family*of new nomads of random size \(C\).The family sizes descending from each nomad are i.i.d. and are distributed according to the pmf \(\mathbb{P}(C = k) = p_k, k = 0, 1, 2 ...\), called the

*family-size distribution*.

## Probability generating functions¶

We are interested in the distribution of the nomad population at future times. The recursive nature of the process gives a neat expression in terms of the generating function of the family-size distribution.

**Theorem (Probability generating functions of a branching process)** Let \(X_t\) be a branching process with family-size distribution \(\mathbb{P}(C = k) = p_k\), whose probability generating function is \(G\). Then the probability generating functions \(G_0, G_1, ...\) of \(X_0, X_1, ...\) are

## Proof: Proability generating functions of a branching process

Let \(C_{t, n}\) be the family size born from the \(n^{th}\) nomad at time \(t\). At \(t = 0\), the population is \(X_0 = 1\) so that \(G_0(s) = s\). For \(t > 0\) we have

where the subscripts in the expectations give the variables over which the expectation is taken. From the first to the second line we used the law of iterated expectations; from the second to the third we used the fact that the \(C_{t, 1}, C_{t, 2}..., C_{t, X_t}\) variables are i.i.d. so the pgf of their sum is equal to the product of their pgfs.

## Mean population¶

**Theorem (Mean population of a branching process)** The mean value of a branching process \(X_t\) is

where \(\mu = \mathbb{E}(C)\) is the mean family size. From this, it follows that as \(t \to \infty\)

This result is not surprising since each nomad will give birth to \(\mu\) other nomads on average. It can be proved quickly using probability generating functions.

## Proof: Mean population of a branching process

We have

and considering that \(G_{t - 1}'(1) = \mathbb{E}(X_{t-1})\), the result follows by induction.

## Ultimate extinction¶

We are also interested in the behaviour of the process as \(t \to \infty\). If the population ever reaches \(X_t = 0\), it will have become extinct.

**Definition (Extinction probability)** Given a branching process \(X_t\), the extinction probability \(e\) is

Alternatively, defining \(E_t = \{X_t = 0\}\) as the event that the process is extinct by time \(t\) and \(e_t = \mathbb{P}(E_t)\) we have

and since \(E_t \subseteq E_{t+1}\) we have \(e = \lim_{t \to \infty} e_t\) by the continuity of probability measures. Thus \(e\) can equivalently be defined as the limit of \(e_t\) as \(t \to \infty\).

We are interested in the probability of ultimate extinction, that is the probability that \(X_t = 0\) for some \(t\). Again, generating functions give a neat answer for this quantity too.

**Theorem (Extinction probability theorem)** Let \(X_t\) be a branching process, whose family sizes \(C\) have common probability generating function \(G\). The probability of extinction \(e\) is the smallest non-negative root of the equation

## Proof: Extinction proability theorem

We have \(e_t = \mathbb{P}(X_t = 0) = G_t(0)\) and \(G_{t + 1}(s) = G(G_t(s))\) so that

Taking the limit \(t \to \infty\) with initial condition \(e_0 = 0\), we obtain

where we have used the fact that \(G\) is continuous on \([0, 1]\). To show that \(e\) is the smallest non-negative root, suppose \(\eta \geq 0\) is another non-negative root. Since \(G\) is non-decreasing on \([0, 1]\), we have

The following theorem gives a necessary and sufficient condition for ultimate extinction to be certain.

**Theorem (Extinction survival theorem)** Let \(X_t\) be a branching process whose family size distribution satisfies \(\mathbb{P}(C = 1) \neq 1\). The probability of ultimate extinction satisfies \(e = 1\) if and only if the mean family size satisfies \(\mu \leq 1\).

## Proof: Extinction survival theorem

In the trivial case where \(\mathbb{P}(C = 1) = 1\) we have \(X_t = 1\) for all \(t\) and \(e = 0\). Now considering the facts that on the interval \([0, 1]\), the function \(G\) is

continuous, since it is a power series with a radius of convergence of at least \(1\),

non-decreasing, since \(G'(x) \geq 0\),

convex, since \(G''(x) \geq 0\),

\(G(1) = 1\),

we see that the graph of \(y = G(x)\) will intersect the graph of \(y = x\) at \(x = 1\), and that the two graphs will have a smaller non-negative point of intersection if and only if \(G'(1) > 1\). The condition \(e = 1\) holds if and only if the smallest non-negative point of intesection is \(1\), which is equivalent to the condition \(G'(1) = \mu \leq 1\), arriving at the result.

From the definition of the probability of extinction, it follows that if \(\mu \leq 0\), then \(X_t \to 0\) in probability. Further, if the inequality is strict, \(\mu < 1\), then \(X_t \to 0\) in mean square - this does not hold for \(\mu = 1\).

## Proof: Convergence of a branching process in probability and in mean-square

To show that \(X_t \to 0\) in probability, suppose \(\mu \leq 1\), let \(\epsilon > 0\) and consider

From the extinction survival theorem, \(\mathbb{P}(X_t = 0) \to 1\) as \(t \to \infty\) so

and \(X_t \to 0\) in probability. We now treat the mean-square case. Consider

Using the law of total variance

Proceeding recursively, we obtain

which is a geometric sum. In the special case of \(\mu = 1\), we see that \(\sigma_t^2 = t \sigma^2\). If \(\mu \neq 1\), we can use the geometric sum to obtain

Substituting these results into the expression for \(\mathbb{E}(X_t^2)\) we obtain

Therefore \(\mathbb{E}(X_t^2) \to 0\), implying that \(X_t \to 0\) in mean-square if and only if \(\mu < 1\).

## References¶

- Wei20
Eric W. Weisstein. Stochastic process. 2020. [From MathWorld–A Wolfram Web Resource. Online; accessed 28-August-2020]. URL: https://mathworld.wolfram.com/StochasticProcess.html.