Measures

Measures#

This chapter builds up measures. First, we establish a notion of length on the real line, which we need in order to build up integration. This notion of length is described by the outer measure. The outer measure has several good properties which align with our expectations of what an appropriate definition of length on the real line should be.

However, the outer measure lacks an important property, namely additivity: the outer measure of the union of two disjoint sets is not necessarily the sum of the outer measures of the two sets. This is a problem, because we need additivity in order to prove useful theorems about integration. We will see that this is not a fundamental limitation of our definition of the outer measure, and show that any function that satisfies the good properties of the outer measure and has domain equal to the power set of \(\mathbb{R}\) cannot be additive.

A solution to this issue, which does not give up the good properties of the outer measure, is to relax the requirement that our notion of length is defined on all subsets of \(\mathbb{R},\) and instead only require that it is defined on a certain collection of subsets of \(\mathbb{R}.\) This leads us to the definitions of \(\sigma\)-algebras, measurable sets, and measurable functions. We then introduce measures, and specifically the Lebesgue measure, which we will use later to define integration.

Outer measure#

First, we define and study the outer measure, which is a notion of length on the real line. We will prove a few good properties that one would hope to hold for a notion of length on the real line. Then, we will show that the outer measure is not additive.

Definition of the outer measure#

To define the outer measure, we first need a definition of the length of an open interval.

Definition 55 (Length of an open interval)

The lenght \(\ell(I)\) of an open interval \(I \subseteq \mathbb{R}\) is defined by

\[\begin{split}\begin{align} \ell(I) = \begin{cases} b - a & \text{if } I = (a, b) \text{ for some } a, b \in \mathbb{R} \text{ with } a < b, \\ 0 & \text{if } I = \emptyset, \\ \infty & \text{if } I = (-\infty, a) \text{ or } I = (a, \infty) \text{ for some } a \in \mathbb{R}, \\ \infty & \text{if } I = (-\infty, \infty). \end{cases} \end{align}\end{split}\]

Given lengths of open intervals, we can define the outer measure of a set as the least sum of the lengths of open intervals that cover the set.

Definition 56 (Outer measure)

The outer measure \(|A|\) of a subset \(A \subseteq \mathbb{R}\) is defined by

\[|A| = \inf \left\{ \sum_{j=1}^\infty \ell(I_j) : A \subseteq \bigcup_{j=1}^\infty I_j \right\}.\]

Good properties#

The outer measure has a number of good properties. First, the outer measure of countable subsets of \(\mathbb{R}\) is zero.

Theorem 60 (Countable sets have outer measure zero)

Every countable subset of \(\mathbb{R}\) has outer measure \(0.\)

Second, the outer measure preserves order, that is the outer measure of a subset of \(\mathbb{R}\) is less than or equal to the outer measure of any of its supersets.

Theorem 61 (Outer measure preserves order)

If \(A\) and \(B\) are subsets of \(\mathbb{R}\) with \(A \subseteq B,\) then \(|A| \leq |B|.\)

The third good property of the outer measure is translation invariance. To establish this property, we first need a definition of the translation of a set.

Definition 57 (Translation invariance)

For \(A \subseteq \mathbb{R}\) and \(t \in \mathbb{R},\) the translation of \(t + A\) is defined by

\[A + t = \{ a + t : a \in A \}.\]

With this definition in place, we can now state the translation invariance property of the outer measure. Specifically, the outer measure of a set is the same as the outer measure of its translation.

Theorem 62 (Outer measure is translation invariant)

For every subset \(A\) of \(\mathbb{R}\) and every \(t \in \mathbb{R},\) we have \(|t + A| = |A|.\)

Another useful property of the outer measure is countable subadditivity. This property will also turn out to be true of more general measures which we will define later.

Theorem 63 (Outer measure is countably subadditive)

Suppose \(A_1, A_2, \ldots\) are subsets of \(\mathbb{R}.\) Then

\[\left| \bigcup_{k=1}^\infty A_k \right| \leq \sum_{k=1}^\infty |A_k|.\]

Heine-Borel theorem#

Another important property that we want to show for the outer measure is that its value on a closed (rather than open) value on a bounded interval is equal to the difference between the endpoints of the interval, that is \(|[a, b]| = b - a\) for \(a, b \in \mathbb{R}\) with \(a < b.\) To show this, we will use the Heine-Borel theorem, which is a theorem of independent interest beyond measure theory. The Heine-Borel theorem is a statement about open covers, an idea which we now define.

Definition 58 (Open cover, finite subcover)

Suppose \(A \subseteq \mathbb{R}.\) A collection \(\mathcal{C}\) of open intervals is called an open cover of \(A\) if \(A\) is contained in the union of the intervals in \(\mathcal{C}.\) An open cover \(\mathcal{C}\) of \(A\) is said to have a finite subcover if \(A\) is contained in the union of some finite list of sets in \(\mathcal{C}.\)

The Heine-Borel theorem states that every open cover of a closed bounded subset of \(\mathbb{R}\) has a finite subcover.

Theorem 64 (Heine-Borel)

Every open cover of a closed bounded subset of \(\mathbb{R}\) has a finite subcover.

Using the Heine-Borel theorem, we can now show that the outer measure of a closed interval is equal to the difference between the endpoints of the interval.

Theorem 65 (Outer measure of a closed interval)

Suppose \(a, b \in \mathbb{R}\) with \(a < b.\) Then \(|[a, b]| = b - a.\)

A nice result from the previous theorems is that nontrivial intervals in \(\mathbb{R}\) are uncountable. Interestingly, this proof does not use the diagonal argument, which is the argument that is usually used to show that the real numbers are uncountable.

Theorem 66 (Nontrivial intervals are uncountable)

Every inverval in \(\mathbb{R}\) that contains at least two distinct elements is uncountable.

Nonadditivity of the outer measure#

Now we come to the negative result of the outer measure, namely that it is not additive. Additivity is an important property that we would like our notion of length to have, because it allows us to prove good theorems about integration.

The proof of nonadditivity of the outer measure relies on defining a subset of a closed interval. Similar sets are used beyond the subadditivity of the outer measure, so we give it a special name.

Definition 59 (Rational difference equivalence relation)

Suppose \(S \in \mathbb{R}.\) Let \(\sim\) be the equivalence relation defined by \(x \sim y \iff x - y \in \mathbb{Q},\) for any \(\mathbb{R}.\) We call this the rational difference equivalence relation.

Theorem 67 (Nonadditivity of the outer measure)

There exist disjoint subsets \(A, B\) of \(\mathbb{R}\) such that

\[|A \cup B| \neq |A| + |B|.\]

Shortly, we will show that this negative result is not a fundamental limitation of our definition of the outer measure. Before doing so however, we will give a positive result that is useful in some contexts. Specifically, we will show that given a sequence of sets that are contained by disjoint open intervals, the outer measure of the union of the sets is equal to the sum of the outer measures of the sets.

Theorem 68 (Outer measure is additive if sets are contained by disjoint open intervals)

Suppose \(S_1, S_2, \ldots\) is a sequence of sets and \(A_1, A_2, \ldots\) is a sequence of disjoint open intervals with \(S_k \subseteq A_k,\) then

\[\left|\bigcup_{n=1}^\infty S_n\right| = \sum_{n=1}^\infty |S_n|.\]

This result highlights that if there is a sequence of sets on which the outer measure is not additive, then the sets cannot be separable in the sense described above.

Measurable spaces and functions#

Theorem 69 (Nonexistence of extension of length to all subsets of \(\mathbb{R}\))

There does not exist a function \(\mu\) with the following properties:

(a) \(\mu\) is a function from the set of subsets of \(\mathbb{R}\) to \([0, \infty],\)

(b) \(\mu(I) = \ell(I)\) for all open intervals \(I \subseteq \mathbb{R},\)

(c) For every disjoint sequence \(A_1, A_2, \ldots\) of subsets of \(\mathbb{R},\) \(\mu \left( \cup_{k=1}^\infty A_k \right) = \sum_{k=1}^\infty \mu(A_k),\)

(d) \(\mu(t + A) = \mu(A)\) for all \(A \subseteq \mathbb{R}\) and \(t \in \mathbb{R}.\)

Sigma algebras#

Definition 60 (\(\sigma\)-algebra)

Suppose \(X\) is a set and \(S\) is a set of subsets of \(X.\) Then \(S\) is called a \(\sigma\)-algebra on \(X\) if it satisfies:

\(\emptyset \in S,\)
if \(E \in S,\) then \(X \setminus E \in S,\)
if \(E_1, E_2, \ldots\) is a sequnece of elements of \(S,\) then \(\bigcup_{k=1}^\infty E_k \in S.\)

Theorem 70 (Other properties of \(\sigma\)-algebras)

Suppose \(S\) is a \(\sigma\)-algebra on a set \(X.\) Then

(a) \(X \in S,\)

(b) if \(D, E \in S,\) then \(D \cup E \in S, D \cap E \in S\) and \(D \setminus E \in S,\)

(c) if \(E_1, E_2, \ldots\) is a sequence of elements of \(S,\) then \(\cap_{k = 1}^\infty E_k \in S.\)

Definition 61 (Measurable space, measurable set)

A measurable space is an ordered pair \((X, S),\) where \(X\) is a set and \(S\) is a \(\sigma\)-algebra on \(X.\) An element of \(S\) is called a \(S\)-measurable set, or simply a measurable set if \(S\) is clear from the context.

Theorem 71 (Smallest \(\sigma\)-algebra containing a collection of subsets)

Suppose \(X\) is a set and \(A\) is a set of subsets of \(X.\) Then, the intersection of all \(\sigma\)-algebras on \(X\) that contain \(A\) is a \(\sigma\)-algebra on \(X.\)

Definition 62 (Borel set)

The smallest \(\sigma\)-algebra on \(\mathbb{R}\) that contains all the open subsets of \(\mathbb{R}\) is called the collection of Borel subsets on \(\mathbb{R}.\) An element of this \(\sigma\)-algebra is called a Borel set.

Definition 63 (Inverse image)

If \(f: X \in Y\) is a function and \(A \subseteq Y,\) then the inverse image of \(A\) under \(f\) is defined by

\[f^{-1}(A) = \{ x \in X : f(x) \in A \}.\]

Theorem 72 (Inverse image of a composition)

Suppose \(f: X \to Y\) and \(g: Y \to Z\) are functions. Then

\[(g \circ f)^{-1}(A) = f^{-1}(g^{-1}(A))\]

for all \(A \subseteq Z.\)

Measurable functions#

Definition 64 (Measurable function)

Suppose \((X, S)\) is a measurable space. A function \(f: X \to \mathbb{R}\) is called \(S\)-measurable if

\[f^{-1}(B) \in S\]

for all Borel sets \(B \subseteq \mathbb{R}.\)

Theorem 73 (Condition for measurable function)

Suppose \((X, S)\) is a measurable space and \(f: X \to \mathbb{R}\) is a function such that

\[f^{-1}((a, \infty)) \in S\]

for all \(a \in \mathbb{R}.\) Then \(f\) is \(S\)-measurable.

In the special case that \(X\) is a subset of the reals and \(S\) is the set of Borel subsets of \(\mathbb{R},\) we use the term Borel measurable to refer to \(S\)-measurable functions.

Definition 65 (Borel measurable function)

Suppose \(X \subseteq \mathbb{R}.\) A function \(f: X \to \mathbb{R}\) is called Borel measurable if \(f^{-1}(B)\) is a Borel set for every Borel set \(B \subseteq \mathbb{R}.\)

Theorem 74 (Every continuous function is Borel measurable)

Every continuous real-valued function defined on a Borel subset of \(\mathbb{R}\) is a Borel measurable function.

Theorem 75 (Every increasing function is Borel measurable)

Every increasing function defined on a Borel subset of \(\mathbb{R}\) is a Borel measurable function.

Theorem 76 (Composition of measurable functions)

Suppose \((X, S)\) is a measurable space and \(f: X \to \mathbb{R}\) is a measurable function. Suppose that \(g\) is a real-valued Borel measurable function defined on a subset of \(\mathbb{R}\) that includes the range of \(f.\) Then \(g \circ f: X \to \mathbb{R}\) is a measurable function.

Theorem 77 (Algebraic operations with measurable functions)

Suppose \((X, S)\) is a measurable space and \(f, g: X \to \mathbb{R}\) are \(S\)-measurable functions. Then

(a) \(f + g, f - g, f g\) are \(S\)-measurable functions,

(b) if \(g(x) \neq 0\) for all \(x \in X,\) then \(f / g\) is a \(S\)-measurable function.

Theorem 78 (Pointwise limit of \(S\)-measurable functions is \(S\)-measurable)

Suppose \((X, S)\) is a measurable space and \(f_1, f_2, \ldots\) are \(S\)-measurable functions from \(X\) to \(\mathbb{R}.\) Suppose \(\lim_{k \to \infty} f_k(x)\) exists for each \(x \in X.\) Define \(f: X \to \mathbb{R}\) by

\[f(x) = \lim_{k \to \infty} f_k(x).\]

Then \(f\) is a \(S\)-measurable function.

Definition 66 (Borel subsets of \([-infty, \infty]\))

A subset of \([-\infty, \infty]\) is called a Borel subset if its intersection with \(\mathbb{R}\) is a Borel set.

Theorem 79 (Measurable function on \([-\infty, \infty]\))

Suppose \((X, \mathcal{S})\) is a measurable space. A function \(f: X \to [-\infty, \infty]\) is \(\mathcal{S}\)-measurable if

\(f^{-1}(B) \in \mathcal{S}\)

for every Borel set \(B \subseteq [-\infty, \infty].\)

Theorem 80 (Sufficient condition for measurable function)

Suppose \((X, \mathcal{S})\) is a measurable space and \(f: X \to [-\infty, \infty]\) is a function such that

\[f^{-1}((a, \infty]) \in \mathcal{S}\]

for all \(a \in \mathbb{R}.\) Then \(f\) is \(\mathcal{S}\)-measurable.

Theorem 81 (Infimum and supremum of a sequence of measurable functions is measurable)

Suppose \((X, \mathcal{S})\) is a measurable space and \(f_1, f_2, \ldots\) is a sequence of \(\mathcal{S}\)-measurable functions from \(X\) to \([-\infty, \infty].\) Define \(g, h: X \to [-\infty, \infty]\) by

\[g(x) = \inf\{f_k(x) : k \in \mathbb{Z}^+\} \text{ and } h(x) = \sup\{f_k(x) : k \in \mathbb{Z}^+\}.\]

Then \(g\) and \(h\) are \(\mathcal{S}\)-measurable functions.

Measures and their properties#

Now we come to the definition of measures. Our original motivation for the following definition came from trying to extend the notion of the length of an interval to the length of more general sets. However, the following definition is allows us to generalise a notion of size to other contexts, such as areas or volumes and beyond.

Definition 67 (Measure)

Suppose \(X\) is a set and \(\mathcal{S}\) is a \(\sigma\)-algebra on \(X.\) A measure on \((X, \mathcal{S})\) is a function \(\mu: \mathcal{S} \to [0, \infty]\) such that \(\mu(\emptyset) = 0\) and \(\mu\) is countably additive, that is

\[\mu\left( \bigcup_{k=1}^\infty E_k \right) = \sum_{k=1}^\infty \mu(E_k)\]

for every disjoint sequence \(E_1, E_2, \ldots\) of sets in \(\mathcal{S}.\)

Countable additivity of measures is a key property that allows us to prove useful limit theorems. Note that countable additvity implies finite additivity, that is, if \(\mu\) is a measure on \((X, \mathcal{S})\) and \(E_1, \ldots, E_n\) are disjoint sets in \(\mathcal{S},\) then

\[\mu(E_1 \cup \cdots \cup E_n) = \mu(E_1) + \cdots + \mu(E_n).\]

The following terminology is often very useful.

Definition 68 (Measure space)

A measure space is an ordered triple \((X, \mathcal{S}, \mu),\) where \(X\) is a set, \(\mathcal{S}\) is a \(\sigma\)-algebra on \(X\) and \(\mu\) is a measure on \((X, \mathcal{S}).\)

Properties of measures#

Now we discuss several useful properties of measures.

Theorem 82 (Measure preserves order; measure of a set difference)

Suppose \((X, \mathcal{S}, \mu)\) is a measure space and \(D, E \in \mathcal{S}\) with \(D \subseteq E.\) Then

(a) \(\mu(D) \leq \mu(E),\)

(b) \(\mu(E \setminus D) = \mu(E) - \mu(D)\) provided that \(\mu(D) < \infty.\)

The countable additivity property of measures applies to disjoint countable unions. The following countable subadditivity property applies to countable unions that may not be disjoint unions.

Theorem 83 (Countable subadditivity)

Suppose \((X, \mathcal{S}, \mu)\) is a measure space and \(E_1, E_2, \ldots \in \mathcal{S}.\) Then

\[\mu\left( \bigcup_{k=1}^\infty E_k \right) \leq \sum_{k=1}^\infty \mu(E_k).\]

Just as countable additivity implies finite additivity, countable subadditivity implies finite subadditivity. That is if \(\mu\) is a measure on \((X, \mathcal{S})\) and \(E_1, \ldots, E_n\) are sets in \(\mathcal{S},\) then

\[\mu(E_1 \cup \cdots \cup E_n) \leq \mu(E_1) + \cdots + \mu(E_n).\]

Now we show two very useful results about limits on measures. Note that the countable additivity property of measures is crucial for the following results.

Theorem 84 (Measure of an increasing union)

Suppose \((X, \mathcal{S}, \mu)\) is a measure space and \(E_1, E_2, \ldots \in \mathcal{S}\) is an increasing sequence of sets in \(\mathcal{S},\) that is \(E_1 \subseteq E_2 \subseteq \cdots.\) Then

\[\mu\left( \bigcup_{k=1}^\infty E_k \right) = \lim_{k \to \infty} \mu(E_k).\]

We conclude this section with another useful intuitive result, namely that the measure of the union of two sets is the sum of the measures of the sets minus the measure of their intersection, which has been counted twice.

Theorem 86 (Measure of the union of two sets)

Suppose \((X, \mathcal{S}, \mu)\) is a measure space and \(D, E \in \mathcal{S}.\) Then

\[\mu(D \cup E) = \mu(D) + \mu(E) - \mu(D \cap E).\]