14 Convergence
14.1 Almost sure
Let (\Omega, \mathcal H, \mathbb P) be a probability space. A sequence of random variables X_n converges almost surely if \mathbb P \{\lim\inf X_n(\omega) = \lim \sup X_n(\omega) \in \mathbb R\} = 1.
We say X_n \stackrel{a.s.}{\to} X if X_n converges almost surely and \lim X_n \stackrel{a.s.}{=} X.
\mathbb P( X(\omega) \lim X_n(\omega)) = 1
For general metric spaces, X_n \stackrel{a.s.}{\to} X \iff d(X_n, X) \stackrel{a.s.}{\to} 0.
An equivalent condition is: for any finite gap \varepsilon, only finitely many |X_n - X| exceed \varepsilon: X_n \stackrel{a.s.}{\to} X \iff \sum_n \mathbb 1[|X_n - X| > \varepsilon](\omega) \stackrel{a.s.}{<} \infty \forall \varepsilon > 0
14.1.1 Cauchy criterion
To check almost sure convergence without nowledge of the limit, there is an analogue of the Cauchy criterion. The following are equivalent:
- X_n converges almost surely
- \sup_{i, j \geq n} |X_i - X_j| \stackrel{a.s.}{\to} 0
- \sup_{k} |X_{n+k} - X_n| \stackrel{a.s.}{\to} 0
14.1.2 Continuous mapping theorem
Define f: \mathbb R^K \to \mathbb R, K \in \mathbb N. If
- X_{kn} \stackrel{a.s.}{\to} X_k for all k and
- f is continuous on a set A \subset \mathbb R^K such that \mathbb P((X_1, \dots, X_K) \in A) = 1 then f((X_{1n}, \dots, X_{Kn})) \stackrel{a.s.}{\to} f((X_{1}, \dots, X_{K}))
Note that the continuity set only needs to almost surely contain the limiting random vector, not the entire sequence.
14.1.3 Borel Cantelli theorem
Let (H_n) be a sequence of events.
\sum_n \mathbb P(H_n) < \infty \implies \sum_n \mathbb 1_{H_n}(\omega) \stackrel{a.s.}{<} \infty
Proof
By monotone convergence \infty > \sum_n \mathbb P(H_n) > \sum_n \mathbb E [\mathbb 1_{H_n}] = \mathbb E[\sum_n \mathbb 1_{H_n}]
And \mathbb E[X] < \infty \implies X \stackrel{a.s.}{<} \infty
14.1.4 Sufficient conditions
Using Borel-Cantelli and the characterisation we saw earlier (setting H_n = (\epsilon, \infty)) , we have a nice sufficient condition: \sum_n \mathbb P(|X_n - X| > \varepsilon) < \infty \forall \varepsilon > 0 \implies X_n \stackrel{a.s.}{\to} X
Alternatively, if we don’t know the limit in advance, we can use another sufficient condition:
There exists a sequence \varepsilon_n > 0 such that \sum_n \varepsilon_n< \infty and
\sum_n \mathbb P(|X_{n+1} - X_n| > \varepsilon_n) < \infty
14.2 In probability
14.2.1 Definition
X_n \stackrel{p}{\to} X if for all \varepsilon >0, \lim \mathbb P(|X_n - X| > \varepsilon) = 0
For general metric spaces, \lim \mathbb P(d(X_n, X) > \varepsilon) = 0
14.3 L^p convergence
\mathbb E[|X_n - X|^p] \to 0
14.3.1 Uniform integrability
A family of random variables (X_t)_{t \in T} is uniformly integrable if
\lim_{b \to \infty} \sup_{t \in T} \mathbb E[|X_t| \mathbb 1(|X_t| > b)] = 0
Theorem 14.1 Uniform integrability implies that all X_t are integrable i.e. \sup_{t\in T} \mathbb E[|X_t|] < \infty. The converse is not true in general.
Theorem 14.2 If \sup_{t \in T} \mathbb E[|X_t|^p] < \infty for some p > 1 then (X_t)_{t \in T} is uniformly integrable.
A family of random variables (X_t)_{t \in T} is uniformly integrable if and only if there exists a convex function \phi: \mathbb R_+ \to \mathbb R_+ such that \lim_{x \to \infty} \frac{\phi(x)}{x} = \infty and \sup_{t \in T} \mathbb E[\phi(|X_t|)] < \infty.
14.3.2 L^1 convergence
The following are equivalent:
- X_n \stackrel{L^1}{\to} X
- X_n \stackrel{p}{\to} X and (X_n) is uniformly integrable
- X_n \in L^1, X \in L^1, X_n \stackrel{p}{\to} X and \mathbb E|X_n| \to \mathbb E|X|
14.4 Weak convergence
14.4.1 Definition
A sequence of measures converges weakly \mu_n \stackrel{w}{\to} \mu if for every bounded continuous function f:
\mu_n[f] \to \mu[f]
For convergence in distribution of a sequence of random variable X_n \stackrel{d}{\to} X, we require convergence of the pushforwards \mathbb P \circ X_n^{-1} i.e. for every bounded continuous function f: \mathbb[f(X_n)] \to \mathbb E [f(X)]
14.4.2 Portmanteau theorem
The following are equivalent:
- \mu_n \stackrel{\to}{w} \mu
- \lim \sup \mu_n(A) \leq \mu(A) for every closed set A
- \lim \inf \mu_n(A) \leq \mu(A) for every open set A
- \lim \mu_n(A) = \mu(A) for every Borel set with \mu(\partial A) = 0
14.4.3 Continuous mapping theorem
Proof
In this new space, \mathbb P(Y \in D_h) =\mathbb P(X \in D_h) = 0 i.e. the set of discontinuity points is negligible. Thus, for all \omega in the complement,
Y_n \stackrel{a.s.}{\to} Y \implies h(Y_n) \stackrel{a.s.}{\to} h(Y)
And since there exists a probability space where this almost sure convergence holds, and in our original space we have h(X_n)= h(Y_n) and h(X)= h(Y), thus again by Skorokhod’s representation theorem, we can conclude h(X_n) \stackrel{d}{\to} h(X).
14.4.4 Uniqueness
If \mu_n \stackrel{w}{\to} \mu and \mu_n \stackrel{w}{\to} \nu then \mu = \nu.
14.4.5 Prokhorov’s theorem
If (\mu_n) is tight, then every subsequence has a further subsubsequence that converges weakly.
In practice, if we can show:
- Any subsequence of \mu_n has a subsubsequence that converges weakly.
- The weak limit is the same for each subsubsequence, say \mu.
Then, for any bounded continuous f, each subsequence of \mu_n f has a subsubsequence converging to \mu f, and thus \mu_n f \to \mu f \implies \mu_n \stackrel{\to}{w} \mu.
We use Prokhorov’s theorem to show (1) - the condition required is tightness i.e. for every \epsilon > 0 there exists a compact set K such that for all n, \mu_n(K) > 1- \epsilon
14.4.6 Levy’s convergence theorem
Let f_n(r) = \int \exp(irx) \mu_n(dx). \mu_n is weakly convergent if and only if the pointwise limit f(r) = \lim f_n(r) exists, and the function f is continuous at 0. In this situation, \mu_n \stackrel{w}{\to} \mu where:
f(r) = \int \exp(irx) \mu(dx)
14.5 Relations
14.5.1 Relation of almost sure convergence and convergence in probability
- X_n \stackrel{a.s.}{\to} X \implies X_n \stackrel{p}{\to} X
Proof
X_n \stackrel{a.s.}{\to} X implies for all \varepsilon > 0:
\sum_n \mathbb 1[|X_n - X| > \varepsilon](\omega) \stackrel{a.s.}{<} \infty
and thus
\mathbb 1[|X_n - X| > \varepsilon](\omega) \stackrel{a.s.}{\to} 0
Now, for \varepsilon > 0
\begin{align*} \lim \mathbb P(|X_n - X| > \varepsilon) &= \lim \mathbb E \mathbb 1[|X_n - X| > \varepsilon]\\ &= \mathbb E \lim \mathbb 1[|X_n - X| > \varepsilon] \text{by dominated convergence using 1}\\ &= 0 \end{align*}
If X_n \stackrel{p}{\to} X then there exists a subsequence (n_i) such that X_{n_i} \stackrel{a.s.}{\to} X
If every subsequence of X_n has a further subsequence that converges to X almost surely then X_n \stackrel{p}{\to} X
14.5.2 Convergence in L^p implies convergence in probability
Proof
For every \varepsilon>0:
\begin{align*} \lim \mathbb P(|X_n - X| > \varepsilon) &= \lim \mathbb P(|X_n - X|^p > \varepsilon^p)\\ &\leq \lim \frac{|X_n - X|}{\varepsilon^p} \text{Markov's inequality} &= 0 \end{align*}
14.5.3 Convergence in probability implies weak convergence
Proof
Assume X_n \stackrel{p}{\to} X. Then, for every subsequence (n_i) there exists a subsubsequence (n_{i_j}) such that X_{n_{i_j}} \stackrel{a.s.}{\to} X.
Here,
\begin{align*} \lim X_{n_{i_j}}(\omega) &\stackrel{a.s.}{=} X(\omega) \\ \lim f(X_{n_{i_j}}) &= f(\lim f(X_{n_{i_j}})) \quad \text{continuity of $f$}\\ &\stackrel{a.s.}{=} f(X(\omega)) \\ \lim \mathbb E[ f \circ X_{n_{i_j}} ] &= \mathbb E[\lim f(X_{n_{i_j}})] \quad \text{DCT, boundedness of $f$}\\ &= \mathbb E [f \circ X] \end{align*}
Thus every subsequence of \mathbb E[f \circ X_n] has a subsubsequence that converges to \mathbb E[f \circ X] so \mathbb E[f \circ X_n] \to \mathbb E [f \circ X] thus X_n \stackrel{d}{\to} X
14.5.4 Skorokhod representation
Let (\Omega, \mathcal H, \mathbb P) be our probability space, with random variables (X_n) and X. Then X_n \stackrel{d}{\to} X if and only if there exists another probability space (\Omega', \mathcal H', \mathbb P') and random variables on that space (Y_n) and Y such that
X_n \stackrel{d}{=} Y_n X \stackrel{d}{=} Y Y_n \stackrel{a.s.}{\to} Y