9  Convergence

Published

December 9, 2025

9.1 Almost sure

Let (\Omega, \mathcal H, \mathbb P) be a probability space. A sequence of random variables X_n converges almost surely if \mathbb P \{\lim\inf X_n(\omega) = \lim \sup X_n(\omega) \in \mathbb R\} = 1.

We say X_n \stackrel{a.s.}{\to} X if X_n converges almost surely and \lim X_n \stackrel{a.s.}{=} X.

\mathbb P( X(\omega) \lim X_n(\omega)) = 1

For general metric spaces, X_n \stackrel{a.s.}{\to} X \iff d(X_n, X) \stackrel{a.s.}{\to} 0.

An equivalent condition is: for any finite gap \varepsilon, only finitely many |X_n - X| exceed \varepsilon: X_n \stackrel{a.s.}{\to} X \iff \sum_n \mathbb 1[|X_n - X| > \varepsilon](\omega) \stackrel{a.s.}{<} \infty \forall \varepsilon > 0

9.1.1 Cauchy criterion

To check almost sure convergence without nowledge of the limit, there is an analogue of the Cauchy criterion. The following are equivalent:

  1. X_n converges almost surely
  2. \sup_{i, j \geq n} |X_i - X_j| \stackrel{a.s.}{\to} 0
  3. \sup_{k} |X_{n+k} - X_n| \stackrel{a.s.}{\to} 0

9.1.2 Continuous mapping theorem

Define f: \mathbb R^K \to \mathbb R, K \in \mathbb N. If

  1. X_{kn} \stackrel{a.s.}{\to} X_k for all k and
  2. f is continuous on a set A \subset \mathbb R^K such that \mathbb P((X_1, \dots, X_K) \in A) = 1 then f((X_{1n}, \dots, X_{Kn})) \stackrel{a.s.}{\to} f((X_{1}, \dots, X_{K}))

Note that the continuity set only needs to almost surely contain the limiting random vector, not the entire sequence.

9.1.3 Borel Cantelli theorem

Let (H_n) be a sequence of events.

\sum_n \mathbb P(H_n) < \infty \implies \sum_n \mathbb 1_{H_n}(\omega) \stackrel{a.s.}{<} \infty

Proof

By monotone convergence \infty > \sum_n \mathbb P(H_n) > \sum_n \mathbb E [\mathbb 1_{H_n}] = \mathbb E[\sum_n \mathbb 1_{H_n}]

And \mathbb E[X] < \infty \implies X \stackrel{a.s.}{<} \infty

9.1.4 Sufficient conditions

Using Borel-Cantelli and the characterisation we saw earlier (setting H_n = (\epsilon, \infty)) , we have a nice sufficient condition: \sum_n \mathbb P(|X_n - X| > \varepsilon) < \infty \forall \varepsilon > 0 \implies X_n \stackrel{a.s.}{\to} X

Alternatively, if we don’t know the limit in advance, we can use another sufficient condition:

There exists a sequence \varepsilon_n > 0 such that \sum_n \varepsilon_n< \infty and

\sum_n \mathbb P(|X_{n+1} - X_n| > \varepsilon_n) < \infty

9.2 In probability

9.2.1 Definition

X_n \stackrel{p}{\to} X if for all \varepsilon >0, \lim \mathbb P(|X_n - X| > \varepsilon) = 0

For general metric spaces, \lim \mathbb P(d(X_n, X) > \varepsilon) = 0

9.2.2 Relation with almost sure convergence

  1. X_n \stackrel{a.s.}{\to} X \implies X_n \stackrel{p}{\to} X
Proof

X_n \stackrel{a.s.}{\to} X implies for all \varepsilon > 0:

\sum_n \mathbb 1[|X_n - X| > \varepsilon](\omega) \stackrel{a.s.}{<} \infty

and thus

\mathbb 1[|X_n - X| > \varepsilon](\omega) \stackrel{a.s.}{\to} 0

Now, for \varepsilon > 0

\begin{align*} \lim \mathbb P(|X_n - X| > \varepsilon) &= \lim \mathbb E \mathbb 1[|X_n - X| > \varepsilon]\\ &= \mathbb E \lim \mathbb 1[|X_n - X| > \varepsilon] \text{by dominated convergence using 1}\\ &= 0 \end{align*}

  1. If X_n \stackrel{p}{\to} X then there exists a subsequence (n_i) such that X_{n_i} \stackrel{a.s.}{\to} X

  2. If every subsequence of X_n has a further subsequence that converges to X almost surely then X_n \stackrel{p}{\to} X

9.3 L^p convergence

\mathbb E[|X_n - X|^p] \to 0

9.3.1 Convergence in L^p implies convergence in probability

Proof

For every \varepsilon>0:

\begin{align*} \lim \mathbb P(|X_n - X| > \varepsilon) &= \lim \mathbb P(|X_n - X|^p > \varepsilon^p)\\ &\leq \lim \frac{|X_n - X|}{\varepsilon^p} \text{Markov's inequality} &= 0 \end{align*}

9.3.2 L^1 convergence

The following are equivalent:

  1. X_n \stackrel{L^1}{\to} X
  2. X_n \stackrel{p}{\to} X and (X_n) is uniformly integrable (i.e. \lim_{b \to \infty} \sup_{n \in \mathbb N} \mathbb E[X_n \mathbb 1(|X_n| > b)])
  3. X_n \in L^1, X \in L^1, X_n \stackrel{p}{\to} X and \mathbb E|X_n| \to \mathbb E|X|

9.4 Weak convergence

9.4.1 Definition

A sequence of measures converges weakly \mu_n \stackrel{w}{\to} \mu if for every bounded continuous function f:

\mu_n[f] \to \mu[f]

For convergence in distribution of a sequence of random variable X_n \stackrel{d}{\to} X, we require convergence of the pushforwards \mathbb P \circ X_n^{-1} i.e. for every bounded continuous function f: \mathbb[f(X_n)] \to \mathbb E [f(X)]

9.4.2 Portmanteau theorem

9.4.3 Skorokhod representation

Let (\Omega, \mathcal H, \mathbb P) be our probability space, with random variables (X_n) and X. Then X_n \stackrel{d}{\to} X if and only if there exists another probability space (\Omega', \mathcal H', \mathbb P') and random variables on that space (Y_n) and Y such that

X_n \stackrel{d}{=} Y_n X \stackrel{d}{=} Y Y_n \stackrel{a.s.}{\to} Y

9.4.4 Continuous mapping theorem

Proof

Let X_n \stackrel{d}{\to} X. Then there exists a Skorokhod probability space (\Omega', \mathcal H', \mathbb P') and random variables on that space (Y_n) and Y such that X_n \stackrel{d}{=} Y_n, X \stackrel{d}{=} Y, Y_n \stackrel{a.s.}{\to} Y.

In this new space, \mathbb P(Y \in D_h) =\mathbb P(X \in D_h) = 0 i.e. the set of discontinuity points is negligible. Thus, for all \omega in the complement,

Y_n \stackrel{a.s.}{\to} Y \implies h(Y_n) \stackrel{a.s.}{\to} h(Y)

And since there exists a probability space where this almost sure convergence holds, and in our original space we have h(X_n)= h(Y_n) and h(X)= h(Y), thus again by Skorokhod’s representation theorem, we can conclude h(X_n) \stackrel{d}{\to} h(X).

9.4.5 Convergence in probability implies weak convergence

Proof

Assume X_n \stackrel{p}{\to} X. Then, for every subsequence (n_i) there exists a subsubsequence (n_{i_j}) such that X_{n_{i_j}} \stackrel{a.s.}{\to} X.

Here,

\begin{align*} \lim X_{n_{i_j}}(\omega) &\stackrel{a.s.}{=} X(\omega) \\ \lim f(X_{n_{i_j}}) &= f(\lim f(X_{n_{i_j}})) \quad \text{continuity of $f$}\\ &\stackrel{a.s.}{=} f(X(\omega)) \\ \lim \mathbb E[ f \circ X_{n_{i_j}} ] &= \mathbb E[\lim f(X_{n_{i_j}})] \quad \text{DCT, boundedness of $f$}\\ &= \mathbb E [f \circ X] \end{align*}

Thus every subsequence of \mathbb E[f \circ X_n] has a subsubsequence that converges to \mathbb E[f \circ X] so \mathbb E[f \circ X_n] \to \mathbb E [f \circ X] thus X_n \stackrel{d}{\to} X