6 Joint distributions

Published

January 19, 2026

6.1 Definitions

Let (\Omega, \mathcal H, \mathbb P) be a probability space.

6.1.1 Random variable

A random variable is a measurable function: (\Omega, \mathcal H) \to (E, \mathcal E). The distribution of a random variable is the pushforward measure \mathbb P \circ X^{-1}.

\mathbb P(X \in A) = \mathbb P(X^{-1} (A))

\mu is the distribution of X \iff \forall f \in \mathcal E_+, \mathbb E[f\circ X] = \mu[f]

6.1.2 Expectation

Lebesgue integral of a random variable wrt a probability measure. \mathbb E[X] = \mathbb P[X]

Some handy relations between expectation and probability: * \mathbb P(A) = \mathbb E [1_A] * For non-negative rv X \mathbb E[X] = \int \mathbb P(X > t) dt

Proof

\begin{align*} \int \mathbb P(X > t) dt = \int_{t = 0}^\infty \int_{x} \mathbb I(x > t) dP_X(x) dt \quad \text{ Fubini's}\\ = \int_x \int_{t = 0}^x dt dP_X(x) \\ = \int_x x dP_X(x) \end{align*}

6.1.3 Moment generating function

For some random variable X,

M_X(s) = \mathbb E[\exp sX]

For random vectors X \in \mathbb R^d, \forall t \in R^d:

M_X(t) = \mathbb E[\exp t^\top X]

Properties:

For independent RVs X and Y, M_{X+Y}(s) = M_X(s) M_Y(s)
If the MGF is finite in a neighbourhood of 0, then \mathbb E|X^n| < \infty for all n and M_X(s) = \sum_{n=0}^\infty \frac{s^n}{n!} \mathbb E [X^n].
Thus, the r-th derivative at zero gives M_X^{(r)}(0) = \mathbb E[X^r].

Proof

\begin{align*} M_X(s) &= \mathbb E[\exp sX] = \mathbb E[ \sum_{n=0}^\infty \frac{(sX)^n}{n!} ]\\ &= \mathbb E[ \lim_{N \to \infty} \sum_{n=0}^N \frac{(sX)^n}{n!} ]\\ &= \lim_{N \to \infty} \mathbb E[ \sum_{n=0}^N \frac{(sX)^n}{n!} ]\\ &= \sum_{n=0}^\infty \frac{s^n}{n!} \mathbb E[X^n] \end{align*}

For the third step, we need to swap limits and integrals. We can use DCT with the dominating function \exp |sX|. Now this is integrable because by assumption for some s_0 > 0 and |s| < s_0, M_X(s) < \infty

\mathbb E[ \exp |sX|] < \mathbb E[ e^{sX} + e^{-sX}] = M_X(s) + M_X(-s)

6.1.4 Characteristic function

6.2 Product spaces

Let (E, \mathcal E) and (F, \mathcal F) be two measurable spaces.

6.2.1 Product sigma algebra

A measurable rectangle is a Cartesian product of a measurable sets. The sigma algebra generated by all possible measurable rectangles is called the product sigma algebra.

\mathcal E \otimes \mathcal F = \sigma \{ A \times B: A \in \mathcal E, B \in \mathcal F \}

This can be easily extended to a product of finitely many sigma algebras \mathcal E_1, \dots \mathcal E_n. Note that the set of all measurable rectangles is a p-system.

\mathcal E_1 \otimes \dots \otimes \mathcal E_n = \sigma \{ A_1 \times A_2 \dots \times A_n: A_1 \in \mathcal E_1, B \in \mathcal E_n \}

The concatenation of measurable functions f_i: \Omega \to E_i each measurable with respect to some sigma algebra \mathcal E_i is measurable with respect to the product sigma algebra.

6.2.2 Infinite product spaces

Let T be an arbitrary index set such that for all t \in T, (E_t, \mathcal E_t) is a measurable space. A measurable rectangle is defined as a Cartesian product of the form \times_{t \in T} A_t = \{ x \in \times_{t \in T} E_t : x_t \in A_t \forall t \in T \}

where only finitely many A_t \subsetneq E_t.

The sigma algebra generated by all measurable rectangles is the product sigma algebra.

Again, the arbitrary “concatenation” of measurable functions f_t: \Omega \to E_t each measurable with respect to some sigma algebra \mathcal E_t is measurable with respect to the product sigma algebra.

6.3 Sigma algebra generated by random variables

Let X be \mathcal H$E$ measurable, then: \sigma X = X^{-1} \mathcal E = \{X^{-1} A : A \in \mathcal E\}

\sigma X is the smallest sigma algebra that X is measurable with respect to.

For an arbitrary collection of random variables X_t: \Omega \to E_t, t \in T:

\sigma \{X_t: t \in T \} = \sigma \left( \cup_{t \in T} \sigma X_t \right)

6.3.1 Measurability with random variable sigma algebra

A function V is measurable with respect to \sigma X if and only if V = f \circ X for some f \in \mathcal E.

For a stochastic process, V is measurable with respect to \sigma \{X_t: t \in T \} if and only if there exists a sequence (t_n) \in T and a function f \in \otimes_n \mathcal E_{t_n} such that:

V = f(X_{t_1}, X_{t_2}, \dots)