6 Exponential family
6.1 Definition
The exponential family is a group of distributions whose probability density or mass functions are of the form: \[ p( x | \eta ) = h(x) \exp \left( \eta^\top t(x) - a(\eta) \right)\]
Here, \(t(x)\) is the sufficient statistic, which is a vector of functions of \(x\), \(\eta\) is the “natural parameter” of the exponential family, \(a(\eta)\) is the log normalizer and \(h(x)\) is the base measure.
6.1.1 Example: Poisson distribution
For example, consider the Bernoulli distribution \[\begin{align*} p(x | \pi) &= \pi ^ x (1 - \pi)^{1-x} \\ &= \exp \left( x \log \pi + (1-x) \log (1- \pi) \right) \\ &= \exp \left( ( \log \pi - \log (1- \pi)) x + \log (1- \pi) \right) \\ &= \exp \left( x \log \frac \pi {1- \pi} + \log (1- \pi) \right) \\ \end{align*}\]
Plugging everything into the exponential family template, \(h(x) = 1\), \(t(x) = x\), \(\eta = \log \frac \pi {1- \pi}\), and \(a(\eta) = -\log (1 - \pi)\). Notice that the natural parameter here is log-odds of success.
6.1.2 Example: Gaussian distribution
Another example is the Gaussian distribution: \[\begin{align*} p(x | \mu, \sigma^2) &= \frac{1}{\sqrt {2 \pi \sigma^2}} \exp \left( \frac{- (x- \mu)^2}{2 \sigma^2} \right) \\ &= \frac{1}{\sqrt {2 \pi \sigma^2}} \exp \left( \frac{- x^2 + 2 \mu x - \mu^2}{2 \sigma^2} \right) \\ \end{align*}\]
Here, \(t(x) = \begin{bmatrix} x^2 \\ x \end{bmatrix}\), \(h(x) = \frac{1}{\sqrt 2 \pi}\), \(\eta = \begin{bmatrix} \frac{-1}{2\sigma^2}\\ \frac{\mu}{\sigma^2} \end{bmatrix}\), and \(a(\eta) = \frac{1}{2} \log \sigma^2 + \frac{\mu^2}{\sigma^2}\)
6.2 Moments of exponential family distributions
An interesting property of exponential family distributions is that the \(i\)th order derivative of the log normaliser \(a(\eta)\) gives us the ith order moment of the sufficient statistics \(t(x)\).
\[\begin{align*} \nabla_\eta a(\eta) &= \nabla_\eta \left(\log \int h(x) \exp (\eta^\top t(x)) dx\right)\\ &= \frac{\nabla_\eta \int h(x) \exp (\eta^\top t(x)) dx }{\int h(x) \exp (\eta^\top t(x)) dx } \\ &= \int t(x) \frac{ h(x) \exp (\eta^\top t(x)) }{\int h(x) \exp (\eta^\top t(x)) dx }dx \\ &= \mathbb E _{X\sim f(\cdot|\eta)}[t(X)] \end{align*}\]