21 Matrix inversion
21.1 Woodbury identity
If we have matrices A_{n \times n}, U_{n\times k}, C_{k \times k}, V_{k \times n}, then
( A + UCV)^{-1} = A^{-1} - A^{-1} U (C^{-1} + VA^{-1}U)^{-1} VA^{-1}
Derivation
Consider matrices X and Y.
\begin{align*} X(I + YX) &= (I+XY)X\\ (I+XY) ^{-1} X &= X(I + YX) ^{-1} \end{align*}
\begin{align*} I &= (I+P)(I+P)^{-1} \\ I &= (I+P)^{-1} + P(I+P)^{-1} \\ (I+P)^{-1} &= I - P(I+P)^{-1} \quad \text{(Rearranging terms)}\\ (I + YX)^{-1} &= I - YX(I+YX)^{-1} \quad \text{(Substituting } P = YX \text{)}\\ (I + YX)^{-1} &= I - Y (I+XY)^{-1}X \quad \text{(Using (1)) } \end{align*}
We substitue Y := A^{-1} U and X := C V
\begin{align*} (I + A^{-1} UCV)^{-1} &= I - A^{-1} U (I + CVA^{-1}U)^{-1} CV\\ (A^{-1} A + A^{-1} UCV)^{-1} &= I - A^{-1} U (C C^{-1} + CVA^{-1}U)^{-1} CV\\ [A^{-1}( A + UCV)]^{-1} &= I - A^{-1} U [ C (C^{-1} + VA^{-1}U)]^{-1} CV\\ ( A + UCV)^{-1} A &= I - A^{-1} U (C^{-1} + VA^{-1}U)^{-1} C^{-1} CV\\ ( A + UCV)^{-1} &= A^{-1} - A^{-1} U (C^{-1} + VA^{-1}U)^{-1} VA^{-1} \end{align*}
21.2 Block matrix inversion using Schur complements
Say we have a matrix M_{(m+ n) \times (m+n)} consisting of four blocks A_{m \times m}, B_{m \times n}, C_{n \times m}, D_{n \times n}.
Then, \begin{bmatrix} A & B \\ C & D \end{bmatrix}^{-1} =\begin{bmatrix} (M/D)^{-1} & - (M/D)^{-1} BD^{-1}\\ -D^{-1}C (M/D)^{-1} & D^{-1} + D^{-1} C (M/D)^{-1} BD^{-1} \end{bmatrix}
Derivation
\begin{bmatrix} A_{m \times m} & B_{m \times n} \\ C_{n \times m} & D_{n \times n} \end{bmatrix} \begin{bmatrix} X & Y \\ Z & W \\ \end{bmatrix} = \mathbf I_{m+n}
\begin{bmatrix} A X + B Z & A Y + B W \\ C X + D Z & C Y + D W \end{bmatrix}= \begin{bmatrix} \mathbf I_m & \mathbf 0_{m \times n} \\ \mathbf 0_{n \times m} & \mathbf I_n \\ \end{bmatrix}
We can separate these matrix equations and do some straightforward linear equation solving (assuming A and D are invertible):
\begin{align*} A Y + B W &= \mathbf 0 \\ A Y &= - B W \\ Y &= - A^{-1} B W \end{align*}
\begin{align*} C X + D Z &= \mathbf 0 \\ D Z &= - C X\\ Z &= - D^{-1} C X \end{align*}
\begin{align*} D W + C Y &= \mathbf I \\ D W - C A^{-1} BW &= \mathbf I \\ (D- C A^{-1} B) W &= \mathbf I\\ W &= (D- C A^{-1} B)^{-1} \end{align*}
\begin{align*} Y &= A^{-1} BW\\ Y &= -A^{-1} B(D - C A^{-1} B)^{-1} \end{align*}
\begin{align*} A X + B Z &= \mathbf I \\ AX - B D^{-1} C X &= \mathbf I\\ (A- BD^{-1} C) X &= \mathbf I\\ X &= (A- BD^{-1} C)^{-1} \end{align*}
\begin{align*} Z &= - D^{-1} C (A - BD^{-1} C)^{-1} \end{align*} \begin{bmatrix} A & B \\ C & D \end{bmatrix}^{-1} = \begin{bmatrix} X & Y \\ Z & W \\ \end{bmatrix} = \begin{bmatrix} (A- BD^{-1} C)^{-1} & -A^{-1} B(D - C A^{-1} B)^{-1} \\ - D^{-1} C (A - BD^{-1} C)^{-1} & (D- C A^{-1} B)^{-1} \end{bmatrix}
Substituting Q := D, U := -C, R := A^{-1}, V :=B into the Woodbury Matrix Identity, we get W = (D- CA^{-1}B)^{-1} = D^{-1} + D^{-1} C (A - BD^{-1}C)^{-1} BD^{-1}
\begin{align*} Y &= -A^{-1} B(D - C A^{-1} B)^{-1} \\ &= -A^{-1} B [D^{-1} + D^{-1} C (A - BD^{-1}C)^{-1} BD^{-1}]\\ &= - [A^{-1} B D^{-1} + A^{-1} B D^{-1} C (A - BD^{-1}C)^{-1} BD^{-1}]\\ &= - [A^{-1} + A^{-1} B D^{-1} C (A - BD^{-1}C)^{-1} ]BD^{-1}\\ &= - A^{-1}[I + B D^{-1} C (A - BD^{-1}C)^{-1} ]BD^{-1}\\ &= - A^{-1}[I - (A- B D^{-1} C ) (A - BD^{-1}C)^{-1} + A(A - BD^{-1}C)^{-1} ]BD^{-1}\\ &= - A^{-1} A(A - BD^{-1}C)^{-1} BD^{-1}\\ &= - (A - BD^{-1}C)^{-1} BD^{-1} \end{align*}
If D is invertible, we define the Schur complement of D as M /D = A - BD^{-1}C. We can easily express the block matrix inverse in terms of the inverse of this Schur complement: \begin{bmatrix} A & B \\ C & D \end{bmatrix}^{-1} =\begin{bmatrix} (M/D)^{-1} & - (M/D)^{-1} BD^{-1}\\ -D^{-1}C (M/D)^{-1} & D^{-1} + D^{-1} C (M/D)^{-1} BD^{-1} \end{bmatrix}
Equivalently, we can rephrase this in terms of the Schur complement of A, M /A = D - CA^{-1}B: \begin{bmatrix} A & B \\ C & D \end{bmatrix}^{-1} =\begin{bmatrix} A^{-1} + A^{-1} B (M/A)^{-1} CA^{-1} & -A^{-1}B (M/A)^{-1}\\ - (M/A)^{-1} CA^{-1} & (M/A)^{-1} \end{bmatrix}A convenient mnemonic is mentioned in Power (2022).