2 Schur complement
3 Inverting block matrices using Schur complement
Let’s learn to invert a matrix \(M_{(m+ n) \times (m+n)}\) consisting of four blocks \(A_{m \times m}, B_{m \times n}, C_{n \times m}, D_{n \times n}\).
\[ \begin{bmatrix} A_{m \times m} & B_{m \times n} \\ C_{n \times m} & D_{n \times n} \end{bmatrix} \begin{bmatrix} X & Y \\ Z & W \\ \end{bmatrix} = \mathbf I_{m+n} \]
\[ \begin{bmatrix} A X + B Z & A Y + B W \\ C X + D Z & C Y + D W \end{bmatrix}= \begin{bmatrix} \mathbf I_m & \mathbf 0_{m \times n} \\ \mathbf 0_{n \times m} & \mathbf I_n \\ \end{bmatrix} \]
We can separate these matrix equations and do some straightforward linear equation solving (assuming \(A\) and \(D\) are invertible):
\[\begin{align*} A Y + B W &= \mathbf 0 \\ A Y &= - B W \\ Y &= - A^{-1} B W \end{align*}\]
\[\begin{align*} C X + D Z &= \mathbf 0 \\ D Z &= - C X\\ Z &= - D^{-1} C X \end{align*}\]
\[\begin{align*} D W + C Y &= \mathbf I \\ D W - C A^{-1} BW &= \mathbf I \\ (D- C A^{-1} B) W &= \mathbf I\\ W &= (D- C A^{-1} B)^{-1} \end{align*}\]
\[\begin{align*} Y &= A^{-1} BW\\ Y &= -A^{-1} B(D - C A^{-1} B)^{-1} \end{align*}\]
\[\begin{align*} A X + B Z &= \mathbf I \\ AX - B D^{-1} C X &= \mathbf I\\ (A- BD^{-1} C) X &= \mathbf I\\ X &= (A- BD^{-1} C)^{-1} \end{align*}\]
\[\begin{align*} Z &= - D^{-1} C (A - BD^{-1} C)^{-1} \end{align*}\] \[ \begin{bmatrix} A & B \\ C & D \end{bmatrix}^{-1} = \begin{bmatrix} X & Y \\ Z & W \\ \end{bmatrix} = \begin{bmatrix} (A- BD^{-1} C)^{-1} & -A^{-1} B(D - C A^{-1} B)^{-1} \\ - D^{-1} C (A - BD^{-1} C)^{-1} & (D- C A^{-1} B)^{-1} \end{bmatrix} \]
Substituting \(Q := D\), \(U := -C\), \(R := A^{-1}\), \(V :=B\) into the Woodbury Matrix Identity, we get \[W = (D- CA^{-1}B)^{-1} = D^{-1} + D^{-1} C (A - BD^{-1}C)^{-1} BD^{-1} \]
\[\begin{align*} Y &= -A^{-1} B(D - C A^{-1} B)^{-1} \\ &= -A^{-1} B [D^{-1} + D^{-1} C (A - BD^{-1}C)^{-1} BD^{-1}]\\ &= - [A^{-1} B D^{-1} + A^{-1} B D^{-1} C (A - BD^{-1}C)^{-1} BD^{-1}]\\ &= - [A^{-1} + A^{-1} B D^{-1} C (A - BD^{-1}C)^{-1} ]BD^{-1}\\ &= - A^{-1}[I + B D^{-1} C (A - BD^{-1}C)^{-1} ]BD^{-1}\\ &= - A^{-1}[I - (A- B D^{-1} C ) (A - BD^{-1}C)^{-1} + A(A - BD^{-1}C)^{-1} ]BD^{-1}\\ &= - A^{-1} A(A - BD^{-1}C)^{-1} BD^{-1}\\ &= - (A - BD^{-1}C)^{-1} BD^{-1} \end{align*}\]
If D is invertible, we define the Schur complement of D as \(M /D = A - BD^{-1}C\). We can easily express the block matrix inverse in terms of the inverse of this Schur complement: \[ \begin{bmatrix} A & B \\ C & D \end{bmatrix}^{-1} =\begin{bmatrix} (M/D)^{-1} & - (M/D)^{-1} BD^{-1}\\ -D^{-1}C (M/D)^{-1} & D^{-1} + D^{-1} C (M/D)^{-1} BD^{-1} \end{bmatrix} \]
Equivalently, we can rephrase this in terms of the Schur complement of \(A\), \(M /A = D - CA^{-1}B\): \[ \begin{bmatrix} A & B \\ C & D \end{bmatrix}^{-1} =\begin{bmatrix} A^{-1} + A^{-1} B (M/A)^{-1} CA^{-1} & -A^{-1}B (M/A)^{-1}\\ - (M/A)^{-1} CA^{-1} & (M/A)^{-1} \end{bmatrix} \]