---- $f(Q)=\sum_{i=1}^{T}\|B_{i}-QA_{i}\|_{F}^{2}$ > [!theorem] Theorem. ([[orthogonal procrustes problem]]) > The **orthogonal procrustes problem** finds an [[orthogonal matrix]] $\hat{Q} \in \mathbb{R}^{M \times M}$ that makes two other matrices $B,A \in \mathbb{R}^{M \times N}$ as similar as possible by (possibly improperly)"rotating" each column of $A$: $\hat{Q}= \arg \min_{Q: Q'Q=I_{M}} f(Q), \ \ f(Q):= \|B-QA\|_{F}^{2},$ > where $\|\cdot\|_{F}$ is the [[Frobenius norm]]. > \ > Let $C:=BA'$ and using the [[Singular Value Decomposition of a Matrix|SVD]] write $C=U\Sigma V'$. The problem can be equivalently formulated as $\hat{Q} = \arg \max _{Q: Q'Q=I}g(Q), \ \ g(Q):= \text{Tr}(Q'C),$ > > > for which a solution is $\hat{Q}=UV'.$ \ This solution is unique whenever $BA'$ is [[rank|full rank]] and [[group-invariant function|invariant]] under scaling of $A$ and $B$. > [!proof]- Proof. ([[orthogonal procrustes problem]]) > We first analyze the cost function thus[[trace of a matrix|:]]$\begin{align} > f(Q)= & \|B-QA\|_{F}^{2} = \text{Tr}\big( (B-QA)' (B-QA) \big) \\ > = & \text{Tr}\big( B'B-B'QA - A'Q'B + A'A \big) \\ > = & \text{Tr}(B'B) + \text{Tr}(A'A) - \overbrace{\text{Tr}(B'QA)}^{=\text{Tr}\big( (B'QA)' \big)} - {\text{Tr}\big( (A'Q'B)' \big)} \\ > = & \text{Tr}(B'B) + \text{Tr}(A'A) - 2\text{Tr}(A'Q'B) \\ > = & \text{Tr}(B'B) + \text{Tr}(A'A) - 2\text{Tr}(Q'\textcolor{Thistle}{BA'}) \\ > = & \text{Tr}(B'B) + \text{Tr}(A'A) - 2 \text{Tr}(Q'\textcolor{Thistle}{C}), > \end{align}$ > where in the penultimate step we use the [[cyclic-commutative property of trace]] and $C=U\Sigma V'$ is as in the theorem statement > > $B'B$ and $A'A$ are each [[positive semidefinite matrix|PSD]] and hence their [[trace of a matrix|trace]]s are nonnegative, so the minimizer of $f(Q)$ will be the same as the maximizer of $g(Q):= \text{Tr}(Q'\textcolor{Thistle}{C})=\text{Tr}(Q'\textcolor{Thistle}{U \Sigma V'}).$ > We further analyze $g(Q)$ (in order to bring the [[orthogonal matrix|orthogonal matrices]] together) as $g(Q)=\text{Tr}(Q'U\Sigma V')=\text{Tr}(V'Q'U\Sigma )=\text{Tr}(W\Sigma)=\sum_{m=1}^{M}w_{mm}\sigma_{m}.$ > where $W:=W(Q)=V'Q'U$ is [[general orthogonal group|orthogonal]]. Because columns of $W$ have [[orthonormal|unit norm]], $w_{mm} \leq 1$ for all $m$. This yields the upper bound $g(Q)=\text{Tr}(W \Sigma) \leq \sum_{m=1}^{M} \sigma _{m} = \text{Tr}(I \Sigma)$ > where we see equality is achieved when $W=I$, i.e., when $V'Q'U=I$. Thus rearranging: $\hat{Q}=UV',$ > which is what we wanted to show. > > The uniqueness condition is left as an exercise (see reference in lecture notes). > > We now verify the property of scale-invariance. Suppose $B$ is exactly a rotated version of the columns of $A$, along with an additional spatial scale factor, i.e. $B=\alpha \tilde{Q} A$ for [[orthogonal matrix]] $Q$. (So $A=\frac{1}{\alpha} \tilde{Q}' B$). We show that Procrustes method finds the correct rotation $\hat{Q}=\tilde{Q}$. Let $B=\tilde{U}\tilde{\Sigma}\tilde{V}'$ denote a [[Singular Value Decomposition of a Matrix|SVD]] of $B$. Then an SVD of $C$ is evident by inspection: $C=BA'=\frac{1}{\alpha}BB'\tilde{Q} = \frac{1}{\alpha} \tilde{U}\tilde{\Sigma}\tilde{V}' \tilde{V}\tilde{\Sigma}'\tilde{U}'\tilde{Q}=\underbrace{\tilde{U}}_{U} \underbrace{\frac{1}{\alpha}\tilde{\Sigma}\tilde{\Sigma}'}_{\Sigma} \underbrace{\tilde{U}' \tilde{Q}}_{V'}.$ > Now $\hat{Q}=UV'=(\tilde{U})(\tilde{U}' \tilde{Q})=\tilde{Q}$ > as claimed. ---- #### ----- #### References > [!backlink] > ```dataview TABLE rows.file.link as "Further Reading" FROM [[]] FLATTEN file.tags GROUP BY file.tags as Tag > [!frontlink] > ```dataview > TABLE rows.file.link as "Further Reading" > FROM outgoing([[]]) > FLATTEN file.tags as Tag > WHERE Tag = "#definition" OR Tag = "#theorem" OR Tag = "#MOC" OR Tag = "#proposition" OR Tag = "#axiom" > GROUP BY Tag > ```