the vec trick - Jacob Hume

----- > [!proposition] Proposition. ([[the vec trick]]) > For $A \in \mathbb{F}^{P \times M}, X \in \mathbb{F}^{M \times N}, B \in \mathbb{F}^{N \times Q}$, we have $\text{vec}(AXB)=(B^{\top} \otimes A)\text{vec}(X),$ > where > - $\text{vec}$ denotes the [[vectorize operation]]; > - $B^\top$ denotes the [[matrix transpose]] of $B$ (*not* [[conjugate transpose]]!); > - $\otimes$ denotes the [[Kronecker product]]. > [!note] Remark on Time Complexity. > Observe that the computing the LHS is much faster than computing the RHS; the RHS is mostly just for analysis. > **LHS.** Computing $XB$ takes around $MNQ$ scalar multiplications, hence computing $A(XB)$ takes around $PMN + MNQ$ scalar multiplications. ([[vectorize operation|Vectorizing]] can be done in $O(1)$ time.) This is $O(N^{3})$. > **RHS.** Computing the [[Kronecker product]] $B^{\top} \otimes A$ ultimately requires multiplying $a_{ij}b_{k\ell}$ as $i,j,k,\ell$ range over $P,M,Q,N$; hence $PMQN$ scalar multiplications. Multiplying by $\text{vec}(X)$ then requires an additional number of operations equal to the number of columns $NM$. In total we need $PMQN + NM$. This is $O(N^{4})$. > [!proof]- Proof. ([[the vec trick]]) > Let$ X=\begin{bmatrix} | & | & | \\ \b x_{1} & \dots & \b x_{N} \\ | & | & | \end{bmatrix} ,\text{ such that }\text{vec}(X)=\begin{bmatrix} \b x_{1} \\ \vdots \\ \b x_N \end{bmatrix}.$ Then $\begin{align} (B^{T} \otimes A)\text{vec}(X)= & \begin{bmatrix} b_{11}A & \dots & b_{N{1}}A \\ \vdots & \ddots & \vdots \\ b_{1Q}A & \dots & b_{NQ}A \end{bmatrix} \begin{bmatrix} \b x_{1} \\ \vdots \\ \b x_{N} \end{bmatrix} \\ = & \begin{bmatrix} \sum_{i=1}^{N} b_{i1}A\b x_{1} \\ \vdots \\ \sum_{i=1}^{N} b_{iQ}A \b x_{N} \end{bmatrix} \\ = & \begin{bmatrix} A \sum_{i=1}^{N}b_{i 1}\b x_{1} \\ \vdots \\ A\sum_{i=1}^{N} b_{iQ}\b x_{N} \end{bmatrix} \\ = & \begin{bmatrix} A \sum_{i=1}^{N}\b b_{1_{i}}\b x_{1} \\ \vdots \\ A\sum_{i=1}^{N} \b b_{Q_{i}}\b x_{N} \end{bmatrix} \\ = & \begin{bmatrix} AX\b b_{1} \\ \vdots \\ AX\b b_{Q} \end{bmatrix} \\ = & \begin{bmatrix} (AXB)_{:,1} \\ \vdots \\ (AXB)_{:,Q} \end{bmatrix} \\ = & \text{vec}(AXB), \end{align}$ \ Where we note that the $i^{th}$ entry of the [[vector]] $\b b_{j}$, $\b b_{j_{i}}$, (perhaps unintuitively) corresponds to entry $i j$ of the [[matrix]] $B$. This is a consequence of us naming the columns of the [[matrix]] $B$ $\b b_{1}, \dots, \b b_{Q}$ rather than the more verbose $B_{:, 1},\dots,B_{:,Q}$. ----- #### ---- #### References > [!backlink] > ```dataview > TABLE rows.file.link as "Further Reading" > FROM [[]] > FLATTEN file.tags as Tag > WHERE Tag = "#definition" OR Tag = "#theorem" OR Tag = "#MOC" OR Tag = "#proposition" OR Tag = "#axiom" > GROUP BY Tag > ``` > [!frontlink] > ```dataview > TABLE rows.file.link as "Further Reading" > FROM outgoing([[]]) > FLATTEN file.tags as Tag > WHERE Tag = "#definition" OR Tag = "#theorem" OR Tag = "#MOC" OR Tag = "#proposition" OR Tag = "#axiom" > GROUP BY Tag > ```