The standard algorithm for multiplying two nΓn matrices
requires n3 multiplications, but, remarkably, in 1969, Strassen
showed it could be done in βn2.8 multiplications. This is
fascinating to me, but the algorithm itself is arcane and seemingly
comes out of the blue. In this little post I hope to demystify it by
showing how you could have discovered Strassenβs algorithm using a
computer-algebra system.
While ostensibly about 2Γ2 matrices, this technique can be
used to multiply arbitrary nΓn matricies as well. The basic
idea is to break the nΓn matricies into four n/2Γn/2
quadrant matrices and use Strassen's algorithm to compute the
nΓn product using seven matrix-multiplies over the
n/2Γn/2 quadrants. Letting t(n) denote the number of
multiplies required to multiply nΓn matricies using Strassen's
algorithm, this gives the following recurrence relation for the value
of t(n):
t(n)={17t(n/2)βifΒ n=1otherwiseβ
When n is not a power of two, we can pad the matrices with zeros to
make the recurrence terminating. Doing so gives
t(n)=7log2βn=nlog2β7βn2.8.
I find this remarkable: It wouldn't have even occurred to me the
typical n3 algorithm for matrix multiplication is
sub-optimal. Moreover, the expressions for Piβ seemingly comes from
thin air. How would one go about "discovering" Strassen's algorithm?
In 1970 Richard Brent gave an algebraic
method (p. 33)
which we will work with for the remainder of this essay. At a high
level, we will set up an algebra problem, the solutions to which
correspond to ways to multiply 2Γ2 matrices using 7
multiplications. This will have 64 equations with 84 variables, so to
avoid solving this by hand we will use a SMT
solver.
First, let C=AB and to simplify indexing let
aΛ=(a11β,a12β,a21β,a22β) be A in row-major form and
let bΛ and cΛ be likewise. Because aijβaklβ nor
bijβbklβ appear in the expression for C, the kth product
will be of the form
where Ξ±i(k)β,Ξ²i(k)ββ{β1,0,1} are variables
indicating the leading coefficent of aΛiβ/bΛjβ in the
kth product. For example, in Strassen's algorithm P3β=a11β(b12ββb22β). As aΛ1β appears in P3β,
Ξ±1(3)β=1. As βbΛ4β appears in P3β,
Ξ²4(3)β=β1.
cΛlβ will be a sum of Pkβs, so letting
Ξ³l(k)ββ{β1,0,1} denote Pkβ's coefficient in the expression
for cΛlβ,
cΛlβ=kββΞ³l(k)βPkβ.
To make this an algebra problem, we need another formula for
cΛlβ. We can use the standard expression for C=AB
an algebra problem, the solutions to which correspond to ways to
compute C=AB using k multiplies. Though, as hinted before, because
our matrices have 4 entries each, the i, j, and l indicies
range over 4 values, making 4β 4β 4=64 equations. Adding to
our complications, to recover Strassen's algorithm we'd set k=7,
meaning there are 3β 4β 7=84wl(k)β, Ξ±i(k)β,
and Ξ²j(k)β variables. We will need a computer to solve this
algebra problem.
I used Z3 for this, an SMT solver
written at Microsoft. The
code is
straightforward, you can read it if you like, but trying to solve the
system of equations above will fail at first blush. To narrow the
search space, I added two constraints to the equations:
Swapping Piβ with Pjβ yields a new, but uninteresting,
solution. To stop the solver from finding these duplicates, I
require the Pkβs to be lexographically ordered, viewing them as
strings of Ξ±i(k)β and Ξ²j(k)β variables.
Negating Pkβ and all its corresponding wl(k)β values, or
negating both of Pkβ's summations yields a new, but
uninteresting, solution. To stop the solver from finding these
duplicates, I require the first non-zero Ξ±i(k)β and
Ξ²j(k)β values in Pkβ's summations to be positive. This
picks a sign for Pkβ, as exactly one of (β1)(Ξ£β¦) and
(Ξ£β¦) has a positive first non-zero term.
With these symmetries removed, the solver finds several Strassen-like
schemes in a minute. Nice - we've rediscovered Strassen's algorithm
from first principles.
Thanks to Pete, Jacob, Groceries, and Elam for interesting
conversations, and Entrepreneurs First for for hosting the weekend
hackathon where I got to weave this
together.