Hosting your own web site - 654 ANSWERS TO EXERCISES 4.6.4 It is easy

654 ANSWERS TO EXERCISES 4.6.4 It is easy to verify that at most n extra bits of precision are needed for the intermediate variables in this calculation; i.e., if ]si] 2 M for 0 2 i < 2n at the beginning of the algorithm, then all of the 5 and X variables will be bounded by 2 M throughout. Algorithm N performs A, addition-subtractions, D, halvings, and M, multiplica- tions, where A1 = 5, D1 = 0, MI = 3; for n > 1 we have A, = Ln/2]2 + + 2 n z +1A,+, + ([n/2] + 1)2 + + 2 , D, = 2 n z +1DLn,zJ + ([n/2] + 1)2 +l, and M, = 21n 2J+1 ML+J. The solutions are A, = 11. 2n- +r1gn1 -3.2 + 6. 2nS,, D, = 4. 2 - +bn1 -2.2 + 2. 2 S,, M, = 3 . 2n-1+r gnl; here S, satisfies the recurrence Sr = 0, S, = 2Sl,/zl+ [n/2], and it is not difficult to prove the inequalities +n[lgnl 2 S, 5 inlgn + n. Algorithm C does approximately the same amount of work as Algorithm N. It would be interesting to find a simpler way to carry out the additions and subtractions in step N3 (and the reverse operations in N5), perhaps analogous to Yates s method. The operation Xi + Xi + wkXl sketched above can be done with a procedure that generalizes the data-rotation algorithm of Fletcher and Silver in CACM 9 (1966), 326, but there might be a better way. 60. (a) In Cl, for example, we can group all terms having a common value of j and k into a single trilinear term; this gives y2 trilinear terms when (j, k) E E xE, plus y2 when (j, k) E E X0 and y2 when (j, k) E OX E. When 3 = k we can also include -xjj y3j z?j in Cl, free of charge. [In the case n = 10, the method multiplies 10 by 10 matrices with 710 noncommutative multiplications; this is fewer than required by any other known method, although Winograd s scheme (35) uses only 600 when commutativity is allowed.] (b) Here we simply let S be all the indices (i, j, k) of one problem, S the indices of the other. [When m = n = s = 10, the result is quite surprising: We can multiply two separate 10 X 10 matrices with 1300 noncommutative multiplications, while no scheme is known that would multiply each of them with 650.1 (c) Corresponding to the left-hand side of the stated identity we get the terms summed over (i, j, k) E S and 0 5 E, 5, n 5 1, so we get all the trilinear terms of the form Xz3 yjk ski except when [i/2] = [j/2] = [k/2]; h owever, these missing terms can all be included in Cl, Cz, or Es. The sum Cr turns out to include terms of the form ~i+~,j+< yi+v,j+, times some sum of z s, so it contributes 8v2 terms to the trilinear realization; and CZ, Cs are similar. To verify that the aBC terms cancel out, note that they are z(-l)Sfq xz+r,j+c Yk+c,i+c Zj+c,k+f, so 7 = 1 cancels with 7 = 0. [This technique leads to asymptotic improvements over Strassen s method whenever ijn +6n2 -4n < nlg7, namely when 36 5 n 5 184, and it was the first construction known to break the lg 7 barrier. Reference: SIAM J. Computing 9 (1980), 321-342.1 61. (a) Replace oij(U) by uail(~). (b) Let Q(U) = a+~~, etc., in a polynomial realization of length r = rankd(tijk). Then tijk = xCl+Y+O=d cl

Leave a Reply