Archive for September, 2007

Web site developers - 542 ANSWERS TO EXERCISES 3.3.3 ,l, 1) y=x-;

Sunday, September 30th, 2007

542 ANSWERS TO EXERCISES 3.3.3 ,l, 1) y=x-; Y++;-; / y=r-a! / l,l-a) yEz-I-a 2 2 M= g -cy 2 / / I,;-; > 1, c9 Fig. A-3. Permutation regions for a generator with potency 2; (Y = (u -l)c/m. 28. Fig. A-3 shows the various regions in the general case. The 213 region means UZ < UI < Us, if WI and UZ are chosen at random; the 321 region means that Us < U2 < Ul, etc. The probabilities for 123 and 321 are $ -cy/2 + ~ 12; the probabilities for all other cases are Q + (r/4 -cu2/4. To have all equal to &, we must have 1 -6cu + 6cu2 = 0. [This exercise establishes a theorem due to J. N. Franklin, Math. Comp. 17 (1963), 28-59, Theorem 13; other results of Franklin s paper are related to exercises 22 and 23.1 SECTION 3.3.4 1. VI is always m and ~1 = 2, for generators of maximum period. 2. Let V be the matrix whose rows are VI, . . . , vt. To minimize Y . Y, subject to the condition that Y # (0,. . . , 0) and VY is an integer column vector X, is equivalent to minimizing (V-lx). (V-lx), subject to the condition that X is a nonzero integer column vector. The columns of V- are UI, . . . , Ut.

3.3.3 ANSWERS TO EXERCISES 541 Fig. A-l. Permutation (Web hosting rating)

Sunday, September 30th, 2007

3.3.3 ANSWERS TO EXERCISES 541 Fig. A-l. Permutation regions for Fibonacci generator. Fig. A-2. Run-length regions for Fibonacci generator. 24. Proceed as in the previous exercise; the sum of the interval lengths is 5 c at- (a -1) = at&)(a+:-2). ol k>l The value for a truly random sequence would be e -1; and our value is e -1 + (e/2 -1)/a + 0(1/a ). [Note: The same result holds for an ascending run, since we have U,, > Un+I if and only if 1 -U,, < 1 -Un+I. This would lead us to suspect that runs in linear congruential sequences might be slightly longer than normal, so the run test should be applied to such generators.] 25. z must be in the interval [(k + a -Q/a, (k + p -0)/a) for some k, and also in the interval [a, p). Let ko = [aa: + 0 -p l, ICI = lab + 0 -/3 l. With due regard to boundary conditions, we get the probability (kl -ko)(P -a/)/a -I- max(0, P -(kl + a -0)/a) -max(O, cx-(k. + a -.9)/a). This is (/3 -cu)(p -CY ) + E, where 1~1 < 2(p -&)/a. 26. See Fig. A-l; the orderings U1 < Us < Uz and Uz < U3 < UI are impossible; the other four each have probability f . 27. U, = {F,-~UO $ F,Ul}. We need to have both F~--~UO + Fkul < 1 and FkUo + Fk+l VI > 1. The half-unit-square in which UO > UI is broken up as shown in Fig. A-2, with various values of k indicated. The probability for a run of length k is 3, if k = 1; l/Fk-l Fk+l -l/Fk Fk+2, if k > 1. The corresponding probabilities for a random sequence are 2k/(k + l)! -2(k + l)/(k + 2)!; the following table compares the first few values. k: 1234 5 Probability in Fibonacci case: f 4 & & & Probability in random case: B s2 ti3%i&

540 ANSWERS TO EXERCISES 3.3.3 Ss = –a(a (a (Web hosting resellers)

Saturday, September 29th, 2007

540 ANSWERS TO EXERCISES 3.3.3 Ss = –a(a (a -l), m, (a ) ~), if a a E 1 (modulo m); and finally ss=12 c O -1) > which is i + (1 -30 + 3e2)/6a + 0(1/a ) for large a. Note that 1 -38 + 319 2 a, so 8 can t be chosen to make this probability come out right.

Web server type - 3.3.3 ANSWERS TO EXERCISES 539 At the conclusion

Friday, September 28th, 2007

3.3.3 ANSWERS TO EXERCISES 539 At the conclusion of this algorithm, p will be equal to the original value /CO of Ic, so the desired answer will be A+ B/p. The final value of p will be h if s < 0, otherwise p will be ko -h . It would be possible to maintain B in the range 0 < B < ko , by making appropriate adjustments to A, thereby requiring only single-precision operations (with double-precision products and dividends) if ko is a single-precision number. 18. A moment s thought shows that the formula S(h, kc, 2) = _ (b/k1 lb - z)lkJ) (((W + d/k)) Co<,<, - is in fact valid for all z, not only when k 2 z. Writing [j/kJ -[(j -z)/kJ = $ + ((y)) -((f)) + fS,o -$6(q) and carrying out the sums yields S(h, k, c, z) = zd((c/d))/k + &a(!~, k, hz + c) -&o(h, k, c) + i((c/k)) -d(((hz + c)/k)), where d = gcd(h, k). [This formula allows us to express the probability that Xn+l < X, < (Y in terms of generalized Dedekind sums, given a.1 19. The desired probability is c OIz

538 ANSWERS TO EXERCISES 3.3.3 8. See Duke

Thursday, September 27th, 2007

538 ANSWERS TO EXERCISES 3.3.3 8. See Duke Math. J. 21 (1954), 391-397. 9. Begin with the handy identity ~O 0, return to D2. 1

3.3.3 ANSWERS TO EXERCISES 537 18. (a) The (Web hosting script)

Thursday, September 27th, 2007

3.3.3 ANSWERS TO EXERCISES 537 18. (a) The numerator is -(Ue -UI)*, the denominator is (Uc -UI)*. (b) The nu- merator in this case is -(iYE + UT + U , -UOUI -VI& -U2i3); the denominator is 2(Ui +…–U~UO). (c) The denominator always equals ~o13,c-sm 2rnx)/n, which converges for all z. (The representation in Eq. (24) may be considered a finite Fourier series, for the case when x is rational.) 4. d = 21 5. Note that we have Xn+l < X, with probability a + 6, where 161 < d/(2. lOlo) = l/(2. 5 ); hence every potency-l0 generator is respectable from the standpoint of Theorem P. 5. An intermediate result: x 4×1 ;~(a, m, c) + T -& -&. c mm = O

536 ANSWERS TO (Web page design) EXERCISES 3.3.2 by the previous

Wednesday, September 26th, 2007

536 ANSWERS TO EXERCISES 3.3.2 by the previous exercise and Eq. 1.2.9-28. The mean and variance are readily computed using Theorem 1.2.10A and exercise 3.4.1-17. We find that mean(G)=w+(&-l)+…+(d-:+I–I)=d(Hd–Hdw)=ii: var(G) = d (Hy) -H&2_1,) -d(Hd -H–W) = oz. The number of U s examined, as the search for a coupon set is repeated n times, therefore has the characteristics (min wn, ave pn, max 00, dev a&z). 11. 111219 8 5 31617 0141. 12. Algorithm R (Data for run test). Rl. [Initialize.] Set j + -1, and set COUNT[l] + COUNT[2] + . . . + COUNT[S] + 0. Also set U, + U,-i, for convenience in terminating the algorithm. R2. [Set r zero.] Set r + 0. R3. [Is U, < U,+I?] Increase r and j by 1. If U, < U3+1, repeat this step. R4. [Record the length.] If r 2 6, increase COUNT[G] by one, otherwise increase COUNT[r] by one. R5. [Done?] If j < n -1, return to step R2. 1 13. There are (p+ 4 + l)( t ) ways to have Ui-1 2 Ui < . . . < Uz+p-~ 3 U,++, < < Uz+p+q-1; subtract ( ~~- ; ) of th ese in which lJ-1 < Ui, and subtract ( pf;l+l) for those in which Uz+p--l < Ui+,; then add in 1 for the case that both U,-I < U, and Uz+p--l < U,+,, since this case has been subtracted out twice. (This is a special case of the inclusion-exclusion principle, which is explained further in Section 1.3.3.) 14. A run of length r occurs with probability l/r! -l/(r + l)!, assuming distinct U s. 15. This is always true of F(X) when F is continuous and S has distribution F; see Section 3.3.1C. 16. (a) Z i, = max(Z,(t-l),Z(j+l)(t-l)). If the Qt-i) are stored in memory, it is therefore a simple matter to transform this array into the set of Z,, with no auxiliary storage required. (b) With his improvement, each of the V s should indeed have the stated distribution, but the observations are no longer independent. In fact, when U, is a relatively large value, all of Z,,, Z(j+l)t, . , Z(j-t+l)t will be equal to Uj; so we almost have the effect of repeating the same data t times (and that would multiply V by t, cf. exercise 3.3.1-10). 17. (b) By Lagrange s identity, the difference is ~O has rank < 2, so its rows are linearly dependent. (A more elementary proof can be given, using the fact that UhVj -l.JiVi = 0 for 1 2 j < n implies the existence of constants cy, p such that cdJi + /3Vi = 0 for all j, provided that Vi and VA are not both zero; the latter case can be avoided by a suitable renumbering.)

3.3.2 ANSWERS TO EXERCISES 535 3. The probability

Tuesday, September 25th, 2007

3.3.2 ANSWERS TO EXERCISES 535 3. The probability that j values are examined, i.e., the probability that U,-, is the nth element of the sequence lying in the range a < Uj-1 < ,0, is easily seen to be nj-l _ 1 PV - Pd- 7 ( > by enumeration of the possible places in which the other n -1 occurrences can appear and by evaluating the probability of such a pattern. The generating function is G(z) = (PZN -(1 -Pk)) , which makes sense since the given distribution is the n-fold convolution of the same thing for n = 1. Hence the mean and variance are proportional to n; the number of U s to be examined is now easily found to have the characteristics (min n, ave n/p, max co, dev d-/p). A more detailed discussion of this probability distribution when n = 1 may be found in the answer to exercise 3.4.1-17; see also the considerably more general results of exercise 2.3.4.2-26. 4. The probability of a gap of length 2 r is the probability that r consecutive U s lie outside the given range, i.e., (1 -p) . The probability of a gap of length exactly r is the above value for length 2 r minus the value for length 2 (r + 1). 5. As N goes to infinity, so does n (with probability l), hence this test is just the same as the gap test described in the text except for the length of the very last gap, And the text s gap test certainly is asymptotic to the chi-square distribution stated, since the length of each gap is independent of the length of the others. [Notes: A quite complicated proof of this result by E. Bofinger and V. J. Bofinger appears in Annals Math. Stat. 32 (1961), 524-534. Their paper is noteworthy because it discusses several interesting variations of the gap test; they show, for example, that the quantity (X -(NP)P~) c 0<7it (NP)P~ does not approach a chi-square distribution, although others had suggested this statistic as a stronger test because Np is the expected value of n.] 7. 5, 3, 5, 6, 5, 5, 4. 8. See exercise 10, with w = d. 9. (Change d to w in steps Cl and C4.) We have d(d -1). . . (d -w + 1) p, = for w < r < t; d 10. As in exercise 3, we really need consider only the case n = 1. The generating function for the probability that a coupon set has length r is

Web hosting contract - 3.3.1 ANSWERS TO EXERCISES 531 27. Let J,

Tuesday, September 18th, 2007

3.3.1 ANSWERS TO EXERCISES 531 27. Let J, = LkX,,/m]. Lemma: After the (k* + 7k -2)/2 consecutive values 0 k+2 10 ktl 2 Ok . (k -1) O3 occur in the (Jn) sequence, Algorithm B will have V[i] < m/k for 0 2 j < k, and also Y < mfk. Proof. Let S, be the set of positions i such that V[i] < mfk just before X, is generated, and let & be the index such that V[$] t X,. If j, @ S, and J, = 0, then S,+r = S, U {j,}; if j, E S, and J, = 0, then S,+i = S, and j,+r = 0. After k + 2 successive O s, we must therefore have 0 E S, and $+I = 0. Then after 1 Ok+ we must have (0, l} c S, and j,+i = 0; after 2 Ok we must have (0, 1,2} c S, and j,+r = 0; and so on. Corollary: For X 2 2(k2+7k-2)k(k2-t7k-2)/2, either Algorithm B yields a period of length X or the sequence (X,) is poorly distributed. Proof. The probability that any given length-l pattern of J s does not occur in a random sequence of length X is less than (1 -k- )x/L < exp(-k-lX/J). For 1 = (k2 + i k -2)/2 this is at most e - ; hence the stated pattern should appear. After it does, the subsequent behavior of Algorithm B will be the same each time it reaches this part of the period. SECTION 3.3.1 1. There are k = 11 categories, so the line v = 10 should be used. 2.2 A lLi9&Zj_Qix2 49, 49, ib, &a, 491 49, 4gr 4gr 491 491 49 3. V = 7$$$, only very slightly higher than that obtained from the good dice! There are two reasons why we do not detect the weighting: (a) The new probabilities (cf. exercise 2) are not really very far from the old ones in Eq. (1). The sum of the two dice tends to smooth out the probabilities; if we considered instead each of the 36 possible pairs of values, and counted these, we would probably detect the difference quite rapidly (assuming that the two dice are distinguishable). (b) A far more important reason is that n is too small for a significant difference to be detected. If the same experiment is done for large enough n, the faulty dice will be discovered (see exercise 12). 4. p, = & for 2 5 s 5 12 and s # 7; pr = &. The value of V is 16$, which falls between the 75% and 95% entries in Table 1; so it is reasonable, in spite of the fact that not too many sevens actually turned up. 5. K&, = 1.15; KG = 0.215; these do not differ significantly from random behavior (being at about the 94% and 86% levels), but they are mighty close. (The data values in this exercise come from Appendix A, Table 1.) 6. The probability that X, 2 z is F(z), so we have the binomial distribution discussed in Section 1.2.10. Fn(s) = s/ n with probability (y) F(z) (l -F(s)) - ; the mean is F(s); the standard deviation is d/F(z)(l -F(z))/n. [Cf. Eq. 1.2.10-19. This suggests that a slightly better statistic would be to define see exercise 22. We can calculate the mean and standard deviation of Fn(z) -Fn(y), for z < y, and obtain the covariance of Fn(z),Fn(y). Using these facts, it can be

Abyss web server - 530 ANSWERS TO EXERCISES 3.2.2 (al, . ,

Tuesday, September 18th, 2007

530 ANSWERS TO EXERCISES 3.2.2 (al, . , a,, 0,. . . , 0) has appeared m times, and all m possible predecessors appear; this means (a,al,…, a,,0 ,…, 0) appears for 0 5 a < m. The proof is now complete by induction. The result also follows from Theorem 2.3.4.2D, using the directed graph of exercise 2.3.4.2-23; the set of arcs from (~1,. . . , z3, 0, , 0) to (~2, . . . , x3, 0, . . . , 0, 0), where Z~ # 0 and 1 2 j < k, forms an oriented subtree related neatly to Dewey decimal notation. 18. The third-most-significant bit of U,+l is completely determined by the first and third bits of U,, so only 32 of the 64 possible pairs (LSU,], [SV,+,J) occur. (Notes: If we had used, say, 11-bit numbers U, = (.XI~~X~~~+~ . . .Xlln+lO)z, the sequence would be satisfactory for many applications. If another constant appears in A having more one bits, the generalized spectral test might give some indication of its suitability. See exercise 3.3.4-24; we could examine it in dimensions t = 36,37,38, . . .) 21. [J. London Math. Sot. 21 (1946), 169-172.1 Any sequence of period length mk -1 with no k consecutive zeros leads to a sequence of period length mk by inserting a zero in the appropriate place, as in exercise 7; conversely, we can start with a sequence of period length mk and delete an appropriate zero from the period, to form a sequence of the other type. Let us call these (m, k) sequences of types A and B. The hypothesis assures us of the existence of (p, k) sequences of type A, for all primes p and all k 2 1; hence we have (p, k) sequences of type B for all such p and k. To get a (p , k) sequence of type B, let e = qr, where q is a power of p and r is not a multiple of p. Start with a (p, qrk) sequence of type A, namely X0, X1, X2, . ; then (using the p-ary number system) the grouped digits (X0.. . Xq–l)p, (X, . . . XZ~–~)~, . . . form a (pq, rk) sequence of type A, since q is relatively prime to pqrk -1 and the sequence therefore has a period length of pqrk -1. This leads to a (pq,rk) sequence (Y,) of type B; and (YoK . . Y7–l)pq, (Y,Y,+1 . . Y27–l)p4, . . . is a (pqr, k) sequence of type B by a similar argument, since r is relatively prime to pqk. To get an (m, k) sequence of type B for arbitrary m, we can combine (p , k) sequences for each of the prime power factors of m using the Chinese remainder theorem; but a simpler method is available. Let (Xn) be an (r, k) sequence of type B, and let (Y%) be an (s, k) sequence of type B, where r and s are relatively prime; then (sX, + Y,) is an (TS, k) sequence of type B. A simple, uniform construction that yields (2, k) sequences for arbitrary k has been discovered by A. Lempel [IEEE Bans. C-19 (1970), 1204-12091. 22. By the Chinese remainder theorem, we can find constants al, . . . , ak having desired residues mod each prime divisor of m. If m = plp2.. .pt, the period length will be lcm(pf -1, . . . , p: -1). In fact, we can achieve reasonably long periods for arbitrary m (not necessarily squarefree), as shown in exercise 11. 23. Period length at least 255 -1; possibly faster than (7), see exercise 3.2.1.1-5. Furthermore, R. Brent has pointed out that the calculations can be done exactly on floating point numbers in [ 0,l). 24. Run the sequence backwards. In other words, if 2, = Y-, we have 2, = (2%~m+k + zn-m) mod 2. 25. This actually would be slower and more complicated, unless it can be used to save subroutine-calling overhead in high-level languages. (See the FORTRAN program in Section 3.6.)