The RSA System and Factoring 116 4.3 The RSA Cryptosystem ...crypto.cs.mcgill.ca/~crepeau/CS547/BOOK/Theory4.pdf · This and related systems are based on the difficulty of the subset

The RSA System and Factoring 116...............4.1 Introduction to Public-key 116...................4.2 More Number Theory 116.........................

4.2.1 The Euclidean Algorithm 116.....................FIGURE 4.1 119.............................................

4.2.2 The Chinese Remainder Theorem 119......4.2.3 Other Useful Facts 122...............................

FIGURE 4.2 124.............................................4.3 The RSA Cryptosystem 124.....................4.4 Implementing RSA 125.............................

FIGURE 4.3 126..................................................FIGURE 4.4 127..................................................

4.5 Probabilistic Primality Testing 129............FIGURE 4.5 130..................................................FIGURE 4.6 130..................................................FIGURE 4.7 133..................................................FIGURE 4.8 136..................................................FIGURE 4.9 137..................................................

4.6 Attacks On RSA 138.................................4.6.1 The Decryption Exponent 139....................

FIGURE 4.10 141...........................................4.6.2 Partial Information Concerning 144............

FIGURE 4.11 145...........................................4.7 The Rabin Cryptosystem 145...................

FIGURE 4.12 146................................................FIGURE 4.13 146................................................FIGURE 4.14 149................................................

4.8 Factoring Algorithms 150..........................FIGURE 4.15 151................................................4.8.1 The p - 1 Method 151.................................4.8.2 Dixon�s Algorithm and the 153...................4.8.3 Factoring Algorithms in Practice 155..........

4.9 Note-s and References 156......................Exercises 157.................................................

TABLE 4.1 158....................................................TABLE 4.2 159....................................................

FIGURE 4.16 160................................................

The RSA System and Factoring

4.1 Introduction to Public-key Cryptography

In the classical model of cryptography that we have been studying up until now, Alice and Bob secretly choose the key I<. I< then gives rise to an encryption rule eK and a decryption rule dK. In the cryptosystems we have seen so far, dK is either the same as eK, or easily derived from it (for example, DES decryption is identical to encryption, but the key schedule is reversed). Cryptosystems of this type are known as private-key systems, since exposure of eK renders the system insecure.

One drawback of a private-key system is that it requires the prior communication of the key I( between Alice and Bob, using a secure channel, before any ciphertext is transmitted. In practice, this may be very difficult to achieve. For example, suppose Alice and Bob live far away from each other and they decide that they want to communicate electronically, using e-mail. In a situation such as this, Alice and Bob may not have access to a reasonable secure channel.

The idea behind a public-key system is that it might be possible to find a cryptosystem where it is computationally infeasible to determine dK given eK. If so, then the encryption rule eK could by made public by publishing it in a directory (hence the term public-key system). The advantage of a public-key system is that Alice (or anyone else) can send an encrypted message to Bob (without the prior communication of a secret key) by using the public encryption rule eK. Bob will be the only person that can decrypt the ciphertext, using his secret decryption rule dK.

Consider the following analogy: Alice places an object in a metal box, and then locks it with a combination lock left there by Bob. Bob is the only person who can open the box since only he knows the combination.

The idea of a public-key system was due to Diffie and Hellman in 1976. The first realization of a public-key system came in 1977 by Rivest, Shamir, and Adleman, who invented the well-known RSA Cryptosystem which we study in this chapter. Since then, several public-key systems have been proposed, whose

I14

4.1. INTRODUCTION TO PUBLIC-KEY CRYPTOGRAPHY 115

security rests on different computational problems. Of these, the most important are the following:

RSA

The security of RSA is based on the difficulty of factoring large integers. This system is described in Section 4.3.

Merkle-Hellman Knapsack

This and related systems are based on the difficulty of the subset sum problem (which is NP-complete’); however, all of the various knapsack systems have been shown to be insecure (with the exception of the Chor- Rivest Cryptosystem mentioned below). See Chapter 5 for a discussion of this cryptosystem.

McEliece

The McEliece Cryptosystem is based on algebraic coding theory and is still regarded as being secure. It is based on the problem of decoding a linear code (which is also NP-complete). (See Chapter 5.)

ElGamal

The ElGamal Cryptosystem is based on the difficulty of the discrete loga- rithm problem for finite fields. (See Chapter 5.)

Chor-Rivest

This is also referred to as a “knapsack” type system, but it is still regarded as being secure.

Elliptic Curve

The Elliptic Curve Cryptosystems are modifications of other systems (such as the ElGamal Cryptosystem, for example) that work in the domain of elliptic curves rather than finite fields. The Elliptic Curve Cryptosystems appear to remain secure for smaller keys than other public-key cryptosys- terns. (See Chapter 5.)

One very important observation is that a public-key cryptosystem can never provide unconditional security. This is because an opponent, on observing a ciphertext y, can encrypt each possible plaintext in turn using the public encryption rule eK until he finds the unique 2: such that y = eK (z). This a! is the decryption of y. Consequently, we study the computational security of public-key systems.

It is helpful conceptually to think of a public-key system in terms of an ab- straction called a trapdoor one-way function. We informally define this notion now.

Bob’s public encryption function, eK, should be easy to compute. We have just noted that computing the inverse function (i.e., decrypting) should be hard (for

‘The NP-complete problems are a large class of problems for which no polynomial-time algorithms are known.

116 CHAPTER 4. THE RSA SYSTEM AND FACTORING

anyone other than Bob). This property of being easy to compute but hard to invert is often called the one-way property . Thus, we desire that eK be an (injective) one-way function.

One-way functions play a central role in cryptography; they are important for constructing public-key cryptosystems and in various other contexts. Unfortu- nately, although there are many functions that are believed to be one-way, there currently do not exist functions that can be proved to be one-way.

Here is an example of a function which is believed to be one-way. Suppose n is the the product of two large primes p and q, and let b be a positive integer. Then define f : Z,, + Z,, to be

f(x) = xb mod n.

(For a suitable choice of b and n, this is in fact the RSA encryption function; we will have much more to say about it later.)

If we are to construct a public-key cryptosystem, then it is not sufficient to find a one-way function. We do not want eK to be a one-way function from Bob’s point of view, since he wants to be able to decrypt messages that he receives in an efficient way. Thus, it is necessary that Bob possesses a trapdoor, which consists of secret information that permits easy inversion of eK. That is, Bob can decrypt efficiently because he has some extra secret knowledge about K. So, we say that a function is a trapdoorone-way function if it is a one-way function, but it becomes easy to invert with the knowledge of a certain trapdoor.

We will see in Section 4.3 how to find a trapdoor for the function f defined above. This will lead to the RSA Cryptosystem.

4.2 More Number Theory

Before describing how RSA works, we need to discuss some more facts concerning modular arithmetic and number theory. Two fundamental results that we require are the Euclidean algorithm and the Chinese remainder theorem.

4.2.1 The Euclidean Algorithm

We already observed in Chapter 1 that Z, is a ring for any positive integer n. We also proved there that b E Z,, has a multiplicative inverse if and only if gcd(b, n) = 1, and that the number of positive integers less than n and relatively prime ton is d(n).

The set of residues modulo n that are relatively prime to n is denoted Z,* . It is not hard to see that Z, * forms an abelian group under multiplication. We already have stated that multiplication modulo n is associative and commutative, and that 1 is the multiplicative identity. Any element in Z,,* will have a multiplicative

4.2. MORE NUMBER THEORY 117

inverse (which is also in &*). Finally, Zn * is closed under multiplication since xy is relatively prime to n whenever x and y are relatively prime to n (prove this!).

At this point, we know that any b E Zn* has a multiplicative inverse, b- t, but we do not yet have an efficient algorithm to compute b-‘. Such an algorithm exists; it is called the extended Euclidean algorithm.

First, we describe the Euclidean algorithm, in its basic form, which is used to compute the greatest common divisor of two positive integers, say ra and rt, where ra > rt. The Euclidean algorithm consists of performing the following sequence of divisions:

r0 = qtrt + 7-2, 0 < r2 < 73 r1 = q2r2+r3, 0 < rg < r2

r,-2 = qm-lrm-l + r,, 0 < r, < r,-I rm-t = qmrm.

Then it is not hard to show that

gcd(re, r1) = gcd(rr , r2) = . . . = gcd(r,- 1, rm) = r,.

Hence, it follows that gcd(re, rl) = r,,,. Since the Euclidean algorithm computes greatest common divisors, it can be

used to determine if a positive integer b < n has a multiplicative inverse modulo n, by starting with ro = n and r1 = b. However, it does not compute the value of the multiplicative inverse (if it exists).

Now, suppose we define a sequence of numbers to, tl , . . . , t, according to the following recurrence (where the qj’s are defined as above):

to = 0 t1 = 1 tj = tj-2 - qj-ltj-1 mod re, if j 2 2.

Then we have the following useful result.

THEOREM 4.1 For 0 5 j 5 m, we have that rj f tjrl (mod ro), where the qj ‘S and rj ‘S are deJined as in the Euclidean algorithm, and the tj ‘s are defined in the above recurrence.

PROOF The proof is by induction on j. The assertion is trivially true for j = 0 and j = 1. Assume the assertion is true for j = i - 1 and i - 2, where i 2 2; we will prove the assertion is true for j = i. By induction, we have that

T;-2 E ti-zrl (mod TO)

and

ri-1 E ti-lrl (mod ra).


Now, we compute:

ri = rj-2 - pi-lri-1

E ti-zrl - qi-lti-lrl (mod t-0)

q (ti-2 - qi-lti-I)rI (mod ra)

E tirl (mod r-0).

Hence, the result is true by induction.

The next corollary is an immediate consequence.

COROLLARY 4.2 Suppose gcd(re, r-1) = 1. Then t, = r1-l mod TO.

Now, the sequence of numbers to, t 1, . . . t, can be calculated in the Euclidean algorithm at the same time as the qj’s and the rj’s. In Figure 4.1, we present the extended Euclidean algorithm to compute the inverse of b modulo n, if it exists. In this version of the algorithm, we do not use an array to keep track of the qj’s, rj’s and tj’s, since it suffices to remember only the “last” two terms in each of these sequences at any point in the algorithm.

In step 10 of the algorithm, we have written the expression for temp in such a way that the reduction modulo n is done with a positive argument. (We mentioned earlier that modular reductions of negative numbers yield negative results in many computer languages; of course, we want to end up with a positive result here.) We also mention that at step 12, it is always the case that tb - r (mod n) (this is the result proved in Theorem 4.1).

Here is a small example to illustrate:

Example 4.1 Suppose we wish to compute 28-i mod 75. The Extended Euclidean algorithm proceeds as follows:

75 = 2 x 28 + 19 step 6 73 x 28 mod 75 = 19 step 12 28=1x19+9 step 16 3 x 28 mod 75 = 9 step 12 19=2x9+1 step 16 67 x 28 mod 75 = 1 step 12 9=9x 1 steo 16

Hence, 28-l mod 75 = 67. 0


FIGURE 4.1 Extended Euclidean algorithm

1. no = n P d. b. = b

3. to = 0

t. t=1

5. r = no -qxbo 7. while r > 0 do

8. temp=to-qxt

a. if temp >_ 0 then temp = temp mod n

10. if temp < 0 then temp = n - ((-temp) mod n)

11. to = t

12. t = temp

13. no = bo

14. bo = r

15. Q = l$y 16.

17.

r=no-qxbo

if bo # 1 then

b has no inverse modulo n

else

b-‘=tmodn

4.2.2 The Chinese Remainder Theorem

The Chinese remainder theorem is really a method of solving certain systems of congruences. Suppose ml, . . . , m, are pairwise relatively prime (that is, gcd(mi, mj) = 1 if i # j). Suppose at,. . . , a, are integers, and consider the following system of congruences:

x~al (modml)

x G a2 (mod mz)

x E a, (mod m,).


The Chinese remainder theorem asserts that this system has a unique solution moduloM = mt x m2 x . . . x m,.. We will prove this result in this section, and also describe an efficient algorithm for solving systems of congruences of this type.

It is convenient to study the function A : ZM + Z!,, x . . . x &,, which we define as follows:

Example 4.2 Suppose r = 2, ml = 5 and rn2 = 3, so M = 15. Then the function T has the following values:

40) = (O,O) r(l) = (171) n(2) = (2,4 n(3) = (3,O) 44) = (471) r(5) = (0,2) ~(6) = (l,O) r(7) = (2,l) 48) = (3,~) r(9) = (470) n(l0) = (0,l) ?r(ll) = (1,2)

7r(12) = (2,O) 7r(13) = (3,l) n(14) = (4,2).

0

Proving the Chinese remainder theorem amounts to proving that this function ?r we have defined is a bijection. In Example 4.2 this is easily seen to be the case. In fact, we will be able to give an explicit general formula for the inverse function 7r-‘.

For 1 5 i 5 r, define

Mj=E. mi

Then it is not difficult to see that

gcd(Mi, mi) = 1

for 1 5 i 5 r. Next, for 1 5 i 5 r, define

yi = Mi-’ mod mi.

(This inverse exists since gcd(Mi , mi) = 1, and it can be found using the Euclidean algorithm .) Note that

Miyi E 1 (mod mi)

for 1 5 i 5 r.


Now, define a function p : Z,, x . . . x L!&, -+ ZM as follows:

p(al, . . . , a,.) = c aiMiyi mod M. i=l

We will show that the function p = r-t , i.e., it provides an explicit formula for solving the original system of congruences.

DenoteX = ~(a,,..., a,.), and let 1 5 j 5 r. Consider a term a;M;yi in the above summation, reduced modulo mj : If i = j, then

UiMiyi E ai (mod mi)

since

Miyj s 1 (mod mi).

On the other hand, if i # j, then

UiMiyi E 0 (mod mj)

since mj 1 Mi in this case. Thus, we have that

X E ka;Mjyi (mod mj) i=l

E aj (mod mj).

Since this is true for all j, 1 5 j 5 r, X is a solution to the system of congruences. At this point, we need to show that the solution X is unique modulo M. But

this can be done by simple counting. The function ?r is a function from a domain of cardinality M to a range of cardinality M. We have just proved that rr is a surjective (i.e., onto) function. Hence, T must also be injective (i.e., one-to-one), since the domain and range have the same cardinality. It follows that ?r is a bijection and n-’ = p. Note also that rr-t is a linear function of its arguments Ul,...,U,.

Here is a bigger example to illustrate.

Example 4.3 Suppose r = 3, ml = 7, m2 = 11 and ml = 13. Then M = 1001. We compute MI = 143, M2 = 91 and M3 = 77, and then yt = 5, y2 = 4 and ys = 12. Then the function IT’ : Z& x Ztt x Zi3 + Ztmt is the following:

A-‘(al, ~2, ~3) = 715~1 + 364~2 + 924~3 mod 1001.

For example, if 2: = 5 (mod 7), 2 E 3 (mod 11) and c z 10 (mod 13), then this formula tells us that

2 = 715 x 5 + 364 x 3 + 924 x 10 mod 1001


= 13907 mod 1001

= 894 mod 1001.

This can be verified by reducing 894 modulo 7, 11 and 13. i]

For future reference, we record the results of this section as a theorem.

THEOREM 4.3 (Chinese Remainder Theorem) Suppose ml, . . . , m, are pairwise relatively pn’me positive integers, and let

a, be integers. Then, the system of r congruences x c ai (mod mi) (4’ ‘<‘i ‘5 r) has a unique solution module M = ml x . . . x m,, which is given by

r x= c aiMiyi mod M,

i=l

where Mi = M/mi and yi = Mi-’ mod mi, for 1 2 i 2 r.

4.2.3 Other Useful Facts

We next mention another result from elementary group theory, called Lagrange’s Theorem, that will be relevant in our treatment of the RSA Cryptosystem. For a (finite) multiplicative group G, define the order of an element g E G to be the smallest positive integer m such that g”’ = 1. The following result is fairly simple, but we will not prove it here.

THEOREM 4.4 (Lagrange) Suppose G is a multiplicative group of order n, and g E G. Then the order of g divides n.

For our purposes, the following corollaries are essential.

COROLLARY 4.5 Ifb E Z&z*, then b+(“) 3 1 (mod n).

PROOF ;Z, * is a multiplicative group of order 4(n).

COROLLARY 4.6 (Fermat) Suppose p is prime and b E Zr. Then bp E b (mod p).

PROOF If p is prime, then 4(p) = p - 1. So, for b $ 0 (mod p), the result follows from Corollary 4.5. For b E 0 (mod p), the result is also true since 0” E 0 (mod p). 1


At this point, we know that if p is prime, then Z$,* is a group of order p - 1, and any element in Z$,* has order dividing p - 1. However, if p is prime, then the group Zr* is in fact cyclic: there exists an element cr E ZP* having order equal to p - 1. We will not prove this very important fact, but we do record it for future reference:

THEOREM 4.7 If p is prime, then &,* is a cyclic group.

An element (Y having order p- 1 is called aprimitive element modulo p. Observe that (Y is a primitive element if and only if

{d : 0 5 i 5 p - 2) = ;Z,*.

Now, suppose p is prime and (Y is a primitive element modulo p. Any element p E i&* can be written as p = cryi, where 0 5 i 5 p - 2, in a unique way. It is not difficult to prove that the order of /3 = (~j is

P-1 gcd(p- 1,i)’

Thus p is itself a primitive element if and only if gcd(p - 1, i) = 1. It follows that the number of primitive elements modulo p is 4(p - 1).

Example 4.4 Suppose p = 13. By computing successive powers of 2, we can verify that 2 is a primitive element modulo 13:

2’mod13= 1

2’ mod 13 = 2

22 mod 13 = 4

23 mod 13 = 8

24 mod 13 = 3

2’ mod 13 = 6

26 mod 13 = 12

27mod13= 11

2’ mod 13 = 9

29 mod 13 = 5

2” mod 13 = 10

2” mod 13 = 7.


FIGURE 4.2 RSA Cryptosystem

Let n = pq, where p and q are primes. Let P = C = Z,,, and define

K = {(n,p, q,u, b) : n = pq,p, q prime, ub s 1 (mod 4(n))}.

For Ii = (n, p, q, a, b), define

eK(x) = xb mod n

and

dK(y) = ya mod n

(x, y E &). The values n and b are public, and the values p, q, a are secret.

The element 2i is primitive if and only if gcd(i, 12) = 1; i.e., if and only if i = 1,5,7 or 11. Hence, the primitive elements modulo 13 are 2,6,7 and 11. 0

4.3 The RSA Cryptosystem

We can now describe the RSA Cryptosystem. This cryptosystem uses computations in &, where n is the product of two distinct odd primes p and q. For such n, note that d(n) = (p - l)(q - 1).

The formal description of the cryptosystem is given in Figure 4.2. Let’s verify that encryption and decryption are inverse operations. Since

we have that

ub f 1 (mod d(n)),

ub = t+(n) + 1

for some integer t 2 1. Suppose that x E Z,’ ; then we have

(xb)” E x@(~)+’ (mod n)

E (x~(~))~x (mod n)

3 ltx (mod n)

E x (mod n),

4.4. IMPLEMENTING RSA 125

as desired. We leave it as an exercise for the reader to show that (xb)” - x (mod n) if x E a,\&*.

Here is a small (insecure) example of the RSA Cryptosystem.

Example 4.5 Suppose Bob chooses p = 101 and q = 113. Then n = 11413 and 4(n) = 100 x 112 = 11200. Since 11200 = 26527, an integer b can be used as an encryption exponent if and only if b is not divisible by 2,5 or 7. (In practice, however, Bob will not factor 4(n). He will verify that gcd(4(n), b) = 1 using the Euclidean algorithm.) Suppose Bob chooses b = 3533. Then the Extended Euclidean algorithm will yield

b-’ = 6597 mod 11200.

Hence, Bob’s secret decryption exponent is a = 6597. Bob publishes n = 11413 and b = 3533 in a directory. Now, suppose Alice

wants to send the plaintext 9726 to Bob. She will compute

97263533 mod 11413 = 5761

and send the ciphertext 5761 over the channel. When Bob receives the ciphertext 5761, he uses his secret decryption exponent to compute

57616597 mod 11413 = 9726.

(At this point, the encryption and decryption operations might appear to be very complicated, but we will discuss efficient algorithms for these operations in the next section.) 0

The security of RSA is based on the hope that the encryption function eK (x) = xb mod n is one-way, so it will be computationally infeasible for an opponent to decrypt a ciphertext. The trapdoor that allows Bob to decrypt is the knowledge of the factorization n = pq. Since Bob knows this factorization, he can compute $(n) = (p - l)(q - 1) and then compute the decryption exponent a using the Extended Euclidean algorithm. We will say more about the security of RSA later on.

4.4 Implementing RSA

There are many aspects of the RSA Cryptosystem to discuss, including the details of setting up the cryptosystem, the efficiency of encrypting and decrypting, and security issues. In order to set up the system, Bob follows the steps indicated in


FIGURE 4.3 Setting up RSA

1. Bob generates two large primes, p and q

2. Bob computes n = pq and 4(n) = (p - l)(q - 1)

3. Bob chooses a random b (0 < b < 4(n)) such that gcd(b, 4(n)) = 1

4. Bob computes a = b-’ mod 4(n) using the Euclidean algorithm

5. Bob publishes n and b in a directory as his public key.

Figure 4.3. How Bob carries out these steps will be discussed later in this chapter.

One obvious attack on the cryptosystem is for a cryptanalyst to attempt to factor n. If this can be done, it is a simple manner to compute 4(n) = (p - l)(q - 1) and then compute the decryption exponent a from b exactly as Bob did. (It has been conjectured that breaking RSA is polynomially equivalent2 to factoring n, but this remains unproved.)

Hence, if the RSA Cryptosystem is to be secure, it is certainly necessary that n = pq must be large enough that factoring it will be computationally infeasible. Current factoring algorithms are able to factor numbers having up to 130 decimal digits (for more information on factoring, see Section 4.8). Hence, it is recommended that, to be on the safe side, one should choose p and q to each be primes having about 100 digits; then n will have 200 digits. Several hardware implementations of RSA use a modulus which is 512 bits in length. However, a 512-bit modulus corresponds to about 154 decimal digits (since the number of bits in the binary representation of an integer is log, 10 times the number of decimal digits), and hence it does not offer good long-term security.

Leaving aside for the moment the question of how to find 100 digit primes, let us look now at the arithmetic operations of encryption and decryption. An encryption (or decryption) involves performing one exponentiation modulo n. Since n is very large, we must use multiprecision arithmetic to perform computations in ZZ&, and the time required will depend on the number of bits in the binary representation ofn.

Suppose n has k bits in its binary representation; i.e., k = [log, n] + 1. Using standard “grade-school” arithmetic techniques, it is not difficult to see that an addition of two k-bit integers can be done in time O(k), and a multiplication can be done in time O(k2). Also, a reduction modulo n of an integer having at most

*Two problems are. said to be. polynomially equivalent if the existence of a polynomial-time algorithm for either problem implies the existence of a polynomial-time algorithm for the other problem.

4.4. IMPLEMENTING RSA 127

FIGURE 4.4 The square-and-multiply algorithm to compute xb mod n

1. %=I

2. fori = !- 1 downtoOdo

3. z = .z2 mod n

4. ifbi=lthenz=zxxmodn

2k bits can be performed in time 0( k2) (this amounts to doing long division and retaining the remainder). Now, suppose that x, y E Z,, (where we are assuming that 0 5 x, y 5 n - 1). Then xy mod n can be computed by first calculating the product xy (which is a 2k-bit integer), and then reducing it modulo n. These two steps can be peformed in time O(k2). We call this computation modular multiplication.

We now consider modular exponentiation, i.e., computation of a function of the form xc mod n. As noted above, both the encryption and the decryption operations in RSA are modular exponentiations. Computation of xc mod n can be done using c - 1 modular multiplications; however, this is very inefficient if c is large. Note that c might be as big as 4(n) - 1, which is exponentially large compared to k.

The well-known “square-and-multiply” approach reduces the number of modular multiplications required to compute xc mod n to at most 2& where !Y is the number of bits in the binary representation of c. Since e 5 k, it follows that 2’ mod n can be computed in time O(k3). Hence, RSA encryption and decryption can both be done in polynomial time (as a function of k, which is the number of bits in one plaintext (or ciphertext) character).

Square-and-multiply assumes that the exponent, b say, is represented in binary notation, say

t-1 b = Cbi2’,

i=O

where bi = 0 or 1, 0 5 i < C - 1. The algorithm to compute z = x6 mod n is presented in Figure 4.4. It is easy to count the number of modular multiplications performed by the square-and-multiply algorithm. There are always e squarings performed (step 3). The number of modular multiplcations in step 4 is equal to the number of l’s in the binary representation of b, which is an integer between 0 and -!?. Thus, the total number of modular multiplications is at least e and at most 2c

We will illustrate the use of square-and-multiply by returning to Example 4.5.


Example 4.5 (Cont.) Recall that n = 11413, and the public encryption exponent is b = 3533. Alice encrypts the plaintext 9726 by computing 97263533 mod 11413, using the square- and multiply-algorithm, as follows:

i lb;1 z

11 1 l2 x 9726 = 9726 10 1 97262 x 9726 = 2659 9 0 26592 = 5634 8 1 56342 x 9726 = 9167 7 1 91672 x 9726 = 4958 6 1 495g2 x 9726 = 7783 5 0 77832 = 6298 4 0 62982 = 4629 3 1 46292 x 9726 = 10185 2 1 1O1852 x 9726 = 105 1 0 1O52 = 11025 0 1 110252 x 9726 = 5761

Hence, as stated earlier, the ciphertext is 5761. 0

It should be emphasized that the most efficient current hardware implementations of RSA achieve encryption rates of about 600 Kbits per second (using a 512 bit modulus n), as compared to 1 Gbit per second for DES. Stated another way, RSA is roughly 1500 times slower than DES.

At this point we have discussed the encryption and decryption operations for RSA. In terms of setting up RSA, the generation of the primes p and q (Step 1) will be discussed in the next section. Step 2 is straightforward and can be done in time 0( (log n)2). Steps 3 and 4 involve the Euclidean algorithm, so let’s briefly consider its complexity.

Suppose we compute the greatest common divisor of r-0 and r-1, where PO > ~1. In each iteration of the algorithm, we compute a quotient and remainder, which can be done in time O((log ~0)~). If we can obtain an upper bound on the number of iterations, then we will have a bound on the complexity of the algorithm. There is a well-known result, known as Lame’s Theorem, that provides such a bound. It asserts that ifs is the number of iterations, then fs+2 5 TO, where fi denotes the ith Fibonacci number. Since

1+Js i fi= -y-- ,

( ) it follows that s is O(logr0).

This shows that the running time of the Euclidean algorithm is O((logn)3). (Actually, a more careful analysis can be used to show that the running time is, in fact, O((logn)2).)

4.5. PROBABILISTIC PRIMALITY TESTING 129

4.5 Probabilistic Primality Testing

In setting up the RSA Cryptosystem, it is necessary to generate large (e.g., 80 digit) “random primes.” In practice, the way this is done is to generate large random numbers, and then test them for primality using a probabilistic polynomial-time Monte Carlo algorithm such as the Solovay-Strassen or Miller-Rabin algorithm , both of which we will present in this section. These algorithms are fast (i.e., an integer n can be tested in time that is polynomial in log, n, the number of bits in the binary representation of n), but there is a possibility that the algorithm may claim that n is prime when it is not. However, by running the algorithm enough times, the error probability can be reduced below any desired threshold. (We will discuss this in more detail a bit later.)

The other pertinent question is how many random integers (of a specified size) will need to be tested until we find one that is prime. A famous result in number theory, called the Prime number theorem, states that the number of primes not exceeding N is approximately N/In N. Hence, if p is chosen at random, the probability that it is prime is about l/lnp. For a 512 bit modulus, we have l/lnp x 1/ 177. That is, on average, of 177 random integers p of the appropriate size, one will be prime (of course, if we restrict our attention to odd integers, the probability doubles, to about 2/177). So it is indeed practical to generate sufficiently large random numbers that are “probably prime,” and hence it is practical to set up the RSA Cryptosystem. We proceed to describe how this is done.

A decision problem is a problem in which a question is to be answered “yes” or “no.” A probabilistic algorithm is any algorithm that uses random numbers (in contrast, an algorithm that does not use random numbers is called a deterministic algorithm). The following definitions pertain to probabilistic algorithms for decision problems.

DEFINITION4.1 A yes-biased Monte Carlo algorithm is a probabilistic algorithm for a decision problem in which a “‘yes” answer is (always) correct, but a “no” answer may be incorrect. A no-biased Monte Carlo algorithm is defined in the obvious way. We say that a yes-biased Monte Carlo algorithm has errorprob- ability equal to e if for any instance in which the answer is “yes,” the algorithm will give the (incorrect) answer “no” with probability at most e. (This probability is computed over all possible random choices made by the algorithm when it is run with a given input.)

The decision problem called Composites is described in Figure 4.5. Note that an algorithm for a decision problem only has to answer “yes” or

“no.” In particular, in the case of the problem Composites, we do not require the algorithm to find a factorization in the case that n is composite.

We will first describe the Solovay-Strassen algorithm, which is a yes-biased


FIGURE 4.5 Composites

Problem Instance A positive integer n > 2.

Question Is n composite?

FIGURE 4.6 Quadratic Residues

Problem Instance An odd prime p, and an integer x such that 0 5 x:Ip-1.

Question Is x a quadratic residue modulo p?

Monte Carlo algorithm for Composites with error probability l/2. Hence, if the algorithm answers “yes,” then n is composite; conversely, if n is composite, then the algorithm answers “yes” with probability at least l/2.

Although the Miller-Rabin algorithm (which we will discuss later) is faster than Solovay-Strassen, we begin by looking at the Solovay-Strassen algorithm because it is easier to understand conceptually and because it involves some number- theoretic concepts that will be useful in later chapters of the book. We begin by developing some further background from number theory before describing the algorithm.

DEFINITION 4.2 Suppose p is an odd prime and x is an integel; 1 5 x 5 p - 1. x is dejined to be a quadratic residue modulo p if the congruence y2 q x (mod p) has a solution y E i&,. x is dejined to be a quadratic non-residue modulo p if x $0 (mod p) and x is not a quadratic residue modulo p.

Example 4.6 The quadratic residues modulo 11 are 1,3,4,5 and 9. Note that (j~l)~ = 1, (*4)2 =03, (zl~2)~ = 4, (~k4)~ = 5 and (&3)2 = 9 (where all arithmetic is in &I).

The decision problem Quadratic Residues is defined in Figure 4.6 in the obvious way:

We prove a result, known as Euler’s criterion, that will give rise to a polynomial- time deterministic algorithm for Quadratic Residues.


THEOREM 4.8 (Euler’s Criterion) Let p be prime. Then x is a quadratic residue modulo p if and only if

x(~-‘)/~ E 1 (mod p).

PROOF First, suppose x - y2 (mod p). Recall from Corollary 4.6 that if p is prime, then xp-i E 1 (mod p) for any x $0 (mod p). Thus we have

x(P-1)/2 E (y2)(P-‘)/2 (mod p)

E Yp-’ (mod p)

= 1 (modp).

Conversely, suppose x(P-I)/~ E 1 (mod p). Let b be a primitive element modulo p. Then x = b’ (mod p) for some i. Then we have

x(P-1)/2 E (b’)(P-‘)/2 (mod p)

E bi(P-‘)/2 (mod p).

Since b has order p - 1, it must be the case that p - 1 divides i(p - 1)/2. Hence, i is even, and then the square roots of x are fb’i2. 1

Theorem 4.8 yields a polynomial-time algorithm for Quadratic Residues, by using the “square-and-multiply” technique for exponentiation modulo p. The complexity of the algorithm will be O((logp)3).

We now need to give some further definitions from number theory.

DEFINITION 4.3 Sup ose p is an odd prime. For any integer a 2 0, we define the Legendre symbol % as follows:

Q,

0 ifaEO(modp) 1 if a is a quadratic residue modulo p

- 1 if a is a quadratic non-residue modulo p.

We have already seen that a(P-1)/2 E 1 (mod p) if and only if a is a quadratic residue modulo p. If a is a multiple of p, then it is clear that a(P-1)/2 E 0 (mod p). Finally, if a is a quadratic non-residue modulo p, then a(P-1)/2 E -1 (mod p) since ap-’ E 1 (mod p), Hence, we have the following result, which provides an efficient algorithm to evaluate Legendre symbols:

THEOREM 4.9 Suppose p is prime. Then

a 0 i q a(P-‘)/2 (mod p).


Next, we define a generalization of the Legendre symbol.

DEFINITION4.4 Suppose n is an odd positive integet; and the prime power factorization of n is pie’ . . .pkek. Let a > 0 be an integer The Jacobi symbol (a) is defined to be

Example 4.7 Consider the Jacobi symbol (H). Th e p rime power factorization of 9975 is 9975 = 3 x 5* x 7 x 19. Thus we have

(z2) = (yt) (y!)*(!y) (!g!)

= ($ (g)‘(t) (+I)

= (-1)(-1)2(-1)(-l)

= -1.

0

Suppose n > 1 is odd. If n is prime then (i) E a(“-‘)/* (mod n) for any a. On the other hand, if n is composite, it may or may not be the case that (Ll) G aCnwl)/* (mod n). If this equation holds, then a is called an Euferpseudo- prime to the base n. For example, 10 is an Euler pseudo-prime to the base 91, since

= -1 = 104’ mod91.

However, it can be shown that, for any odd composite n, at most half of the integers a such that 1 5 a 5 n - 1 are Euler pseudo-primes to base n (see the exercises). This fact shows that the Solovay-Strassen primality test, which we present in Figure 4.7, is a yes-biased Monte Carlo algorithm with error probability at most l/2. At this point it is not clear that the algorithm is a polynomial-timealgorithm. We already know how to evaluate a(“-‘)/* mod n in time O((logn)3), but how do we compute Jacobi symbols efficiently? It might appear to be necessary to first factor n, since the Jacobi symbol (t) is defined in terms of the factorization of n. But, if we could factor n, we would already know if it is prime, so this approach ends up in a vicious circle.

Fortunately, we can evaluate a Jacobi symbol without factoring n by using some results from number theory, the most important of which is a generalization of


FIGURE 4.7 The Solovay-Strassen primality test for an odd integer n

1. choose a random integer a, 1 5 a 5 n - 1

2. if (b) = a(n-‘)/2 (mod n) then

answer “n is prime”

else

answer “n is composite”

the law of quadratic reciprocity (property 4 below). We now enumerate these properties without proof:

I. If n is an odd integer and mt G rn2 (mod n), then

(T) = (?).

2. If n is an odd integer, then

1 ifnsfl (mod8) -1 ifnEf3(mod8).

3. If n is an odd integer then

(y2) = (X) (?).

In particular, if m = 2kt, where t is odd, then

(3 = (:)” (t).

4. Suppose m and n are odd integers. Then

Example 4.8 As an illustration of the application of these properties, we evaluate the Jacobi


symbol (z) as follows:

by property 4

by property 1

= - (k)*(g) bypropefiy3 117

= - 0 7411

7411

= - (3 117

40

= - ( - > 117

by property 2

by property 4

by property 1

= - (&)‘(A) byproperty

by property 2

by property 4

by property 1

= -1 by property 2

Notice that we successively apply properties 4, 1, 3, and 2 in this computation. II

In general, by applying these four properties, it is possible to compute a Jacobi symbol (z) in polynomial time. The only arithmetic operations that are required are modular reductions and factoring out powers of two. Note that if an integer is represented in binary notation, then factoring out powers of two amounts to determining the number of trailing zeroes. So, the complexity of the algorithm is determined by the number of modular reductions that must be done. It is not difficult to show that at most O(logn) modular reductions are performed, each of which can be done in time O((logn)*). This shows that the complexity is 0( (log n)3), which is polynomial in log n. (In fact, the complexity can be shown to be 0( (log n)‘) by more precise analysis.)


Suppose that we have generated a random number n and tested it for primality using the Solovay-Strassen algorithm. If we have run the algorithm m times, what is our confidence that n is prime? It is tempting to conclude that the probability that such an integer n is prime is 1 - 2-m. This conclusion is often stated in both textbooks and technical articles, but it cannot be inferred from the given data.

We need to be careful about our use of probabilities. We will define the following random variables: a denotes the event

“a random odd integer n of a specified size is composite,”

and h denotes the event

“the algorithm answers ‘n is prime’ m times in succession.”

It is certainly the case that prob(bla) 5 2-“. However, the probability that we are really interested is prob(alb), which is usually not the same as prob(bla).

We can compute prob(alb) using Bayes’ theorem (Theorem 2.1). In order to do this, we need to know prob(a). Suppose N 5 n < 2N. Applying the Prime number theorem, the number of (odd) primes between N and 2N is approximately

2N N N m--= 1nN 1nN

n %--l.

Since there are N/2 M n/2 odd integers between N and 2N, we will use the estimate

2 prob(a) e 1 - -.

Inn Then we can compute as follows:

prob(a,,,) = prob(bia)prob(a) ProO)

prob(bla)prob(a) = prob(bla)prob(a) + prob(bIZ)prob(Z)

proWa) (I- &) a prob(bla) (I- &) + & prob(bja)(lnn - 2)

= prob(bla)(lnn - 2) + 2

< 2-m(lnn - 2) - 2-m(lnn-2)+2

Inn-2 = lnn-2+2m+t’

Note that in this computation, Z denotes the event


FIGURE 4.8 Error probabilities for the Solovay-Strassen test

m

1 2 5

10 20 30 50

100

2-m 1’/5

175 + 2m+t .500 .978 .250 .956

.312 x lo-’ .732

.977 x 10-s .787 x 10-l

.954x 10-6 .834 x lO-4

.931 x lo-9 .815 x lO-7 .888 x lo-‘5 .777 x 10-13 .789 x lO-3o .690 x lO-28

“a random odd integer n is prime.”

It is interesting to compare the two quantities (Inn - 2)/(lnn - 2 + 2m+‘) and 2-m as a function of m. Suppose that n M 2256 = e’77, since these are the sizes of primes that we seek for use in RSA. Then the first function is roughly 175/( 175 + 2mt’). We tabulate the two functions for some values of m in Figure 4.8.

Although 175/( 175 + 2”+’ ) approaches zero exponentially quickly, it does not do so as quickly as 2-“. In practice, however, one would take m to be something like 50 or 100, which will reduce the probability of error to a very small quantity.

We conclude this section with another Monte Carlo algorithm for Composites which is known as the Miller-Rabin algorithm (it is also known as the “strong pseudo-prime test”). This algorithm is presented in Figure 4.9. It is clearly a polynomial-time algorithm: its complexity is O((logn)3), which is the same as the Solovay-Strassen test. In fact, the Miller-Rabin algorithm performs better in practice than the Solovay-Strassen algorithm.

We show now that this algorithm cannot answer “n is composite” if n is prime, i.e., the algorithm is yes-biased.

THEOREM 4.10 The Miller-Rabin algorithm for Composites is a yes-biased Monte Carlo algorithm.

PROOF We will prove this by assuming the algorithm answers “n is composite” for some prime integer n, and obtain a contradiction. Since the algorithm answers “n is composite,” it must be the case that urn $ 1 (mod n). Now consider the sequence of values b tested in step 2 of the algorithm. Since b is squared in each iteration of the for loop, we are testing the values urn, a2m, . . . , a 2k-‘m. Since the


FIGURE 4.9 The Miller-Rabin primality test for an odd integer n

1. write n - 1 = 2”m, where m is odd

2. choose a random integer a, 1 5 a 5 n - 1

3. compute b = a”’ mod n

4. if b E 1 (mod n) then

answer “n is prime” and quit

5. fori=Otok- ldo

6. if b E - 1 (mod n) then

answer “n is prime” and quit

else

b = b2 mod n

7. answer “n is composite”

algorithm answers “n is composite,” we conclude that

u21m $ -1 (mod n)

forO<is k- 1. Now, using the assumption that n is prime, Fermat’s theorem (Corollary 4.6)

tells us that

u2km z 1 (mod n)

since n - 1 = 2km. Then u2*-lm is a square root of 1 modulo n. Since n is prime, there are only two square roots of 1 modulo n, namely, f 1 mod n. This can be seen as follows: z is a square root of 1 modulo n if and only if

n 1 (2 - l)(z + 1).

Since n is prime, either n 1 (z - 1) (i.e., t G 1 (mod n)) or n \ (z + 1) (i.e., 2 s -1 (mod n)).

We have that

so it follows that

u~*-‘~ $ -1 (mod n),

u2*-lm G 1 (mod n).


Then u~“-‘~ must be a square root of 1. By the same argument,

u~“-*~ s 1 (mod n).

Repeating this argument, we eventually obtain

urn z 1 (mod n),

which is a contradiction, since the algorithm would have answered “n is prime” in this case. 1

It remains to consider the error probability of the Miller-Rabin algorithm. Al- though we will not prove it here, the error probability can be shown to be, at most, l/4.

4.6 Attacks On RSA

In this section, we address the question: are there possible attacks on RSA other than factoring n? Let us first observe that it is sufficient for the cryptanalyst to compute 4(n). For, if n and $( ) n are known, and n is the product of two primes p, q, then n can be easily factored, by solving the two equations

n = m 4(n) = (P - I)(9 - 1)

for the two “unknowns”p and q. If we substitute9 = n/p into the second equation, we obtain a quadratic equation in the unknown value p:

p*-(n-$(n)+l)p+n=O.

The two roots of this equation will be p and q, the factors of n. Hence, if a cryptanalyst can learn the value of 4(n), then he can factor n and break the system. In other words, computing 4(n) is no easier than factoring n.

Here is an example to illustrate.

Example 4.9 Suppose the cryptanalyst has learned that n = 84773093 and 4(n) = 84754668. This information gives rise to the following quadratic equation:

p2 - 18426~ + 84773093 = 0.

This can be solved by the quadratic formula, yielding the two roots 9539 and 8887. These are the two factors of n. 0

4.6. ATTACKS ON RSA 139

4.6.1 The Decryption Exponent

We will now prove the very interesting result that any algorithm which computes the decryption exponent a can be used as a subroutine (or oracle) in a probabilistic algorithm that factors n. So we can say that computing a is no easier than factoring n. However, this does not rule out the possibility of breaking the cryptosystem without computing a.

Notice that this result is of much more than theoretical interest. It tells us that if a is revealed, then the value n is also compromised. If this happens, it is not sufficient for Bob to choose a new encryption exponent; he must also choose a new modulus n.

The algorithm we are going to describe is a probabilistic algorithm of the Las Vegas type. Here is the definition:

DEFZNZTZON4.5 Suppose 0 5 c < 1 is a real number A Las Vegas algorithm is a probabilistic algorithm such that, for any problem instance I, the algorithm may fail to give an answer with some probability E (i.e., it can terminate with the message “no answer”). However; if the algorithm does return an answec then the answer must be correct.

REMARK A Las Vegas algorithm may not give an answer, but any answer it gives is correct. In contrast, a Monte Carlo algorithm always gives an answer, but the answer may be incorrect. 1

If we have a Las Vegas algorithm to solve a problem, then we simply run the algorithm over and over again until it finds an answer. The probability that the algorithm will return “no answer” m times in succession is em. The average (i.e., expected) number of times the algorithm must be run in order to obtain an answer is in fact 1 /e (see the exercises).

Suppose that A is a hypothetical algorithm that computes the decryption exponent a from b. We will describe a Las Vegas algorithm that uses A as an oracle. This algorithm will factor n with probability at least l/2. Hence, if the algorithm is run m times, then n will be factored with probability at least 1 - l/2m.

The algorithm is based on certain facts concerning square roots of 1 modulo n, where n = pq is the product of two distinct odd primes. Recall that the congruence x2 E 1 (mod p) has two solutions modulo p, namely 2 = fl mod p. Similarly, the congruence x2 E 1 (mod q) has two solutions, namely x = fl mod q.

Now, since x2 G 1 (mod n) if and only if x2 E 1 (mod p) and x2 G 1 (mod q), it follows that x2 E 1 (mod n) if and only if x = fl mod p and x = f 1 mod q. Hence, there are four square roots of 1 modulo n, and they can be found using the Chinese remainder theorem. ‘Iwo of these solutions are x = f 1 mod n; these are called the trivial square roots of 1 modulo p. The other two square roots are called non-trivial, and they are negatives of each other modulo n.


Here is a small example to illustrate.

Example 4.10 Suppose n = 403 = 13 x 3 1. The four square roots of 1 modulo 403 are 1,92,3 11 and 402. The square root 92 is obtained by solving the system x = 1 (mod 13), x c - 1 (mod 3 1) using the Chinese remainder theorem. Having found this non- trivial square root, the other non-trivial square root must be 403 - 92 = 3 11. It is the solution to the system x - -1 (mod 13), x - 1 (mod 31). 0

Suppose x is a non-trivial square root of 1 modulo n. Then we have

n I (x - 1)(x + 11,

but n divides neither factor on the right side. It follows that gcd(x + 1, n) = p or q (and similarly, gcd(x - 1, n) = p or q). Of course, a greatest common divisor can be computed using the Euclidean algorithm, without knowing the factorization of n. Hence, knowledge of a non-trivial square root of 1 modulo n yields the factorization of n with only a polynomial amount of computation. This important fact is the basis of many results in cryptography.

In Example 4.10 above, gcd(93,403) = 31 and gcd(312,403) = 13. In Figure 4.10, we present an algorithm which, using the hypothetical algorithm

A as a subroutine, attempts to factor n by finding a non-trivial square root of 1 modulo n. (Recall that A computes the decryption exponent a corresponding to the encryption exponent b.) We first do an example to illustrate the application of this algorithm.

Example 4.11 Suppose n = 89855713, b = 34986517 and a = 82330933, and the random value w = 5. We have

ub - 1 = 23 x 360059073378795.

In step 6, TJ = 85877701, and in step 10, v = 1. In step 12, we compute

gcd(85877702, n) = 9103.

This is one factor of n; the other is n/9103 = 9871. [I

Let’s now proceed to the analysis of the algorithm. First, observe that if we are lucky enough to choose w to be a multiple of p or q, then we can factor n immediately. This is detected in step 2. If w is relatively prime to n, then we compute wr, w2’, w4r, . . ., by successive squaring, until

w2” E 1 (mod n)


FIGURE 4.10 Factoring algorithm, given the decryption exponent a

1.

2.

3.

4.

5.

6.

7.

8.

9.

10.

11.

choose w at random such that 1 5 w 5 n - 1

compute x = gcd(w, n)

if 1 < x < n then quit (success: x = p or x = q)

compute a = A(b)

write ab - 1 = 2’r, T odd

compute v = 20’ mod n

if v E 1 (mod n) then quit (failure)

while v $ 1 (mod n) do

V’J = v

v = v2 mod n

if vo s - 1 (mod n) then

quit (failure)

else

compute x = gcd( ve + 1, n) (success: x = p or x = q)

for some t. Since

a6 - 1 = 2’~ E 0 (mod 4(n)),

we know that w2*’ E 1 (mod n). Hence, the while loop terminates after at most s iterations. At the end of the while loop, we have found a value va such that vi G 1 (mod n) but va $ 1 (mod n). If ve 3 -1 (mod n), then the algorithm fails; otherwise, va is a non-trivial square root of 1 modulo n and we are able to factor n (step 12).

The main task facing us now is to prove that the algorithm succeeds with probability at least l/2. There are two ways in which the algorithm can fail to factor 12:

1. wp E 1 (mod n) (step 7)

2 . W”~E-1 (modn)forsomet,O<t<s-l(stepl1)

We have s + 1 congruences to consider. If a random value w is a solution to at least one of these s + 1 congruences, then it is a “bad” choice, and the algorithm fails. So we proceed by counting the number of solutions to each of these congruences.


First, consider the congruence w’ E 1 (mod n). The way to analyze a congruence such as this is to consider solutions modulo p and modulo q separately, and then combine them using the Chinese remainder theorem. Observe that x E 1 (mod n) if and only if x E 1 (mod p) and x E 1 (mod q).

So, we first consider wp E 1 (mod p). Since p is prime, Z&,* is a cyclic group by Theorem 4.7. Let g be a primitive element modulo p. We can write w = g” for a unique integer u, 0 5 u 2 p - 2. Then we have

wr E 1 (mod p)

9 “r E 1 (mod p)

(P- 1) I ur.

Let us write

where pl is odd, and

where q1 is odd. Since

p - 1 = 2’p,

q - 1 = 2jq1

4(n) = (p - l)(q - 1) 1 (ub - 1) = 2’r,

we have that

Hence

and

Now, the condition (p - 1) I ur becomes 2’pl I ur. Since pl I T and r is odd, it is necessary and sufficient that 2’ 1 u. Hence, u = k2’, 0 5 k 5 pl - 1, and the number of solutions to the congruence w’ E 1 (mod p) is pl .

By an identical argument, the congruence w’ E 1 (mod q) has exactly q1 solutions. We can combine any solution modulo p with any solution modulo q to obtain a unique solution modulo n, using the Chinese remainder theorem. Consequently, the number of solutions to the congruence wp E 1 (mod n) is plql.

The next step is to consider a congruence w”’ E - 1 (mod n) for a fixed value t (where 0 5 t 5 s - 1). Again, we first look at the congruence modulo p and then modulo q (note that w”’ E -1 (mod n) if and only if w2” - -1 (mod p) and w2’? = -1 (mod q)). First, consider w2 ’ E -1 (mod p). Writing w = gU, as above, we get

9 u2’r G - 1 (mod p).


Since g(P-1)/2 f - 1 (mod p), we have that

dr f 9 (mod p- 1)

(P-1) I ( u2tr - p-l

2 )

2(p - 1) 1 (u2t+‘r - (p - 1)).

Since p - 1 = 2ipl, we get

2’+‘pl 1 (u2t+‘r - 2’Pl).

Taking out a common factor of pl, this becomes

2’+’ 1 (“2;” 2’) .

Now, if t 2 i, then there can be no solutions since 2’+’ 1 2t+’ but 2’+’ X2’. On the other hand, if t 5 i - 1, then u is a solution if and only if u is an odd multiple of 2”-t-’ (note that r/p, is an odd integer). So, the number of solutions in this case is

p-l 1 ,i-t-1 x 5 =2tp1.

By similar reasoning, the congruence w2’r E -1 (mod q) has no solutions if t >_ j, and 2tqt solutions if t 5 j - 1. Using the Chinese remainder theorem, we see that the number of solutions of w2’r E - 1 (mod n) is

0 if t >_ min{i, j} 22tp1q1 if t 5 min{i, j} - 1.

Now, t can range from 0 to s - 1. Without loss of generality, suppose i 5 j; then the number of solutions is 0 if t > i. The total number of “bad” choices for w is at most

p,ql+plg,(l+22+24+...+22i-2)=plq1 1+ ( 3

2 22’ = PI91 ( > 7+3 .

Recall that p - 1 = 2”pl and q - 1 = 2jql. Now, j > i 2 1, so plql < n/4. We also have that

22iplql 2 2’+jplql = (p- l)(q - 1) < 12.

Hence, we obtain

2 22’ PI91 ( > 5+3 <:+;


n = -.

2

Since at most (n - 1)/2 choices for w are “bad,” it follows that at least (n - I)/2 choices are “good” and hence the probability of success of the algorithm is at least l/2.

4.6.2 Partial Information Concerning Plaintext Bits

The other result we will discuss concerns partial information about the plaintext that might be “leaked” by an RSA encryption. Two examples of partial information that we consider are the following:

I. given y = eK(x), computepariry(y), whereparizy(y) denotes thelow-order bit of x

2. given y = eK(X), compute half(y), where half(y) = 0 if 0 5 2 < n/2 and half(y) = 1 if n/2 < 2 5 n - 1.

We will prove that, given y = eK (x), any algorithm that computes purify(y) or half(y) can be used as an oracle to construct an algorithm that computes the plaintext x. What this means is that, given a ciphertext, computing the low-order bit of the plaintext is polynomially equivalent to determining the whole plaintext!

First, we prove that computingpurity is polynomially equivalent to computing half(y). This follows from the following two easily proved identities (see the exercises):

half(y) = parity(y X eK(2) mod n) (4.1)

parity(y) = hdf(y X eK(2-‘) mod n) (4.2)

and from the multiplicativerule eK(xt)eK(E*) = eK(xtx2). We will show how to compute x = dK(y), given a hypothetical algorithm

(oracle) which computes half(y). The algorithm is presented in Figure 4.11. In steps 2-4, we compute

yj = hdfy x (eK(2))‘) = hdfleK(X x 2’)),

for 0 < i 5 log, n. We observe that

hCdf(eK(x)) = 0 @ x E [o, 5>

hdf(eK(h)) = 0 e x E [o, a) u [f, $)

hdf(eK(4x)) = 0 ($ x E 0 ’ u ’ it!! u ’ ’ u 3” E [ ‘8) [4’ 8) [27 8) [ 4’ 8)’

and so on. Hence, we can find x by a binary search technique, which is done in steps 7-l 1. Here is a small example to illustrate.

4.7. THE RABIN CRYPTOSYSTEM 145

FIGURE 4.11 Decrypting RSA ciphertext, given an oracle for computing hafly)

1. denote k = [log, n]

2. fori=Otokdo

3. yi = W(y) 4. y = (y X eK(2)) mod 72

5. lo = 0

6. hi = n

7. fori=Otokdo

8. mid = (hi + Zo)/2

9. if = 1 then yi lo = mid

else hi = mid

10. 2 = [hij

Example 4.12 Suppose n = 1457, b = 779, and we have a ciphertext y = 722. eK(2) is computed to be 946. Suppose, using our oracle for half, that we obtain the following values yi in step 3 of the algorithm:

i 0 1 2 3 4 5 6 7 8 9 10 yi 1 0 1 0 1 1 1 1 1 0 0

Then the binary search proceeds as shown in Figure 4.12. Hence, the plaintext is 2 = 1999.551 = 999. II

4.7 The Rabin Cryptosystem

In this section, we describe the Rabin Cryptosystem, which is computationally secure against a chosen-plaintext attack provided that the modulus n = pq cannot be factored. The system is described in Figure 4.13.

We will show that the encryption function eK is not an injection, so decryption cannot be done in an unambiguous fashion. In fact, there are four possible


FIGURE 4.12 Binary search for RSA decryption

% 1 2 3 4 5 6 7 8 9

10

lo mid hi 0.00 728.50 1457.00

728.50 1092.75 1457.00 728.50 910.62 1092.75 910.62 1001.69 1092.75 910.62 956.16 1001.69 956.16 978.92 1001.69 978.92 990.30 1001.69 990.30 996.00 1001.69 996.00 998.84 1001.69 998.84 1000.26 1001.69 998.84 999.55 1000.26 998.84 999.55 999.55

FIGURE 4.13 Rabin Cryptosystem

Let n be the product of two distinct primes p and q, p, q E 3 (mod 4) Let P = C = &, and define

K={(n,p,q,B):O<B<n-1).

For Ii = (n, p, q, B), define

eK(x) = x(x + B) mod n

md

k(Y) =

l%e values n and B are public, while p and q are secret.


plaintexts that could be the encryption of any given cipher-text. More precisely, let w be one of the four square roots of 1 modulo n. Let x E i&. Then, we can verify the following equations:

=w2(x+f)2- (f)’ B2 B2

=x~+Bx+~-~

=x2+Bx

= eK(X).

(Note that all arithmetic is being done in Z&, and division by 2 and 4 is the same as multiplication by 2-t and 4-l modulo n, respectively.)

The four plaintexts that encrypt to eK(x) are x, --I - B, w(x + B/2) and -w(x + B/2), where w is a non-trivial square root of 1 modulo n. In general, there will be no way for Bob to distinguish which of these four possible plaintexts is the “right” plaintext, unless the plaintext contains sufficient redundancy to eliminate three of these four possible values.

Let us look at the decryption problem from Bob’s point of view. He is given a ciphertext y and wants to determine z such that

x2 + bx E y (mod n).

This is a quadratic equation in the unknown x. We can eliminate the linear term by making the substitution xt = x + B/2, or equivalently, x = xt - B/2. Then the equation becomes

B2 xf-Bx,+-+BxI---

4 B2 yEO(modn), 2

or

xi G B2 4 + y (mod n).

If we define C = B2/4 + y, then we can rewrite the congruence as

xi E C (mod n).

So, decryption reduces to extracting square roots modulo n. This is equivalent to solving the two congruences

xf 3 C (mod p)

and

xf E C (mod q).


(There are two square roots of C modulo p and two square roots modulo q. Using the Chinese remainder theorem, these can be combined to yield four solutions modulo n.) We can use Euler’s criterion to determine if C is a quadratic residue modulo p (and modulo q). In fact, C will be a quadratic residue modulo p and modulo q if encryption was performed correctly. But Euler’s criterion does not help us find the square roots of C; it yields only an answer “yes” or “no.”

When p E 3 (mod 4), there is a simple formula to compute square roots of quadratic residues modulop. Suppose C is aquadratic residue andp = 3 (mod 4). Then we have that

(kC(P+')/4)2 G C(P+'w (mod p)

E C(P-‘)12C (mod p)

E C (mod p).

Here we again make use of Euler’s criterion, which says that if C is a quadratic residue modulo p, then C(P-t)/2 G 1 (mod p). Hence, the two square roots of C modulo p are IIIC(P+‘)/~ mod p. In a similar fashion, the two square roots of C modulo q are HY(q+1)/4 mod q. It is then straightforward to obtain the four square roots x t of C modulo n using the Chinese remainder theorem.

REMARK It is interesting that for p 3 1 (mod 4) there is no known polynomial- time deterministic algorithm to compute square roots of quadratic residues modulo p. There is a polynomial-time Las Vegas algorithm, however. 1

Once we have determined the four possible values for xt , we compute 2 from the equation I = zt - B/2 to get the four possible plaintexts. This yields the decryption formula

&c(Y) = J ;+y+

Example 4.13 Let’s illustrate the encryption and decryption procedures for the Rabin Gyp- tosystem with a toy example. Suppose n = 77 = 7 x 11 and B = 9. Then the encryption function is

eK (x) = x2 + 9x mod 77

and the decryption function is

dK(y) = m - 43 mod 77.

Suppose Bob wants to decrypt the ciphertext y = 22. It is first necessary to find the square roots of 23 modulo 7 and modulo 11. Since 7 and 11 are both congruent


FIGURE 4.14 Factoring a Rabin modulus, given a decryption oracle

1. choosearandomr,l<r<n-1

2. compute y = r2 - B2/4 mod n

3. call A(y), obtaining a decryption x

4. compute 21 = x + B/2

5. if x1 E fr (mod n) then

quit (failure)

else

gcd(zt + r, n) = p or q (success)

to 3 modulo 4, we use our formula:

23(7+‘)/“ = 22 E 4 mod 7 -

and

23(“+‘)/4 = 13 f 1 mod 11, -

Using the Chinese remainder theorem, we compute the four square roots of 23 modulo 77 to be f 10, f32 mod 77. Finally, the four possible plaintexts are:

lo-43mod77=44

67 - 43 mod 77 = 24

32-43mod77=66

45 - 43 mod 77 = 2.

It can be verified that each of these plaintexts encrypts to the ciphertext 22. 0

We now discuss the security of the Rabin Cryptosystem. We will prove that any hypothetical decryption algorithm A can be used as an oracle in a Las Vegas algorithm that factors the modulus n with probability at least l/2. This algorithm is depicted in Figure 4.14.

There are several points of explanation needed. First, observe that


so a value x will be returned in step 3. Next, we look at step 4 and note that XT E r2 (mod n). It follows that xt E fr (mod n) or xt E *wr (mod n), where w is one of the non-trivial square roots of 1 modulo n. In the second case, we have

n I (21 - ~)(xI + ~1,

but n does not divide either factor on the right side. Hence, computation of gcd(xt + r, n) (or gcd(xt - r, n)) must yield either p or q, and the factorization of n is accomplished.

Let’s compute the probability of success of this algorithm, over all n - 1 choices for the random value r. For two non-zero residues r1 and r2, define

rt -r2~r~~r~(modn).

It is easy to see that r - r for all r; rI - r2 implies r2 - r1; and r1 - r2 and r:! - rs together imply r1 - r3. This says that the relation - is an equivalence relation. The equivalence classes of Z,, \{ 0) all have cardinality four: the equivalence class containing r is the set

[r] = {hr, fwr mod n},

where w is a non-trivial square root of 1 modulo n. In the algorithm presented in Figure 4.14, any two values r in the same equiva-

lence class will yield the same value y. Now consider the value x returned by the oracle A when given y. We have

[Yl = {fY>fwYl.

If r = fy, then the algorithm fails; while it succeeds if r = fwy. Since r is chosen at random, it is equally likely to be any of these four possible values. We conclude that the probability of success of the algorithm is l/2.

It is interesting that the Rabin Cryptosystem is provably secure against a chosen plaintext attack. However, the system is completely insecure against a chosen ciphertext attack. In fact the algorithm in Figure 4.14, that we used to prove security against a chosen plaintext attack, also can be used to break the Rabin Cryptosystem in a chosen ciphertext attack! In the chosen ciphertext attack, the oracle A is replaced by Bob’s decryption algorithm.

4.8 Factoring Algorithms

There is a huge amount of literature on factoring algorithms, and a careful treatment would require more pages than we have in this book. We will just try to give a brief overview here, including an informal discussion of the best current factoring algorithms and their use in practice. The three algorithms that are most effective

4.8. FACTORING ALGORITHMS 151

FIGURE 4.15 The p - 1 factoring algorithm

Input: n and B

a=2

forj = 2toBdo

a = aj mod n

d= gcd(a- 1,n)

1 < d < n then

d is a factor of n (success)

else

no factor of n is found (failure)

on very large numbers are the quadratic sieve, the elliptic curve algorithm and the number field sieve. Other well-known algorithms that were precursors include Pollard’s rho-method and p- 1 algorithm, Williams’ p+ 1 algorithm, the continued fraction algorithm, and of course, trial division.

Throughout this section, we suppose that the integer n that we wish to factor is odd. Trial division consists of dividing n by every odd integer up to [fi]. If n < IO’*, say, this is a perfectly reasonable factorization method, but for larger n we generally need to use more sophisticated techniques.

4.8.1 The p - 1 Method

As an example of a simple algorithm that can sometimes be applied to larger integers, we describe Pollard’s p - 1 algorithm, which dates from 1974. This algorithm, presented in Figure 4.15, has two inputs: the (odd) integer n to be factored, and a “bound” B. Here is what is taking place in the p - 1 algorithm: Suppose p is a prime divisor of n, and q 5 B for every prime power q 1 (p - 1). Then it must be the case that

b-1) IB!

At the end of the for loop (step 2),

a E 2B! (mod n),

so

a G 2B! (mod p)


sincep 1 n. Now,

2p-’ = 1 (mod p)

by Fermat’s theorem. Since (p - 1) 1 B!, we have that

a s 1 (mod p)

(in step 3). Thus, in step 4,

and

P I (a - 1)

P I 12, so

pId=gcd(a-l,n).

The integer d will be a non-trivial divisor of n (unless a = 1 in step 3). Having found a non-trivial factor d, we would then proceed to attempt to factor d and n/d if they are composite.

Here is an example to illustrate.

Example 4.14 Suppose n = 15770708441. If we apply the p - 1 algorithm with B = 180, then we find that a = 11620221425 in step 3, and d is computed to be 135979. In fact, the complete factorization of n into primes is

15770708441= 135979 x 115979.

in this case, the factorization succeeds because 135978 has only “small” prime factors:

135978 = 2 x 3 x 131 x 173.

Hence, by taking B 2 173, it will be the case that 135978 I B!, as desired. 0

In the algorithm, there are B - 1 modular exponentiations, each requring at most 2 log, B modular multiplications using square-and-multiply. The gcd computation can be done in time O((logn)3) using the Euclidean algorithm. Hence, the complexity of the algorithm is O(B log B(log n)* + (log n)3). If B is 0( (log n)i) for some fixed integer i, then the algorithm is indeed a polynomial- time algorithm; however, for such a choice of B the probability of success will be very small. On the other hand, if we increase the size of B drastically, say to fi, then the algorithm will be successful, but it will be no faster than trial division.

Thus, the drawback of this method is that it requires n to have a prime factor p such that p - 1 has only “small” prime factors. It would be very easy to construct


an RSA modulus n = pq which would resist factorization by this method. One would start by finding a large prime pl such that p = 2~1 + 1 is also prime, and a large prime q1 such that q = 2ql + 1 is also prime (using one of the Monte Carlo primality testing algorithms discussed in Section 4.5). Then the RSA modulus n = pq will be resistant to factorization using the p - 1 method.

The more powerful elliptic curve algorithm, developed by Lenstra in the mid- 1980’s, is in fact a generalization of the p - 1 method. We will not discuss the theory at all here, but we do mention that the success of the elliptic curve method depends on the more likely situation that an integer “close to” p has only “small” prime factors. Whereas the p - 1 method depends on a relation that holds in the group ZP, the elliptic curve method involves groups defined on elliptic curves modulo p.

4.8.2 Dixon’s Algorithm and the Quadratic Sieve

Dixon’s algorithm is based on a very simple idea that we already saw in connection with the Rabin Cryptosystem. Namely, if we can find z $ fy (mod n) such that x2 q y* (mod n), then gcd(z - y, n) is a non-trivial factor of n.

The method uses a factor base, which is a set f3 of “small” primes. We first obtain several integers x such that all the prime factors of z2 mod n occur in the factor base B. (How this is done will be discussed a bit later.) The idea is to then take the product of several of these x’s in such a way that every prime in the factor base is used an even number of times. This then gives us a congruence of the desired type x2 q y* (mod n), which (we hope) will lead to a factorization of n.

We illustrate with a carefully contrived example.

Example 4.15

Suppose n = 15770708441 (this was the same n that we used in Example 4.14). Let f3 = {2,3,5,7,11, 13). Consider the three congruences:

8340934156* G 3 x 7 (mod n)

12044942944* E 2 x 7 x 13 (mod n)

2773700011* E 2 x 3 x 13 (mod n).

If we take the product of these three congruences, then we have

(8340934156 x 12044942944 x 2773700011)* s (2 x 3 x 7 x 13)* (mod n).

Reducing the expressions inside the parentheses modulo n, we have

9503435785* = 546* (mod n),

Then we compute

gcd(9503435785 - 546,15770708441) = 115759,


finding the factor 115759 of n. 0

Suppose f? = (~1, . . . , pi} is the factor base. Let C be slightly larger than B

(say C = B + lo), and suppose we have obtained C congruences:

xj* z plalj x p2a2j . . . x pBasj (mod n),

for 1 5 j 5 C. For each j, consider the vector

aj = (‘Ylj mod 2,. . . , 'YBj mod 2) E (ZZ)~.

If we can find a subset of the aj’s that sum modulo 2 to the vector (0, . . . , 0), then the product of the corresponding xj’s will use each factor in B an even number of times.

We illustrate by returning to Example 4.15, where there exists a dependence even though C < B in this case.

Example 4.15 (Cont.) The three vectors al, a*, a3 are as follows:

al = (O,l,O, 1,&O)

a2 = (l,O,O, l,O, 1)

aj = (1, l,O,O,O, 1).

It is easy to see that

al + a2 + ag = (0, 0, O,O,O, 0) mod 2.

This gives rise to the congruence we saw earlier that successfully factored n.

Observe that finding a subset of the C vectors al, . . . , ac that sums modulo 2 to the all-zero vector is nothing more than finding a linear dependence (over Z2) of these vectors. Provided C > B, such a linear dependence must exist, and it can be found easily using the standard method of Gaussian elimination. The reason why we take C > B + 1 is that there is no guarantee that any given congruence will yield the factorization of n. Approximately 50% of the time it will turn out that x E fy (mod n). But if C > B + 1, then we can obtain several such congruences (arising from different linear dependencies among the aj’s). Hopefully, at least one of the resulting congruences will yield the factorization.

It remains to discuss how we obtain integers xj such that the values xj* mod n factor completely over the factor base B. There are several methods of doing this. One common approach is the Quadratic Sieve due to Pomerance, which uses integers of the form xj = j + LfiJ, j = 1,2, . . . . The name “quadratic sieve” comes from a sieving procedure (which we will not describe here) that is used to determine those zj ‘s that factor over f3.


There is, of course, a trade-off here: if B = IBI is large, then it is more likely that an integer zj factors over B. But the larger B is, the more congruences we need to accumulate before we are able to find a dependence relation. The optimal choice for B is approximately

and this leads to an expected running time of

OCe (I+o(l))~lnnlnlnn

>.

The number field sieve is a more recent factoring algorithm from the late 1980’s. It also factors n by constructing a congruence x2 E y* (mod n), but it does so by means of computations in rings of algebraic integers.

4.8.3 Factoring Algorithms in Practice

The asymptotic running times of the quadratic sieve, elliptic curve and number field sieve are as follows:

quadratic sieve 0 (et 1+o(l))~iaaiG >

elliptic curve

number field sieve 0 (e(1.92+o(l))(lnn)““(lnInn)“3) 1

The notation o( 1) denotes a function of n that approaches 0 as n -+ co, and p denotes the smallest prime factor of n.

In the worst case, p M ,/ii and the asymptotic running times of the quadratic sieve and elliptic curve algorithms are essentially the same. But in such a situation, quadratic sieve generally outperforms elliptic curve. The elliptic curve method is more useful if the prime factors of n are of differing size. One very large number that was factored using the elliptic curve method was the Fermat number 2*” - 1 in 1988 by Brent.

For factoring RSA moduli (where n = pq, p, q are prime, andp and q are roughly the same size), the quadratic sieve is currently the most successful algorithm. Some notable milestones have included the following factorizations. In 1983, the quadratic sieve successfully factored a 69-digit number that was a (composite) factor of 225’ - 1 (this computation was done by Davis, Holdredge, and Simmons). Progress continued throughout the 1980’s, and by 1989, numbers having up to 106 digits were factored by this method by Lenstra and Manasse, by distributing the computations to hundreds of widely separated workstations (they called this approach “factoring by electronic mail” ).


More recently, in April 1994, a 129-digit number known as RSA-129 was factored by Atkins, Graff, Lenstra, and Leyland using the quadratic sieve. (The numbers RSA-100, RSA-110, . . ., RSA-500 are a list of RSA moduli publicized on the Internet as “challenge” numbers for factoring algorithms. Each number RSA-d is a d-digit number that is the product of two primes of approximately the same length.) The factorization of RSA-129 required 5000 MIPS-years of computing time donated by over 600 researchers around the world.

The number field sieve is the most recent of the three algorithms. It seems to have great potential since its asymptotic running time is faster than either quadratic sieve or the elliptic curve. It is still in developmental stages, but people have speculated that number field sieve might prove to be faster for numbers having more than about 125-130 digits. In 1990, the number field sieve was used by Lenstra, Lenstra, Manasse, and Pollard to factor 229 - 1 into three primes having 7,49 and 99 digits.

4.9 Note-s and References

The idea of public-key cryptography was introduced by Diffie and Hellman in 1976. Although [DH~~A] is the most cited reference, the conference paper [DH76] actually appeared a bit earlier. The RSA Cryptosystem was discov- ered by Rivest, Shamir and Adleman [RSA78]. The Rabin Cryptosystem was described in Rabin [RA79]; a similar provably secure system in which decryption is unambiguous was found by Williams [WISO]. For a general survey article on public-key cryptography, we recommend Diffie [D192].

The Solovay-Strassen test was first described in [SS77]. The Miller-Rabin test was given in [M176] and [RA~O]. Our discussion of error probabilities is motivated by observations of Brassard and Bratley [BB~~A, $8.61 (see also [BBCGPSS]). The best current bounds on the error probability of the Miller-Rabin algorithm can be found in [DLP93].

The material in Section 4.6 is based on the treatment by Salomaa [SA90, pp. 143-1541. The factorization of n given the decryption exponent was proved in [DE84]; the results on partial information revealed by RSA is from [GMT82].

As mentioned earlier, there are many sources of information on factoring algorithms. Pomerance [P090] is a good survey on factoring, and Lenstra and Lenstra [LL90] is a good article on number-theoretic algorithms in general. Bres- soud [BR89] is an elementary textbook devoted to factoring and primality testing. Cryptography textbooks that emphasize number theory include Koblitz [Ko87] and Kranakis [KRS~]. Lenstra and Lenstra [LL93] is a monograph on the number field sieve.

Exercises 4.7-4.9 give some examples of protocol failures. For a nice article on this subject, see Moore [Mo92].

Exercises 157

Exercises

4.1 Use the Extended Euclidean algorithm to compute the following multiplicative inverses:

(a) 17-l mod 101 (b) 357-l mod 1234 (c) 3 125-l mod 9987.

4.2 Solve the following system of congruences:

2~ 12(mod25)

x E 9 (mod 26)

x z 23 (mod 27).

4.3 Solve the following system of congruences:

132 = 4 (mod 99)

1% G 56 (mod 101).

HINT First use the Extended Euclidean algorithm, and then apply the Chinese remainder theorem.

4.4 Here we investigate some properties of primitive roots. (a) The integer 97 is prime. Prove that z # 0 is a primitive root modulo 97 if

andonly ifx3* f 1 (mod 97) andza $ 1 (mod 97). (b) Use this method to find the smallest primitive root modulo 97. (c) Suppose p is prime, and p - 1 has prime power factorization

n p- 1 = rI pie’,

i=l

where thepi’s are distinct primes. Prove that x # 0 is a primitive root modulo p if and only if xtp-‘l/p* $ 1 (mod p) for 1 5 i 5 n.

4.5 Suppose that n = pq, where p and q are distinct odd primes and ab I 1 (mod (p -

l)(q - 1)). The RSA encryption operation is e(x) = xb mod n and the decryption operation is d(y) = ya mod n. We proved that d(e(x)) = x if z E Z,,*. Prove that the same statement is true for any x E Z,.

HINT Use the fact that ZI = x2 (mod pq) if and only if xi ss 22 (mod p) and xi = 22 (mod q). This follows from the Chinese remainder theorem.

4.6 ‘Iwo samples of RSA ciphertext are presented in Tables 4.1 and 4.2. Your task is to decrypt them. The public parameters of the system are n = 18923 and b = 1261 (for Table 4.1) and n = 31313 and b = 4913 (for Table 4.2). This can be accomplished as follows. First, factor n (which is easy because it is so small). Then compute the exponent a from 4( n , and, finally, decrypt the ciphertext. Use ) the square-and-multiply algorithm to exponentiate modulo n.

In order to translate the plaintext back into ordinary English text, you need to know how alphabetic characters are “encoded’as elements in &. Each element of


TABLE 4.1 RSA Ciphertext

12423 11524 7243 7459 14303 6127 10964 16399 9792 13629 14407 18817 18830 13556 3159 16647 5300 13951 81 8986 8007 13167 10022 17213 2264 961 17459 4101 2999 14569 17183 15827

12693 9553 18194 3830 2664 13998 12501 18873 12161 13071 16900 7233 8270 17086 9792 14266 13236 5300 13951 8850 12129 6091 18110 3332 15061 12347 7817 7946 11675 13924 13892 18031 2620 6276 8500 201 8850 11178 16477 10161 3533 13842 7537 12259 18110 44 2364 15570 3460 9886 8687 4481 11231 7547 11383 17910

12867 13203 5102 4742 5053 15407 2976 9330 12192 56 2471 15334 841 13995 17592 13297 2430 9741 11675 424 6686 738 13874 8168 7913 6246 14301 1144 9056 15967 7328 13203

796 195 9872 16979 15404 14130 9105 2001 9792 14251 1498 11296 1105 4502 16979 1105

56 4118 11302 5988 3363 15827 6928 4191 4277 10617 874 13211 11821 3090 18110 44 2364 15570 3460 9886 9988 3798 1158 9872

16979 15404 6127 9872 3652 14838 7437 2540 1367 2512 14407 5053 1521 297 10935 17137 2186 9433 13293 7555 13618 13000 6490 5310

18676 4782 11374 446 4165 11634 3846 14611 2364 6789 11634 4493 4063 4576 17955 7965

11748 14616 11453 17666 925 56 4118 18031 9522 14838 7437 3880 11476 8305 5102 2999

18628 14326 9175 9061 650 18110 8720 15404 2951 722 15334 841 15610 2443 11056 2186

&, represents three alphabetic characters as in the following examples:

DOG + 3x26*+14x26+6 = 2398 CAT + 2x26*+0x26+19 1371 ZZZ + 25 x26*+25x26+25 1 17575.

You will have to invert this process as the final step in your program. The first plaintext was taken from ‘The Diary of Samuel Marchbanks,” by Robert-

son Davies, 1947, and the second was taken from “Lake Wobegon Days,” by Garrison Keillor, 1985.

4.7 This exercise exhibits what is called a protocol failure. It provides an example where ciphertext can be decrypted by an opponent, without determining the key, if a cryptosystem is used in a careless way. (Since the opponent does not determine the key, it is not accurate to call it cryptanalysis.) The moral is that it is not sufficient to use a “secure” cryptosystem in order to guarantee “secure” communication.

Suppose Bob has an RSA Cryptosystem with a large modulus n for which the factorization cannot be found in a reasonable amount of time. Suppose Alice sends

Exercises 159

TABLE 4.2 RSA Ciphertext

6340 8309 14010 23614 7135 24996 27584 14999 4517 25774 7647 23901 7908 8635 2149 4082 11803 5314

15698 30317 4685 1417 26905 25809

12437 1108 27106 23005 8267 9917 15930 29748 8635 27486 9741 2149 18154 22319 27705 2149 16975 16087

19554 23614 7553 3183 17347 25234 6000 31280 29413

25973 4477 30989

8936 27358 25023 16481 25809 30590 27570 26486 30388 9395 12146 29421 26439 1606 17881 7372 25774 18436 12056 13547 1908 22076 7372 8686 1304 107 7359 22470 7372 22827

14696 30388 8671 29956 15705 28347 26277 7897 20240 21519 18743 24144 10685 25234 30155 7994 9694 2149 10042 27705

23645 11738 24591 20240 27212 29329 2149 5501 14015 30155 20321 23254 13624 3249 5443 14600 27705 19386 7325 26277 4734 8091 23973 14015 107 4595 21498 6360 19837 8463 2066 369 23204 8425 7792

a message to Bob by representing each alphabetic character as an integer between 0 and 25 (i.e., A t) 0, J3 +) 1, etc.), and then encrypting each residue modulo 26 as a separate plaintext character.

(a) Describe how Oscar can easily decrypt a message which is encrypted in this way.

(b) Illustrate this attack by decrypting the following ciphertext (which was encrypted using an RSA Cryptosystem with n = 18721 and b = 25) without factoring the modulus:

365,0,4845,14930,2608,2608,0.

4.8 This exercise illustrates another example of a protocol failure (due to Simmons) involving RSA; it is called the common modulus protocol failure. Suppose Bob has an RSA Crpyotsystem with modulus n and encryption exponent bi, and Charlie has an RSA Cryptosystem with (the same) modulus n and decryption exponent b2. Suppose also that gcd(bi, b2) = 1. Now, consider the situation that arises if Alice encrypts the same plaintext x to send to both Bob and Charlie. Thus, she computes yi = x’r mod n and y2 = x62 mod n, and then she sends yt to Bob and yz to Charlie. Suppose Oscar intercepts yi and yz, and performs the computations indicated in Figure 4.16.

(a) Prove that the value x I computed in step 3 of Figure 4.16 is in fact Alice’s plaintext, z. Thus, Oscar can decrypt the message Alice sent, even though the cryptosystem may be “secure.”

(b) Illustrate the attack by computing x by this method if n = 18721, bt = 945, b2 = 7717, yt = 12677 and yr = 14702.

4.9 We give yet another protocol failure involving RSA. Suppose that three users in a network, say Bob, Bart and Bert, all have public encryption exponents b = 3.


FIGURE 4.16 RSA common modulus protocol failure

Input: n, h, b2, YI, ~2 1. compute ct = bt -’ mod b2 2. computec2 = (ctbt - 1)/b* 3. compute 21 = ytcl (~2~~)~' mod n

4.10

4.11

4.12

4.13

4.14

Let their moduli be denoted by nt, n2, na. Now suppose Alice encrypts the same plaintext 2 to send to Bob, Bart and Bert. That is, Alice computes yi = x3 mod ni, 1 5 i 5 3. Describe how Oscar can compute x, given yt,y2 and ys, without factoring any of the moduli. A plaintext I is said to beJiredif eK(z) = x. Show that, for the RSA Cryptosystem, the number of fixed plaintexts z E Z n*isequaltogcd(b-l,p-l)xgcd(b-l,q-l).

HINT ConsiderthesystemoftwocongruenceseK(z) =x (modp),eK(z) 3 x (mod 4.

Suppose A is a deterministic algorithm which is given as input an BSA modulus n and a ciphertext y. A will either decrypt y or return no answer. Supposing that there are e x n ciphertexts which A is able to decrypt, show how to use A as an oracle in a Las Vegas decryption algorithm having failure probability e.

HINT Use the multiplicative property of RSA that eK(xt)eK(x2) = eK(ztz2). where all arithmetic operations are modulo n.

Write a program to evaluate Jacobi symbols using the four properties presented in Section 4.5. The program should not do any factoring, other than dividing out powers of two. Test your program by computing the following Jacobi symbols:

(3 ’ GG%? 1 (:::p:lTl> . Write a program that computes the number of Euler pseudo-primes to the bases 837, 851,and 1189. The purpose of this question is to prove that the error probability of the Solovay- Strassen primality test is at most l/2. Let z,* denote the group of units modulo n. Define

G(n) = { a : a E &‘, ; E &4/* 0

(mod n)} .

(a) Prove that G(n) is a subgroup of z,‘. Hence, by Lagrange’s theorem, if G(n) # &*, then

IG(n)I 5 v 5 q.

(b) Suppose n = pkq, wherep and q are odd, p is prime, k 2 2, and gcd(p, q) =

1. Let 0 = 1 +pk--Iq. Prove that

0 z $ a(“-‘)‘* (mod n).

HINT Use the binomial theorem to compute o(R-1)/2.

Exercises 161

(c) Suppose n = p1 . . .ps, where the pi’s are distinct odd primes. Suppose a = u (modpt) ando = 1 (modmm . . . p,), where u is a quadratic non- residue modulo p1 (note that such an a exists by the Chinese remainder theorem). Prove that

a 0

- n

z -1 (mod n),

but a (n-‘)‘2 z 1 (mod ~2~3.. .pd),

so a(n-‘)‘2 f -1 (mod n).

(d) If n is odd and composite, prove that IG(n)I <_ (n - 1)/2. (e) Summarize the above: prove that the error probability of the Solovay-Strassen

primality test is at most l/2. 4.15 Suppose we have a Las Vegas algorithm with failure probability e.

(a) Prove that the probability of first achieving success on the nth trial is pn =

en-y 1 - e). (b) The average (expected) number of trials to achieve success is

m

C( n x p,). VI=1

Show that this average is equal to l/( 1 - e). (c) Let 6 be a positive real number less than 1. Show that the number of iterations

required in order to reduce the probaility of failure to at most 6 is

log2 6 L I G’

4.16 Suppose Bob has carelessly revealed his decryption exponent to be a = 14039 in an RSA Cryptosystem with public key n = 36581 and b = 4679. Implement the probablistic algorithm to factor n given this information. Test your algorithm with the “random” choices w = 9983 and w = 13461. Show all computations.

4.17 Prove Equations 4.1 and 4.2 relating the functions half and parity. 4.18 Supposep = 199, q = 211 and B = 1357 in the Rabin Cryptosystem. Perform

the following computations. (a) Determine the four square roots of 1 modulo n, where n = pq. (b) Compute the encryption y = eK(32767). (c) Determine the four possible decryptions of this given ciphertext y.

4.19 Factor 262063 and 9420457 using the p - 1 method. How big does B have to be in each case to be successful?

Documents

The RSA System and Factoring 116 4.3 The RSA Cryptosystem ...crypto.cs.mcgill.ca/~crepeau/CS547/BOOK/Theory4.pdf · This and related systems are based on the difficulty of the subset