Notes on Cryptography 2014

7/21/2019 Notes on Cryptography 2014

1/228

POLITECNICO DI TORINO

Notes on Cryptography

by MICHELE ELIA

A.A. 2014-2015


2/228


3/228

my teaching, including in obscure times, and in very difcult conditions. Un-fortunately, their hope of having written notes has been satised too late; how-ever, they have my sincere and permanent gratitude for their anonymous, non-rewarded but warm and tangible support. I am indebted to Frances Cooper forher professional and friendly revision of the English, revision that has also greatly

improved the presentation from a logical point of view. I want also to thank Dr.Guglielmo Morgari (of Telsy) for his careful reading and for pointing out a lot of typos and mistakes. The nal technical quality owes much to his professional anddeep knowledge of the subject. Obviously, any error or questionable viewpoint,is my responsibility alone, and due to my too many limits.

Turin, September 2014michele elia

ii


4/228

Texts

J. Hoffstein, J. Pipher, J.H. Sylverman, An Introduction to mathematical Cryp-tography, Springer, New York, 2008.

N. Koblitz, A Course in Number Theory and Cryptography, Springer, NY, 1987.

F. Fabris, Teoria dellInformazione, Codici, Cifrari, Boringhieri, Torino, 2001. R. Mollin, An Introduction to Cryptography, CRC, NY, 2007.

Manuals

A.J. Menezes, P.V. vanOorschot, S.A. Vanstone, Handbook of Applied Cryptog-raphy, CRC, New York, 1997.

iii


5/228

iv


6/228

Contents

1 Cryptography from Art to Science - 1.1 -1.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . - 1.1 -1.2 Information Protection . . . . . . . . . . . . . . . . . . . . . . . . . . - 1.2 -

1.2.1 The goals of Information Protection . . . . . . . . . . . . . . - 1.2 -1.2.2 Aims . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . - 1.4 -1.2.3 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . - 1.4 -

1.3 Historical glimpses . . . . . . . . . . . . . . . . . . . . . . . . . . . . - 1.5 -1.3.1 Cryptography from diplomacy to commerce . . . . . . . . . - 1.5 -1.3.2 From art to science . . . . . . . . . . . . . . . . . . . . . . . . - 1.8 -

2 The Shannon theory of secrecy systems - 2.1 -2.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . - 2.1 -2.2 Uncertainty: Entropy and Mutual Information . . . . . . . . . . . . - 2.3 -2.3 Uncertainty and Secrecy . . . . . . . . . . . . . . . . . . . . . . . . . - 2.4 -

2.3.1 Binary message encryption . . . . . . . . . . . . . . . . . . . - 2.8 -2.4 Cryptology . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . - 2.8 -2.5 Cryptography . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . - 2.9 -2.6 Steganography . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . - 2.12 -

3 Random Sequences and Statistics - 3.1 -3.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . - 3.1 -

3.1.1 Sample Spaces . . . . . . . . . . . . . . . . . . . . . . . . . . - 3.3 -3.2 Statistical Tests for Binary Sequences . . . . . . . . . . . . . . . . . . - 3.6 -

3.2.1 Linear Complexity Prole. . . . . . . . . . . . . . . . . . . . . - 3.13 -

4 Secret-Key Cryptography - Act I

Block ciphers - 4.1 -4.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . - 4.1 -4.2 The role of the secret-key . . . . . . . . . . . . . . . . . . . . . . . . . - 4.1 -4.3 Historical Encryption Systems . . . . . . . . . . . . . . . . . . . . . . - 4.3 -

4.3.1 Substitution encryption . . . . . . . . . . . . . . . . . . . . . - 4.3 -4.3.2 Transposition encryption . . . . . . . . . . . . . . . . . . . . - 4.3 -4.3.3 Albertis disk . . . . . . . . . . . . . . . . . . . . . . . . . . . - 4.4 -4.3.4 Vigenere cipher . . . . . . . . . . . . . . . . . . . . . . . . . . - 4.5 -4.3.5 Hill Cipher . . . . . . . . . . . . . . . . . . . . . . . . . . . . - 4.5 -

v


7/228

4.3.6 Francis Bacon Cipher . . . . . . . . . . . . . . . . . . . . . . . - 4.6 -4.3.7 One-time pad . . . . . . . . . . . . . . . . . . . . . . . . . . . - 4.6 -4.3.8 Enigma . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . - 4.7 -

4.4 Block ciphers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . - 4.7 -4.4.1 Common structure of block ciphers . . . . . . . . . . . . . . - 4.7 -

4.4.2 Modes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . - 4.8 -4.5 DES . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . - 4.11 -

4.5.1 DES transformations . . . . . . . . . . . . . . . . . . . . . . . - 4.13 -4.5.2 Local key generation . . . . . . . . . . . . . . . . . . . . . . . - 4.15 -

4.6 AES . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . - 4.17 -4.6.1 Round Transformations . . . . . . . . . . . . . . . . . . . . . - 4.19 -4.6.2 Local Key generation . . . . . . . . . . . . . . . . . . . . . . . - 4.20 -

5 Secret-Key Cryptography - Act IIStream ciphers - 5.1 -

5.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . - 5.1 -5.1.1 The structure . . . . . . . . . . . . . . . . . . . . . . . . . . . - 5.2 -5.1.2 Finite State Machines . . . . . . . . . . . . . . . . . . . . . . . - 5.3 -

5.2 Output functions - Boolean functions . . . . . . . . . . . . . . . . . . - 5.3 -5.3 Periodic generators and LFSRs . . . . . . . . . . . . . . . . . . . . . - 5.5 -

5.3.1 The mathematics of LFSRs . . . . . . . . . . . . . . . . . . . . - 5.7 -5.4 Linear Codes and Binary sequences . . . . . . . . . . . . . . . . . . - 5.9 -

5.4.1 BCH codes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . - 5.11 -5.4.2 Goppa codes . . . . . . . . . . . . . . . . . . . . . . . . . . . - 5.12 -

5.5 Nonlinear Feedback Shift Registers . . . . . . . . . . . . . . . . . . . - 5.13 -5.5.1 Clock-controlled LFSR . . . . . . . . . . . . . . . . . . . . . . - 5.13 -5.5.2 Self-Clock-controlled LFSR . . . . . . . . . . . . . . . . . . . - 5.14 -5.5.3 Clock-controlling and puncturing . . . . . . . . . . . . . . . - 5.15 -5.5.4 LCP of clock-controlled LFSR sequences . . . . . . . . . . . . - 5.15 -

5.6 Encryption with rate less than 1 . . . . . . . . . . . . . . . . . . . . . - 5.17 -5.7 Appendix I - Representation of Finite Fields . . . . . . . . . . . . . . - 5.20 -5.8 Appendix II - Linear recurrent equations in F q . . . . . . . . . . . . - 5.21 -

5.8.1 Generating functions . . . . . . . . . . . . . . . . . . . . . . . - 5.23 -5.8.2 Characteristic equation methods . . . . . . . . . . . . . . . . - 5.27 -

5.9 Appendix III - Tridiagonal matrices and LFSRs . . . . . . . . . . . . - 5.31 -6 Public-key Cryptography - 6.1 -

6.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . - 6.1 -6.1.1 One-way functions . . . . . . . . . . . . . . . . . . . . . . . . - 6.3 -

6.2 The RSA Scheme . . . . . . . . . . . . . . . . . . . . . . . . . . . . . - 6.6 -6.3 The Rabin Scheme . . . . . . . . . . . . . . . . . . . . . . . . . . . . . - 6.10 -6.4 The El Gamal Scheme . . . . . . . . . . . . . . . . . . . . . . . . . . . - 6.12 -6.5 The McEliece Scheme . . . . . . . . . . . . . . . . . . . . . . . . . . . - 6.14 -

vi


8/228

7 Electronic signatures - 7.1 -7.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . - 7.1 -

7.1.1 Electronic signature of an electronic document . . . . . . . . - 7.4 -7.2 Components of Electronically Signed Documents . . . . . . . . . . . - 7.5 -

7.2.1 Document . . . . . . . . . . . . . . . . . . . . . . . . . . . . . - 7.5 -

7.2.2 Standard hash function SHA-1 . . . . . . . . . . . . . . . . . - 7.8 -7.3 Signature based on RSA . . . . . . . . . . . . . . . . . . . . . . . . . - 7.8 -7.4 Signature based on Rabin scheme . . . . . . . . . . . . . . . . . . . . - 7.9 -7.5 Signature based on El Gamal . . . . . . . . . . . . . . . . . . . . . . - 7.13 -7.6 Blind signature . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . - 7.14 -7.7 Secret Sharing - Shamir . . . . . . . . . . . . . . . . . . . . . . . . . . - 7.16 -

8 Complexity - 8.1 -8.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . - 8.1 -

8.1.1 A heuristic view of computational complexity . . . . . . . . - 8.3 -8.2 Complexity: the Heart of Cryptography . . . . . . . . . . . . . . . . - 8.5 -

8.2.1 One-way functions . . . . . . . . . . . . . . . . . . . . . . . . - 8.7 -8.3 Arithmetic complexity . . . . . . . . . . . . . . . . . . . . . . . . . . - 8.8 -

8.3.1 Complexity of product and exponentiation . . . . . . . . . . - 8.8 -8.3.2 Finite eld Arithmetics . . . . . . . . . . . . . . . . . . . . . . - 8.9 -

8.4 Factorization complexity . . . . . . . . . . . . . . . . . . . . . . . . . - 8.10 -8.4.1 Factorization in Z . . . . . . . . . . . . . . . . . . . . . . . . . - 8.11 -

8.5 Discrete logarithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . - 8.11 -8.5.1 Discrete logarithm as one-way function . . . . . . . . . . . . - 8.13 -8.5.2 Discrete Logarithm Complexity . . . . . . . . . . . . . . . . . - 8.14 -8.5.3 Shanks Bound . . . . . . . . . . . . . . . . . . . . . . . . . . - 8.16 -

8.6 Searching Unsorted Data (SUD) . . . . . . . . . . . . . . . . . . . . . - 8.17 -

9 ECC - 9.1 -9.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . - 9.1 -9.2 Elliptic Curves and Group Law . . . . . . . . . . . . . . . . . . . . . - 9.3 -

9.2.1 Group Law . . . . . . . . . . . . . . . . . . . . . . . . . . . . - 9.4 -9.3 EC over Finite Fields . . . . . . . . . . . . . . . . . . . . . . . . . . . - 9.7 -9.4 EC Public-key Schemes . . . . . . . . . . . . . . . . . . . . . . . . . . - 9.10 -9.5 Arithmetics and complexity in ECC . . . . . . . . . . . . . . . . . . - 9.11 -9.6 Historical Notes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . - 9.14 -

9.6.1 The origins . . . . . . . . . . . . . . . . . . . . . . . . . . . . - 9.15 -

10 Cryptanalysis - 10.1 -10.1 Int roduct ion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . -10.1-10 .2 Axioms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . -10 .2-10.3 Cryptanalysis of secret-key systems . . . . . . . . . . . . . . . . . . - 10.3 -

10.3.1 Cryptanalysis of classic schemes . . . . . . . . . . . . . . . . - 10.4 -10.4 DES Cryptanalysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . - 10.20 -10.5 Cryptanalysis of Public Key Systems . . . . . . . . . . . . . . . . . . - 10.23 -

vii


9/228

10.5.1 Factorization . . . . . . . . . . . . . . . . . . . . . . . . . . . - 10.23 -10.5.2 Discrete logarithms . . . . . . . . . . . . . . . . . . . . . . . . - 10.26 -

11 Cryptography in GSM - 11.1 -11.1 Evolution of cellular systems . . . . . . . . . . . . . . . . . . . . . . - 11.2 -11.2 GSM . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . - 11.3 -

11.2.1 Origins . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . - 11.3 -11.2.2 Communication aspects . . . . . . . . . . . . . . . . . . . . . - 11.5 -11.2.3 Security and Protections . . . . . . . . . . . . . . . . . . . . . - 11.6 -

11.3 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . - 11.9 -

12 Steganography - 12.1 -12.1 Int roduct ion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . -12.1-12.2 Some historical notes . . . . . . . . . . . . . . . . . . . . . . . . . . . -12.2-12.3 Steganographic channel models . . . . . . . . . . . . . . . . . . . . . - 12.4 -12.4 Concealment issues . . . . . . . . . . . . . . . . . . . . . . . . . . . . - 12.6 -

12.4.1 Examples, Simulation, and Results . . . . . . . . . . . . . . . - 12.7 -12.5 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . - 12.11 -

viii


10/228

- 0.0 -


11/228

Chapter 1Cryptography from Art to Science

Some people are so busylearning the tricks of the tradethat they never learn the trade.

VERNON LAW (Pittsburgh Pirates pitcher)

1.1 IntroductionIt is a fact of recent history that, in the last two decades of the twentieth cen-tury, a scientic, technological, and cultural revolution swept through the com-munication systems of high-technology countries. Satellite telecommunications,cellular telephony, digital television, the Internet and personal computers showthat the convergence of telecommunications and computer technology has over-turned the entire world order of Information Technology. This atypical revolutionhas had unforeseeable repercussions also on the traditional methods of knowl-edge production and transmission. However, the effects in these elds will only be observed in the coming decades, and they will probably turn out to be muchmore far-reaching than the highly visible modications already produced on theworld economy and nance. Commerce is increasingly based on the Internet,with sometimes disturbing effects on the consolidated systems of dealing and of handling goods. In the banking world, the traditional branch, thanks to the Inter-net, has expanded to enter into the homes of net customers, modifying both theway users relate to the banking system and the inner organization of the banks

themselves.Whereas in one respect these perhaps irreversible phenomena have improvedthe quality of life, they have conversely made the system as a whole more fragileand more sensitive to any recession. Adversaries of all types, compatriots or for-eigners, governmental or private bodies, can order and scan plain text they haveintercepted and selected, based on details of your address, or on convenient keywords present in the message. This improper monitoring activity has been goingon for decades, obviously even before the computer made the job so much eas-ier. The novelty comes from the proportions and the number of customers who

- 1.1 -


12/228

entrust their personal transactions and secrets to ber optics, to copper cables orto the ether. The more a country is technologically advanced, the more usuallywill it be susceptible to interception of electronic trafc. Therefore, protection of information is becoming an unavoidable necessity to assure a societys operativelife. The technologies for protecting information have been developed in the dis-

cipline known as cryptology. For millennia, cryptology had as main objective thecondentiality of information, but in recent times, the technological evolution,together with the creation of a world-wide society with integrated services andglobal systems of communication, has delegated much more extensive, wider-ranging and more complex objectives to cryptology. Specically, the number of services that need some form of information protection is continuously growing.Any list would fail to be complete, but will be topped by the telephone, e-mail, e-commerce, tele-working, remote monitoring, tele-medicine, and could continuealmost indenitely.

1.2 Information ProtectionIt does not appear that the denition of a system for protecting information can be formulated in a denitive manner through authoritative statements. Rather,security comes from the concurrence of needs, situations, and purposes that con-tribute to dening the scenario in which information plays the role of principalactor.A system for the protection of information depends on:

1. Accuracy of the principles.

2. Robustness of the mathematical procedure used to transform the informa-tion.

3. Physical security of the technological equipment that processes the infor-mation and the environments where such devices reside.

4. Discipline of employees, by discipline meaning the mental attitude and the behavioral attention to details that could make even the most technically-secure system vulnerable.

As just noted, security systems bring together many components of a human andtechnical nature. Among these, an important role is played by cryptology andrelated mathematical techniques.

1.2.1 The goals of Information Protection

The objectives for protecting information against deliberate manipulation in gen-eral should respond to four basic questions:

1) What information to protect?

- 1.2 -


13/228

i) The message as such, keeping it condential;ii) The integrity of the message, that is guaranteeing it is received cor-

rectly by the recipient, whether privately or not;iii) The authenticity of the message, that is reassuring the recipient about

the identity of the messages author;

iv) The very existence of the message.

2) Why protect the information?

(a) To ensure integrity: Information should be preserved in its originalform. It must not be fraudulently altered and passed off as authentic.

(b) To ensure availability: Information should be usable when required,without delay or uncertainty.

(c) To ensure condentiality: The information must be kept as private asthe owner wants. Only authorized persons or entities can have access.

(d) To ensure privacy: it should not be possible to trace the source of in-formation.

3) Against whom to protect the information?

(a) Against opponents determined to steal it;(b) Against accidental or deliberate destruction;(c) Against improper or unauthorized use.

4) How to protect the information?

(a) In a physical manner, i.e. endowing physical locations or equipmentwith defenses difcult to crack;

(b) In a logical manner, that is by transforming the information so that itcannot be stolen, understood, or manipulated by any opponent;

(c) In a virtual way, namely by preventing persons from locating the in-formation in real terms.

Although these statements may sound authoritative, it is not in any way pos-sible to give denite and nal answers to the above four questions, if such re-sponses even exist. Rather, these questions and their partial answers direct thepresentation of cryptography and related mathematical techniques, to give secu-

rity managers the most valuable tools that are available at the current state of knowledge. With reference to how to protect the information, the techniques de-veloped to hide the very existence of the message have had a somehow moreesoteric development than cryptographic techniques proper, and fell into the dis-cipline known as steganography (a word of Greek origin that means coveredwriting). The rst recorded use of steganography is in the title of a book byThrithemius. Steganography has recently experienced a great revival, mainlythanks to the Internet, and a short overview will be given in the last chapter of these Notes.

- 1.3 -


14/228

1.2.2 Aims

The situation that pits the defender against the attacker has a dual aspect, thatcharacterizes the two main branches into which Cryptology is partitioned: cryp-tography/steganography and cryptanalysis.Cryptography/steganography pursues ve main goals:

- To protect against intruders, ensuring that access to the information is re-served to authorized persons, entities, or devices.

- To protect from deliberate destruction or alteration, ensuring the datas in-tegrity, both logical (meaning of the texts) and physical (supporting paper,magnetic tapes, CD-ROMs, etc..).

- To prevent shadowing (authenticity), namely to ensure recognition of thesource of information.

- To prevent repudiation (signature), to ensure the impossibility of denyingthe origin of a message.

- To prevent tracking, ensuring anonymity of the source and route of mes-sages, objects or people.

The purposes of cryptanalysis are operations that may be the converse of the aimsof the above list, namely:

- To determine the contents of a message.

- To destroy a message, i.e. to deliberately prevent communication betweentwo parties.

- To falsify, that is to send a message as if it were from another author, such aslaunching a communication with a party and being accepted as a legitimatecounterpart.

- To deny being the author of ones own message.

- To trace the origin and path of messages, objects, or people.

1.2.3 Summary

The ve situations considered above are at the core of modern cryptology, andcan all be incorporated into a mathematical description in the framework of theShannon information theory. However, for practical purposes, it has been pre-ferred to develop a discipline that is apparently independent, referring to infor-mation theory only for the basic principles. This will be the subject of the follow-ing chapters.

- 1.4 -


15/228

1.3 Historical glimpses

The millenary history of cryptology began in ancient Egypt at the Court of thePharaohs where, between sphinxes, pyramids, and plots, for millennia the powergame was played. But it was the warrior soul of Greece, with its oligarchic sys-tem, kingdoms, and ambitions of military and cultural domination, that rst sys-tematically applied a cryptographic method of which we have any certain knowl-edge. In the silent palaces of Sparta, King Agide encrypted messages directed tohis distant generals in charge of controlling the eastern Mediterranean, by rollingup a string of papyri, helicoidally around a skytale (command baton) and writ-ing his message along the length of the roll. The straightened string of papyriwith the encrypted message looked like a chaotic set of symbols. To read his mes-sage, the general rolled up the string around a baton of the same diameter. Today,these procedures for exchanging secret messages may move us to smile. Never-theless, they solved the problem of private communication in an acceptable way,compatibly with the available technology.

1.3.1 Cryptography from diplomacy to commerce

From the Spartan hegemony on the Aegean sea, through the grandeur of theRoman Empire, the effervescent political and cultural milieu of the Italian Re-naissance, down to the modern supra-national governments, cryptography has been variously, but almost exclusively, used in affairs of power. The impulse toits development was almost always given by the exigencies of war. Surely, thecomplex needs of the Roman army to exchange secret messages at the time of Gaius Julius Caesar promoted the invention and diffusion of a method for con-cealing information that was relatively secure, and at the same time operativelyeasy. The cryptographic method known as Caesars cipher consisted in substitut-ing each letter with a letter three positions onward (in the natural alphabeticalorder from A to Z of the letters). For example, the letter A is substituted with D,B with E, and so on, W being replaced by Z. The last three letters X, Y, and Z aresubstituted with A, B, and C, respectively. The rule was very easy and number3 was the secret key for enciphering and deciphering. The decryption operationto recover the original message from the encrypted text consisted of the inversesubstitution, which can be described similarly: each letter is substituted with theletter three positions before it. Technically, in jargon, this encryption rule is called

mono-alphabetic, while its generalization is called polyalphabetic substitution.This general and relatively strong encryption rule (i.e. polyalphabetic substitu-tion) was perfected by Blaise de Vigen` ere and reported in his Traict e des chiffres,ou secretes manieres d escrire published in 1586, where a square table that bearshis name appeared with a certain emphasis, for the rst time. However, this table had already been reported in De Furtivis Literarum Notis by Giovanni BattistaDella Porta, published in 1563. The Vigen` ere polyalphabetic enciphering waslong considered impossible to crack. In polyalphabetic encryption, the key con-sists of an ordered set of numbers (or letters), for example, encrypting with a key

- 1.5 -


16/228

consisting of the numbers 3 and 12, the letters of the text, starting from the rst,are alternately transformed by mono-alphabetic substitution as in the Caesar ci-pher, with keys 3 and 12.The principle of substituting a message with a second message, according to arule governed by a secret key, easy to remember, in such a way that only the

person who has the secret key may go from the encrypted message to the originalmessage, constitutes the essential part of private key encryption. The rst trea-tises about these cryptographic techniques appeared around the sixteenth cen-tury in Italy, although the rst known manual of cryptography had already beenpublished in 1379 by one Gabriele de Lavinde of Parma, possibly a cryptographerwho served in the secretariat of Clemente VII, the antipope.A prominent position in the literature on cryptography is occupied by De Com- ponendis Cyfris (1466) by Leon Battista Alberti, a work in which, together withthe principle of polyalphabetic encryption, the rst encryption disc is described,and the concept of cryptanalysis is introduced. In particular, several methods forattacking encrypted messages are proposed. Cryptanalysis was highly prized, astestied by many books of the same period, written by scientists of the time whoalso acted as court cryptographers, such as Cicco Simonetta with the Sforza fam-ily in Milan, Giovanni Soro serving the Venetian Republic, and Giovanni BattistaArgenti, who served the Pope. Interest in the art of secret writing was certainlystimulated by the quarrelsome nature of the princes of the time, and the typicalItalian taste for political manoeuvring. Meanwhile, connoisseurs of cryptogra-phy were artists, scientists, and politicians, working directly in other sectors like,for example, the mathematicians of the Bologna school. In this cultural envi-ronment, the contribution to cryptography made by the great Lombard mathe-matician Girolamo Cardano was both varied and remarkable. In his work De

Subtilitate libri XXI (1550), Cardano describes, among other subjects, a lock withrotors that could be unlocked only by a given letter combination (the letters be-ing written on the external side of the rotors). Among eminent Europeans whowere interested in cryptography during the fortunate Renaissance period are menof vast cultural interests, like the already-cited Leon Battista Alberti, GiacomoCasanova (who rst intuited how to break polyalphabetic encryption), famousmathematicians like John Wallis and Francois Vi` ete, and Francis Bacon, philoso-pher and statesman. In the seventeenth and eighteenth centuries, progress in thecryptographic eld was slow and insignicant. The evolution restarted, quitesuddenly, around the middle of the nineteenth century, sustained this time by the

industrial revolution and by the economic and governmental interests of the greatmodern States. In 1863, the important work Die Geheimschriften und die Dechiffrir-Kunst by Friedrich W. Kasiski was published. This book describes in detail howto break, with cryptanalysis, the Vigen` ere encryption. Despite general works,like the Trait e de cryptographie by Lange and Sourdat, published in 1925, or theimportant Manuale di crittograa by the general Luigi Sacco, published in 1936,which is one of the best-known and most interesting cryptography treatises of the early twentieth century, the true great leap forward toward a mathematicaltheory of cryptography only occurred at the end of the second world war. The

- 1.6 -


17/228

most signicant progress in the nineteenth century comprised the introduction of encryption machines, the natural evolution of the enciphering discs introducedin the Renaissance. In fact, the diffusion of cryptography, especially in militarycontexts due to the size and composition of the army, entailed using operatorswho were not skilled as designers of secret codes, imposing de facto the use of

automatic systems for encrypting and decrypting, that is enciphering machines.The demands of war gave a big boost to the improvement of such equipment.Obviously, cryptographic technology had a development parallel to that of prin-ciples and techniques for masking (protecting the secrecy of) information. Never-theless, to become effective, fast, and reliable instruments, encryption machinesneeded the mechanical and electrical technology that only became available withthe advance of industrial development. In La Cryptographie militaire (1883) Au-guste Kerckhoff von Niuewenhof formulated the main principles of cryptologythat must be satised by any encryption equipment used by the army. In 1891,the Frenchman Etienne Bazeries invented an encryption machine that was used(at least) until the second world war.In 1917, the American Gilbert S. Vernan of AT&T invented an encryption ma-chine for teletypes based on the polyalphabetic enciphering of Vigen ere, whichwas adopted by the U.S. Army Signal Corps. Vernans great innovation was theway that the enciphered text was obtained by combining, bit by bit, two binarysequences, introducing de facto the modern stream ciphers. The encryption ma-chine designed by the Swede Boris Hagelin around 1920 became famous, andwas also used by the American army. The Hagelin enciphering machine was acompetitor of the Enigma encryption machine used by the German army. TheEnigma machine was invented by the German Arthur Scherbius, an electrical en-gineer, and patented in April 1918. It was composed of four rotating discs (rotors),

whose initial position was part of the secret key. It was rst used by the Germanpost for encrypting telegrams. After long cryptanalytical studies, it was adopted by the German Navy in 1926. The structure of Enigma was further improved,until it achieved satisfactory strength against any kind of cryptanalytical attack.Without management mistakes or partial private information on the secret keys,it could be considered denitively secure.Enigma represents the apex of the evolution of the electro-mechanical encryp-tion machines based on rotors, and fully motivates the great efforts made by theAllies to decrypt its messages. During World War Two, attacks against Enigmawere organized by the English secret services with an extraordinary deployment

of resources. At Bletchley Park, a town 75 km North-West of London, a group of cryptographers worked for the entire duration of the war, trying to decrypt, withalternating fortunes and acceptable success, the messages encrypted by Enigma.In these efforts, the rst electronic computers were employed to implement math-ematical attack criteria. Attacks were rst developed by Polish mathematicians,and later by a collaboration among famous mathematicians including Alan Tur-ing.After the second world war, encryption machines continued to be introduced, im-plemented with the new electronic technologies, but the underlying algorithms

- 1.7 -


18/228

were still based on the old principle of rotors, borrowed from Albertis disc. Tomeet the requirements of the globally-expanding economy, several standardiza-tion processes were started. In the 1970s, the most widely-debated system forprivate key encryption was the DES (Data Encryption Standard) proposed by theAmerican National Bureau of Standards, and developed on an initial project by

IBM. DES, and its successor AES (Advanced Encryption Standard) may representthe last step of the rotor machine development.In the subsequent evolution, the word machine is still maintained, but it must be intended as a mathematical computing procedure by means of algorithms.All these machines are commonly known as encrypting machines. They realizeprivate key encryption and represent the modern variant of the Caesar cipher,improved with tricks aimed to achieve the perfect encryption system known asone time pad. Actually, this system, notoriously used in the fascinating worldof spying, encrypted a message by substituting the message letters by a combi-nation with letters suggested by special positions in the pages of a booklet (pad)used only one time. The system is practically unbreakable without knowing the book.

1.3.2 From art to science

Cryptography was treated as an art for centuries. From invisible ink to mecha-nisms combined with key words to open secret doors. From rings that shouldcombine to show incredible secret passages, to mysterious combinations of car-illon notes that open fabulous strongboxes. From the vaguely cryptic love mes-sages of Cirano de Bergerac to the beautiful Roxanna, to the light signals betweenlovers in the Certosa di Parma by Stendhal, every action, instrument, or eventcontributed to make cryptography a mysterious art.However, the needs of governments of great modern states called for somethingmore than a reliance on experts, however loyal, or accredited men skilled in cryp-tography. Thus, it is not entirely by chance that the English philosopher andstatesman Francis Bacon formulated the basic criteria that should be met by goodencryption systems. Bacon made a major contribution to the rise of a scientictheory of cryptography. But only relatively recently has a complete axiomaticformulation of cryptography been achieved, by merit of Claude Elwood Shan-non with the publication of his paper Communication theory and secrecy sys-tems, in 1949. Actually, this paper was already completed in 1945, but was clas-

sied material, and only after its declassication could it appear in a publicly-distributed journal. Key in this rigorous description of encryption/decryptionoperations was Information Theory, a mathematical theory of communicationalso due to Shannon. As a result of this approach, all impossible expectations of cryptographic protection were abandoned. With Shannons proof that certaintyin any cryptographic protection does not exist, the dream of perfect secrecy -nally waned. All protection is of a probabilistic nature. We may only reduce theprobability of violating a secret system, but we will never achieve the certainty of absolute inviolability.

- 1.8 -


19/228

The axiomatic formulation made of cryptography a discipline similar to math-ematics. Today, the prophetic words, pronounced by Adrian A. Albert at theopening of the 382nd Conference of the American Mathematical Society in 1939,are astonishingly concrete:

We shall see that cryptography is more than a subject permitting math-ematical formulation for indeed it would not be an exaggeration to state thatabstract cryptography is identical with abstract mathematics.

- 1.9 -


20/228

Chapter 2The Shannon theory of secrecysystems

There is nothing more difcult to take inhand, more perilous to conduct, or more un-certain in its success, than to take the lead inthe introduction of a new order of things.

Niccolo Machiavelli

2.1 Introduction

The theoretical foundation of modern cryptography can indisputably be attri- buted to Claude Elwood Shannon, with his 1949 publication of the paper Com-munication Theory and Secrecy Systems, in the Bell System Technical Journal [68].The paper, almost surely completed in 1945, was considered classied material(a document or information is classied when it is considered important for na-tional security) and it was declassied only four years later, just before its ap-pearance in the open literature. This paper that founded cryptology followed byone year, if possible, an even more important paper by Shannon, that is A Mathe-matical Theory of Communication, which appeared in 1948 in the same Bell SystemTechnical Journal. This paper was determinant for the theoretical and practicaldevelopment of telecommunications systems [69]. In it, after having introduced

axiomatically a measure of information, Shannon developed a totally new theory by means of which to describe all problems concerning the transmission, storage,and transformation of information. Shannons measure of information has the char-acter of a physical quantity, analogous to the measure of surfaces, energy, time,speed, etc.This denition of a measure of information has enabled the true nature of infor-mation to be established, tied to uncertainty and to discrete character. It has alsopermitted a better comprehension of the mechanisms conveying information, andin cryptography the understanding of the achievable limits of data security.

- 2.1 -


21/228

All information useful to mankind is represented by means of a nite number of symbols, in particular the binary form consisting of the use of two symbols, typi-cally represented with 0 and 1, is ubiquitous and has been universally adopted.The signals, or waveforms, used to carry information are continuous in timeand amplitude, that is they are well described by continuous functions. Never-

theless, today, most of them may be seen as digital objects, in the sense that theyare chosen from among a nite set of waveforms. However, many informationsignals are still continuous or analog, like audio signals or some broadcast videosignals, and the corresponding telecommunications systems are still said to beanalog systems.This situation is rapidly changing, as Shannons information theory is impos-ing the digital view, its philosophical signicance, and its axiomatic formulation,which, being strictly mathematical, is gradually dominating over every naive ap-proach. The fundamental theorems of Shannons are restricted to telecommuni-cations systems; however, two theorems emerging from the theory are of greatimportance in many elds of science.

Theorem 2.1. The nature of information is inherently discrete.

The formal proof of this theorem is based on the notion of differential en-tropy. However, a heuristic argument is the following: the information associ-ated to analog signals is innite, that is if we want to describe an analog signal of stochastic character we need an innite amount of information, while over anycontinuous channel conveying an analog signal, the amount of information given by the received signal on the transmitted signal is nite, that is channel noise de-stroy an innite amount of information. In conclusion, the received information,

being nite, can be described with a nite set of symbols.The second theorem is more useful in computer science, although its consequen-ces are also important in communications systems and cryptography.

Theorem 2.2. (Data Processing Theorem). Each time digital data are processed (i.e.transmitted, transformed, of stored), the amount of information carried may only dimin-ish.

Paradoxically, it often occurs that, after processing the information, is appar-ently more meaningful, because it is in a form that is compliant with what we mayperceive or understand. This undoubted advantage for us induces the erroneousimpression the the transformations have extracted all the information containedin raw data. Nevertheless, the information contained in the more friendly datais inevitably diminished. In other words, we have lost something.

It took several decades before the consequences of these theorems produced vis-ible effects on the evolution of telecommunications and computer science. Nowthat the transition to digital is completed, the values of these theorems is prac-tically only philosophical and cultural, although their in-deepth comprehensioncould still aid all sciences that inevitably deal with information.

- 2.2 -


22/228

S ENCODERM Noisy channel

NE = G(M , N )

DECODERM

U

Figure 2.1: Shannons model of a communication channel

2.2 Uncertainty: Entropy and Mutual InformationClaude Shannon considered information as a reduction of uncertainty, thus con-necting the measure of information to the physical measure of uncertainty, namelyentropy. With admirably mathematical rigor (although Shannon was enteringcompletely new territory) he deduced the measure of information from a set of axioms composed of the axioms that are at the basis of the theory of mathematicalmeasure, completed by a specic axiom for dealing with information. The result-ing measure was the classic entropy, that had been introduced, in the ninetiethcentury, for sizing the uncertainty of systems of particles like molecules, atoms,photons, etc.

In the following we recall some results from information theory, as they werederived by Shannon, which are indispensable to cryptographic uses.

Let A = {a i}N i=1 be an alphabet of N symbols, and let M = Am be the set of messages of length m over the alphabet A.Assuming that the messages M Mare random events of a stationary stochas-tic process characterized by a probability distribution p(M ), the entropy of themessages H (M) is dened as the average

H (M) =M Am

p(M ) ln 1

p(M ) , (2.1)

and the source entropy, i.e. the entropy of the alphabet H (A), is dened as thelimitH (A) = limm

1m

M Am p(M ) ln

1 p(M )

. (2.2)

The entropy H (A) is a nonnegative number that evaluates the uncertainty reduc-tion relative to a symbol of the alphabet Aas a consequence of its emission.Shannons interpretation of the entropy was that H (A) represents the amountof information that, on the average, a symbol of the stream M can support. If

- 2.3 -


23/228

the stochastic process that governs generation of the message M , is memorylessstationary, then it can be shown that the entropy H (A) is a nite sum

H (A) =a A

p(a) ln 1

p(a) . (2.3)

The entropy H (A) intended as a function of vector P = ( p(a1), . . . , p(aN )) , witha i A, of dimension N , is a convex function that assumes the maximum valueH (A) = ln N when all symbols are used with the same probability, that is p(a) =1N . In this model we have

H (M) = H (Am ) = mH (A) .When a message M M is sent over a real communication channel, it is cor-rupted by some noise, given by a block N of symbols from the same alphabet A.The received message is E = G(M , N ) E . The noisy channel may be ultimatelyseen as a discrete memoryless source of random symbols from the same alphabetA, which corrupt the message M. The received message E may be different fromM ; nevertheless, E yields some information on the sent message M .The most important parameter introduced by Shannon for describing these situ-ations was mutual information I (M, E ), dened as:

I (M, E ) =M Am ,E Am

p(M , E ) ln p(M |E )

p(M ) . (2.4)

where p(M , E ) is the joint probability distribution of the sent and received mes-sages, and p(M

|E ) is the corresponding conditioned probability distribution.

I (M, E ) represents the amount of information that each received symbol gives,on average, about the transmitted symbol.Mutual information is non-negative: if it is zero, the channel is useless, that is,instead of transmitting, at the receiver side the symbols may be randomly pro-duced. The following equations connect mutual information and entropies:

I (M, E ) = H (M) + H (E ) H (ME ) = H (M) H (M|E ) ,where H (M, E ) and H (M|E ) are, respectively, the joint entropy and the condi-tional entropy of the input and output symbols of the channel. H (M|E ) may beinterpreted as the amount of information that is still needed to specify M com-pletely, when the received symbol E is known.

2.3 Uncertainty and SecrecyIn search of perfect secrecy, Shannon conceived the idea of modeling encryp-tion operations as transmission over a very noisy channel, where the messagesM M from an information source are corrupted by noise to such an extent thatthey cannot be recognized by any unauthorized observer, but this noise should

- 2.4 -


24/228

S E = F (M , K ) Public Channel M = G(E , K ) U

Private Channel

Figure 2.2: Shannons model of cryptographic channel

be reproducible by the intended recipient, in order to allow reconstruction of thevalid message. This noisy transformation is specied by a function F (., .) de-ned from Am Ak into Ar E = F (M , K ) , (2.5)with the constraint of being invertible, i.e. given E and knowing K we must beable to compute M as

M = G(E , K ) , (2.6)

where G(., .) is a function from Ar Ak into Am . The elements of K = Ak arecalled keys, and play the same role as noise in the transmission channel.The keys thus dene the transmission channel, whose aim, in this case, is to mod-

ify transmitted messages in such a way that they can only be understood by theintended recipients, with own reading key.In this context, the mutual information I (M, E ) is dened as the average amountof information that the encrypted messages E give about the original messagesM, when the message key belonging to Kis unknown.The mutual information is dened through the conditional entropy, which is de-ned as the amount of information that the messages in Am can still give once areceived (encrypted) message belonging to Ar is known. We have

H (M|E ) =M

Am ,E

Ar

p(M , E ) ln 1

p(M

|E )

. (2.7)

In Shannons model of cryptologic channels, a key role is occupied by the amountof information that the encrypted messages give about the original messages,in the two contrasting hypotheses that the key word is known or not. In otherwords, the mutual information I (M, E ) and the conditional mutual informationdened as

I (M, E|K) =M Am ,E Am ,K Ak

p(M , E |K ) ln p(M |E , K )

p(M |K ) . (2.8)

- 2.5 -


25/228

are equally of interest in dening the ideal cryptographic situation, which may be taken as reference.

Denition 2.1. An encryption process is called perfect when the mutual informationI (M, E ) is zero, while the conditional mutual information I (M, E|K) yields all the information of the original message, that is

I (M, E ) = 0I (M, E|K) = H (M) . (2.9)

In other words, assuming that the key is not known, the encryption of a mes-sage is perfect when the received message does not give any information on theoriginal message. In terms of entropies the equation (2.9) implies

H (E ) = H (E|M) , (2.10)which means that the entropy of the encrypted message is not reduced by the

knowledge of the original message. Furthermore,H (E|K) H (E|M, K) = H (M) ,

that is H (E|K) = H (M) since H (E|M, K) = 0 . In conclusion, knowledge of thekey must enable the original message to be obtained from the encrypted mes-sage, without any further uncertainty. Operatively, this condition assures thatthe decrypting operation is possible, i.e. that the function G(., .) exists. Since therelation between the key and the encrypted message is described by the equation(2.5), the entropy of the encrypted messages, knowing the original messages, isnot greater than the entropy of the keys H (K) H (E|M); then, using (2.10), wehave the chain of inequalities

H (K) H (E|M) = H (E ) H (E|K) = H (M) . (2.11)The interpretation of this last equation is that to achieve perfect secrecy, the en-tropy of the keys should not be smaller than the entropy of original messages.Very frequently, messages, keys, and encrypted messages are all obtained by con-catenating statistically-independent symbols from the same alphabet A. In thiscase we have H (M) = mH (A), and H (K) = kH (A), thus the inequality amongentropies implies k m for perfect secrecy. We may collect the previous result ina signicant statementProposition 2.1. In order to achieve perfect secrecy, it is necessary that the entropy of the keys not be smaller than the entropy of the original messages. Equivalently in termsof alphabet symbols, the length of the key should be not shorter than the length of theoriginal message.

Historically, perfect secrecy was achieved by the classical encryption systemsadopted by spies, and known as a one-time pad, where a book (pad) was usedonly once with a precise algorithm for specifying which character to use in whichpage, for encrypting the characters of the original message.

- 2.6 -


26/228


27/228

is the net encryption rate, while the ratio

= mr + k

,

is called the full encryption rate.

Equation (2.11), consistently with Shannons source coding theorem [54], im-plies that 1, a necessary condition for the invertibility of the encryptionfunction F (., .), once the key is known. However, the parameter yields a morefaithful measure of the loss (or load) deriving from encryption. Perfect secrecy re-quires that m = k, because the key should be of the same length as the message;therefore the full encryption rate (corresponding to the full transmission rate) is0.5.Chapter 4 will describe an encryption procedure achieving perfect secrecy with-out using a secret key of the same length as the plain message, the price to pay being a net encryption rate not greater than 0.3867.

2.3.1 Binary message encryption

The typical encryption operation of binary sequences is the simple and fast binarysum modulo 2

ei = m i + ki mod 2 .

This is the standard encryption operation of stream ciphers, which are machinesgenerating the key stream ki starting from a short key (the secret key) K . Thisencryption procedure will be described in greater detail in Chapter 4.

2.4 Cryptologycryptologyis the scientic discipline that deals with the methodological and math-ematical aspects of protecting information against deliberate alterations or intru-sions.It is sub-divided into two great branches with opposite aims:

cryptography/ steganography aim to develop protection methods of differ-ent sorts; cryptanalysis aims to break protection systems and methods.

The two branches are closely correlated, and the design of good cryptographicfunctions cannot avoid in-depth cryptanalysis. Nevertheless, the mathematicalmethods are quite different.

Cryptography deals with methods for protecting information, in particular toachieve:

- 2.8 -


28/228

1. Condentiality in any sort of communication, that is to assure the pri-vacy of the exchanged messages.

2. Integrity in any sort of communication, that is to maintain unchangeda not necessarily secret message against deliberate alterations.

3. Authenticity of the interlocutors, that is to guarantee the identity of theconversation partners, namely sender and recipient of a message.

4. Non-repudiation of the authorship by the authentic signer, that is toguarantee the recipient that the signer cannot reject the authorship.

Steganography deals with the protection achieved by hiding the existence of themessage, which may be embedded in innocent information, or in any un-suspected object.

Cryptanalysis deals with problems of offensive, that is with developing attackmethods to violate that characteristic of the message protected by cryptog-

raphy. In particular, typical cryptanalysis actions are:1. Retrieving the text of a message protected by cryptography, having at

ones disposal only partial information.

2. Altering or destroying plain or encrypted messages.3. Fraudulently impersonating the legitimate interlocutor.

2.5 Cryptography

Cryptography includes the study of encryption algorithms, and of the protocolsused to achieve security objectives. After the publication of Shannons paper,and the burst of interest that was immediately raised, the evolution of cryptologyseemed to return to the traditional discretion, and to an evolution without greatsteps forward. However, the progress that occurred in error-correcting codes af-fected cryptography. Most frequently, it was the same people who carried outdifferent duties in the public domain of communication systems, and in the pri-vate elitist branches typical of cryptographic ofces.Probably, the great economic interests handled by computers, automatic bankcounters, and the need for control over information ows, lead to increasinglyglobal and fast signicant applications of cryptography outside of the traditionalmilitary and diplomatic elds.However, the paper New Directions in Cryptography by Whiteld Dife and Mar-tin Hellman, which appeared in the November 1976 issue of IEEE Transactions onInformation Theory, was surprising and caused a sensation, especially among ex-perts of secret cryptography. It proposed a new and challenging paradigm to thecryptographers:

To transfer private information on a public channel without a previous agree-ment.

- 2.9 -


29/228

The abstract solution of this problem, in Dife and Hellmans conception, for-mally introduced the new concept of one-way function, or better, a new way to in-terpret the traditional cryptographic functions. These traditional cryptographicfunctions, they said, should satisfy an asymmetric complexity not in the way theyare used, but between the modes of application and their cryptanalysis. Dife

and Hellman introduced a function F (.)

that should satisfy the following proper-ties:

1) To be easily specied together with its inverse function F 1(.);

2) To be easy to compute (message encryption) C = F (M ), given M , a plainmessage;

3) To be hard to compute F 1(.) from the sole knowledge of F (.);

4) To be easy to compute M = F 1(C ) the plain message, given the encryptedmessage C .

Conditions 3) and 4) are not contradictory, because the goal is to easily retrieve themessage, having designed both F (.) and F 1(.), but that the uninitiated cannotobtain the inverse function simply from the knowledge of the function F (.).This problem has originated public-key cryptography, which was immediatelyuseful in telecommunications systems, in the management of computing resour-ces, and in many banking operations.

One-way functions. The notion of one-way function is fundamental in public-key cryptography, although their existence is still questionable, as it has not beenformally proved. A large number of systems base their security on a mathemat-

ical notion, whose signicance has not been rigorously proved.Many potential one-way functions have been proposed, however the only surviv-ing candidates are borrowed from elementary number theory. Precisely, the onlyknown putative one-way functions are based on

1. the difculty of factoring integer numbers

2. the difculty of computing the discrete logarithm in certain representationof cyclic groups of prime order

3. the difculty of decoding linear error-correcting codes that miss symmetry.

The idea of using the difculty of computing the discrete logarithm in convenientcyclic groups was introduced by Dife and Hellman.The idea of using the difculty of factoring was introduced by Rivest, Shamir,and Adleman with the description of the algorithm known as RSA.Lastly, the idea of using the difculty of decoding error-correcting codes is due toMcEliece.All these problems are the object of stringent research, but the main objectiveremains an axiomatic proof of the existence of one-way functions.

- 2.10 -


30/228

Secret-key cryptography, despite its limitations, is unavoidable for encryptinglarge amounts of stored data, or fast secret data transmission. What is remark-able is that all these enciphering schemes are derived from the historical Caesarscipher, with the variants imposed by two thousand years of evolution, and by theevolution of technology.

The Caesars cipher consisted of a simple substitution of a letter with a letterthree positions forward in the alphabetic; the last three letters were substitutedwith the rst three letters of the alphabet, in order. The number 3 was the secretkey. The procedure could be mathematically interpreted by encoding the letterswith numbers from 0 to 25 and introducing the operations modulo 26, that isworking the the residue ring Z 26.The encryption operation rst converts the letters into numbers (encoding oper-ation) then the number 3 is summed to each number modulo 26. These numbermay be converted back to letters for transmission; in this event, before decryp-tion the letters are re-converted to numbers. The decryption operation consists insubtracting the number 3 from each number of the encrypted message, and thenin conversing (decoding) the numbers into letters.In spite of its apparent simplicity, to state that such a scheme has conserved itsvalidity is not supercial. Caesars cipher, viewed in mathematical terms, has allthe ingredients for dening the encryption of a message, namely, the concept of transformation, the notion of secret key exchanged in a secure way, and the en-coding notion.Actually, the source of weaknesses in Caesars cipher is the very short key, thisawkward limitation was avoided by the method using different keys to encryptsymbols in different positions in the message. The key symbols were taken froma pad, to be used only once, which was kept secret and only known to sender and

receiver. This scheme was typically used by spies, and became known as one-time pad encryption. Practically, it achieved perfect secrecy, as shown by Shannon.Formally, the one-time pad encryption procedure may be described as follows. Letthe ( plain text) be the sequence of numbers in Z N = {0, 1, 2, . . . , N 1}

M 1, M 2, . . . , M i , . . . .

The secret encryption key is a sequence from the same set ZN , and of the samelength

K 1, K 2, . . . , K i , . . . .

The (cipher text) is obtained by composition on a symbol-by-symbol basisE i = M i + K i mod N i .

The plain text is easily recovered knowing the secret key

M i = E i K i mod N i .Conversely, assuming that the key is a sequence of equally-probable and statistically-independent symbols, it is impossible to obtain the plain text from a knowledge

- 2.11 -


31/228

of the cipher text only. We have p{E i = j} = p{M i + K i = j}, for every j ZN ,that is p{E i = j}=

N 1

=0

p{M i = j mod N } p{K i = }= 1N

which means that the symbols in the encrypted message are also equally probable and statistically independent.Shannons perfect secrecy is thus possible. However, it is not practical for use bya large number of people in unrestricted scenarios. In the real world, the imple-mentation of cryptographic schemes calls for more pragmatic solutions, and thussome basic principles that should be followed by any good cryptographic systemhave been formulated.In 1883, Auguste Kerckhoffs wrote two journal articles on La Cryptographie Mili-taire, in which he stated six design principles for military ciphers. Most of themare now redundant because of computers, but two are still valid:

1. Encryption and decryption rules should be easy to implement. Actually this rulewas already formulated hundreds of years before by Francis Bacon.

2. The encryption rule must not be required to be secret, and it must be able to fall intothe hands of the enemy without inconvenience. In other words, security shouldlie all in the secrecy of the key. This is now known as Kerckhoffs principle.

2.6 SteganographyThe protection of information achieved by concealing the very existence of the

message itself is studied by the discipline of steganography. The term was rstused by Johannes Trithemius (1462-1516) and derives from the composition of two Greek terms meaning literally covered writing. In history, we have seenseveral examples of information hiding which have folk appeal; however, inmodern times it is usually interpreted as hiding information by means of otherinformation. Examples include sending a message by altering the pixels of animage, bits in digital voice recordings or music, introducing articial errors inencoded transmissions, or linguistic methods called acrostics. The most famousacrostic author, namely Boccaccio deserves to be mentioned. He wrote three son-nets, then wrote other poems such that the initials of the successive tercets corre-sponded exactly to the letters of the sonnets [44].Today, the more general term information hiding is used to indicate any dis-cipline that is directed, honestly or dishonestly, to goals that are based on con-cealing the very existence of the information to different degrees. In this context,Petitcolas [44] considers four sub-disciplines

Steganography properly speaking conceals the existence of the message by meth-ods that include both physical and logical techniques. In principles, attacksare not possible, because the existence of the message is not known and isnot considered.

- 2.12 -


32/228

Watermarking is used to protect proprietary rights, authorships, or any kind of ownership of a product. It is weaker than steganography, because an at-tacker expect, or at least suspects that some protection is active on the prod-uct. A simple attack may be limited to making the mark undetectable.

Covert channels are typically used by untrustworthy programs to leak informa-tion to their owners while performing a service.

Anonymity is a way to conceal the identity of partners in a game. For instance,to guarantee the secrecy of the vote in e-voting systems, or to hide themeta-content of a message, or the sender and recipient of a message. Thegoal may be different depending on whether anonymity should concernthe sender, the receiver, or both. Web applications have focused on re-ceiver anonymity, while email applications are more concerned with senderanonymity.

- 2.13 -


33/228

Chapter 3Random Sequences and Statistics

The very name calculus of probabilities is a paradox. Probability opposed to certainty iswhat we do not know, and how can we calcu-late what we do not know?

H. Poincar e, Science and Hypothesis

3.1 IntroductionIn this chapter we will consider sequences that may look like random sequencesor, more technically, discrete stochastic sequences. Random sequences, besidesplaying a key role in cryptography, have many applications in other elds: inspread-spectrum techniques, in communication theory, in testing digital devices,

in satellite navigation and localization systems, in computer simulation, to limitthe list to some important applications.Consider for example the following binary sequences

1) . . . , 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, . . .2) . . . , 0, 1, 0, 1, 1, 1, 0, 0, 1, 0, 0, 0, 0, 0, 1, 1, . . .3) . . . , 1, 1, 0, 0, 0, 1, 0, 0, 1, 0, 1, 0, 0, 0, 1, 0, . . .4) . . . , 0, 1, 1, 1, 1, 0, 0, 0, 1, 0, 0, 1, 1, 0, 1, 0, . . .

we may ask whether they have been generated by some random process, or havea deterministic origin. In fact:

- sequence 1) was generated by a deterministic mechanism

- sequence 2) was generated by ipping a coin

- sequence 3) was generated considering the binary representations of therst 6 digits in the decimal part of

- sequence 4) was generated by a primitive linear feedback shift register of length 4.

- 3.1 -


34/228


35/228

Unfortunately, Lehmers denition of a random sequence does not exclude para-doxical situations, because the tests are always done on a nite number of in-stances and suffer from the limits of this kind of statistics. However, a choicemust be made, in spite of possible errors. Lehmers view is unavoidable if wewant to obtain conclusions of practical value.

A key role is always played by the sample space, which depends on the uses thesequence is to be put to. The sample space species the set with respect to whichwe want the statistics.

3.1.1 Sample Spaces

A sample space is the set of all possible outcomes of a random experiment. Arandom variable is a function dened on a sample space. A sample space may benite or innite; innite sample spaces may be discrete or continuous.We will now look at some important examples of sample spaces.

Drawing a card. The experiment is drawing a card from a standard deck of 52cards. The cards are of two colors - black (spades and clubs) and red (diamondsand hearts), four suits (spades (S), clubs (C), diamonds (D), hearts (H)), 13 values(2, 3, 4, 5, 6, 7, 8, 9, 10, Jack (J), Queen (Q), King (K), Ace (A)). There are 52 possibleoutcomes with the sample space

{2S, 2C, 2D, 2H, 3S, 3C, 3D, 3H,. . . ,AS,AC,AD,AH } .Of course, if we are only interested in the color of a drawn card, or its suit, or

perhaps its value, then it would be natural to consider other sample spaces:

{b, r}{S,C,D,H }{2, 3, 4, 5, 6, 7, 8, 9, 10,J,Q,K,A } .

Assuming that deck and drawing are fair, so that the probability of drawing agiven card is 1/ 52, we may easily compute the probability distributions over thevarious sample spaces.

Choosing a birthday and the birthday paradox. The experiment is to select asingle date during a given year. This can be done, for example, by picking arandom person and inquiring about his or her birthday. Disregarding leap years,for simplicitys sake, there are 365 possible birthdays, which may be enumeratedas

{1, 2, 3, 4, . . . , 365} ,then the probability that a given person was born on a given day of the year is

1365 . This probability leads to the so-called birthday paradox that arises from thequestion

- 3.3 -


36/228


37/228

Coin tossing. The experiment of tossing a coin, which lands on either one orthe other of its two sides, called head and tail, has two possible outcomes. In thecase of a single toss, the sample space has two elements that will be denoted as

{H, T }. Let p be the probability of getting H and 1 p the probability of gettinga tail; if p = 12 we say that the coin is fair. Consider the case of two experiments.One may toss two indistinguishable coins simultaneously, or one coin twice. Thedifference is that in the second case we can easily differentiate between the twothrows. If two indistinguishable coins are tossed simultaneously, there are justthree possible outcomes, {H, H}, {H, T}, and {T, T}. If one coin is tossed twice,there are four distinct outcomes: HH, HT, TH, TT. Thus, depending on the natureof the experiment, there are 3 or 4 outcomes, with the sample spaces

{H, H }, {H, T }, {T, T } Indistinguishable coinsHH,HT,TH,TT Distinguishable coins .

Repeated throwing yields an example of an innite discrete sample space, that isthe rst tail experiment: a coin is repeatedly tossed until the rst tail shows up.Possible outcomes are sequences of H that, if nite, end with a single T, and aninnite sequence of Hs:

{T,HT,HHT,HHHT, . . . , {HHH . . . }} .This is a space that contains an event (not impossible) whose probability is 0. Arandom variable is naturally dened as the length of an outcome. It drawsvalues from the set of integer numbers including the symbol of innity:

{1, 2, 3, 4, . . . , n , . . . } .The sample space may be equipped with a probability distribution p{HH . . .HT }= p#(H )(1 p) induced by the probability of H and T .

Rolling dice. The experiment is rolling a cubic dice whose faces show numbers1, 2, 3, 4, 5, 6 one way or another. These may be the real digits or arrangements of an appropriate number of dots, e.g. like these for the number 5

There are six possible outcomes and the sample space consists of 6 elements:

{1, 2, 3, 4, 5, 6}. To each face is associated a probability p{i}; the dice is said to be fair if p{i}= 16 .A second experiment is rolling two dice. If the dice are distinct or if they arerolled successively, there are 36 possible outcomes, i.e. the sample space is:

{11, 12, . . . , 16, 21, 22, . . . , 66} .

- 3.5 -


38/228


39/228


40/228

1. The symbols are statistically independent

2. The symbols are equi-probable, that is p{bi = 0}= p{bi = 1}= 12 .With these hypotheses, the probability that a block of N bits contains n0 = k 0sand n1 = N k 1s, given by the binomial distribution, is

N k

12N

The average number of 0s or of 1s in a block of N symbols is

E [n0] = E [n1] = N 2

,

while the standard deviation is the same for both statistics

E [(n0 E [n0])2] = N

2 .

Let 99% be a condence (or probability) level; assume N = n0 + n1, with N

100 to avoid inconsistencies, then with a probability of 0.99, both n0 and n1 areincluded in the interval I 99

N 2 3

N 2

, N

2 + 3

N 2

.

Test: Given a block of N binary symbols, count the total numbers n0 and n1 of 0s and 1s, respectively. The test of randomicity is passed, with 0.99%condence, if both n0 and n1 are included in I 99.

Group counting. This test is based on counting the numbers of the same pat-terns of bits, taken k by k. Since a pattern b0, b1, . . . , bk1 of k bits can be interpretedas an integer number m i = j b j 2 j lying between 0 and 2k 1, the test is equiv-alently based on counting the number N i of m is for every i. Assuming that N issufciently large to avoid inconsistencies, if the given sequence is a truly randomsequence, the numbers mis are uniformly distributed, the expected number of each of them is N 2k . The standard deviation is the same for every statistics N i

E [(N i E [N i])2] = N (2k 1)2k

.

The condence interval is dened in exactly the same way as in the previouscase. Let 99% be a condence (or probability) level, then with a probability of 0.99, every N i is included in the interval I 99

N 2k 3 N (2k 1)2k , N 2k + 3 N (2k 1)2k .

Test: Given a block of N binary symbols, count the number N i of patterns cor-responding to the same number, m i , for every i. The test of randomicity ispassed, with 0.99% condence, if every N i is included in I 99.

- 3.8 -


41/228

Run counting. A run of 0s of length k is a pattern of k consecutive 0s included between two 1s. A run of 1s is dened conversely.The two statistics are the numbers of runs of equal symbols, namely, the totalnumber of runs of 1s, and the total number of runs of 0s.Let X 1, X 2, . . . , X N be blocks of N 0s and 1s, considered as integer numbers. It

is convenient to introduce two statistics:S : the number of 1s in the block; it is easily obtained as a sum

S =N

i=1

X i .

R1: the autocorrelation function obtained as

R1 =N

i=2

X iX i1 .

It is easily seen that the difference D = S R1 gives the number of runs of 1s:in fact a run of length k of 1s is a pattern of the form 01 . . . 10, where we have kconsecutive 1s. Its R1 correlation is the pattern obtained as

0 1 1 . . . . . . 1 0 . . .. . . 0 1 1 . . . 1 1 0. . . 0 1 1 . . . 1 0 0

in the third row the run of 1s has length k 1. It follows that the difference be-tween the number of 1s in the original pattern and the number of 1s in the patternobtained by shifting and multiplication is exactly 1 for every run, thus D countsexactly the number of runs of 1s. In a truly random sequence, the symbols X i arestatistically independent and equiprobable. Thus the joint-probability distribu-tion of the pair S and R1 is [45, p.11]

p{S, R1}= N S + 1S R1

S 1R112

N

.

Since we are interested only in the statistics D , we must sum over the values of S and R1 whose difference is D , obtaining

p{D}= N + 12D 12N

.

The average value of D can be computed directly, and turns out to be N 8 , while itsvariance is N +116 .Assuming a condence level of 99%, the number of runs of 1s is included in theinterval I 99

N 8 3

N + 14

, N

8 + 3

N + 14

.

- 3.9 -


42/228

Test: Given a block of N binary symbols, count the number S of runs of 1s andthe number R1 of runs of 1s, and obtain D = S R1. The test of randomicityis passed, with 0.99% condence, if D is included in I 99.The same test can be done with respect to the runs of 0s.

Up and down runs. Let us consider a sequence of 18 numbers1 3 10 4 2 7 12 16 15 9 7 6 4 5 6 3 2 12

+ + + + + + + +where, below each number the symbol + or denotes whether the number isgreater or smaller than the previous ones. We assume that adjacent numbers arealways different, or that the probability of the event that two adjacent numbersare equal is zero. A sequence of N numbers has N 1 changes.Assuming that the numbers are identically distributed, the number r of runs upplus runs down (in the example above r = 7) is asymptotically distributed ac-cording to the normal distribution, with average (2N

1)/ 3 and variance (16N

29)/ 90.We obtain a test on the hypothesis that the numbers in the sequence are uniformlydistributed, assuming the statistics r (i.e. the number of runs up and runs down)is normally distributed with the given average and variance.

Test: Given a block consisting of N digits, the test of randomicity is passed, with99.7% condence, if the statistics r is included in the interval

[2N 1

3 3 16N 2990 , 2N 13 + 3 16N 2990 ] .Monte Carlo tests. The set of numerical techniques employing random numbers to evaluate integrals or functions, otherwise difcult to compute, bears thename of Montecarlo methods.The randomicity tests are dened considering the numerical evaluation by Mon-tecarlo techniques of integrals whose value is exactly known. The discrepancy between exact and Montecarlo evaluation of the integral is a measure of the good-ness of random sequences.The integral

I f = 10 f (x)dx ,may be intended as the expectation E [f ] of a function f ( ) of a random variablexi uniformly distributed in the [0, 1] interval. The standard deviation is

D = E [f 2 E [f ]2] = 10 f 2(x)dx E [f ]2 .Given a putative random sequence of N numbers X i [0, 1], we dene the sum

f N = 1N

N

i=1

f (X i) ,

- 3.10 -


43/228

which represents an estimation of the value of the integral I f , since taking theexpectation of f N , we have

E [f N ] = 1N

N

i=1

E [f (X i)] = E [f ] .

Applying the Tschebyscheff inequality, we have

|E [f ]f N | DN .with probability 1 . In the following Table, we report some functions that can be used in testing the randomicity of a sequence

f E [f ] D

x 1

2112

sin2ux sin2 u

u12

+ sin2u

2u

Test: Given a block consisting of N binary symbols, and assuming a condencelevel of 99% (which means that, with probability 1

= 0.99), then the

difference between exact and expected value will be less than 10 DN , i.e.|E [f ]f N | < 10 DN .

It is remarked that multiple integrals may advantageously be used to strengthenthe test. For example, using double integrals to evaluate the area of a region

Ddxdy ,

where D is dened by a function f (x, y) 0 in the square 0 x 1, 0 y 1.Two sequences x1, x2, . . . x n , and y1, y2, . . . yn are obtained by splitting the givensequence under test of length 2N , and the following summation is evaluated

S N =N

i=1

u(xi , yi) ,

where u(x, y) = 1 if f (x, y) 0, and u(x, y) = 0 otherwise.

- 3.11 -


44/228

The Auto-correlation coefcient. The periodic auto-correlation coefcient ( )of a binary sequence of length N is dened as

( ) = 1N

N 1

i=0

(1)a i (1)a i + , (3.1)

where the index i + should be evaluated modulo N .Since E [( )] = 0 and E [( )2] = 1N for = 0 , to test whether a given block of N binary symbols consists of equally distributed and statistically independentsymbols the expression (3.1) is computed. Considering the inequality

p{|( )| k N } 1

1k2

,

obtained using the Tschebyscheff inequality, the following test is run.

Test: Given a block consisting of N binary symbols, the test of randomicity ispassed, with 0.99% condence, if (1) is included in the interval

[ 10 N ,

10 N ] .

The Kolmogorov-Smirnov test. This test concerns continuous random vari-ables. That is, given a sequence of real random numbers, the test ascertainswhether the sequence is associated to a random variable with a given probabilitydensity, [47]. However, the test may be adapted to test the uniform distributionof discrete random variables, in particular binary variables.A major advantage of the Kolmogorov-Smirnov test is that it is independent of the probability density f (x) to which it is applied, because it is referred to a ran-dom variable dened through the cumulative probability density

F (x) = x f (t) dt .Given N samples X 1, X 2, . . . , X N of a random variable putatively with probability density f (x), and cumulative probability density F (x), let the samples bere-ordered in an increasing way, i.e. X i+1 X i , then dene the two statistics

K +N = N max j [ jN F (X j )] j = 1, 2, . . . , N K N = N max j [F (X j ) j1N ] .

Note that the random variable = F ( ) is uniformly distributed in the [0, 1]interval, therefore the probability distribution of K +N and K N can be deducedreferring to a random variable with uniform density in the [0, 1] interval.The test consists of looking at the values K +N and K N to determine whether theyare sufciently high or low. These values should be compared with those given

- 3.12 -


45/228

in the following Table, borrowed from [47, page 48]. For example, the probabilityis 50% that K 10 is 0.5426 or less. The entries in the table were computed using anexpression that asymptotically is

y p 16 N + O(

1N

) where y2 p = 12

ln 11 p

.

Since in any reasonable test N is large, with probability p, K +N or K N asymptoti-cally are y p 16 N or less.

p = 1% p = 5% p = 50% p = 95% p = 99%n = 1 0.0100 0.0500 0.5000 0.9500 0.9900n = 10 0.0291 0.1147 0.5426 1.1658 1.4440n = 20 0.0381 0.1298 0.5547 1.1839 1.4698n = 30 0.0435 0.1351 0.5605 1.1916 1.4801

Percentage points of the distributions of K +N and K N

Although the test has been dened for continuous random variables, it can beapplied to test the uniform distribution of binary sequences. The binary streamx1, x2, . . . , x N may be partitioned in blocks xk+1 , xk +2 , . . . , x k+ k of k bits, andeach block may be considered as the positional representation of an integer base2

X =k

i=1

xk + i2i1 .

The aim is to test the uniform distribution of these integers in the [0, 2k 1]interval; equivalently, to look for the uniform distribution in the [0, 1] interval of the rational numbers X mu

2k

1. Note that, in the case of uniform probability densities

in the [0, 1] interval, the cumulative probability density is F (x) = x.Test: Given a sequence consisting of N = kL binary symbols; the sequence is

partitioned into L blocks of k symbols which are interpreted as integers,then they are divided by 2k 1 and sorted into ascending magnitude. Therandomicity test is passed, with 0.97% condence, if both K +N and K N areincluded into the interval

[1.22 16 L , 1.52

16 L ] ,

with L > 200.

3.2.1 Linear Complexity Prole.The linear complexity prole, strictly speaking, is not a statistical test. However,it can be seen as a Montecarlo method, because we compare the actual linearcomplexity prole with the average (expected) linear complexity prole of a trulyrandom sequence.Linear complexity is dened by referring to the linear sequences generated by theLinear Feedback Shift Register (LFSR), [34, 65].

- 3.13 -


46/228

slope 12

Figure 3.1: LCP for a clock controlled Fibonacci LFSR of length 22

Denition 3.2. The linear complexity (M ) of a sequenceX of lengthM is the minimumlength of a LFSR that generatesX .A LFSR of length m has a period of length not greater than 2m1: that is the linearcomplexity of the sequence is small with respect to its length. In general, given a se-quence X of length M , its linear complexity is computed by Berlekamp-Masseysalgorithm, which yields the length of the shortest LFSR generating X [4, 55, 54].If X is an m-sequence (i.e. a binary sequence generated by a LFSR of length m),then (M ) m, and (M ) = m for every M 2m; then an interesting question is

What is the linear complexity of a genuinely random sequence?

(n)

nFor each sub-sequence of length n, the approach is to compute, starting fromthe beginning, its linear complexity, a process that yields the linear complexityprole.Since, if

X is a genuine random sequence, then (M ) = M

2 on average, the linear

complexity prole is a straight line of slope 12 .

Let X n be a subsequence of length n of an innite truly random sequence X ,and let = (X n ) denote the length of the shortest recurrence generating X n . Letg (x) = x + a1x 1 + . . . + a 1x + a be the generator polynomial of X n .The linear complexity prole of X n is dened as the function (n; k) = (X k)for every k from 0 to n. In order to compute the expectation of (X k), for everyn , it is necessary to dene a probability measure over {X n} the set of all binarysequences of length n. We say that a sequence X n is randomly generated if it

- 3.14 -


47/228

is picked at random from {X n} with probability p{X n} = 12n , since |{X n}| = 2n .This denition is tantamount to considering a sequence X n as produced bit by bit, with bit probability 12 . Let c(n; k) denote the number of sequences in {X n}that are generated by a recurrence of order k; therefore the expectation of (X k)can be written asE [ (X n )] = 12n

(X n ) = 12nk

kc(n, k )

The last summation is easily computed, taking into account the following obser-vations:

Every generator polynomial of degree k is allowed, [55], including xk whichis assumed to generate the all zero sequence. The sequence 0 01 composed of n 1 zeros followed by a 1 is necessarilygenerated by a recurrence of degree n , [55], therefore c(n; n) = 1 since any

other sequence of length n is generated by some LFSR of length less than n[55].

c(n; 0) = 1 is a consequence of the previous observation. c(1;1) = 1 : since we have only the sequence 1, given that the sequence0 is generated by a recurrence of order 0, and c(n;1 ) = 2 , n > 1 since

the recurrence with generator polynomial x + 1 generates two sequences,namely the all-zero and the all-one sequences, but by denition the all-zerosequence is generated by a recurrence of order 0, and the sequence 01 cannot be generated by a recurrence of order 1.

if n > 2k and k 0, thenc(n; k) = c(n 1; k) = c(2k; k),

because any periodic sequence generated by a LFSR of length k is specied by its rst 2k digits, and sequences longer than 2k are the periodic exten-sions of some sequence generated by a LFSR of length k.

c(2; 2) = 1 and accounts for the sequence 01.c(3; 2) = 4 is obtained as the differencec(3; 2) = 23

[c(3; 0) + c(3; 1) + c(3;3)] = 4

Moreover, c(n; 2) = c(4; 2) = 8 for every n 4, where c(4; 2) is obtained bydirect counting, or repeating the same argument used above for evaluatingc(3; 2).

We have the recurrence c(2k; k) = 4 c(2(k 1), k 1 because, adding one cellto a LFSR, we have available one more initial condition and one more tap,therefore

c(2k; k) = 2 2k1 k 1

- 3.15 -


48/228

If 2k > n , then c(n; k) = 4 nk for every n2 + 2 k n 1, n > 2An initial set of values of c(n; k) are reported in the Table.

n\k 0 1 2 3 4 51 1 1 - - - -2 1 2 1 - - -3 1 2 4 1 - -4 1 2 8 4 1 -5 1 2 8 16 4 1

The average (or expectation) of (X k) is obtained as

E [ (X k)] = 12n

n

k=0

kc(n, k ) =

n2

+ 418

3n + 29

2n even n

n2 +

518

3n + 29 2

n

od