44
Pseudo Random and Random Numbers Vivek Bhatnagar and Chaitanya Cheruvu

Pseudo Random

  • Upload
    bea-lem

  • View
    32

  • Download
    1

Embed Size (px)

DESCRIPTION

discrete math

Citation preview

  • Pseudo Random and Random NumbersVivek Bhatnagar and Chaitanya Cheruvu

  • ContentsIntroduction to Pseudorandom NumbersTheory Behind Pseudorandom NumbersSome Pseudorandom Number GeneratorsAttacks on Pseudorandom generatorsTests for pseudorandom functionsTrue Random generatorsConclusions

  • IntroductionTruly random - is defined as exhibiting ``true'' randomness, such as the time between ``tics'' from a Geiger counter exposed to a radioactive elementPseudorandom - is defined as having the appearance of randomness, but nevertheless exhibiting a specific, repeatable pattern.numbers calculated by a computer through a deterministic process, cannot, by definition, be random

  • IntroductionGiven knowledge of the algorithm used to create the numbers and its internal state (i.e. seed), you can predict all the numbers returned by subsequent calls to the algorithm, whereas with genuinely random numbers, knowledge of one number or an arbitrarily long sequence of numbers is of no use whatsoever in predicting the next number to be generated. Computer-generated "random" numbers are more properly referred to as pseudorandom numbers, and pseudorandom sequences of such numbers.

  • IntroductionUsageAlmost all network security protocols rely on the randomness of certain parameters Nonce - used to avoid replay session key Unique parameters in digital signatures Monte Carlo Simulations -is a mathematical technique for numerically solving differential equations. Randomly generates scenarios for collecting statistics.

  • Introduction(Desirable) Properties of Pseudorandom NumbersUncorrelated Sequences - The sequences of random numbers should be serially uncorrelatedLong Period - The generator should be of long period (ideally, the generator should not repeat; practically, the repetition should occur only after the generation of a very large set of random numbers).Uniformity - The sequence of random numbers should be uniform, and unbiased. That is, equal fractions of random numbers should fall into equal ``areas'' in space. Eg. if random numbers on [0,1) are to be generated, it would be poor practice were more than half to fall into [0, 0.1), presuming the sample size is sufficiently large.Efficiency - The generator should be efficient. Low overhead for massively parallel computations.

  • The Random Number Cycle

    Almost all random number generators have as their basis a sequence of pseudorandom integers The integers or ``fixed point'' numbers are manipulated arithmetically to yield floating point or ``real'' numbers. The Nature of the cycle the sequence has a finite number of integersthe sequence gets traversed in a particular orderthe sequence repeats if the period of the generator is exceeded the integers need not be distinct; that is, they may repeat.

  • IntroductionTesting Pseudorandom generatorsclever algorithms have been developed which generate sequences of numbers which pass every statistical test used to distinguish random sequences from those containing some pattern or internal order.Tests to check the different properties discusses above.Tests include mean and variance checks. Mean should be close to 0.5 and variance 1/12 = 0.08 for uniformly distributed pseudorandom numbers.

  • Theory of Pseudorandom Numbers

  • TheoryComputational Indistinguishability - Consider an ensemble (i.e. a sequence of numbers). Now if we can generate an ensemble which cannot be differentiated from the first ensemble in polynomial time by any efficient procedure , the second ensemble is, for all practical purposes, equivalent to the first ensemble.Ensembles that are computationally Indistinguishable from a uniform ensemble are called pseudorandom.General definition of Pseudorandom numbersA deterministic polynomial time algorithm which satisfies the following 2 conditions

  • TheoryComputational Indistinguishability - Consider an ensemble (i.e. a sequence of numbers). Now if we can generate an ensemble which cannot be differentiated from the first ensemble in polynomial time by any efficient procedure , the second ensemble is, for all practical purposes, equivalent to the first ensemble.Ensembles that are computationally Indistinguishable from a uniform ensemble are called pseudorandom.General definition of Pseudorandom numbersA deterministic polynomial time algorithm which satisfies the following 2 conditionsExpansion: for every s E {0,1}* , |G(s)| > |s|Pseudo randomness: the ensemble { G(s) } is pseudorandom

  • TheorySignificance of Pseudorandom GeneratorsEfficient amplifiers/expanders of randomness.Using very little randomness (a randomly chosen seed) they produce very long sequences which look random to any efficient observer.Pseudorandom generators allow to produce high quality random sequences at low costs making them very useful in cryptography.They produce unpredictable sequences i.e. no efficient algorithm can guess its next bit given a prefix of the sequence.

  • TheoryOne-way functions and Pseudorandom numbersThe existence of one is necessary and sufficient condition for the existence of the other.Some one-way functionsRSA FunctionDiscrete Logarithm

  • TheorySome practical constructions based on collections of permutationsThe intractability of the Discrete Logarithm Problem : based on the fact that it is hard to predict, given a prime P, a primitive element G, and an element Y of the group, whether there exists 0 < x < P/2 such that Y = G^x mod P. The difficulty of inverting RSA : based on the fact that the least significant bit constitutes a hard-core for the RSA collection.The intractability of Factoring Blum Integers: based on the fact that the least significant bit constitutes a hard-core for the Rabin collection

  • Linear Congruential GeneratorsWe begin by discussing the linear congruential generator - the one most commonly used for generating random integers

    we generate the next random integer using the previous random integer , the integer constants, and the integer modulus To get started, the algorithm requires an initial ``seed'', which must be provided by some means.We refer to the sequence generated as The appearance of randomness is provided by performing modulo arithmetic or remainderingNote that the next result, , depends upon only the previous integer - This is a characteristic of linear, congruential generators which minimizes storage requirements, but at the same time, imposes restrictions on the period.

  • Linear Congruential GeneratorsWith Xn determined, we generate a corresponding real number as follows:

    When dividing by m Rn , the values are then distributed on [0,1). We desire uniformity, where any particular Rn is just as likely to appear as any other Rn , and the average of the Rn is very close to 0.5.

  • Linear Congruential GeneratorsExample 1 LCG (5, 1, 16, 1)Let us consider a simple example with a= 5, c=1, m=16, and X0 =1. The sequence of pseudorandom integers generated by this algorithm is: 1,6,15,12,13,2,11,8,9,14,7,4,5,10,3,0,1,6,15,12,13,2,11,8,9,14, ..

  • Linear Congruential GeneratorsWe observe :The period (the number of integers before the sequence repeats) P is 16 - exactly equal to the modulus, m. Thus, for m=16 , this sequence is of long period (the longest possible), and uniform (it completely fills the space of integers from 0-15). sequence exhibits throughout its period the pattern of alternating odd and even integers. It is readily apparent that the sequence is serially correlated. Due to this lack of randomness, the values should not be used as random digits. The real numbers generated from the integer sequence are generally sufficiently random in the higher order (most significant) bits to be used in many application codes.

  • Linear Congruential Generators

  • Linear Congruential GeneratorsNext, we infer the following. Because each random integer results from the previous integer alone, selecting any initial seed from 0 to 15 would just cyclically shift the above sequence.Thus, all that a different choice of the initial seed does is shift the starting point in the sequence already determined by a, c and mFinally, we note that the average of the real numbers is 0.4688 and the variance is 0.0830. The departure of these values from the ideal ones of 1/2 and 1/12 is due to the short period of this sequence and the rather coarse resolution of the generated real numbers. These conditions of average and variance approaching the theoretical values are necessary but not sufficient conditions for a good random number generator.

  • Linear Congruential GeneratorsExample 2 LCG (5, 0, 16, 1) Next, we take the case of c = 0 . This is termed a multiplicative congruential random number generator:

  • Linear Congruential Generators

  • Linear Congruential GeneratorsObservationsthe low order bits are not random. the sequence is correlated, as all successive integers differ by 4 from their predecessors. At coarse granularity, the sequence is uniform. For example, if we divide [0,1) equally into quarter segments, then exactly one random number falls into each segment: [0, 0.25), [0.25, 0.5), [0.5, 0.75) and [0.75,1). However, at finer granularity, this uniformity breaks down - consider dividing up the domain into 8 equal segments, for example. There are two separate issues to consider here. the finite precision existing in all computers, which results in a round-off error to the precision with which integers can be represented, or with which the floating point divide is accomplished. the interaction of the sequence of random numbers produced by our generator with our application. This is particularly troublesome when an application requires n-tuples of random numbers, instead of just one random number at a time

  • Linear Congruential GeneratorsInitial SeedWhen debugging, it is important to implement the algorithm to reproduce the same stream of random numbers on successive runs.the initial seed should be set to a ``random'' odd value Eg.

  • Characteristics of good LCGs Pseudo random number generatorsA large value of a is desirable to provide sufficient randomness. A large value of m is also desired, so that the period is kept long.Summary of the salient features and the recommendationsMultiplicative, congruential generators are adequate to good for many applications. They are not acceptable... for high-dimensional work..They can be very good if speed is a major consideration. Prime modulo are best. However, modulo of the form are faster on binary computers. Linear Congruential Generators

  • R250Uses a shift register sequence. Has several advantages over a linear congruential generatorLong period 2^249Period does not depend upon the number of bits used in the random number generatorGenerally much faster than an LCM implementationGenerator is built from a one bit random generator based on the following equation.

    The max period is 2^(p-1). We will use the value of p =250 .

  • R250Choosing most of the ci terms to be 0 we get the equation.

    If we choose q = 103 then the number generated is got by adding the previously calculated 103rd bit and 250th bit To generate a random number of 16 or 32 bit s. This can be done by doing the above 1 bit addition for each bit in the desired random number. Since exclusive-or is the same as bitwise addition all the bit operations can be don in parallel. This gives the speed advantage.

  • Shuffling NumbersSometimes it is desirable to randomize a small set of numbers so that a non-repeating sequence is obtained. GamesOceanographic RAFOS floatIt is Important not to repeat numbers. Taking the modulus of a generator like r250 will not work as the numbers could repeat.One way to do this would be to put the value to be shuffled into an array and to use a random number generator to generate indices into the array to actually shuffle the numbers. The array is then accessed sequentially.

  • Quasi Random NumbersFor some applications pseudo random numbers are a little too random.Some portions of the domain are relatively under sampled and other portions are over sampled.Quasi Random number generators maintain a uniform density of coverage over the entire domain by giving up serial independence of subsequenctly generated value in order to obtain a uniform coverage of the domain.

  • Cryptanalytic Attacks on Random Number Generators

    Examples of random parameters in cryptography:Session keysNumbers to be hashed with passwordsParameters in digital signaturesNonces

    Most of the above are approximated using PRNGs

    For true randomness:Noise in electrical circuitsRadioactive decay etc.

  • Classes of Attacks on PRNGs:

    Direct Cryptanalytic Attack:When the attacker can directly distinguish between PRNG numbers and random numbers (cryptanalyze the PRNG).

    Input Based Attack:When the attacker is able to use knowledge and control of PRNG inputs to cryptanalyze the PRNG.

    State Compromise Extension Attacks:When the attacker can guess some information due to an earlier breach of security. The advantage of a previous attack is extended.

  • Direct Cryptanalytic Attacks:

    When the attacker can directly cryptanalyze the PRNG.

    Applicable to most PRNGs

    Not applicable when the attacker is not able to directly see the output of the PRNG.Eg A PRNG used to generate triple-DES keys. Here the output of the PRNG is never directly seen by an attacker.

  • Input Based Attacks:When an attacker used knowledge or control of the inputs to cyptanalyze the PRNG output.Types:Known InputIf the inputs to the PRNG, that are designed to be difficult for a user to guess, turn out to be easily deducible. Eg disk latency time. When the user is accessing a network disk, the attacker can observe the latency time. Chosen inputPractical against smartcards, applications that feed incoming messages (username/password etc) to the PRNG as entropy samples.Replayed InputSimilar to chosen input, except it requires less sophistication on the part of the attacker.

  • State Compromise Extension Attacks:Attempts to extend the advantages of a temporary security breachThese breaches can be:Inadvertent leakPrevious cryptographic successThis attack is successful when:The attacker learns the internal state of the system at state S and its:Able to recover unknown PRNG outputs from before S was compromised. ORRecover outputs from after a PRNG has collected a sequence of inputs that an attacker cannot otherwise guess.These attacks usually succeed when the system is started in guessable state (due to lack of entropy):

  • State Compromise Extension Attacks (cont):These attacks are classified as:

    Backtracking attacksUses the compromise of PRNG state S to learn about all previous PRNG outputs.Permanent compromise attackOnce S has been compromised, all future and past outputs of the PRNG are vulnerable.Iterative guessing attacksUses the knowledge of state S that was compromised at time t and the intervening PRNG outputs to guess the state S at time t+.Meet-in-the-middle attacksCombination of iterative guessing and backtracking.

  • Some Examples:

    X 9.17 PRNG:Vulnerable to Input based attack and state compromise extension attacks.

    DSA PRNG:Vulnerable only to state compromise extension attacks.

    RSAREF PRNG:Vulnerable to Input based attack and state compromise extension attacks.

  • Tests for Randomness in Random Numbers:

    Quantitative tests:2 tests:Lagged Correlation:

    Qualitative tests:Scatter PlotsPlot pairs of random numbers.Clumps of numbers, gaps and patterns are easily visible.Random Walk

  • 2 tests:

    Measure how well the presumed distribution (usually uniform) is represented.

    Algorithm for the test:Divide the whole interval, within which the random number would be into finite number of bins (class intervals). Assume they have same size.Count the number of random numbers within each interval and calculate the expected number of observations [(number of random numbers used) / (number of class intervals) for uniform intervals].Calculate: 2 = (i=1,m)(observedi expectedi)2 / (expectedi)The value of 2 determines if the numbers generated represent a chosen distribution, by looking up in a table, some critical values of 2.

  • Lagged Correlation:

    This test reveals the relationship between the numbers at one time and at another (autocorrelation).

    Reveals trends and periodicity of numbers.

    Properties of an ideal random number generator:Autocorrelation value = 1; for lag ()=0Autocorrelation value = 0; for any other value of If the autocorrelation values slowly drop to 0 as increases, then the random numbers generated are not very independent of each other.

  • Scatter Plots:

  • Random WalkAlgorithm:Divide the range of the random number generator into equal intervals.(Divide into 4 intervals for a random walk in two dimensions)Generate a number, if the number falls in:First interval, increment XSecond interval, increment YThird interval, decrement XFourth interval, decrement YGenerate t steps for a random walk for n walksCalculate the means squared distance reachedPlot this distance against timeA plot for several values of t and distance should roughly be linear.Else the random numbers are not correctly distributed.

  • Truly Random Numbers:

    Must rely on external physical quantitiesComputers require special hardwareFew computers have access to this kind of hardwareExample: Sensors (heat/pressure) etc. Randomness without relying on external data:Some way to measure internal activity of the computer such that the activity is quantifiable and genuinely random. Example: Timing of keystrokes as a user enters a password.

  • Some physical quantities used in real world for true random number generation:Timing of keystrokes when a user enters a password.

    Measurement of air turbulence due to the movement of hard drive heads.

    Timings of memory accesses under artificially induced thrashing conditions.

    Precise measurement of current leakage from a CPU or any other system component.

    Measurement of timing skew between two systems timers:A hardware timerA software timer

  • Conclusions:Random number are the basis for many cryptographic applications.

    There is no reliable independent function to generate random numbers.

    Present day computers can only approximate random numbers, using pseudo-random numbers generated by Pseudo Random Number Generators (PRNG)s.

    Attacks on many cryptographic applications are possible by attacks on PRNGs.

    Computer applications are increasingly turning towards using physical data (external/internal) for getting truly random numbers.