Upload
markus-denning
View
216
Download
0
Tags:
Embed Size (px)
Citation preview
www.csd.abdn.ac.uk/~jhunter/teaching/CS1512/lectures/
CS1512
CS1512
1
CS1512Foundations of
Computing Science 2
Lecture 24
Probability and statistics (5)Random number generators
© J R W Hunter, 2006
www.csd.abdn.ac.uk/~jhunter/teaching/CS1512/lectures/
CS1512
CS1512
2
After Easter
Lectures
• Dr Kees van Deemter will take over
• Logic and HCI
• Same times and places
Tutorials
• Logic and HCI
• Same times and places
Practicals
• Java programming simulation – Robocode ‘take-home asessment’
• Same times and places
www.csd.abdn.ac.uk/~jhunter/teaching/CS1512/lectures/
CS1512
CS1512
3
Continuous Assessment
Week 9 test• week 9 = week after Easter vacation;• worth 10% of the marks of the course;• as for week 5 – test under practical exam conditions;• will test your knowledge of inheritance.
‘Practical exam’• completed in your own time;• worth 30% of the marks of the course;• handed out in week 10; hand in by the end of week 12.
Both• conditional on AUT dispute being resolved;• safest course is to assume that they will go ahead;• if you are worried about the possible effects of the AUT
action, write to the Principal and express those concerns.
www.csd.abdn.ac.uk/~jhunter/teaching/CS1512/lectures/
CS1512
CS1512
4
Remember: Continuous data
• Divide range of observations into non-overlapping intervals (bins)
• Count number of observations in each bin
• Enzyme concentration data:
121 25 83 110 60 101
95 81 123 67 113 78
85 145 100 70 93 118
119 57 64 151 48 92
62 104 139 201 68 95
• Range: 25 to 201
• 10 bins of width 20
www.csd.abdn.ac.uk/~jhunter/teaching/CS1512/lectures/
CS1512
CS1512
5
Remember: Enzyme concentrations
Concentration Freq. Rel.Freq. 19.5 ≤ c < 39.5 1 0.033 39.5 ≤ c < 59.5 2 0.067 59.5 ≤ c < 79.5 7 0.233 79.5 ≤ c < 99.5 7 0.233 99.5 ≤ c < 119.5 7 0.233119.5 ≤ c < 139.5 3 0.100139.5 ≤ c < 159.5 2 0.067159.5 ≤ c < 179.5 0 0.000179.5 ≤ c < 199.5 0 0.000199.5 ≤ c < 219.5 1 0.033
Totals 30 1.000
www.csd.abdn.ac.uk/~jhunter/teaching/CS1512/lectures/
CS1512
CS1512
6
Relative Frequency Histogram
0.00E+00
5.00E-02
1.00E-01
1.50E-01
2.00E-01
2.50E-01
height of the bar gives the relative frequency
relative frequency
www.csd.abdn.ac.uk/~jhunter/teaching/CS1512/lectures/
CS1512
CS1512
7
Density Histograms
Plot relative frequency / width of the column (bin width) so that the area of the bar now gives the relative frequency
0.00E+00
2.00E-03
4.00E-03
6.00E-03
8.00E-03
1.00E-02
1.20E-02
1.40E-02
19.5 39.5 59.5 79.5 99.5 119.5 139.5 159.5 179.5 199.5
relative frequency = relative frequency density
bin width= 0.0165 20 = 0.233
relative frequency density
www.csd.abdn.ac.uk/~jhunter/teaching/CS1512/lectures/
CS1512
CS1512
8
Addition of areas
relative frequencies of values between here and here = this area
0.00E+00
2.00E-03
4.00E-03
6.00E-03
8.00E-03
1.00E-02
1.20E-02
1.40E-02
19.5 39.5 59.5 79.5 99.5 119.5 139.5 159.5 179.5 199.5
www.csd.abdn.ac.uk/~jhunter/teaching/CS1512/lectures/
CS1512
CS1512
9
Increase the number of samples
... and decrease the width of the bin ...
www.csd.abdn.ac.uk/~jhunter/teaching/CS1512/lectures/
CS1512
CS1512
10
Relative frequency as area under the curve
relative frequency of values between a and b = area
www.csd.abdn.ac.uk/~jhunter/teaching/CS1512/lectures/
CS1512
CS1512
11
Continuous random variable
Consider a large population of individuals e.g. all males in the UK over 16
Consider a continuous attribute e.g. Height: X
Select an individual at random so that any individual is as likely to be selected as any other
X is said to be a continuous random variable
www.csd.abdn.ac.uk/~jhunter/teaching/CS1512/lectures/
CS1512
CS1512
12
Probability density function
The probability distribution of X is said to be its probability density function defined such that:
P(a ≥ x > b) = area under the curve between a and b
NB total area under curve must be 1.0
www.csd.abdn.ac.uk/~jhunter/teaching/CS1512/lectures/
CS1512
CS1512
13
The ‘normal’ distribution
Very common distribution:
• often called a Gaussian distribution
• variable measured for large number of nominally identical objects;
• variation assumed to be caused by a large number of factors;
• each factor exerts a small random positive or negative influence;
• e.g. height: age diet bone structure genetic influences etc.
Symmetric about mean
Unimodal0
0.05
0.1
0.15
0.2
0.25
www.csd.abdn.ac.uk/~jhunter/teaching/CS1512/lectures/
CS1512
CS1512
14
Mean Mean determines the centre of the curve:
0
0.05
0.1
0.15
0.2
0.25
0
0.05
0.1
0.15
0.2
0.25
Mean = 10
Mean = 30
www.csd.abdn.ac.uk/~jhunter/teaching/CS1512/lectures/
CS1512
CS1512
15
Remember: Variance
Measure of spread: variance
0
5
10
15
20
25
30
35
40
45
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 0
5
10
15
20
25
30
35
40
45
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18
www.csd.abdn.ac.uk/~jhunter/teaching/CS1512/lectures/
CS1512
CS1512
16
Remember: Variance
sample variance = s2
sample standard deviation = s = √ variance
www.csd.abdn.ac.uk/~jhunter/teaching/CS1512/lectures/
CS1512
CS1512
17
Standard deviation
0
0.05
0.1
0.15
0.2
0.25
0.3
0.35
0.4
0.45
0
0.05
0.1
0.15
0.2
0.25
Standard deviation determines the ‘width’ of the curve:
Std. Devn. = 2
Std. Devn. = 1
www.csd.abdn.ac.uk/~jhunter/teaching/CS1512/lectures/
CS1512
CS1512
18
Remember: Cumulative frequencies
Number of piglets
in a litter:
(discrete data)
Litter size Frequency Cum. Freq
5 1 1 6 0 1 7 2 3 8 3 6 9 3 9 10 9 18 11 8 26 12 5 31 13 3 34 14 2 36
Total 36cK = n
www.csd.abdn.ac.uk/~jhunter/teaching/CS1512/lectures/
CS1512
CS1512
19
Remember: Plotting
frequency cumulative frequency
www.csd.abdn.ac.uk/~jhunter/teaching/CS1512/lectures/
CS1512
CS1512
20
Cumulative normal distribution
0
0.2
0.4
0.6
0.8
1
1.2
For good demo, go to: http://www.vertex42.com/ExcelArticles/mc/NormalDistribution-Excel.html and download the Excel file
www.csd.abdn.ac.uk/~jhunter/teaching/CS1512/lectures/
CS1512
CS1512
21
Relationship between the two distributions
0
0.05
0.1
0.15
0.2
0.25
0
0.2
0.4
0.6
0.8
1
1.2
area under curve = 0.84
www.csd.abdn.ac.uk/~jhunter/teaching/CS1512/lectures/
CS1512
CS1512
22
Probability of sample lying within mean ± 2 standard deviations
x Prob Dist Cum Prob Dist 2.0 0.00013383 00.003% 2.4 0.000291947 00.007% 2.8 0.000611902 00.016% 3.2 0.001232219 00.034% 3.6 0.002384088 00.069% 4.0 0.004431848 00.135% 4.4 0.007915452 00.256% 4.8 0.013582969 00.466% 5.2 0.02239453 00.820% 5.6 0.035474593 01.390% 6.0 0.053990967 02.275% 6.4 0.078950158 03.593% 6.8 0.110920835 05.480% 7.2 0.149727466 08.076% 7.6 0.194186055 11.507% 8.0 0.241970725 15.866% 8.4 0.289691553 21.186% 8.8 0.333224603 27.425% 9.2 0.36827014 34.458% 9.6 0.391042694 42.074%10.0 0.39894228 50.000%
Mean (μ)= 10.0Std. devn (σ)= 2.0
P(X < μ – σ) = 15.866%
P(X < μ – 2σ) = 2.275%
P(μ – 2σ < X < μ + 2σ) = = (100 – 2 * 2.275)% = 94.5%
www.csd.abdn.ac.uk/~jhunter/teaching/CS1512/lectures/
CS1512
CS1512
23
Probability of sample lying within mean ± 2 standard deviations
0
0.05
0.1
0.15
0.2
0.25
2.275% 2.275%94.5%
μ – 2σ μ + 2σμ
www.csd.abdn.ac.uk/~jhunter/teaching/CS1512/lectures/
CS1512
CS1512
24
Uniform probability distribution
Also called rectangular distribution
0.0 1.0
x
P(X)
0.0
1.0
P(X < y) = 1.0 y = y
y
www.csd.abdn.ac.uk/~jhunter/teaching/CS1512/lectures/
CS1512
CS1512
25
Uniform probability distribution
0.0
x
P(X)
0.0
P(X < a) = a
P(X < b) = b
P(a ≤ X < b) = b - a
1.0
1.0
a b
www.csd.abdn.ac.uk/~jhunter/teaching/CS1512/lectures/
CS1512
CS1512
26
Sampling from a distribution
Suppose we want a stream of numbers sampled from a given distribution.Previously we had the sample data and wanted the distribution.Now we have the distribution and want sample data.
Simplest to sample from the uniform distribution between 0.0 and 1.0:
0.6282666143546787 0.1874836450093842 0.13450942779513230.0720166704579284 0.5892161544310359 0.93753356924707780.6377396244822982 0.6832029056956863 0.81960762878402440.3689553414430091 0.6597233555218959 0.99691464429868770.0867381632942044 0.4262198006313059 0.30649543632706120.7706191731891433 0.7327364126731544 0.61841146714454690.4400410508617185 0.7270704022602184 ...
Use a random number generator
www.csd.abdn.ac.uk/~jhunter/teaching/CS1512/lectures/
CS1512
CS1512
27
Random Number Generators
From a given starting number (the seed) there are algorithms which will generate a series of pseudo-random numbers which are uniformly distributed:• linear congruential pseudorandom number generator
• but you didn’t want to know this!
Computers are deterministic:• from a given starting point they always do the same thing;• how do we get different series?• start from different seeds:
choose the seed yourself derive it from the computer clock (date and time of day)
www.csd.abdn.ac.uk/~jhunter/teaching/CS1512/lectures/
CS1512
CS1512
28
Java support – class Math
static double random()• returns a double value with a positive sign, greater than or equal to 0.0
and less than 1.0 (0.0 ≤ x < 1.0);
• when this method is first called, it creates a single new pseudo-random-number generator (seed derived automatically) which is used thereafter for all calls to this method and is used nowhere else.
public void randGen() //demo of Math.random(){ for (int i=0; i<20; i++) System.out.println(Math.random());}
www.csd.abdn.ac.uk/~jhunter/teaching/CS1512/lectures/
CS1512
CS1512
29
Java support – class java.util.Random
Construct a random number generator:
• public Random() Creates a new random number generator; this constructor sets the
seed of the random number generator to a value very likely to be distinct from any other invocation of this constructor.
• public Random(long seed) Creates a new random number generator using a single long seed
www.csd.abdn.ac.uk/~jhunter/teaching/CS1512/lectures/
CS1512
CS1512
30
Java support – class java.util.Random
Get the (next) random number:
• public double nextDouble() just like Math.random()
• public int nextInt(int n) returns a pseudo-random, uniformly distributed int value between 0
(inclusive) and the specified value (exclusive)
• public double nextGaussian() returns the next pseudo-random, Gaussian ("normally") distributed
double value with mean 0.0 and standard deviation 1.0 from this random number generator's sequence.
www.csd.abdn.ac.uk/~jhunter/teaching/CS1512/lectures/
CS1512
CS1512
31
Testing the uniformity
public void testUniform(int numberOfSamples, int numberOfBins){ double[] hist = new double[numberOfBins]; double sample; int binNumber; for (int i = 0; i < numberOfSamples; i++){ sample = Math.random(); binNumber = (int) (sample * numberOfBins); hist[binNumber]++; } double relativeFrequency; for (int k = 0; k < numberOfBins; k++){ relativeFrequency = hist[k]/numberOfSamples; System.out.println(relativeFrequency); } System.out.println();}
www.csd.abdn.ac.uk/~jhunter/teaching/CS1512/lectures/
CS1512
CS1512
32
Simulating coin toss
public String coinToss(int n){ String s = ""; Random random = new Random(); for (int i = 0; i < n; i++) { int t = random.nextInt(2); // i.e. t = 0 or 1 if (t == 0) // we want this to happen // with probability 0.5 s = s + "T "; else s = s + "H "; } return s; }
www.csd.abdn.ac.uk/~jhunter/teaching/CS1512/lectures/
CS1512
CS1512
33
Simulating dice throw
public String diceThrow(int n){
String s = "";
Random random = new Random();
for (int i = 0; i < n; i++)
s = s + random.nextInt(6) + " ";
return s;
}
www.csd.abdn.ac.uk/~jhunter/teaching/CS1512/lectures/
CS1512
CS1512
34
Simulating picking balls
public String pickABall(int n){ String s = ""; Random random = new Random(); for (int i = 0; i < n; i++) { double b = random.nextDouble(); if (b < 0.3) // we want this to happen with probability 0.3 s = s + "R "; else s = s + "W "; } return s; }
www.csd.abdn.ac.uk/~jhunter/teaching/CS1512/lectures/
CS1512
CS1512
35
‘Foxes and rabbits’ simulation
Rabbits and foxes in an enclosed field;• example of a “predator-prey” simulation;• see Barnes and Kölling, Objects first with Java, Chapter 10.
The field:• has a fixed number of square
cells arranged in a square grid;• each cell can be occupied
by only one animal;• animals can’t leave the field.
www.csd.abdn.ac.uk/~jhunter/teaching/CS1512/lectures/
CS1512
CS1512
36
Animals
All animals have:• a state (alive or dead!)• an age• a location in the field
All animals do:• get older• breed• try to move to a new location• die of old age• die of overcrowding
Foxes:
• die of hunger
Rabbits:
• die from being eaten by a fox
www.csd.abdn.ac.uk/~jhunter/teaching/CS1512/lectures/
CS1512
CS1512
37
Breeding
Rabbits:
BREEDING_PROBABILITY = 0.15;
MAX_LITTER_SIZE = 5;
private int breed() // returns size of litter (if any) { int births = 0; if (rand.nextDouble() <= BREEDING_PROBABILITY) { births = rand.nextInt(MAX_LITTER_SIZE) + 1; } return births; }
Foxes:
BREEDING_PROBABILITY = 0.09;
MAX_LITTER_SIZE = 3;
www.csd.abdn.ac.uk/~jhunter/teaching/CS1512/lectures/
CS1512
CS1512
38
Breeding probability
0.0 1.0
x
P(X)
0.0
1.0
if (rand.nextDouble() <= BREEDING_PROBABILITY) ...
www.csd.abdn.ac.uk/~jhunter/teaching/CS1512/lectures/
CS1512
CS1512
39
Number of births witheach litter size equally likely
0 MAX_LITTER_SIZE
x
P(X)
0.0
1 / MAX_LITTER_SIZE
births = rand.nextInt(MAX_LITTER_SIZE) + 1
1 2 3 ...
www.csd.abdn.ac.uk/~jhunter/teaching/CS1512/lectures/
CS1512
CS1512
40
Number of births with different probabilities of litter sizes
0 MAX_LITTER_SIZE
x
P(X)
0.01 2 3 ...
p3
p…
p2
p1
pMAX
Given p1, p2, ... pMAX, how do you use a random number generator to generate a litter size?
www.csd.abdn.ac.uk/~jhunter/teaching/CS1512/lectures/
CS1512
CS1512
41
Have a good
Easter!