Upload
penelope-young
View
218
Download
0
Embed Size (px)
Citation preview
Probability Distributions
2014/04/07Maiko Narahara
http://en.wikibooks.org/wiki/R_Programming/Probability_Distributions
Probability density function(PDF)
• A function that defines probabilities of continuous variables• Because a continuous variable is continuous, the probability
of observing the exact value is almost zero.• So, we read the area of a certain range as a probability of
observing any value in the range.• --> Area under the curve must be 1.
Probability mass function(PMF)
• A function that defines probabilities for discrete variables.• Unlike continuous variables, the probability of observing exact value is
defined (y-axis = probability). • The sum of y values for all possible x values must be 1.
http://en.wikipedia.org/wiki/File:Binomial_distribution_pmf.svg
R functions for probability distributions
• [rdpq]name_of_distribution()– r: random generation• generate random numbers from distribution
– d: density distribution function• returns density for given value
– p: cumulative distribution function• returns cumulative probability• or used when we calculate p value
– q: quantile function• returns values that correspond to given quantiles
rnorm
n <- 1000x <- rnorm(n, mean=0, sd=1)hist(x)
dnorm
dnorm(x=1, mean=0, sd=1)gives the density that corresponds to the given value x.
Den
sity
X
pnormpnorm(q=0, mean=0, sd=1) gives the cumulative probability for the given value of x
How to compute p valueZ-test statistic: 2.5pnorm(2.5, lower.tail=FALSE)
*note: one-tail testCu
mul
ative
pro
babi
lity
X
qnormqnorm(0.975) returns x that corresponds to the given quantile value.
This example calculates the upper critical value at alpha=0.05 (two-tail).
Cum
ulati
ve p
roba
bilit
y
X
Tips 1Handling vectors
• rnorm(10, mean=1:10, sd=1:10)• rnorm(5, mean=c(1, 1, 2, 2, 2))– # sampling from different distributions
• dnorm(0, mean=1:2)• dnorm(c(0, 1), mean=1:2)– # similarly, qnorm and pnorm can handle vectors
Tips 2Drawing curve of d/p function
• Syntax: curve(function, from, to)curve(dnorm, from=-3, to=3) – # draws a nice curve for the standard normal distribution,
• But if you want to change the parameters for the distribution, how to do that?
curve(dnorm, mean=1, sd=2) # does not worka <- function(x) dnorm(x, mean=1, sd=2)curve(a, from=-3, to=5)• Similarly, you can draw a cumulative curvecurve(pnorm, from=-3, to=3)
Note about lower.tail=FALSE for discrete distributions
pbinom(1, 5, prob=0.3)--> 0.52822--> include the probability of x=1pbinom(1, 5, prob=0.3, lower.tail=FALSE)--> 0.47178--> does not include x=1Note that setting lower.tail=FALSE equals 1 - pbinom(1, 5, prob=0.3)