Upload
marina-santini
View
191
Download
0
Tags:
Embed Size (px)
Citation preview
Introduction to Probability
Last Updated: 20 March 2015
Slideshare: http://www.slideshare.net/marinasantini1/introduction-to-probability-theory
Mathematics for Language Technology http://stp.lingfil.uu.se/~matsd/uv/uv15/mfst/
Marina Santini [email protected]
Department of Linguistics and Philology Uppsala University, Uppsala, Sweden
Spring 2015
1
Acknowledgements Several slides borrowed from Prof Joakim Nivre. Practical Activity by Mats Dahllöf
Required Reading: E&G (2013): Ch. 5 (pp. 105-110) Compendium (4): 9.1 E&G (2013): Ch. 5.2-5.3 (self-study)
Recommended Reading: Sections 1-3 in Goldsmith J. (2007) Probability for
Linguists. The University of Chicago. The Department of Linguistics:
• http://hum.uchicago.edu/~jagoldsm/Papers/probability.pdf 2
Why study probability and statistics?
Developments in NLP have led to the exploitation of language corpora to refine and develop computational models of language. Many of these models exploit basic axioms, theorems and approximations from the field of probability theory and statistical inference.
4
Deterministic vs Non-Deterministic
Generally speaking, a deterministic system is a system in which no randomness is involved in the development of future states of the system. That is, a deterministic model will always produce the same behaviour from a given state. In automata theory, a deterministic finite automaton (DFA) is a finite state machine that accepts/rejects finite strings of symbols and only produces a unique computation (or run) of the automaton for each input string. A nondeterministic finite automaton (NFA), or nondeterministic finite state machine, needn't obey these restrictions.
5
Probability Theory
Probability theory is the branch of mathematics concerned with probability, ie the analysis of random/non-deterministic phenomena.
7
Statistics
Statistics is the study of the collection, analysis, interpretation, presentation, and organization of data.
8
Probability Theory and Statistics
9
We use probability theory to build models of uncertainty and we can use statistics to ground these models in empirical data.
Probability, Event and Sample Space
10
Ex 2: we have a sample space of sentences and we are interested in the length of these sentences. A relevant event would be the set of all sentences that contain exactly 8 words. And again we can describe this set as the outcome for which the variable "numberOfWords" takes the value 8
Ex 1: we have a sample space consisting of words. An event in that sample space can be the set of NOUNS, ie all the words that belong to the category NOUN. One way of describing this subset is to say that the property PartOfSpeech has the value NOUN. = is an element of
Formula & Calculations
14
Calculations: 6x6x6=216; 26x26x26=17576; 216/17576=0.01228949
if A is an event, and x1 to xn are its individual outcomes, then the probability of A can be computed by summing the probability of each outcome because they are disjoint or mutually exclusive.
Read as: sum from i=1 to n or sum over all the elements of the set
There are 26 ways of choosing the first letter, 26 ways of choosing the 2nd letter and 26 ways of choosing the third letter, ie 26*26*26 = 263
But there are only 6 ways of choosing the first vowel… Since we assume that all strings are equally
possible, the probability is simply 1 over the total number of strings.
In order to get the probability of the 3-vowel string, we can simply add the strings that contain exactly 3 vowels. So 6 to the power of 3 plus 26 to the power of 3 gives us approximately.012
In sum
The probability of an event is the SUM of the probabilities of each outcome
An event is represented as a variable
15
Theorems
16
A theorem is a statement that has been proven on the basis of previously established statements, such as axioms
Addition Rule: A method for finding the probability that either or both of two events occur
17
In other words: If events A and B are mutually exclusive (disjoint), then: P(A or B) = P(A) + P(B) Otherwise: P(A or B) = P(A) + P(B) – P(A and B)
Say that A is the set of people who have glasses and B is the set of people who are blond We are interested in the set of people who are blond OR have glasses. If we simply add the probabilities of the two simple events, we count blond with glasses twice.Therefore, in order to get the correct probability, we have to subtract
Think of the axiom about disjoint event as a special case where the intersection is empty. Therefore it is not added in the first place, and it has not to be subracted.
Quiz 1: Solution
1. 0.01 - incorrect. The probability of an event and its complement must sum to 1. 2. 0.99 - correct. The complement of A has probability 1 - P(A). 3. Impossible to tell - incorrect. The complement of A must have probability 1 - P(A).
19
Quiz 2: Solutions
1. P(A or B) < P(A and B) - incorrect. Since the union includes the intersection, it can never have lower probability.
2. 2. P(A or B) = P(A and B) - correct. This is possible as a limiting case, for example, when A = B.
3. 3. P(A or B) > P(A and B) - correct. This holds as soon as there is some outcome with a positive probability in A or B that is not in the intersection.
21
Practical Activity
22
We have a regular die. We cast the die twice and we get a two and a four. Therefore, A = {2,4}. Calculate: 1. The probability of the event A = {2,4} 2. The probability that the first number is a 6 3. The probability that the second number is a 5 or a 6 4. The probability that the first and the second number are the same 5. The probability that the first number is an odd number 6. The probability that the first and the second number are both odd numbers
Practical Activity: Solutions
1. The probability of the event A = {2,4} [1/36 = 0.05] 2. The probability that the first number is a 6 [1/6 = 0.16] 3. The probability that the second number is a 5 or a 6 [1/3 = 0.33] 4. The probability that the first and the second number are the same [1/6 =
0.16] 5. The probability that the first number is an odd number [1/5 = 0.5] 6. The probability that the first and the second number are both odd numbers
[1/4 = 0.25]
23