Upload
others
View
14
Download
0
Embed Size (px)
Citation preview
Dirichlet Processes
Machine Learning: Jordan Boyd-GraberUniversity of MarylandINTRODUCTION
Machine Learning: Jordan Boyd-Graber | UMD Dirichlet Processes | 1 / 1
Content Questions
Machine Learning: Jordan Boyd-Graber | UMD Dirichlet Processes | 2 / 1
Administrivia
� Feeback on projects
� Work on first deliverable
Machine Learning: Jordan Boyd-Graber | UMD Dirichlet Processes | 3 / 1
DPMM
� Don’t know how many clusters there are
� Gibbs sampling: change the assignment of one cluster conditioned onall other clusters
� Convergence harder to detect
� Equation
p(zi = k |~z−i ,~x ,{θk},α)∝
¨�
nkn·+α
�
N�
x , nx̄n+1 ,1
�
existingα
n·+αN (x ,0,1) new
(1)
Machine Learning: Jordan Boyd-Graber | UMD Dirichlet Processes | 4 / 1
Simplification
We’ll assume that:
p(x | x̄)∝ exp
¨
−
√
√
�
x1−n
n + 1x̄1
�2
+
�
x2−n
n + 1x̄2
�2«
(2)
Machine Learning: Jordan Boyd-Graber | UMD Dirichlet Processes | 5 / 1
Data
10 9 8 7 6 5 4 3 2 1 0 1 2 3 4 5 6 7 8 9 10 111110
9876543210123456789
1011
Machine Learning: Jordan Boyd-Graber | UMD Dirichlet Processes | 6 / 1
Sampling point 0
Compute the (proportional) probability of assigning data 0 to a new clusterand cluster 1.Recall that α= 0.25 and
p(x | x̄)∝ exp
¨
−
√
√
�
x1−n
n + 1x̄1
�2
+
�
x2−n
n + 1x̄2
�2«
(3)
i x1 x2 zi
0 10 101 8 9 12 7 6 23 -9 -10 34 -5 -10 45 -7 -7 56 1 1 6
� There are currently 6 clusters(4)
� After normalization:{new: 0.00 1: 0.80 2: 0.19 3: 0.00 4: 0.00 5: 0.00 6:0.00}
� New assignment = 1
Machine Learning: Jordan Boyd-Graber | UMD Dirichlet Processes | 7 / 1
Sampling point 0
� There are currently 6 clusters
(3)
� After normalization:{new: 0.00 1: 0.80 2: 0.19 3: 0.00 4: 0.00 5: 0.00 6:0.00}
� New assignment = 1
Machine Learning: Jordan Boyd-Graber | UMD Dirichlet Processes | 7 / 1
Sampling point 0
� There are currently 6 clusters
p(z0 = new | ~z−0 ,~x ,α)∝0.25
6 + 0.25N
�
10.0010.00
| 0.000.00
,1�
= 0.04×0.00000 (3)
(4)
� After normalization:{new: 0.00 1: 0.80 2: 0.19 3: 0.00 4: 0.00 5: 0.00 6:0.00}
� New assignment = 1
Machine Learning: Jordan Boyd-Graber | UMD Dirichlet Processes | 7 / 1
Sampling point 0
� There are currently 6 clusters
p(z0 = new | ~z−0 ,~x ,α)∝0.25
6 + 0.25N
�
10.0010.00
| 0.000.00
,1�
= 0.04×0.00000 (3)
p(z0 = 1 | ~z−0 ,~x ,α)∝1.00
6 + 0.25N
�
10.0010.00
| 4.004.50
,1�
= 0.16×0.00029 (4)
(5)
� After normalization:{new: 0.00 1: 0.80 2: 0.19 3: 0.00 4: 0.00 5: 0.00 6:0.00}
� New assignment = 1
Machine Learning: Jordan Boyd-Graber | UMD Dirichlet Processes | 7 / 1
Sampling point 0� There are currently 6 clusters
p(z0 = new | ~z−0 ,~x ,α)∝0.25
6 + 0.25N
�
10.0010.00
| 0.000.00
,1�
= 0.04×0.00000 (3)
p(z0 = 1 | ~z−0 ,~x ,α)∝1.00
6 + 0.25N
�
10.0010.00
| 4.004.50
,1�
= 0.16×0.00029 (4)
p(z0 = 2 | ~z−0 ,~x ,α)∝1.00
6 + 0.25N
�
10.0010.00
| 3.503.00
,1�
= 0.16×0.00007 (5)
p(z0 = 3 | ~z−0 ,~x ,α)∝1.00
6 + 0.25N
�
10.0010.00
| −4.50−5.00
,1�
= 0.16×0.00000 (6)
p(z0 = 4 | ~z−0 ,~x ,α)∝1.00
6 + 0.25N
�
10.0010.00
| −2.50−5.00
,1�
= 0.16×0.00000 (7)
p(z0 = 5 | ~z−0 ,~x ,α)∝1.00
6 + 0.25N
�
10.0010.00
| −3.50−3.50
,1�
= 0.16×0.00000 (8)
p(z0 = 6 | ~z−0 ,~x ,α)∝1.00
6 + 0.25N
�
10.0010.00
| 0.500.50
,1�
= 0.16×0.00000 (9)
� After normalization:{new: 0.00 1: 0.80 2: 0.19 3: 0.00 4: 0.00 5: 0.00 6:0.00}
� New assignment = 1
Machine Learning: Jordan Boyd-Graber | UMD Dirichlet Processes | 7 / 1
Sampling point 0� There are currently 6 clusters
p(z0 = new | ~z−0 ,~x ,α)∝0.25
6 + 0.25N
�
10.0010.00
| 0.000.00
,1�
= 0.04×0.00000 (3)
p(z0 = 1 | ~z−0 ,~x ,α)∝1.00
6 + 0.25N
�
10.0010.00
| 4.004.50
,1�
= 0.16×0.00029 (4)
p(z0 = 2 | ~z−0 ,~x ,α)∝1.00
6 + 0.25N
�
10.0010.00
| 3.503.00
,1�
= 0.16×0.00007 (5)
p(z0 = 3 | ~z−0 ,~x ,α)∝1.00
6 + 0.25N
�
10.0010.00
| −4.50−5.00
,1�
= 0.16×0.00000 (6)
p(z0 = 4 | ~z−0 ,~x ,α)∝1.00
6 + 0.25N
�
10.0010.00
| −2.50−5.00
,1�
= 0.16×0.00000 (7)
p(z0 = 5 | ~z−0 ,~x ,α)∝1.00
6 + 0.25N
�
10.0010.00
| −3.50−3.50
,1�
= 0.16×0.00000 (8)
p(z0 = 6 | ~z−0 ,~x ,α)∝1.00
6 + 0.25N
�
10.0010.00
| 0.500.50
,1�
= 0.16×0.00000 (9)
� After normalization:{new: 0.00 1: 0.80 2: 0.19 3: 0.00 4: 0.00 5: 0.00 6:0.00}
� New assignment = 1
Machine Learning: Jordan Boyd-Graber | UMD Dirichlet Processes | 7 / 1
Sampling point 0� There are currently 6 clusters
p(z0 = new | ~z−0 ,~x ,α)∝0.25
6 + 0.25N
�
10.0010.00
| 0.000.00
,1�
= 0.04×0.00000 (3)
p(z0 = 1 | ~z−0 ,~x ,α)∝1.00
6 + 0.25N
�
10.0010.00
| 4.004.50
,1�
= 0.16×0.00029 (4)
p(z0 = 2 | ~z−0 ,~x ,α)∝1.00
6 + 0.25N
�
10.0010.00
| 3.503.00
,1�
= 0.16×0.00007 (5)
p(z0 = 3 | ~z−0 ,~x ,α)∝1.00
6 + 0.25N
�
10.0010.00
| −4.50−5.00
,1�
= 0.16×0.00000 (6)
p(z0 = 4 | ~z−0 ,~x ,α)∝1.00
6 + 0.25N
�
10.0010.00
| −2.50−5.00
,1�
= 0.16×0.00000 (7)
p(z0 = 5 | ~z−0 ,~x ,α)∝1.00
6 + 0.25N
�
10.0010.00
| −3.50−3.50
,1�
= 0.16×0.00000 (8)
p(z0 = 6 | ~z−0 ,~x ,α)∝1.00
6 + 0.25N
�
10.0010.00
| 0.500.50
,1�
= 0.16×0.00000 (9)
� After normalization:{new: 0.00 1: 0.80 2: 0.19 3: 0.00 4: 0.00 5: 0.00 6:0.00}
� New assignment = 1
Machine Learning: Jordan Boyd-Graber | UMD Dirichlet Processes | 7 / 1
Sampling point 0� There are currently 6 clusters
p(z0 = new | ~z−0 ,~x ,α)∝0.25
6 + 0.25N
�
10.0010.00
| 0.000.00
,1�
= 0.04×0.00000 (3)
p(z0 = 1 | ~z−0 ,~x ,α)∝1.00
6 + 0.25N
�
10.0010.00
| 4.004.50
,1�
= 0.16×0.00029 (4)
p(z0 = 2 | ~z−0 ,~x ,α)∝1.00
6 + 0.25N
�
10.0010.00
| 3.503.00
,1�
= 0.16×0.00007 (5)
p(z0 = 3 | ~z−0 ,~x ,α)∝1.00
6 + 0.25N
�
10.0010.00
| −4.50−5.00
,1�
= 0.16×0.00000 (6)
p(z0 = 4 | ~z−0 ,~x ,α)∝1.00
6 + 0.25N
�
10.0010.00
| −2.50−5.00
,1�
= 0.16×0.00000 (7)
p(z0 = 5 | ~z−0 ,~x ,α)∝1.00
6 + 0.25N
�
10.0010.00
| −3.50−3.50
,1�
= 0.16×0.00000 (8)
p(z0 = 6 | ~z−0 ,~x ,α)∝1.00
6 + 0.25N
�
10.0010.00
| 0.500.50
,1�
= 0.16×0.00000 (9)
� After normalization:{new: 0.00 1: 0.80 2: 0.19 3: 0.00 4: 0.00 5: 0.00 6:0.00}
� New assignment = 1
Machine Learning: Jordan Boyd-Graber | UMD Dirichlet Processes | 7 / 1
Assignments after sampling point 0
10 9 8 7 6 5 4 3 2 1 0 1 2 3 4 5 6 7 8 9 10 111110
9876543210123456789
1011
Machine Learning: Jordan Boyd-Graber | UMD Dirichlet Processes | 8 / 1
Sampling point 1
Compute the (proportional) probability of assigning data 1 to clusters 1 and2.Recall that α= 0.25 and
p(x | x̄)∝ exp
¨
−
√
√
�
x1−n
n + 1x̄1
�2
+
�
x2−n
n + 1x̄2
�2«
(10)
i x1 x2 zi
0 10 10 11 8 92 7 6 23 -9 -10 34 -5 -10 45 -7 -7 56 1 1 6
� There are currently 6 clusters(11)
� After normalization:{new: 0.00 1: 0.92 2: 0.08 3: 0.00 4: 0.00 5: 0.00 6:0.00}
� New assignment = 1
Machine Learning: Jordan Boyd-Graber | UMD Dirichlet Processes | 9 / 1
Sampling point 1
� There are currently 6 clusters
p(z1 = 1 | ~z−1 ,~x ,α)∝1.00
6 + 0.25N
�
8.009.00
| 5.005.00
,1�
= 0.16×0.00674 (10)
(11)
� After normalization:{new: 0.00 1: 0.92 2: 0.08 3: 0.00 4: 0.00 5: 0.00 6:0.00}
� New assignment = 1
Machine Learning: Jordan Boyd-Graber | UMD Dirichlet Processes | 9 / 1
Sampling point 1
� There are currently 6 clusters
p(z1 = 1 | ~z−1 ,~x ,α)∝1.00
6 + 0.25N
�
8.009.00
| 5.005.00
,1�
= 0.16×0.00674 (10)
p(z1 = 2 | ~z−1 ,~x ,α)∝1.00
6 + 0.25N
�
8.009.00
| 3.503.00
,1�
= 0.16×0.00055 (11)
(12)
� After normalization:{new: 0.00 1: 0.92 2: 0.08 3: 0.00 4: 0.00 5: 0.00 6:0.00}
� New assignment = 1
Machine Learning: Jordan Boyd-Graber | UMD Dirichlet Processes | 9 / 1
Sampling point 1� There are currently 6 clusters
p(z1 = new | ~z−1 ,~x ,α)∝0.25
6 + 0.25N
�
8.009.00
| 0.000.00
,1�
= 0.04×0.00001 (10)
p(z1 = 1 | ~z−1 ,~x ,α)∝1.00
6 + 0.25N
�
8.009.00
| 5.005.00
,1�
= 0.16×0.00674 (11)
p(z1 = 2 | ~z−1 ,~x ,α)∝1.00
6 + 0.25N
�
8.009.00
| 3.503.00
,1�
= 0.16×0.00055 (12)
p(z1 = 3 | ~z−1 ,~x ,α)∝1.00
6 + 0.25N
�
8.009.00
| −4.50−5.00
,1�
= 0.16×0.00000 (13)
p(z1 = 4 | ~z−1 ,~x ,α)∝1.00
6 + 0.25N
�
8.009.00
| −2.50−5.00
,1�
= 0.16×0.00000 (14)
p(z1 = 5 | ~z−1 ,~x ,α)∝1.00
6 + 0.25N
�
8.009.00
| −3.50−3.50
,1�
= 0.16×0.00000 (15)
p(z1 = 6 | ~z−1 ,~x ,α)∝1.00
6 + 0.25N
�
8.009.00
| 0.500.50
,1�
= 0.16×0.00001 (16)
� After normalization:{new: 0.00 1: 0.92 2: 0.08 3: 0.00 4: 0.00 5: 0.00 6:0.00}
� New assignment = 1
Machine Learning: Jordan Boyd-Graber | UMD Dirichlet Processes | 9 / 1
Sampling point 1� There are currently 6 clusters
p(z1 = new | ~z−1 ,~x ,α)∝0.25
6 + 0.25N
�
8.009.00
| 0.000.00
,1�
= 0.04×0.00001 (10)
p(z1 = 1 | ~z−1 ,~x ,α)∝1.00
6 + 0.25N
�
8.009.00
| 5.005.00
,1�
= 0.16×0.00674 (11)
p(z1 = 2 | ~z−1 ,~x ,α)∝1.00
6 + 0.25N
�
8.009.00
| 3.503.00
,1�
= 0.16×0.00055 (12)
p(z1 = 3 | ~z−1 ,~x ,α)∝1.00
6 + 0.25N
�
8.009.00
| −4.50−5.00
,1�
= 0.16×0.00000 (13)
p(z1 = 4 | ~z−1 ,~x ,α)∝1.00
6 + 0.25N
�
8.009.00
| −2.50−5.00
,1�
= 0.16×0.00000 (14)
p(z1 = 5 | ~z−1 ,~x ,α)∝1.00
6 + 0.25N
�
8.009.00
| −3.50−3.50
,1�
= 0.16×0.00000 (15)
p(z1 = 6 | ~z−1 ,~x ,α)∝1.00
6 + 0.25N
�
8.009.00
| 0.500.50
,1�
= 0.16×0.00001 (16)
� After normalization:{new: 0.00 1: 0.92 2: 0.08 3: 0.00 4: 0.00 5: 0.00 6:0.00}
� New assignment = 1
Machine Learning: Jordan Boyd-Graber | UMD Dirichlet Processes | 9 / 1
Sampling point 1� There are currently 6 clusters
p(z1 = new | ~z−1 ,~x ,α)∝0.25
6 + 0.25N
�
8.009.00
| 0.000.00
,1�
= 0.04×0.00001 (10)
p(z1 = 1 | ~z−1 ,~x ,α)∝1.00
6 + 0.25N
�
8.009.00
| 5.005.00
,1�
= 0.16×0.00674 (11)
p(z1 = 2 | ~z−1 ,~x ,α)∝1.00
6 + 0.25N
�
8.009.00
| 3.503.00
,1�
= 0.16×0.00055 (12)
p(z1 = 3 | ~z−1 ,~x ,α)∝1.00
6 + 0.25N
�
8.009.00
| −4.50−5.00
,1�
= 0.16×0.00000 (13)
p(z1 = 4 | ~z−1 ,~x ,α)∝1.00
6 + 0.25N
�
8.009.00
| −2.50−5.00
,1�
= 0.16×0.00000 (14)
p(z1 = 5 | ~z−1 ,~x ,α)∝1.00
6 + 0.25N
�
8.009.00
| −3.50−3.50
,1�
= 0.16×0.00000 (15)
p(z1 = 6 | ~z−1 ,~x ,α)∝1.00
6 + 0.25N
�
8.009.00
| 0.500.50
,1�
= 0.16×0.00001 (16)
� After normalization:{new: 0.00 1: 0.92 2: 0.08 3: 0.00 4: 0.00 5: 0.00 6:0.00}
� New assignment = 1
Machine Learning: Jordan Boyd-Graber | UMD Dirichlet Processes | 9 / 1
Assignments after sampling point 1
10 9 8 7 6 5 4 3 2 1 0 1 2 3 4 5 6 7 8 9 10 111110
9876543210123456789
1011
Machine Learning: Jordan Boyd-Graber | UMD Dirichlet Processes | 10 / 1
Sampling point 2
Compute the (proportional) probability of assigning data 2 to cluster 1 (butnothing else; there won’t be other options).Recall that α= 0.25 and
p(x | x̄)∝ exp
¨
−
√
√
�
x1−n
n + 1x̄1
�2
+
�
x2−n
n + 1x̄2
�2«
(17)
i x1 x2 zi
0 10 10 11 8 9 12 7 63 -9 -10 34 -5 -10 45 -7 -7 56 1 1 6
� There are currently 5 clusters(18)
� After normalization:{new: 0.00 1: 1.00 3: 0.00 4: 0.00 5: 0.00 6: 0.00}� New assignment = 1
Machine Learning: Jordan Boyd-Graber | UMD Dirichlet Processes | 11 / 1
Sampling point 2
� There are currently 5 clusters
p(z2 = 1 | ~z−2 ,~x ,α)∝2.00
6 + 0.25N
�
7.006.00
| 6.006.33
,1�
= 0.32×0.34851 (17)
(18)
� After normalization:{new: 0.00 1: 1.00 3: 0.00 4: 0.00 5: 0.00 6: 0.00}
� New assignment = 1
Machine Learning: Jordan Boyd-Graber | UMD Dirichlet Processes | 11 / 1
Sampling point 2
� There are currently 5 clusters
p(z2 = new | ~z−2 ,~x ,α)∝0.25
6 + 0.25N
�
7.006.00
| 0.000.00
,1�
= 0.04×0.00010 (17)
p(z2 = 1 | ~z−2 ,~x ,α)∝2.00
6 + 0.25N
�
7.006.00
| 6.006.33
,1�
= 0.32×0.34851 (18)
p(z2 = 3 | ~z−2 ,~x ,α)∝1.00
6 + 0.25N
�
7.006.00
| −4.50−5.00
,1�
= 0.16×0.00000 (19)
p(z2 = 4 | ~z−2 ,~x ,α)∝1.00
6 + 0.25N
�
7.006.00
| −2.50−5.00
,1�
= 0.16×0.00000 (20)
p(z2 = 5 | ~z−2 ,~x ,α)∝1.00
6 + 0.25N
�
7.006.00
| −3.50−3.50
,1�
= 0.16×0.00000 (21)
p(z2 = 6 | ~z−2 ,~x ,α)∝1.00
6 + 0.25N
�
7.006.00
| 0.500.50
,1�
= 0.16×0.00020 (22)
� After normalization:{new: 0.00 1: 1.00 3: 0.00 4: 0.00 5: 0.00 6: 0.00}
� New assignment = 1
Machine Learning: Jordan Boyd-Graber | UMD Dirichlet Processes | 11 / 1
Sampling point 2
� There are currently 5 clusters
p(z2 = new | ~z−2 ,~x ,α)∝0.25
6 + 0.25N
�
7.006.00
| 0.000.00
,1�
= 0.04×0.00010 (17)
p(z2 = 1 | ~z−2 ,~x ,α)∝2.00
6 + 0.25N
�
7.006.00
| 6.006.33
,1�
= 0.32×0.34851 (18)
p(z2 = 3 | ~z−2 ,~x ,α)∝1.00
6 + 0.25N
�
7.006.00
| −4.50−5.00
,1�
= 0.16×0.00000 (19)
p(z2 = 4 | ~z−2 ,~x ,α)∝1.00
6 + 0.25N
�
7.006.00
| −2.50−5.00
,1�
= 0.16×0.00000 (20)
p(z2 = 5 | ~z−2 ,~x ,α)∝1.00
6 + 0.25N
�
7.006.00
| −3.50−3.50
,1�
= 0.16×0.00000 (21)
p(z2 = 6 | ~z−2 ,~x ,α)∝1.00
6 + 0.25N
�
7.006.00
| 0.500.50
,1�
= 0.16×0.00020 (22)
� After normalization:{new: 0.00 1: 1.00 3: 0.00 4: 0.00 5: 0.00 6: 0.00}
� New assignment = 1
Machine Learning: Jordan Boyd-Graber | UMD Dirichlet Processes | 11 / 1
Sampling point 2
� There are currently 5 clusters
p(z2 = new | ~z−2 ,~x ,α)∝0.25
6 + 0.25N
�
7.006.00
| 0.000.00
,1�
= 0.04×0.00010 (17)
p(z2 = 1 | ~z−2 ,~x ,α)∝2.00
6 + 0.25N
�
7.006.00
| 6.006.33
,1�
= 0.32×0.34851 (18)
p(z2 = 3 | ~z−2 ,~x ,α)∝1.00
6 + 0.25N
�
7.006.00
| −4.50−5.00
,1�
= 0.16×0.00000 (19)
p(z2 = 4 | ~z−2 ,~x ,α)∝1.00
6 + 0.25N
�
7.006.00
| −2.50−5.00
,1�
= 0.16×0.00000 (20)
p(z2 = 5 | ~z−2 ,~x ,α)∝1.00
6 + 0.25N
�
7.006.00
| −3.50−3.50
,1�
= 0.16×0.00000 (21)
p(z2 = 6 | ~z−2 ,~x ,α)∝1.00
6 + 0.25N
�
7.006.00
| 0.500.50
,1�
= 0.16×0.00020 (22)
� After normalization:{new: 0.00 1: 1.00 3: 0.00 4: 0.00 5: 0.00 6: 0.00}
� New assignment = 1
Machine Learning: Jordan Boyd-Graber | UMD Dirichlet Processes | 11 / 1
Assignments after sampling point 2
10 9 8 7 6 5 4 3 2 1 0 1 2 3 4 5 6 7 8 9 10 111110
9876543210123456789
1011
Machine Learning: Jordan Boyd-Graber | UMD Dirichlet Processes | 12 / 1
Sampling point 3
Compute the (proportional) probability of assigning data 3 to cluster 4 and5.Recall that α= 0.25 and
p(x | x̄)∝ exp
¨
−
√
√
�
x1−n
n + 1x̄1
�2
+
�
x2−n
n + 1x̄2
�2«
(23)
i x1 x2 zi
0 10 10 11 8 9 12 7 6 13 -9 -104 -5 -10 45 -7 -7 56 1 1 6
� There are currently 4 clusters(24)
� After normalization:{new: 0.00 1: 0.00 4: 0.58 5: 0.42 6: 0.00}� New assignment = 4
Machine Learning: Jordan Boyd-Graber | UMD Dirichlet Processes | 13 / 1
Sampling point 3
� There are currently 4 clusters
(23)
� After normalization:{new: 0.00 1: 0.00 4: 0.58 5: 0.42 6: 0.00}
� New assignment = 4
Machine Learning: Jordan Boyd-Graber | UMD Dirichlet Processes | 13 / 1
Sampling point 3
� There are currently 4 clusters
p(z3 = 4 | ~z−3 ,~x ,α)∝1.00
6 + 0.25N
�
−9.00−10.00
| −2.50−5.00
,1�
= 0.16×0.00027 (23)
(24)
� After normalization:{new: 0.00 1: 0.00 4: 0.58 5: 0.42 6: 0.00}
� New assignment = 4
Machine Learning: Jordan Boyd-Graber | UMD Dirichlet Processes | 13 / 1
Sampling point 3
� There are currently 4 clusters
p(z3 = 4 | ~z−3 ,~x ,α)∝1.00
6 + 0.25N
�
−9.00−10.00
| −2.50−5.00
,1�
= 0.16×0.00027 (23)
p(z3 = 5 | ~z−3 ,~x ,α)∝1.00
6 + 0.25N
�
−9.00−10.00
| −3.50−3.50
,1�
= 0.16×0.00020 (24)
(25)
� After normalization:{new: 0.00 1: 0.00 4: 0.58 5: 0.42 6: 0.00}
� New assignment = 4
Machine Learning: Jordan Boyd-Graber | UMD Dirichlet Processes | 13 / 1
Sampling point 3
� There are currently 4 clusters
p(z3 = new | ~z−3 ,~x ,α)∝0.25
6 + 0.25N
�
−9.00−10.00
| 0.000.00
,1�
= 0.04×0.00000 (23)
p(z3 = 1 | ~z−3 ,~x ,α)∝3.00
6 + 0.25N
�
−9.00−10.00
| 6.256.25
,1�
= 0.48×0.00000 (24)
p(z3 = 4 | ~z−3 ,~x ,α)∝1.00
6 + 0.25N
�
−9.00−10.00
| −2.50−5.00
,1�
= 0.16×0.00027 (25)
p(z3 = 5 | ~z−3 ,~x ,α)∝1.00
6 + 0.25N
�
−9.00−10.00
| −3.50−3.50
,1�
= 0.16×0.00020 (26)
p(z3 = 6 | ~z−3 ,~x ,α)∝1.00
6 + 0.25N
�
−9.00−10.00
| 0.500.50
,1�
= 0.16×0.00000 (27)
� After normalization:{new: 0.00 1: 0.00 4: 0.58 5: 0.42 6: 0.00}
� New assignment = 4
Machine Learning: Jordan Boyd-Graber | UMD Dirichlet Processes | 13 / 1
Sampling point 3
� There are currently 4 clusters
p(z3 = new | ~z−3 ,~x ,α)∝0.25
6 + 0.25N
�
−9.00−10.00
| 0.000.00
,1�
= 0.04×0.00000 (23)
p(z3 = 1 | ~z−3 ,~x ,α)∝3.00
6 + 0.25N
�
−9.00−10.00
| 6.256.25
,1�
= 0.48×0.00000 (24)
p(z3 = 4 | ~z−3 ,~x ,α)∝1.00
6 + 0.25N
�
−9.00−10.00
| −2.50−5.00
,1�
= 0.16×0.00027 (25)
p(z3 = 5 | ~z−3 ,~x ,α)∝1.00
6 + 0.25N
�
−9.00−10.00
| −3.50−3.50
,1�
= 0.16×0.00020 (26)
p(z3 = 6 | ~z−3 ,~x ,α)∝1.00
6 + 0.25N
�
−9.00−10.00
| 0.500.50
,1�
= 0.16×0.00000 (27)
� After normalization:{new: 0.00 1: 0.00 4: 0.58 5: 0.42 6: 0.00}
� New assignment = 4
Machine Learning: Jordan Boyd-Graber | UMD Dirichlet Processes | 13 / 1
Sampling point 3
� There are currently 4 clusters
p(z3 = new | ~z−3 ,~x ,α)∝0.25
6 + 0.25N
�
−9.00−10.00
| 0.000.00
,1�
= 0.04×0.00000 (23)
p(z3 = 1 | ~z−3 ,~x ,α)∝3.00
6 + 0.25N
�
−9.00−10.00
| 6.256.25
,1�
= 0.48×0.00000 (24)
p(z3 = 4 | ~z−3 ,~x ,α)∝1.00
6 + 0.25N
�
−9.00−10.00
| −2.50−5.00
,1�
= 0.16×0.00027 (25)
p(z3 = 5 | ~z−3 ,~x ,α)∝1.00
6 + 0.25N
�
−9.00−10.00
| −3.50−3.50
,1�
= 0.16×0.00020 (26)
p(z3 = 6 | ~z−3 ,~x ,α)∝1.00
6 + 0.25N
�
−9.00−10.00
| 0.500.50
,1�
= 0.16×0.00000 (27)
� After normalization:{new: 0.00 1: 0.00 4: 0.58 5: 0.42 6: 0.00}
� New assignment = 4
Machine Learning: Jordan Boyd-Graber | UMD Dirichlet Processes | 13 / 1
Assignments after sampling point 3
10 9 8 7 6 5 4 3 2 1 0 1 2 3 4 5 6 7 8 9 10 111110
9876543210123456789
1011
Machine Learning: Jordan Boyd-Graber | UMD Dirichlet Processes | 14 / 1
Sampling point 4
Compute the (proportional) probability of assigning data 4 to cluster 4 and5.Recall that α= 0.25 and
p(x | x̄)∝ exp
¨
−
√
√
�
x1−n
n + 1x̄1
�2
+
�
x2−n
n + 1x̄2
�2«
(28)
i x1 x2 zi
0 10 10 11 8 9 12 7 6 13 -9 -10 44 -5 -105 -7 -7 56 1 1 6
� There are currently 4 clusters(29)
� After normalization:{new: 0.00 1: 0.00 4: 0.84 5: 0.16 6: 0.00}� New assignment = 4
Machine Learning: Jordan Boyd-Graber | UMD Dirichlet Processes | 15 / 1
Sampling point 4
� There are currently 4 clusters
p(z4 = 4 | ~z−4 ,~x ,α)∝1.00
6 + 0.25N
�
−5.00−10.00
| −4.50−5.00
,1�
= 0.16×0.00657 (28)
(29)
� After normalization:{new: 0.00 1: 0.00 4: 0.84 5: 0.16 6: 0.00}
� New assignment = 4
Machine Learning: Jordan Boyd-Graber | UMD Dirichlet Processes | 15 / 1
Sampling point 4
� There are currently 4 clusters
p(z4 = 4 | ~z−4 ,~x ,α)∝1.00
6 + 0.25N
�
−5.00−10.00
| −4.50−5.00
,1�
= 0.16×0.00657 (28)
p(z4 = 5 | ~z−4 ,~x ,α)∝1.00
6 + 0.25N
�
−5.00−10.00
| −3.50−3.50
,1�
= 0.16×0.00127 (29)
(30)
� After normalization:{new: 0.00 1: 0.00 4: 0.84 5: 0.16 6: 0.00}
� New assignment = 4
Machine Learning: Jordan Boyd-Graber | UMD Dirichlet Processes | 15 / 1
Sampling point 4
� There are currently 4 clusters
p(z4 = new | ~z−4 ,~x ,α)∝0.25
6 + 0.25N
�
−5.00−10.00
| 0.000.00
,1�
= 0.04×0.00001 (28)
p(z4 = 1 | ~z−4 ,~x ,α)∝3.00
6 + 0.25N
�
−5.00−10.00
| 6.256.25
,1�
= 0.48×0.00000 (29)
p(z4 = 4 | ~z−4 ,~x ,α)∝1.00
6 + 0.25N
�
−5.00−10.00
| −4.50−5.00
,1�
= 0.16×0.00657 (30)
p(z4 = 5 | ~z−4 ,~x ,α)∝1.00
6 + 0.25N
�
−5.00−10.00
| −3.50−3.50
,1�
= 0.16×0.00127 (31)
p(z4 = 6 | ~z−4 ,~x ,α)∝1.00
6 + 0.25N
�
−5.00−10.00
| 0.500.50
,1�
= 0.16×0.00001 (32)
� After normalization:{new: 0.00 1: 0.00 4: 0.84 5: 0.16 6: 0.00}
� New assignment = 4
Machine Learning: Jordan Boyd-Graber | UMD Dirichlet Processes | 15 / 1
Sampling point 4
� There are currently 4 clusters
p(z4 = new | ~z−4 ,~x ,α)∝0.25
6 + 0.25N
�
−5.00−10.00
| 0.000.00
,1�
= 0.04×0.00001 (28)
p(z4 = 1 | ~z−4 ,~x ,α)∝3.00
6 + 0.25N
�
−5.00−10.00
| 6.256.25
,1�
= 0.48×0.00000 (29)
p(z4 = 4 | ~z−4 ,~x ,α)∝1.00
6 + 0.25N
�
−5.00−10.00
| −4.50−5.00
,1�
= 0.16×0.00657 (30)
p(z4 = 5 | ~z−4 ,~x ,α)∝1.00
6 + 0.25N
�
−5.00−10.00
| −3.50−3.50
,1�
= 0.16×0.00127 (31)
p(z4 = 6 | ~z−4 ,~x ,α)∝1.00
6 + 0.25N
�
−5.00−10.00
| 0.500.50
,1�
= 0.16×0.00001 (32)
� After normalization:{new: 0.00 1: 0.00 4: 0.84 5: 0.16 6: 0.00}
� New assignment = 4
Machine Learning: Jordan Boyd-Graber | UMD Dirichlet Processes | 15 / 1
Sampling point 4
� There are currently 4 clusters
p(z4 = new | ~z−4 ,~x ,α)∝0.25
6 + 0.25N
�
−5.00−10.00
| 0.000.00
,1�
= 0.04×0.00001 (28)
p(z4 = 1 | ~z−4 ,~x ,α)∝3.00
6 + 0.25N
�
−5.00−10.00
| 6.256.25
,1�
= 0.48×0.00000 (29)
p(z4 = 4 | ~z−4 ,~x ,α)∝1.00
6 + 0.25N
�
−5.00−10.00
| −4.50−5.00
,1�
= 0.16×0.00657 (30)
p(z4 = 5 | ~z−4 ,~x ,α)∝1.00
6 + 0.25N
�
−5.00−10.00
| −3.50−3.50
,1�
= 0.16×0.00127 (31)
p(z4 = 6 | ~z−4 ,~x ,α)∝1.00
6 + 0.25N
�
−5.00−10.00
| 0.500.50
,1�
= 0.16×0.00001 (32)
� After normalization:{new: 0.00 1: 0.00 4: 0.84 5: 0.16 6: 0.00}
� New assignment = 4
Machine Learning: Jordan Boyd-Graber | UMD Dirichlet Processes | 15 / 1
Assignments after sampling point 4
10 9 8 7 6 5 4 3 2 1 0 1 2 3 4 5 6 7 8 9 10 111110
9876543210123456789
1011
Machine Learning: Jordan Boyd-Graber | UMD Dirichlet Processes | 16 / 1
Sampling point 5
Compute the (proportional) probability of assigning data 5 to cluster 4 (butnothing else is viable).Recall that α= 0.25 and
p(x | x̄)∝ exp
¨
−
√
√
�
x1−n
n + 1x̄1
�2
+
�
x2−n
n + 1x̄2
�2«
(33)
i x1 x2 zi
0 10 10 11 8 9 12 7 6 13 -9 -10 44 -5 -10 45 -7 -76 1 1 6
� There are currently 3 clusters(34)
� After normalization:{new: 0.00 1: 0.00 4: 1.00 6: 0.00}� New assignment = 4
Machine Learning: Jordan Boyd-Graber | UMD Dirichlet Processes | 17 / 1
Sampling point 5
� There are currently 3 clusters
p(z5 = 4 | ~z−5 ,~x ,α)∝2.00
6 + 0.25N
�
−7.00−7.00
| −4.67−6.67
,1�
= 0.32×0.09470 (33)
(34)
� After normalization:{new: 0.00 1: 0.00 4: 1.00 6: 0.00}
� New assignment = 4
Machine Learning: Jordan Boyd-Graber | UMD Dirichlet Processes | 17 / 1
Sampling point 5
� There are currently 3 clusters
p(z5 = new | ~z−5 ,~x ,α)∝0.25
6 + 0.25N
�
−7.00−7.00
| 0.000.00
,1�
= 0.04×0.00005 (33)
p(z5 = 1 | ~z−5 ,~x ,α)∝3.00
6 + 0.25N
�
−7.00−7.00
| 6.256.25
,1�
= 0.48×0.00000 (34)
p(z5 = 4 | ~z−5 ,~x ,α)∝2.00
6 + 0.25N
�
−7.00−7.00
| −4.67−6.67
,1�
= 0.32×0.09470 (35)
p(z5 = 6 | ~z−5 ,~x ,α)∝1.00
6 + 0.25N
�
−7.00−7.00
| 0.500.50
,1�
= 0.16×0.00002 (36)
� After normalization:{new: 0.00 1: 0.00 4: 1.00 6: 0.00}
� New assignment = 4
Machine Learning: Jordan Boyd-Graber | UMD Dirichlet Processes | 17 / 1
Sampling point 5
� There are currently 3 clusters
p(z5 = new | ~z−5 ,~x ,α)∝0.25
6 + 0.25N
�
−7.00−7.00
| 0.000.00
,1�
= 0.04×0.00005 (33)
p(z5 = 1 | ~z−5 ,~x ,α)∝3.00
6 + 0.25N
�
−7.00−7.00
| 6.256.25
,1�
= 0.48×0.00000 (34)
p(z5 = 4 | ~z−5 ,~x ,α)∝2.00
6 + 0.25N
�
−7.00−7.00
| −4.67−6.67
,1�
= 0.32×0.09470 (35)
p(z5 = 6 | ~z−5 ,~x ,α)∝1.00
6 + 0.25N
�
−7.00−7.00
| 0.500.50
,1�
= 0.16×0.00002 (36)
� After normalization:{new: 0.00 1: 0.00 4: 1.00 6: 0.00}
� New assignment = 4
Machine Learning: Jordan Boyd-Graber | UMD Dirichlet Processes | 17 / 1
Sampling point 5
� There are currently 3 clusters
p(z5 = new | ~z−5 ,~x ,α)∝0.25
6 + 0.25N
�
−7.00−7.00
| 0.000.00
,1�
= 0.04×0.00005 (33)
p(z5 = 1 | ~z−5 ,~x ,α)∝3.00
6 + 0.25N
�
−7.00−7.00
| 6.256.25
,1�
= 0.48×0.00000 (34)
p(z5 = 4 | ~z−5 ,~x ,α)∝2.00
6 + 0.25N
�
−7.00−7.00
| −4.67−6.67
,1�
= 0.32×0.09470 (35)
p(z5 = 6 | ~z−5 ,~x ,α)∝1.00
6 + 0.25N
�
−7.00−7.00
| 0.500.50
,1�
= 0.16×0.00002 (36)
� After normalization:{new: 0.00 1: 0.00 4: 1.00 6: 0.00}
� New assignment = 4
Machine Learning: Jordan Boyd-Graber | UMD Dirichlet Processes | 17 / 1
Assignments after sampling point 5
10 9 8 7 6 5 4 3 2 1 0 1 2 3 4 5 6 7 8 9 10 111110
9876543210123456789
1011
Machine Learning: Jordan Boyd-Graber | UMD Dirichlet Processes | 18 / 1
Sampling point 6
Compute the (proportional) probability of assigning data 6 to a new clusterand cluster 1.Recall that α= 0.25 and
p(x | x̄)∝ exp
¨
−
√
√
�
x1−n
n + 1x̄1
�2
+
�
x2−n
n + 1x̄2
�2«
(37)
i x1 x2 zi
0 10 10 11 8 9 12 7 6 13 -9 -10 44 -5 -10 45 -7 -7 46 1 1
� There are currently 2 clusters(38)
� After normalization:{new: 0.97 1: 0.03 4: 0.00}� New assignment = 0
Machine Learning: Jordan Boyd-Graber | UMD Dirichlet Processes | 19 / 1
Sampling point 6
� There are currently 2 clusters
p(z6 = new | ~z−6 ,~x ,α)∝0.25
6 + 0.25N
�
1.001.00
| 0.000.00
,1�
= 0.04×0.24312 (37)
(38)
� After normalization:{new: 0.97 1: 0.03 4: 0.00}
� New assignment = 0
Machine Learning: Jordan Boyd-Graber | UMD Dirichlet Processes | 19 / 1
Sampling point 6
� There are currently 2 clusters
p(z6 = new | ~z−6 ,~x ,α)∝0.25
6 + 0.25N
�
1.001.00
| 0.000.00
,1�
= 0.04×0.24312 (37)
p(z6 = 1 | ~z−6 ,~x ,α)∝3.00
6 + 0.25N
�
1.001.00
| 6.256.25
,1�
= 0.48×0.00060 (38)
(39)
� After normalization:{new: 0.97 1: 0.03 4: 0.00}
� New assignment = 0
Machine Learning: Jordan Boyd-Graber | UMD Dirichlet Processes | 19 / 1
Sampling point 6
� There are currently 2 clusters
p(z6 = new | ~z−6 ,~x ,α)∝0.25
6 + 0.25N
�
1.001.00
| 0.000.00
,1�
= 0.04×0.24312 (37)
p(z6 = 1 | ~z−6 ,~x ,α)∝3.00
6 + 0.25N
�
1.001.00
| 6.256.25
,1�
= 0.48×0.00060 (38)
p(z6 = 4 | ~z−6 ,~x ,α)∝3.00
6 + 0.25N
�
1.001.00
| −5.25−6.75
,1�
= 0.48×0.00005 (39)
� After normalization:{new: 0.97 1: 0.03 4: 0.00}
� New assignment = 0
Machine Learning: Jordan Boyd-Graber | UMD Dirichlet Processes | 19 / 1
Sampling point 6
� There are currently 2 clusters
p(z6 = new | ~z−6 ,~x ,α)∝0.25
6 + 0.25N
�
1.001.00
| 0.000.00
,1�
= 0.04×0.24312 (37)
p(z6 = 1 | ~z−6 ,~x ,α)∝3.00
6 + 0.25N
�
1.001.00
| 6.256.25
,1�
= 0.48×0.00060 (38)
p(z6 = 4 | ~z−6 ,~x ,α)∝3.00
6 + 0.25N
�
1.001.00
| −5.25−6.75
,1�
= 0.48×0.00005 (39)
� After normalization:{new: 0.97 1: 0.03 4: 0.00}
� New assignment = 0
Machine Learning: Jordan Boyd-Graber | UMD Dirichlet Processes | 19 / 1
Sampling point 6
� There are currently 2 clusters
p(z6 = new | ~z−6 ,~x ,α)∝0.25
6 + 0.25N
�
1.001.00
| 0.000.00
,1�
= 0.04×0.24312 (37)
p(z6 = 1 | ~z−6 ,~x ,α)∝3.00
6 + 0.25N
�
1.001.00
| 6.256.25
,1�
= 0.48×0.00060 (38)
p(z6 = 4 | ~z−6 ,~x ,α)∝3.00
6 + 0.25N
�
1.001.00
| −5.25−6.75
,1�
= 0.48×0.00005 (39)
� After normalization:{new: 0.97 1: 0.03 4: 0.00}
� New assignment = 0
Machine Learning: Jordan Boyd-Graber | UMD Dirichlet Processes | 19 / 1
Assignments after sampling point 6
10 9 8 7 6 5 4 3 2 1 0 1 2 3 4 5 6 7 8 9 10 111110
9876543210123456789
1011
Machine Learning: Jordan Boyd-Graber | UMD Dirichlet Processes | 20 / 1