62
はどうでもいいから作ってみた いのためのNPYLM デンソーアイティーラボラトリ @uchumik

DSIRNLP06 Nested Pitman-Yor Language Model

Embed Size (px)

DESCRIPTION

実装のためのNPYLMの解説

Citation preview

  • 1. NPYLM @uchumik

2. @uchumik DSIRNLP2 DSIRNLP02LT AI Lua ROAI 3. MeCab ChaSenJuman KyTea WebAPI 4. Yes 5. 6. 7. 8. Nested Pitman-Yor Langauge Model [Daichi Mochihashi, ACL2009] 2009 HPYLMVPYLM Teh 9. NPYLM ngram NPYLM CRP PYP PYP HPYLM VPYLM 10. NPYLM Nested Pitman-Yor Language Model ngramngram 11. ngram s = {a b c d e} {a, b, c, d, e} P(s) ngram n e.g. n=2(bigram model) P(s) = P(a)P(b|a)P(c|a, b)P(d|a, b, c)P(e|a, b, c, d) P(s) = lY i P(wi|w1, . . . , wi 1) lY i P(wi|wi 1) ngram ngram ngram 12. P(w_i|w_{i-1}) bigram ngram ngramNPYLM ngram P(wi|wi 1) = c(wi 1wi) P wi c(wi 1wi) 13. 2 13 B 1 1 3 3 3 E E []P(E|) []P(E|) 1 B [] 14. E 1 []P(E|)=0.2 []P(E|)=0.8 1.0=0.2+0.8 0 1 1. 01.0r 2. t[i]ri t []P(E|) []P(E|) 15. 0 1 2 3 4 5 6 7 1 2 8 9 138 2 -(8-2) 16. 0 1 2 3 4 5 6 7 1 2 8 9 138 [t][k]= P( tk+1 t c | tkj+1 tk c )[t k][ j] j t, k [8][2]=P(|)[6][2]+P(|)[6][1] 17. Forward-filtering EB Backward-sampling 18. NPYLM 1. 1ngram,ngram 2. 1 3. ngram ngram 4. 5. NPYLM 19. 20. P(wi|wi 1) 21. HPYLM, VPYLM 22. NPYLM 2Nested ngram HPYLM: Hierarchical Pitman-Yor Language Model ngram VPYLM: Variable-order Pitman-Yor Language Model Nested Pitman-Yor 23. ngram P( )=0P( | ) = c(, ) w c(, w) = 0 1 P( | )P( | )P( | ) 0 24. 25. ngram () ID bigram: trigram: 4-gram: 26. Chinese Restaurant Process 3 + c 2 + c 1 + c + c P(w| ) = cw + c + + c G0(w) CRP CRP 27. Pitman-Yor Process dCRP(Chinese Restaurant Process) d0CRP 1 d + c 2 d + c 3 d + c + 3d + c 3d= 28. Pitman-Yor Process PYP 29. ngram G0 = 1 V unigram G_0 bigram unigram trigram bigram G1 G2 30. P( |context) = c |context d + c. + + 3d + c. G2() PYP ngram G1 G2 G0 = 1 V 1 d + c 2 d + c 3 d + c + 3d + c 1c()1*d PYP 31. ngram G_0 P(w|context) = cw dtw + c. + + dt. + c. P(w|(context)) 1 HPYLM 32. Nested G_0 = 1/V ngram HPYLM G_0 HPYLM 33. 34. HPYLM HPYLM 35. Context Tree Context Tree ngram trigram 36. n=3HPYLMtrigram c k d : k ( + dt.)P (u)() trigram 37. HPYLM Text = {w_1, w_2, , W_T} for i = 1 to Convergence for t = 1 to T read {context:u_t, word:w_t} u_t = w_{i-1},,w_{i-n+1} if i > 0 then remove_customer(u_t,w_t) add_customer(u_t,w_t) add_customer: u_tw_t remove_customer: u_tw_t 38. : ID w_tcontext:u_t typedef vector tables; class node { map restaurants; // map arrangement; // int T; // int C; // node *p; // }; root 5: 0 5 ID 39. AddCustomer addCust(context u, word w) 13root 3 31 31 root : w_tcontext:u_t Context Tree root root 40. AddCustomer addCust(context u, word w) w() w,kkc_{wk} root (cwk - d|u|): cwk++ (|u|+d|u|t.)P(u)(w): cwk new=1, tw++add_customer((u), w) root 5: 0 root 5: 0 41. AddCustomer addCust(context u, word w) u () w () w,kkc_{wk} root 5: 0 : w_tcontext:u_t (cwk - d|u|): cwk++ (|u|+d|u|t.)P(u)(w): cwk new=1, tw++add_customer((u), w) 42. RemoveCustomer : w_tcontext:u_t root 5: 0 remCust(context u, word w) u wkcwk 1 0: 2 1: 1 root 5: 0 1 root 5: 0 remCust((u),w) 43. NPYLM G_0ngramHPYLM 44. ngram ngram23gram NPYLM3 ngramn n ngram HPYLMn 45. Variable-order PYLM ngram n n ngram ngram 46. _i: 1-q_i: root root root q_0 1-q_0 q_1*(1-q_0) root root q_2(1-q_0)(1-q_1) root root 3 l P(n = l|context) = ql l 1 i=0 (1 qi) l 47. qi Be(, ) Be( , ) = ( + ) ( ( ) + ( ))q 1 E[q] = ( + ) () 48. / typedef vector tables; class node { map restaurants; // map arrangement; // int T; // int C; // node *p; // int a; // int b; // }; ai : bi : i i E[qi] = ai + ai + bi + + i 49. n 0 0|u|=0w*0 1 1|u|=1w*1 2 2|u|=2w*2 wu P(n = l| ) = al + al + bl + + l 1 i=0 bi + ai + bi + + : wt, nt VPYLM l P(w|u, , n) = cw dn tw n + c + n + dn t n + c P(w| (u)) 50. VPYLM Text = {w_1, w_2, , W_T, n_1, n_2, , n_T} for i = 1 to Convergence for t = 1 to T read {context:u_t, word:w_t, order:n_t} if i > 0 then remove_customer(u_t,w_t,n_t) n_t = sample_order(u_t,w_t) add_customer(u_t,w_t,n_t) 51. sample_order(u_t, w_t) remove_cusomerVPYLM P(n = l| ) = al + al + bl + + l 1 i=0 bi + ai + bi + + 0 1 2 z = t[i] 1. 0zr 2. t[i]ri P(n|w, ) P(w|u, ) P(n = l| ) P(w|u, , n) = cw dn tw n + c + n + dn t n + c P(w| (u)) 52. AddCustomer addCust(context u, word w, order n_t) n_t 1root 1 : w_tcontext:u_t root Context Tree root n_t=1 53. AddCustomer addCust(context u, word w, order n_t) n_t 1b n_ta : w_tcontext:u_t root Context Tree root n_t=1 b_0++ a_1++ 54. AddCustomer addCust(context u, word w, order n_t) w() w,kkc_{wk} root (cwk - d|u|): cwk++ (|u|+d|u|t.)P(u)(w): cwk new=1, tw++add_customer((u), w, n_t-1) root 5: 0 root 5: 0 55. RemoveCustomer : w_tcontext:u_t root 5: 0 remCust(context u, word w, order n_t) un_t b n_ta wkcwk 1 0: 2 1: 1 root root remCust((u),w, n_t-1) 5: 0 1 5: 0 56. uw n 1. n 2. n ngram 3. 12n P(n|context) = al + al + bn + + l 1 i=0 bi + ai + bi + + P(w|context, n) = cw dntw n + c n + dnt n + c P(w| (context), n 1) 57. ngram 58. 2 1/ HPYLM/VPYLM GibbsSampling 59. NPYLM d, NPYLM HPYLM unigram VPYLM ngram 60. NPYLM 1. 1ngram,ngram addCustomer 2. 1 3. ngram ngram (removeCustomer) 4. (forward-filtering/backward-sampling) 5. NPYLM(addCustomer) 61. HPYLM/VPYLM1 Blocked Gibbs Sampling 62. NPYLM