Posterior Regularization for Structured Latent Variable Models

Preview:

DESCRIPTION

Posterior Regularization for Structured Latent Variable Models. Li Z honghua I2R SMT Reading Group. Outline. Motivation and Introduction Posterior Regularization Application Implementation Some Related Frameworks. Motivation and Introduction. Prior Knowledge - PowerPoint PPT Presentation

Citation preview

Posterior Regularization for Structured Latent Variable Models

Li ZhonghuaI2R SMT Reading Group

Outline

• Motivation and Introduction• Posterior Regularization• Application• Implementation• Some Related Frameworks

Motivation and Introduction

Prior Knowledge

We posses a wealth of prior knowledge about most NLP tasks.

Motivation and Introduction --Prior Knowledge

Motivation and Introduction --Prior Knowledge

Motivation and Introduction

Leveraging Prior Knowledge Possible approaches and their limitations

Motivation and Introduction --Limited Approach

Bayesian Approach : Encode prior knowledge with a prior on parameters

Limitation: Our prior knowledge is not about parameters!Parameters are difficult to interpret; hard to get desired effect.

Augmenting Model : Encode prior knowledge with additional variables and dependencies.

Motivation and Introduction --Limited Approach

limitation: may make exact inference intractable

Posterior Regularization

• A declarative language for specifying prior knowledge

-- Constraint Features & Expectations

• Methods for learning with knowledge in this language

-- EM style learning algorithm

Posterior Regularization

Posterior Regularization

Original Objective :

Posterior RegularizationEM style learning algorithm

Posterior Regularization

Computing the Posterior Regularizer

Application

Statistical Word AlignmentsIBM Model 1 and HMM

Application

One feature for each source word m, that counts how many times it is aligned to a target word in the alignment y.

Application

Define feature for each target-source position pair i,j . The feature takes the value zero in expectation if a word pair i ,j is aligned with equal probability in both directions.

Application

Learning Tractable Word Alignment Models with Complex Constraints CL10

Application

• Six language pairs• both types of constraints improve over the

HMM in terms of both precision and recall• improve over the HMM by 10% to 15%• S-HMM performs slightly better than B-HMM• S-HMM performs better than B-HMM in 10

out of 12 cases• improve over IBM M4 9 times out of 12

Application

Implementation

• http://code.google.com/p/pr-toolkit/

Some Related Frameworks

Some Related Frameworks

Some Related Frameworks

Some Related Frameworks

Some Related Frameworks

more info: http://sideinfo.wikkii.com many of my slides get from there

Thanks!

Recommended