View
1.280
Download
4
Category
Preview:
Citation preview
SVMChapter11.Structural SVM
Waseda Univ. Hamada Lab. Taikai Takeda
Twitter: @bigsea_t
Structural SVM (SSVM)SVM
SVM
NLP[Yue et al. 2007] Bioinformatics[Yu et al. 2006]
(cutting plane training)
[John Yu et al. 2009]SSVM http://www.cs.cornell.edu/~cnyu/latentssvm/ written in C Cornell Univ.Prof. Thorsten http://www.cs.cornell.edu/People/tj/
This is an implementation of latent structural SVM accompanying the ICML '09 paper "Learning Latent Structural SVMs with Latent Variables". It was developed under Linux and compiles under gcc, built upon the SVM^light software by Thorsten Joachims. There are two versions available. The standalone version using the SVM^light QP solver is available below. Another version using the Mosek quadratic program solver is also available. It has been developed and tested for a longer period of time but requires the separate installation of the solver.
Formulate SSVMSSVM
SVMxy yy
Formulate SSVMNotationsDecision function :
, = +(,) : space of input , : feature vector 0: parameter vector
Classifier : , = argmax
9(,)
: space of (structural) output
Formulate SSVMHard-margin problemConstraint
Max-Margin
Formulate SSVMSoft-Margin Problem
Formulate SSVMLagrange Function
Dual Problem(;9)(
Formulate SSVMKernel Function
Optimize SSVM
cutting plane training
[Joachims et al. 2009]
1-Slack Formulation1-slack OP
N-slack OP (previous one)
1-slack OP and N-slack OP are equivalent
1-Slack FormulationTheorem1. Any solution of 1-slack OP is also a solution of N-slack OP (and vice versa), with = ?? . (prove later)
Proof sketch. optimal n-slack
optimal 1-slack
Therefore, the objective functions are equal for any
1-Slack FormulationDual ProblemLagrange
Dual Problem
Cutting Plane Training M(J)
[Joachims et al. 2009]
Cutting Plane TrainingAlgorithm
Cutting Plane Training
; = ;M = 0
Loss functions SSVMhinge loss
For example, in natural language parsing, a parse tree that is almost correct and differs from the correct parse in only one or a few nodes should be treated differently from a parse tree that is completely different. [Tsochantaridis et al. 2005]
margin-rescaling, slack-rescalingloss function
[Tsochantaridis et al. 2005]
Loss functionsn-slack formulationMargin rescaling
Slack rescaling
Loss functionsn-slack formulationMargin rescaling
Slack rescaling
Application: learning to rank IRInformation Retrieval queryranking
relevant documents
Application: learning to rankEvaluation MeasureAverage Precision(AP)
Loss Function
Application: learning to rankNotations
= Q, , |T|: :
:
;< = _1 ; >
Recommended