Upload
ichigaku-takigawa
View
74
Download
3
Embed Size (px)
Citation preview
/401
94
/402
0.1 0.7 1.2 0.2 1.3 0.9
g1 g2 g3 g4 g5
y1 y2 y3 y4 y5
...
yn
gn
/403
(1) (2) (3)
(3)
/404
0.1 0.7 1.2 0.2 1.3 0.9
g1 g2 g3 g4 g5
y1 y2 y3 y4 y5
g6
y6
y
g
/405
0.1 0.7 1.2 0.2 1.3 0.9
g1 g2 g3 g4 g5
y1 y2 y3 y4 y5
g6
y6
y
g
/406
0.1 0.7 1.2 0.2 1.3 0.9
()
Hansh-QSAR
(,,2001)10.3
(SPR)) (SAR)
/407
) (SAR)
(Hansh-QSAR) (LogP)HOMO/LUMO
()/
2013(199815) ()
/408
0.1 0.7 1.2 0.2 1.3 0.9
) (SAR)
/
/
(ADME)
/409
a
h
h
h
h
d
hh
a
h
r
r
r
r
r
r
r
rr
r
r
r
C
O
N
S
CC
C
CC
C
C
C
C
C
C
C
C
C
C C
C
C
O2x
C1x
C1x
C1x
C1x
N1x
C1bC1b
S2a
C1c
C8y
C8y
C8x
C8x
C8x
C8x
C8x
C8xC8x
C8x
C8x
C8x
RA
L
L
Ar
ArA
Structure
diagram
Skeletal
topology
Atom/bond
labeled graph
KEGG atom
labeled graph
(KCF)
Pharmacophore
type labeled graph
(ChemAxon Screen)
Reduced graph
1
11 1
1
1
1
11
1 1
2
1
11
1
1
2
2
2
2
2
1
1
graph=
graphterm(Biggs, Lloyd, Wilson, , 1986)
PubChemunique > 5200 (2014722)
(S. Nowozin, Learning with Structured Data: Applications to Computer Vision, Phd Thesis, 2009)
?
/4012
(/)
(?)
/
/4013
(1) (2) (3)
(3)
/4014
0.1
0.7
0.9
1.2
0 0 1 1 1 0
1 0 0 0 0 1
1 1 0 1 1 0
1 0 1 1 1 0
y g
x1 x2 x3 x4 x5 x6
g1
g2
g3
gn
n
0 or 1 ()
bag-of-features
/4015
Data-Driven Fingerprints Extended Connectivity Fingerprint
(Rogers and Hahn, 2010) Frequent and/or Bounded-Size Subgraphs
(Wale et al, 2008)
Sparse Learning Graph AdaBoost
(Kudo et al, 2004) Graph LPBoost (gBoost)
(Saigo et al, 2009) Graph LARS/LASSO
(Tsuda et al, 2007)
Discriminative Subgraph Mining () LEAP
(Yan et al, 2008) GraphSig
(Ranu et al, 2009) CORK
(Thoma et al, 2009)
Graph Kernels Marginalized Kernels
(Kashima et al, 2003, 2004; Mhe et al, 2005) Walk Kernels
(Grtner et al, 2003; Borgwardt et al, 2005; Vishwanathan et al, 2010)
Weighted Decomposition Kernels(Menchetti et al, 2005)
Subtree Kernels(Mah and Vert, 2009)
Weisfeiler-Lehman Kernel (Shervashidze et al, 2011)
/4016
0.1
0.7
0.9
1.2
0 0 1 1 1 0
1 0 0 0 0 1
1 1 0 1 1 0
1 0 1 1 1 0
y g
x1 x2 x3 x4 x5 x6
g1
g2
g3
gn
Data-Driven Fingerprints
1.
2.
() PubChem fingerprintMaccs Key
fingerprint
0-1fingerprint
/4017
Data-Driven Fingerprints
(Wale et al, KAIS, 2008)
1. fp: Hashed Fingerprint () 2. ECFP () 3. MK: 166bit 4. FS: 5. GF:
GFECFP FS(!)
ROC50AUC (50 false positivesAUC)
/4018
Hashed Fingerprint (Wale et alChemAxon)
Data-Driven Fingerprints
DaylightChemAxon ()
https://docs.chemaxon.com/display/jchembase/User's+Guide
/4019 Data-Driven Fingerprints
Extended Connectivity Fingerprints
http://chembioinfo.com/2011/10/30/revisiting-molecular-hashed-fingerprints/
1. 03
2. (SMARTS/SMILES)Hashed Fingerprint
(variation)
Morgan(Morgan: J. Chem. Doc. 5, 107-113, 1965)
/4020 Graph Kernels
0.1
0.7
0.9
1.2
0 0 1 1 1 0
1 0 0 0 0 1
1 1 0 1 1 0
1 0 1 1 1 0
y g
x1 x2 x3 x4 x5 x6
g1
g2
g3
gn
implicit
Walk lPath Tree:
Subgraph(NP-hard)
/40
0 1 0 0 1
0 0 1 0 1
1 2 0 0 0
0 0 0 1 2
21 Graph Kernels
a a a b a c b b b c
a b c
a c b
a a b
b
b b c
g1
g2
g3
x1 x2 x3 x4 x5
) ()
21 22 0 52 2 0 5
g1
g2g3
g1 g2 g3
OK
/40
( )
22 Graph Kernels
Marginalized Kernels(Kashima et al, 2003, 2004; Mhe et al, 2005)
Walk Kernels(Grtner et al, 2003; Borgwardt et al, 2005; Vishwanathan et al, 2010)
Weighted Decomposition Kernels(Menchetti et al, 2005)
Subtree Kernels(Mah and Vert, 2009)
Weisfeiler-Lehman Kernel (Shervashidze et al, 2011)
Vk(g,g)
V (Hilbert)
V(k)?
/4023 Graph Kernels
) Weisfeiler-Lehman Kernel
ECFP
2checkWeisfeiler-Lehman (1968)
http://www.cc.gatech.edu/~lsong/teaching/8803ML/lecture22.pdf
(Shervashidze et al, JMLR 2011)
x5 x5 x3 x2
x1x3 x2
x1
x1 x2 x2 x1
x2 x1 x2 x1
/4024 Sparse Learning: Boosting
0.1
0.7
0.9
1.2
0 0 1 1 1 0
1 0 0 0 0 1
1 1 0 1 1 0
1 0 1 1 1 0
y g
x1 x2 x3 x4 x5 x6
g1
g2
g3
gn
()boosting
/4025 Sparse Learning: Boosting
) Adaboost(Kudo et al, NIPS 2004)
iteration()
TT
(gSpan)Branch and Bound LPboost
(AdaboostArc-GVsoft-margin boosting)
/4026 Sparse Learning: Boosting
) LPboost(gBoost)(Saigo et al, Mach Learn, 2009)
AdaboostLPboost
LSVMhinge (1-norm SVM)
SVM
hinge loss+L1
Totally corrective boosting
Dantzig-WolfeLP = boostingLPboost (Coordinate Descent?)
Adaboost
LPboost (Demiriz et al,2002)
/4027
1.Data-Driven FingerprintHashed Fingerprint, ECFP, ,
2.Graph Kernels , Weisfeiler-Lehmann
3.BoostingAdaboostLPboost
3
AdaboostLPboost(loss)
/4028
(1) (2) (3)
(3)
/4029 Sparse Learning
1,2 > 0 L
= (1,2, . . . )
g g( |,0) := 0 +
j=1
jI( )xj
min,0
n
i=1
Lyi, (gi |,0)
+ 11 +
2222
0
: AdaboostLPboost(Coordinate Descent)
or Gradient BoostingBoosting
/4030
g g( |,0) := 0 +
j=1
jI( )xj
Sparse Learning:
0.1
0.7
0.9
1.2
0 0 1 1 1 0
1 0 0 0 0 1
1 1 0 1 1 0
1 0 1 1 1 0
y g
x1 x2 x3 x4 x5 x6
g1
g2
g3
gn
AdaboostLPboost(graph kernel)
0-1Pseudo-boolean function
()Linear threshold functionboolean cube
/4031bounding
(gSpan)Branch and BoundLPboost
0/1 0/1 0/1 0/1 0/1 0/1 0/1 0/1
/4032
x
x
zi 2 {0, 1}f(z1, z2, . . . , zn) 2 R
x
x
nBoolean vector
1!
1 0maxmin
bounding
nBoolean vector
/4033bounding
f(u) f(v) f(u) for all v s.t. 1(v) 1(u)
f(u) :=
i1(u)
max{fi(0), fi(1)} +
i0(u)
fi(0)
f(u) :=
i1(u)
min{fi(0), fi(1)} +
i0(u)
fi(0)
f : {0, 1}n ! R, f(u1, u2, . . . , un) =Pn
i=1 fi(ui)
v =
nz }| {001000110 0
u = 011001110 1
Gain:n
i=1 wiyi(2I(x gi) 1)
Weighted error count:n
i=1 wi I(I(x gi) = yi)
Correlation with response:n
i=1 yiI(x gi)
Gain Morishita, 2001; Kudo et al, 2005
Morishita-Kudo Bounds for Separable Functions
0-1
()
/4034Boosting
1I(x1 g)1I(x1 g) + 2I(x2 g)
1I(x1 g) + 2I(x2 g) + 3I(x3 g)
Iteration 1:
Iteration 2:
Iteration 3: ...
x
x
Main Trick: MK BoundsBranch & Bound (pruning)
(subtree)
xi
boosting
iteration
xx
k-best( multiple pricing)
/4035
Iterations: (t + 1) (t) + d(t), d(t) := T ((t)) (t)
block coordinate descentBCGD
T ((t)) := argmin
hrf((t)), (t)i+ 1
2h (t), H(t)( (t))i+R()
2nd-order approx of f() at (t)
min
f() + R(), = (0,)nonsmooth
Coordinate blockGauss-Southwell
d(t)j = 0 for d(t)j C d(t) (Gauss-Southwell-r rule)
step length selected by Armijo rule at each iteration
: Tseng-Yuns BCGD
BCGD: Block Coordinate Gradient Descent
/4036
1) zero vector
2) iterate: BCGDiteration(iteration) MK bounds (=)
3) BCGD
(boosting)(elastic-net)
iteration iterationboosting
/4037
() (+) LPboostL1-
Boosting (test graph)()fingerprint()
(L2)
()
pseudo-boolean functionunique ()
/4038
(1)
(2)
(3)
(3)