2. l l l l HIDO Shohei TwitterID: @sla l l l 2006-2012: IBM l l
2012-: l Jubatus 2013-: Preferred Infrastructure America, Inc l
Chief Research Officer 2
10. l Grubbs, l 1969 An outlying observation, or outlier, is
one that appears to deviate markedly from other members of the
sample in which it occurs. l Hawkins, l An observation that
deviates so much from other observations as to arouse suspicion
that is was generated by a different mechanism. l Barnett l 1980
& Lewis, 1994 An observation (or subset of observations) which
appears to be inconsistent with the remainder of that set of data.
10
11. l l l l l l l l l HDD l l HDD DDoS l l 11
12. 3 1. 2. 3. (Outlier detection) l l (Change point detection)
l l (Anomaly detection, etc) l l 12
13. l i.i.d l l 5 4 3 2 1 0 -5 -3 -1 1 -1 -2 -3 -4 -5 13 3
5
14. l l l l 5 4 3 2 1 0 -1 0 5 10 15 -2 -3 -4 -5 14 20 25
30
15. l l l l l 5 4 3 5 2 4 1 3 0 2 -1 1 -2 0 -3 -1 -2 0 5 10 15
-4 0 5 20 -5 -3 -4 -5 15 10 25 15 30 20 25 30
16. Agenda l l l l l
17. 3 l l l & l i.i.d l l l l Unix 17
18. (1/3) l l l l l l l l l l l Minimum Volume Ellipsoid
estimation [Rousseeuw, 1985] l Minimum Covariance Determinant
[Rousseeuw, 1999] l 18
19. (2/3) l l l l l l l l l l l RIPPER [Cohen, Fast effective
rule induction, Machine Learning, 1995] 19
20. l l l l l l 20
21. Iris33 USPS10256 21
22. (3/3) l l l l l l l l l l l l One-class SVM l Local Outlier
Factor (LOF) 22
23. One-class SVM [Schoelkopf et al., 1999] l l l Support
Vector Machine SVM l l 2 OC-SVM l (1-) l l Manevitz et al,
One-Class SVMs for Document Classification, 2001 23
24. LOF: Identifying Density-Based Local Outliers [Breunig,
SIGMOD2000] l LOF l l l l LOF X1 (LOF X1 X2LOF X2 X3 X3LOF 24
25. l l l normal : [, 2009] l faulty 25
26. [Sugiyama&Borgwardt, NIPS2013] l K=4 l K (5100) () l l
l l () l l K l l K=20 l l CR l 26
https://github.com/mahito-sugiyama/sampling-outlier-detection/
27. Agenda l l l l l
28. (1/3) R l S l l CRAN l LOF l l l dpreplofactor Rloflof
One-class SVM l l KernlabLIBSVMone-svc l l ADM3ADM3 l l
cpmdetectChangePoint OutlierDC l mvoutlier 28
29. (2/3) OSS l OSS l Weka: Java l l SHOGUN: SVM l l One-class
SVM, EllipticEnvelop ELKI: AGPL l l One-class SVM Scikits-learn:
Python l l Distance Based & Spatial outlier detection LOF
DB-outlier, LOCI, LDOF OPTICS-OF EM-Outlier , , , l RapidMiner:
YALE, WekaR l l LOF DB-outlier, Class Outlier Factor, , SAS:
29
30. (3/3) l l NEC l Smart Sifter l Change Detector l l Malheur
l l Nave Bayes l FICOFalcon Fraud Manager l l l Nerural
networkMulti-layered self-calibrating analytics IBM, SAS, Oracle,
NEC 30
31. Agenda l l l l l
32. Machine learning that matters Kiri L. Wagsta, ICML, 2012. l
l l 32
33. Edge-heavy data: l l l exhaust data l l , , , Edge-Heavy
Data: CPS GICTF 2012, http://www.gictf.jp/doc/20120709GICTF.pdf
33
34. Jubatus l l l (1) l (2) l (3) l l l l (4) l 34