Upload
dung-rut-re
View
229
Download
0
Embed Size (px)
DESCRIPTION
Pham Huyen Trang
Citation preview
THUT TON K-MEAN V NG DNGGVHD: CN.Trn Nam KhnhSV: Phm Huyn TrangLp: K52CA*K-Mean v ng dung
K-Mean v ng dung
NI DUNG CHNHPhn cm
Thut ton K-MeanKhi qut v thut tonCc bc ca thut tonV d minh ha Demo thut tonnh gi thut tonTng qut ha v Cc bin th
ng dng ca thut ton K-Mean
*K-Mean v ng dung
K-Mean v ng dung
I. PHN CM Phn cm l g?Qu trnh phn chia 1 tp d liu ban u thnh cc cm d liu tha mn:Cc i tng trong 1 cm tng t nhau.Cc i tng khc cm th khng tng t nhau.
Gii quyt vn tm kim, pht hin cc cm, cc mu d liu trong 1 tp hp ban u cc d liu khng c nhn.
*K-Mean v ng dung
K-Mean v ng dung
I. PHN CM
Nu X : 1 tp cc im d liu Ci : cm th iX = C1 Ck Cngoi lai Ci Cj = *K-Mean v ng dung
K-Mean v ng dung
I. PHN CMMt s o trong phn cmMinkowski
Euclidean p = 2
o tng t (gn nhau): cosin hai vect cos =
*K-Mean v ng dung
K-Mean v ng dung
I. PHN CM Mc ch ca phn cm Xc nh c bn cht ca vic nhm cc i tng trong 1 tp d liu khng c nhn.
Phn cm khng da trn 1 tiu chun chung no, m da vo tiu ch m ngi dng cung cp trong tng trng hp.
*K-Mean v ng dung
K-Mean v ng dung
I. PHN CM Mt s phng php phn cm in hnhPhn cm phn hoch
Phn cm phn cp
Phn cm da trn mt
Phn cm da trn li
Phn cm da trn m hnh
Phn cm c rng buc
*K-Mean v ng dung
K-Mean v ng dung
II.PHN CM PHN HOCHPhn 1 tp d liu c n phn t cho trc thnh k tp con d liu (k n), mi tp con biu din 1 cm.Cc cm hnh thnh trn c s lm ti u gi tr hm o tng t sao cho: Cc i tng trong 1 cm l tng t.Cc i tng trong cc cm khc nhau l khng tng t nhau.c im: Mi i tng ch thuc v 1 cm.Mi cm c ti thiu 1 i tng.Mt s thut ton in hnh : K-mean, PAM, CLARA,
*K-Mean v ng dung
K-Mean v ng dung
II.2. Thut ton K-MeansPht biu bi ton:InputTp cc i tng X = {xi| i = 1, 2, , N},S cm: K
OutputCc cm Ci ( i = 1 K) tch ri v hm tiu chun E t gi tr ti thiu.*K-Mean v ng dung
K-Mean v ng dung
II.1. KHI QUT V THUT TON Thut ton hot ng trn 1 tp vect d chiu, tp d liu X gm N phn t:X = {xi | i = 1, 2, , N}
K-Mean lp li nhiu ln qu trnh:Gn d liu. Cp nht li v tr trng tm.
Qu trnh lp dng li khi trng tm hi t v mi i tng l 1 b phn ca 1 cm.
*K-Mean v ng dung
K-Mean v ng dung
II.1. KHI QUT V THUT TON Hm o tng t s dng khong cch Euclidean E = trong cj l trng tm ca cm Cj
Hm trn khng m, gim khi c 1 s thay i trong 1 trong 2 bc: gn d liu v nh li v tr tm.
*K-Mean v ng dung
K-Mean v ng dung
II.2. CC BC CA THUT TONBc 1 - Khi to Chn K trng tm {ci} (i = 1K).Bc 2 - Tnh ton khong cch
= { for all = 1, , k}
Bc 3 - Cp nht li trng tm
Bc 4 iu kin dngLp li cc bc 2 v 3 cho ti khi khng c s thay i trng tm ca cm. *
K-Mean v ng dung
II.2. CC BC CA THUT TON*K-Mean v ng dung
K-Mean v ng dung
II.3 V D MINH HA*K-Mean v ng dung
i tngThuc tnh 1 (X)Thuc tnh 2 (Y)A11 B21 C43 D54
K-Mean v ng dung
Chart1
1
1
3
4
Y-Values
Sheet1
X-ValuesY-Values
11
21
43
54
To resize chart data range, drag lower right corner of range.
II.3 V D MINH HABc 1: Khi toChn 2 trng tm ban u: c1(1,1) A v c2(2,1) B, thuc 2 cm 1 v 2
*K-Mean v ng dung
K-Mean v ng dung
Chart1
1
1
3
4
Y-Values
Sheet1
X-ValuesY-Values
11
21
43
54
To resize chart data range, drag lower right corner of range.
II.3 V D MINH HABc 2: Tnh ton khong cchd(C, c1) = = 13d(C, c2) = = 8 d(C, c1) > d(C, c2) C thuc cm 2 d(D, c1) = = 25d(D, c2) = = 18d(D,c1) > d(D, c2) D thuc cm 2*K-Mean v ng dung
K-Mean v ng dung
II.3 V D MINH HABc 3: Cp nht li v tr trng tmTrng tm cm 1 c1 A (1, 1)Trng tm cm 2 c2 (x,y) =
*K-Mean v ng dung
K-Mean v ng dung
Chart1
1
1
3
4
2.67
Y-Values
Sheet1
Column1Y-Values
11
21
43
54
3-Jan2-Jan
To resize chart data range, drag lower right corner of range.
II.3 V D MINH HABc 4-1: Lp li bc 2 Tnh ton khong cchd(A, c1 ) = 0 < d(A, c2 ) = 9.89A thuc cm 1 d(B, c1 ) = 1 < d(B, c2 ) = 5.56B thuc cm 1d(C, c1 ) = 13 > d(C, c2 ) = 0.22C thuc cm 2d(D, c1 ) = 25 > d(D, c2 ) = 3.56D thuc cm 2*K-Mean v ng dung
K-Mean v ng dung
II.3 V D MINH HABc 4-2: Lp li bc 3-Cp nht trng tmc1 = (3/2, 1) v c2 = (9/2, 7/2)
*K-Mean v ng dung
K-Mean v ng dung
Chart1
1
1
3
4
1
3.5
Y-Values
Sheet1
X-ValuesY-Values
11
21
43
54
1.51
4.53.5
To resize chart data range, drag lower right corner of range.
II.3 V D MINH HABc 4-3: Lp li bc 2d(A, c1 ) = 0.25 < d(A, c2 ) = 18.5A thuc cm 1d(B, c1 ) = 0.25 < d(B, c2 ) = 12.5B thuc cm 1d(C, c1 ) = 10.25 < d(C, c2 ) = 0.5C thuc cm 2d(D, c1 ) = 21.25 > d(D, c2 ) = 0.5D thuc cm 2
*K-Mean v ng dung
K-Mean v ng dung
II.3 V D MINH HA*K-Mean v ng dung
K-Mean v ng dung
Chart1
1
1
3
4
1
3.5
Y-Values
Sheet1
X-ValuesY-Values
11
21
43
54
1.51
4.53.5
To resize chart data range, drag lower right corner of range.
II.4 NH GI THUT TON U IM phc tp: O( ) vi l: s ln lpC kh nng m rng, c th d dng sa i vi nhng d liu mi.Bo m hi t sau 1 s bc lp hu hn.Lun c K cm d liuLun c t nht 1 im d liu trong 1 cm d liu.Cc cm khng phn cp v khng b chng cho d liu ln nhau.Mi thnh vin ca 1 cm l gn vi chnh cm hn bt c 1 cm no khc.*K-Mean v ng dung
K-Mean v ng dung
II.4 NH GI THUT TON NHC IMKhng c kh nng tm ra cc cm khng li hoc cc cm c hnh dng phc tp.Kh khn trong vic xc nh cc trng tm cm ban u- Chn ngu nhin cc trung tm cm lc khi to- hi t ca thut ton ph thuc vo vic khi to cc vector trung tm cmKh chn ra c s lng cm ti u ngay t u, m phi qua nhiu ln th tm ra c s lng cm ti u.Rt nhy cm vi nhiu v cc phn t ngoi lai trong d liu.Khng phi lc no mi i tng cng ch thuc v 1 cm, ch ph hp vi ng bin gia cc cm r.
*K-Mean v ng dung
K-Mean v ng dung
II.5 TNG QUT HA V CC BIN THCc bin th
Thut ton K-medoid: Tng t thut ton K-meanMi cm c i din bi mt trong cc i tng ca cm.Chn i tng gn tm cm nht lm i din cho cm .K-medoid khc phc c nhiu, nhng phc tp ln hn.
*K-Mean v ng dung
K-Mean v ng dung
II.5 TNG QUT HA V CC BIN THThut ton Fuzzy c-mean (FCM):Chung chin lc phn cm vi K-mean.Nu K-mean l phn cm d liu cng (1 im d liu ch thuc v 1 cm) th FCM l phn cm d liu m (1 im d liu c th thuc v nhiu hn 1 cm vi 1 xc sut nht nh).Thm yu t quan h gia cc phn t v cc cm d liu thng qua cc trng s trong ma trn biu bin bc ca cc thnh vin vi 1 cm.FCM khc phc c cc cm d liu chng nhau trn cc tp d liu c kch thc ln hn, nhiu chiu v nhiu nhiu, song vn nhy cm vi nhiu v cc phn t ngoi lai.
*K-Mean v ng dung
K-Mean v ng dung
III. NG DNG CA THUT TONPhn cm ti liu web.Tm kim v trch rt ti liu Tin x l ti liu: Qu trnh tch t v vecto ha ti liu: tm kim v thay th cc t bi ch s ca t trong t in.Biu din d liu di dng vect.p dng K-MeanKt qu tr v l cc cm ti liu v cc trng tm tng ng.Phn vng nh*K-Mean v ng dung
K-Mean v ng dung
TI LIU THAM KHOTi liu chnh: [WKQ08] Xindong Wu, Vipin Kumar, J. Ross Quinlan, Joydeep Ghosh, Qiang Yang, Hiroshi Motoda, Geoffrey J. McLachlan, Angus Ng, Bing Liu, Philip S. Yu , Zhi-Hua Zhou, Michael Steinbach, David J. Hand, Dan Steinberg (2008). Top 10 algorithms in data mining, Knowl Inf Syst (2008) 14:137 Pavel Berkhin (). Survey of Clustering Data Mining Techniques
http://en.wikipedia.org/wiki/K-means_clustering
http://en.wikipedia.org/wiki/Segmentation_(image_processing)
Slide KI2 7 Clustering Algorithms - Johan Everts
http://vi.wikipedia.org/wiki/Hc_khng_c_gim_st
http://people.revoledu.com/kardi/tutorial/kMean/NumericalExample.htm
*K-Mean v ng dung
K-Mean v ng dung
THANK YOU FOR LISTENING*K-Mean v ng dung
K-Mean v ng dung
**Hc khng c gim st(unsupervised learning) l mt phng php ca ngnhhc mynhm tm ra mt m hnh m ph hp vi cc quan st. N khc bit vihc c gim st ch l u ra ng tng ng cho mi u vo l khng bit trc. Trong hc khng c gim st, mt tp d liu u vo c thu thp. Hc khng c gim st thng i x vi cc i tng u vo nh l mt tp ccbin ngu nhin.***********************