Advisor : Prof. Sing Ling Lee Student : Chao Chih Wang Date : 2013.01.04 1

1

Ambiguous Nodes in Networked Data based on Measuring Reliable Neighboring Probabilities

Advisor : Prof. Sing Ling LeeStudent : Chao Chih WangDate : 2013.01.04

2

Outline

Introduction Network data Traditional VS Networked data Classification Collective Classification ICA

Problem Our Method

Collective Inference With Ambiguous Node (CIAN)

Experiments Conclusion

3

Introduction – Network data traditional data:

instances are independent of each other

network data: instances are maybe related to each other

application: emails web page paper citation

independent related

4

Introduction – Network data

5

traditional VS network data classification

Introduction

F

D

G

E

C

B

H

A

12

F

D

G

E

C

B

H

A 1

2

B

Class: 1 2

: Class 1

6

Introduction – Collective Classification To classify interrelated instances using content

features and link features.

node

content feature

link featureclass1 class2

class3

class

D 1 0 0

E 1 1 1

1+2

A 1B

C

D

E

21

1 0 0 1/2 1/2 0

1

2

link feature

class1 class2 class3

Binary

Count

Proportion

1 0 02 0 01 0 0

1 1 01 1 01/2 1/2 0

D:E:

We use :

7

Introduction – ICA

ICA : Iterative Classification Algorithm

Initial : Training local classifier use content features to predict unlabel instancesIterative { for predict each unlabeled instance { set unlabeled instance ’s link feature use local classifier to predict unlabeled instance }}

step1

step2

8

Introduction – ICA Example

Iteration

content feature


class3

class

1 1 0 1

2 1 0 1

node

content feature


class3

class

C 1 0 0 0 1/2 1/2 2

D 1 0 0 1/2 1/2 0 2

E 1 0 1 1/2 1/4 1/4 1

F 1 0 1 1 0 0 1

G 1 0 1 1 0 0 1

H 1 1 1 1/3 1/3 1/3 3

Training data:

2/3 0 1/3 1/3 1/3 1/3

1

3A:

1

2

2 A

E

1

1B

1

C

D

F1 G

3H2

1

2

3

3

Class : 1 2 3

unlabel data:training data:

Iteration

content feature


class3

class

1 1 0 1

2 1 0 1

1/2 1/2 0 1/4 1/2 1/4

2

2B:

3I

9

Problem – Ambiguous Node label the wrong class

judge the label with difficulty make a mistake 2

1

1A

B

2

2D

node content feature class

C 0 0 1

D 0 0 1

E 1 1 2

F 1 0 2

G 1 1 2

C

E F

G

content feature

a. b.

woman

man age≤20

age>20

0 1 0 1

class

non-smoking

smoking

1 2

11 or 2 ?1

10

Problem – use ICA

11

1A


1

1

1C 1

2

22

A

True class :

B

1

C

1

2 D

node

content feature


class3

class

B 0 0 1 0 1 0 1

D 0 1 1 1/3 2/3 0 2

E 0 1 1 1/4 3/4 0 2

F 1 0 1 2/3 1/3 0 1

G 1 0 1 2/3 1/3 0 1

H 0 1 1 0 1 0 2

Training data

Iteration

content feature


class3

class

1 1 1 1

2 1 1 1

2/3 1/3 0 2/3 1/3 0

A: 1

1E

F

G

B

2

H

- Ambiguous

1 0 1C:

1I

2J

11

Idea

Make a new prediction for neighbors of unlabeled instance

Use probability to compute link feature

Retrain the CC classifier

12

compute link feature use probability

A

12

3( 1 , 80%)

( 2, 60%)

( 3 , 70%)

Our method:Class 1 : 80/(80+60+70)Class 2 : 60/(80+60+70)Class 3 : 70/(80+60+70)

General method :Class 1 : 1/3Class 2 : 1/3Class 3 : 1/3

Our Method – Method #1

13


11

1A1

1

1C

2

2

E

F

G

B

2

H

- Ambiguous

22

A

True class :

B

1

C

D

To predict unlabeled instance ’s neighbors again.

( 1 , 70%)( 2, 80%)

( 1 , 70%)

( 1 , 70%)

( 2 , 60%)

( 2 , 80%)

predict again

11

1A1

1

1C

2

2

E

F

G

B

2

H

- Noise

D( 1 , 70%)( 2, 80%)

( 1 , 70%)

( 1 , 70%)

( 2 , 90%)

( 2 , 80%)

predict again

B is ambiguous node.

B is noise node.

14

To predict unlabeled instance ’s neighbors again first iteration needs to predict again difference between original and predict

label : ▪ This iteration doesn’t to adopt▪ Next iteration need to predict again

similarity between original and predict label :▪ Average the probability ▪ Next iteration doesn’t need to predict again

A

1 2( 1 , 80%) ( 2, 60%)

( 2, 80%)( 2, 60%)

( 2, 70%)( 2, 60%)

1


new prediction

B C

Example:

15

2

w

x

y 3z

1( 1 , 60%)

( 2 , 70%)

( 3 , 60%)

( 3 , 60%)

( 2 , 80%)

( 2 , 70%) new prediction

( 2 , 75%)

( 3 , 60%)( ? , ??%)

linkfeature

Class 1 Class 2 Class 3

original 0.315 0.368 0.315

Method A 0.27 0.405 0.324

Method B 0 0.692 0.307

Method C 0 0.555 0.444

x:Method A : (1 , 50%)Method B : (2 , 60%)Method C : (1 , 0%)

2

x’s True label : 2

x is ambiguous ( or noise) node:Method B >Method C > Method Ax is not ambiguous ( or noise) node:Method A >Method C > Method BMethod A & Method B is too extreme.So we choose the Method C.


not adoptchange classnot change class

-Ambiguous

16


Iteration 1

Iteration 2

Iteration 3

Iteration 4

Iteration 5

67

68

69

70

71

72

73

74

75

MethodAMethodBMethodC

Accuracy

17

Retrain CC classifier


node

content feature


class3

class

A 1 0 1 1/2 1/2 0 1

B 1 1 1 1 0 0 2

C 1 0 1 1 0 0 1

( 3 , 70%)

1+

AB

C

D

E

21

1+2

A 3

B

C

D

E

21

( 1 , 90%)( 2, 60%)

( 2 , 70%)

( 1 , 80%) node

content feature


class3

class

A 1 0 1 90/290 130/290

70/290 1

B 1 1 1 80/140 60/140 0 2

C 1 0 1 80/150 0 70/150 1

retrain

Initial ( ICA )

18

11

1A

B

1

1

1C 2

-Ambiguous2

1

( 2 , 60%)

( 1 , 80%)

2( 2 , 80%)

( 1 , 80%) ( 2 , 80%)

( 1 , 70%)

predict again

( 2 , 60%)

( 1 , 60%)

D

Iteration

content feature


class3

class

1 1 1 1

22

A

True label:

B

1

C node

content feature


class3

class

B 0 0 1 0 1 0 1

D 0 1 1 1/2 1/2 0 2

E 0 1 1 1/2 1/2 0 2

F 1 0 1 1 0 0 1

G 1 0 1 1 0 0 1

Training data

E

F

G

1/2 1/2 0

Our: 2


content feature


class3

class

0 0 1 0.466 0.533 0

B:

( 1 , 60%)

( 2 , 80%)

2

CIAN Example – Ambiguous

1 1 1 2/3 1/3 0 1ICA:

19

11

1A

B

1

1

1C 2

- Noise2

1

( 2 , 80%)

( 1 , 80%)

2( 2 , 80%)

( 1 , 80%) ( 2 , 80%)

( 1 , 70%)

predict again

( 2 , 70%)

( 1 , 70%)

D Iteration

content feature


class3

class

1 1 1 1

22

A

True label:

B

1

C node

content feature


class3

class

B 0 1 1 0 1 0 1

D 0 1 1 1/2 1/2 0 2

E 0 1 1 1/2 1/2 0 2

F 1 0 1 1 0 0 1

G 1 0 1 1 0 0 1

Training data

E

F

G

1/2 1/2 0

Our:2


content feature


class3

class

0 1 1 0.466 0.533 0

B:

( 1 , 60%)

( 2 , 80%)

2

CIAN Example – Noise

1 1 1 2/3 1/3 0 1ICA:

20

CIAN

CIAN : Collective Inference With Ambiguous Node

Initial : Training local classifier use content features to predict unlabel instancesIterative { for predict each unlabel instance { for nb unlabeled instance ’s neighbors{

if(need to predict again) (class label, probability ) = local

classifier(nb) } set unlabel instance ’s link feature (class label, probability ) = local classifier(A) } retrain local classifier}

step1

step2

step3

step4step5

21

Experiments - Data sets

Characteristics Cora CiteSeer

WebKB-texas

WebKB-washingto

n

Instances 2708 3312 187 230

Class labels 7 6 5 5

Link number 4732 5429 328 446

Content features 1433 3703 1703 1703

Link features 7 6 5 5

22

Experiments-Experimental setting


WebKB-texas

WebKB-washingt

on

Instances 2708 3312 187 230

Max ambiguous nodes

(NB)429 590 52 50

Max ambiguous nodes

(SVM)356 365 20 31

Training data 1500 2000 100 120

Iteration 5 5 5 5fixed

argument‧Compare with CO、 ICA、 CIAN

23

Experiments

1. misclassified nodes Proportion of misclassified nodes (0%~30% , 80%)

2. ambiguous nodes NB vs SVM

3. misclassified and Ambiguous nodes Proportion of misclassified and ambiguous nodes

(0%~30% , 80%)

4. iteration & stable number of iterations

24

Experiments – 1. misclassified Cora

0% 10% 20% 30%68

70

72

74

76

78

80

82

84COICACIAN

Proportion of misclassified nodes

Accu

racy

0% 4.5 2.5

10% 3.2 3.3

20% 2.7 3.9

30% 2.3 4.2

CO ICA CIAN0% 76.1 80.6 83.1

10% 73.2 76.4 79.720% 70.7 73.4 77.330% 68 70.3 74.5

25

Experiments – 1. misclassified CiteSeer

0% 10% 20% 30%6869707172737475767778

COICACIAN


Accu

racy

0% 3.6 0.2

10% 1.9 1.3

20% 1.3 1.8

30% 1.2 2.1

CO ICA CIAN0% 73.5 77.1 77.3

10% 71.8 73.7 7520% 70.2 71.5 73.330% 69.1 70.3 72.4

26

Experiments – 1. misclassified WebKB-texas

0% 10% 20% 30%66676869707172737475 CO

ICACIAN


Accu

racy

0% 1.4 0.3

10% 1.5 0.9

20% 1 1.5

30% 0.4 2

CO ICA CIAN

0% 73.2 74.6 74.9

10% 70.7 72.2 73.1

20% 69.4 70.4 71.9

30% 67.8 68.2 70.2

27

Experiments – 1. misclassified WebKB-washington

0% 10% 20% 30%6869707172737475767778 CO

ICACIAN


Accu

racy

0% 1.5 0.9

10% 1.4 1.4

20% 1 1.9

30% 0.8 2.2

CO ICA CIAN

0% 75.1 76.6 77.5

10% 73.2 74.6 76

20% 71.6 72.6 74.5

30% 69.8 70.6 72.8

28

Experiments – 1. misclassified

80% of misclassified nodes

CO ICA CIAN

Cora 22.1 20.1 25.6

CiteSeer 20.7 19.1 23.6

WebKB-texas 17.6 16.3 20.5

WebKB-washington 19 17.7 22.2

29

Experiments – 2. ambiguous

Cora

33% 66% 99%74

76

78

80

82

84

86

CO

ICA

CIAN

Proportion of ambiguous nodes(NB)

Accu

racy

Max ambiguous nodes : 429

33% 66% 99%74

76

78

80

82

84

86

Proportion of ambiguous nodes (SVM)


30


CiteSeerMax ambiguous nodes : 590

33% 66% 99%74

76

78

80

82

84

CO

ICA

CIAN


Accu

racy

33% 66% 99%74

76

78

80

82

84



31


WebKB-texasMax ambiguous nodes : 52

33% 66% 99%67

69

71

73

75

77

79

81

CO

ICA

CIAN


Accu

racy

33% 66% 99%67

69

71

73

75

77

79

81



32


WebKB-washingtonMax ambiguous nodes : 33

33% 66% 99%71

73

75

77

79

81

CO

ICA

CIAN


Accu

racy

33% 66% 99%71

73

75

77

79

81



33



WebKB-texas

WebKB-washingt

on

Instances 2708 3312 187 230

Max ambiguous nodes

(NB)429 590 52 50

Max ambiguous nodes

(SVM)356 365 20 31

The same ambiguous nodes 157 164 15 17

The proportion of the same

ambiguous nodes (NB)

36.5% 27.7% 28.8% 34%

The proportion of the same

ambiguous nodes (SVM)

44.1% 44.9% 75% 54.8%

‧ How much the same ambiguous nodes between NB and SVM?

34

Experiments – 3. misclassified and ambiguous

Cora

10% 20% 30%67

69

71

73

75

77

79 COICACIAN

Proportion of misclassified and ambiguous nodes

Accu

racy

10% 6.3 1.5

20% 6.7 2

30% 6.7 2.3

CO ICA CIAN

10% 71.7 78 79.5

20% 69.2 75.9 77.9

30% 67.1 73.8 76.1

35


CiteSeer

10% 20% 30%69707172737475767778

COICACIAN


Accu

racy

10% 2.2 0.9

20% 1.1 1.7

30% 2.2 2.3

CO ICA CIAN

10% 74.5 76.7 77.6

20% 72.5 73.6 75.3

30% 69.1 71.3 73.6

36


WebKB-texas

10% 20% 30%666768697071727374

COICACIAN


Accu

racy

10% 1.8 1

20% 1.8 1.4

30% 1.4 3

CO ICA CIAN

10% 70.3 72.1 73.1

20% 68.2 70 71.4

30% 66.1 67.5 70.5

37


WebKB-washington

10% 20% 30%66

68

70

72

74

76

78

80 COICACIAN


Accu

racy

10% 0.5 1.8

20% 1.2 2.4

30% 1.8 5.2

CO ICA CIAN

10% 78.4 78.9 80.7

20% 72.4 73.6 76

30% 66.2 68 73.2

38


80% of misclassified and ambiguous nodes

CO ICA CIAN

Cora 27.3 26.8 30.3

CiteSeer 26.9 24.8 28.6

WebKB-texas 24.9 22.5 26.9

WebKB-washington 25.8 20.9 25.9

39

Experiments

Number of

training data

The proportion of misclassified

nodes

The proportion of misclassified and ambiguous

nodesCora 1500 60% 65%

CiteSeer 2000 65% 70%

WebKB-texas 100 45% 50%WebKB-

washington 120 50% 55%

‧ When the accuracy of ICA is lower than CO ?

40

Experiments – 4. iteration & stable Cora

10%

round CO ICA Our

1 74.25 78.59 79.24

2 74.25 78.64 80.11

3 74.25 78.59 80.23

4 74.25 78.64 80.11

5 74.25 78.64 80.48

6 74.25 78.82 80.5

7 74.25 78.82 80.67

8 74.25 78.82 80.79

9 74.25 78.82 80.79

10 74.25 78.82 80.79

Avg. 74.25 78.72 80.371

20%

CO ICA Our

71.77 75.64 76.33

71.77 76.22 77.22

71.77 75.85 77.41

71.77 75.94 77.22

71.77 76.26 78.3

71.77 76.29 78.3

71.77 76.49 78.44

71.77 76.49 78.57

71.77 76.49 78.57

71.77 76.49 78.57

71.77 76.216 77.893

30%

CO ICA Our

67.52 71.32 73.59

67.52 71.53 74.11

67.52 71.53 74.11

67.52 71.32 74.25

67.52 71.64 74.11

67.52 71.73 74.28

67.52 71.82 74.32

67.52 71.82 74.4

67.52 71.82 74.49

67.52 71.82 74.49

67.52 71.635 74.215

41

Experiments – 4. iteration & stable CiteSeer

10%

round CO ICA Our

1 72.25 74.74 74.88

2 72.25 74.68 75.96

3 72.25 74.79 75.88

4 72.25 74.83 76.27

5 72.25 75.14 76.27

6 72.25 75.23 76.44

7 72.25 75.42 76.5

8 72.25 75.42 76.81

9 72.25 75.42 76.81

10 72.25 75.42 76.81

Avg. 72.25 75.109 76.263

20%

CO ICA Our

71.34 73.32 73.64

71.34 72.94 74.32

71.34 73.32 74.32

71.34 72.94 74.49

71.34 73.28 74.7

71.34 73.52 74.7

71.34 73.52 74.71

71.34 73.52 74.86

71.34 73.52 74.86

71.34 73.52 74.86

71.34 73.34 74.546

30%

CO ICA Our

69.24 71.03 72.17

69.24 71.24 72.78

69.24 71.24 72.63

69.24 71.03 72.78

69.24 71.38 72.86

69.24 71.46 73.24

69.24 71.76 73.3

69.24 71.76 73.39

69.24 71.76 73.48

69.24 71.76 73.48

69.24 71.442 73.011

42

Experiments – 4. iteration & stable WebKB-texas

10%

round CO ICA Our

1 66.73 66.82 68.34

2 66.73 66.96 68.83

3 66.73 66.82 68.83

4 66.73 67.24 68.54

5 66.73 67.24 68.92

6 66.73 67.24 68.92

7 66.73 67.24 68.92

8 66.73 67.24 68.92

9 66.73 67.24 68.92

10 66.73 67.24 68.92

Avg. 66.73 67.128 68.806

20%

CO ICA Our

64.36 65.51 65.66

64.36 65.79 66.8

64.36 66.08 66.8

64.36 66.25 67.36

64.36 66.25 67.67

64.36 66.25 67.67

64.36 66.25 67.67

64.36 66.25 67.67

64.36 66.25 67.67

64.36 66.25 67.67

64.36 66.113 67.264

30%

CO ICA Our

62.14 62.94 65.86

62.14 63.27 66.05

62.14 64.38 66.05

62.14 64.87 66.19

62.14 64.87 66.19

62.14 64.87 66.31

62.14 64.87 66.31

62.14 64.87 66.31

62.14 64.87 66.31

62.14 64.87 66.31

62.14 64.468 66.189

43

Experiments – 4. iteration & stable WebKB-washington

10%

round CO ICA Our

1 70.9 71.81 73.34

2 70.9 71.94 74.05

3 70.9 72.16 74.27

4 70.9 72.2 74.27

5 70.9 72.2 74.41

6 70.9 72.2 74.41

7 70.9 72.2 74.41

8 70.9 72.2 74.41

9 70.9 72.2 74.41

10 70.9 72.2 74.41

Avg. 70.9 72.131 74.239

20%

CO ICA Our

68.18 69.09 71.67

68.18 69.27 72.32

68.18 69.67 72.41

68.18 69.8 72.53

68.18 69.8 72.86

68.18 69.8 72.86

68.18 69.8 72.86

68.18 69.8 72.86

68.18 69.8 72.86

68.18 69.8 72.86

68.18 69.663 72.609

30%

CO ICA Our

66.36 67.27 69.37

66.36 67.27 69.86

66.36 67.46 69.92

66.36 67.73 70.18

66.36 67.73 70.18

66.36 67.73 70.22

66.36 67.73 70.22

66.36 67.73 70.22

66.36 67.73 70.22

66.36 67.73 70.22

66.36 67.611 70.061

Documents

Advisor : Prof. Sing Ling Lee Student : Chao Chih Wang Date : 2013.01.04 1