Upload
clarissa-boone
View
221
Download
2
Embed Size (px)
Citation preview
1
Ambiguous Nodes in Networked Data based on Measuring Reliable Neighboring Probabilities
Advisor : Prof. Sing Ling LeeStudent : Chao Chih WangDate : 2013.01.04
2
Outline
Introduction Network data Traditional VS Networked data Classification Collective Classification ICA
Problem Our Method
Collective Inference With Ambiguous Node (CIAN)
Experiments Conclusion
3
Introduction – Network data traditional data:
instances are independent of each other
network data: instances are maybe related to each other
application: emails web page paper citation
independent related
4
Introduction – Network data
5
traditional VS network data classification
Introduction
F
D
G
E
C
B
H
A
12
F
D
G
E
C
B
H
A 1
2
B
Class: 1 2
: Class 1
6
Introduction – Collective Classification To classify interrelated instances using content
features and link features.
node
content feature
link featureclass1 class2
class3
class
D 1 0 0
E 1 1 1
1+2
A 1B
C
D
E
21
1 0 0 1/2 1/2 0
1
2
link feature
class1 class2 class3
Binary
Count
Proportion
1 0 02 0 01 0 0
1 1 01 1 01/2 1/2 0
D:E:
We use :
7
Introduction – ICA
ICA : Iterative Classification Algorithm
Initial : Training local classifier use content features to predict unlabel instancesIterative { for predict each unlabeled instance { set unlabeled instance ’s link feature use local classifier to predict unlabeled instance }}
step1
step2
8
Introduction – ICA Example
Iteration
content feature
link featureclass1 class2
class3
class
1 1 0 1
2 1 0 1
node
content feature
link featureclass1 class2
class3
class
C 1 0 0 0 1/2 1/2 2
D 1 0 0 1/2 1/2 0 2
E 1 0 1 1/2 1/4 1/4 1
F 1 0 1 1 0 0 1
G 1 0 1 1 0 0 1
H 1 1 1 1/3 1/3 1/3 3
Training data:
2/3 0 1/3 1/3 1/3 1/3
1
3A:
1
2
2 A
E
1
1B
1
C
D
F1 G
3H2
1
2
3
3
Class : 1 2 3
unlabel data:training data:
Iteration
content feature
link featureclass1 class2
class3
class
1 1 0 1
2 1 0 1
1/2 1/2 0 1/4 1/2 1/4
2
2B:
3I
9
Problem – Ambiguous Node label the wrong class
judge the label with difficulty make a mistake 2
1
1A
B
2
2D
node content feature class
C 0 0 1
D 0 0 1
E 1 1 2
F 1 0 2
G 1 1 2
C
E F
G
content feature
a. b.
woman
man age≤20
age>20
0 1 0 1
class
non-smoking
smoking
1 2
11 or 2 ?1
10
Problem – use ICA
11
1A
unlabel data:training data:
1
1
1C 1
2
22
A
True class :
B
1
C
1
2 D
node
content feature
link featureclass1 class2
class3
class
B 0 0 1 0 1 0 1
D 0 1 1 1/3 2/3 0 2
E 0 1 1 1/4 3/4 0 2
F 1 0 1 2/3 1/3 0 1
G 1 0 1 2/3 1/3 0 1
H 0 1 1 0 1 0 2
Training data
Iteration
content feature
link featureclass1 class2
class3
class
1 1 1 1
2 1 1 1
2/3 1/3 0 2/3 1/3 0
A: 1
1E
F
G
B
2
H
- Ambiguous
1 0 1C:
1I
2J
11
Idea
Make a new prediction for neighbors of unlabeled instance
Use probability to compute link feature
Retrain the CC classifier
12
compute link feature use probability
A
12
3( 1 , 80%)
( 2, 60%)
( 3 , 70%)
Our method:Class 1 : 80/(80+60+70)Class 2 : 60/(80+60+70)Class 3 : 70/(80+60+70)
General method :Class 1 : 1/3Class 2 : 1/3Class 3 : 1/3
Our Method – Method #1
13
Our Method – Method #2
11
1A1
1
1C
2
2
E
F
G
B
2
H
- Ambiguous
22
A
True class :
B
1
C
D
To predict unlabeled instance ’s neighbors again.
( 1 , 70%)( 2, 80%)
( 1 , 70%)
( 1 , 70%)
( 2 , 60%)
( 2 , 80%)
predict again
11
1A1
1
1C
2
2
E
F
G
B
2
H
- Noise
D( 1 , 70%)( 2, 80%)
( 1 , 70%)
( 1 , 70%)
( 2 , 90%)
( 2 , 80%)
predict again
B is ambiguous node.
B is noise node.
14
To predict unlabeled instance ’s neighbors again first iteration needs to predict again difference between original and predict
label : ▪ This iteration doesn’t to adopt▪ Next iteration need to predict again
similarity between original and predict label :▪ Average the probability ▪ Next iteration doesn’t need to predict again
A
1 2( 1 , 80%) ( 2, 60%)
( 2, 80%)( 2, 60%)
( 2, 70%)( 2, 60%)
1
Our Method – Method #2
new prediction
B C
Example:
15
2
w
x
y 3z
1( 1 , 60%)
( 2 , 70%)
( 3 , 60%)
( 3 , 60%)
( 2 , 80%)
( 2 , 70%) new prediction
( 2 , 75%)
( 3 , 60%)( ? , ??%)
linkfeature
Class 1 Class 2 Class 3
original 0.315 0.368 0.315
Method A 0.27 0.405 0.324
Method B 0 0.692 0.307
Method C 0 0.555 0.444
x:Method A : (1 , 50%)Method B : (2 , 60%)Method C : (1 , 0%)
2
x’s True label : 2
x is ambiguous ( or noise) node:Method B >Method C > Method Ax is not ambiguous ( or noise) node:Method A >Method C > Method BMethod A & Method B is too extreme.So we choose the Method C.
Our Method – Method #2
not adoptchange classnot change class
-Ambiguous
16
Our Method – Method #2
Iteration 1
Iteration 2
Iteration 3
Iteration 4
Iteration 5
67
68
69
70
71
72
73
74
75
MethodAMethodBMethodC
Accuracy
17
Retrain CC classifier
Our Method – Method #3
node
content feature
link featureclass1 class2
class3
class
A 1 0 1 1/2 1/2 0 1
B 1 1 1 1 0 0 2
C 1 0 1 1 0 0 1
( 3 , 70%)
1+
AB
C
D
E
21
1+2
A 3
B
C
D
E
21
( 1 , 90%)( 2, 60%)
( 2 , 70%)
( 1 , 80%) node
content feature
link featureclass1 class2
class3
class
A 1 0 1 90/290 130/290
70/290 1
B 1 1 1 80/140 60/140 0 2
C 1 0 1 80/150 0 70/150 1
retrain
Initial ( ICA )
18
11
1A
B
1
1
1C 2
-Ambiguous2
1
( 2 , 60%)
( 1 , 80%)
2( 2 , 80%)
( 1 , 80%) ( 2 , 80%)
( 1 , 70%)
predict again
( 2 , 60%)
( 1 , 60%)
D
Iteration
content feature
link featureclass1 class2
class3
class
1 1 1 1
22
A
True label:
B
1
C node
content feature
link featureclass1 class2
class3
class
B 0 0 1 0 1 0 1
D 0 1 1 1/2 1/2 0 2
E 0 1 1 1/2 1/2 0 2
F 1 0 1 1 0 0 1
G 1 0 1 1 0 0 1
Training data
E
F
G
1/2 1/2 0
Our: 2
unlabel data:training data:
content feature
link featureclass1 class2
class3
class
0 0 1 0.466 0.533 0
B:
( 1 , 60%)
( 2 , 80%)
2
CIAN Example – Ambiguous
1 1 1 2/3 1/3 0 1ICA:
19
11
1A
B
1
1
1C 2
- Noise2
1
( 2 , 80%)
( 1 , 80%)
2( 2 , 80%)
( 1 , 80%) ( 2 , 80%)
( 1 , 70%)
predict again
( 2 , 70%)
( 1 , 70%)
D Iteration
content feature
link featureclass1 class2
class3
class
1 1 1 1
22
A
True label:
B
1
C node
content feature
link featureclass1 class2
class3
class
B 0 1 1 0 1 0 1
D 0 1 1 1/2 1/2 0 2
E 0 1 1 1/2 1/2 0 2
F 1 0 1 1 0 0 1
G 1 0 1 1 0 0 1
Training data
E
F
G
1/2 1/2 0
Our:2
unlabel data:training data:
content feature
link featureclass1 class2
class3
class
0 1 1 0.466 0.533 0
B:
( 1 , 60%)
( 2 , 80%)
2
CIAN Example – Noise
1 1 1 2/3 1/3 0 1ICA:
20
CIAN
CIAN : Collective Inference With Ambiguous Node
Initial : Training local classifier use content features to predict unlabel instancesIterative { for predict each unlabel instance { for nb unlabeled instance ’s neighbors{
if(need to predict again) (class label, probability ) = local
classifier(nb) } set unlabel instance ’s link feature (class label, probability ) = local classifier(A) } retrain local classifier}
step1
step2
step3
step4step5
21
Experiments - Data sets
Characteristics Cora CiteSeer
WebKB-texas
WebKB-washingto
n
Instances 2708 3312 187 230
Class labels 7 6 5 5
Link number 4732 5429 328 446
Content features 1433 3703 1703 1703
Link features 7 6 5 5
22
Experiments-Experimental setting
Characteristics Cora CiteSeer
WebKB-texas
WebKB-washingt
on
Instances 2708 3312 187 230
Max ambiguous nodes
(NB)429 590 52 50
Max ambiguous nodes
(SVM)356 365 20 31
Training data 1500 2000 100 120
Iteration 5 5 5 5fixed
argument‧Compare with CO、 ICA、 CIAN
23
Experiments
1. misclassified nodes Proportion of misclassified nodes (0%~30% , 80%)
2. ambiguous nodes NB vs SVM
3. misclassified and Ambiguous nodes Proportion of misclassified and ambiguous nodes
(0%~30% , 80%)
4. iteration & stable number of iterations
24
Experiments – 1. misclassified Cora
0% 10% 20% 30%68
70
72
74
76
78
80
82
84COICACIAN
Proportion of misclassified nodes
Accu
racy
0% 4.5 2.5
10% 3.2 3.3
20% 2.7 3.9
30% 2.3 4.2
CO ICA CIAN0% 76.1 80.6 83.1
10% 73.2 76.4 79.720% 70.7 73.4 77.330% 68 70.3 74.5
25
Experiments – 1. misclassified CiteSeer
0% 10% 20% 30%6869707172737475767778
COICACIAN
Proportion of misclassified nodes
Accu
racy
0% 3.6 0.2
10% 1.9 1.3
20% 1.3 1.8
30% 1.2 2.1
CO ICA CIAN0% 73.5 77.1 77.3
10% 71.8 73.7 7520% 70.2 71.5 73.330% 69.1 70.3 72.4
26
Experiments – 1. misclassified WebKB-texas
0% 10% 20% 30%66676869707172737475 CO
ICACIAN
Proportion of misclassified nodes
Accu
racy
0% 1.4 0.3
10% 1.5 0.9
20% 1 1.5
30% 0.4 2
CO ICA CIAN
0% 73.2 74.6 74.9
10% 70.7 72.2 73.1
20% 69.4 70.4 71.9
30% 67.8 68.2 70.2
27
Experiments – 1. misclassified WebKB-washington
0% 10% 20% 30%6869707172737475767778 CO
ICACIAN
Proportion of misclassified nodes
Accu
racy
0% 1.5 0.9
10% 1.4 1.4
20% 1 1.9
30% 0.8 2.2
CO ICA CIAN
0% 75.1 76.6 77.5
10% 73.2 74.6 76
20% 71.6 72.6 74.5
30% 69.8 70.6 72.8
28
Experiments – 1. misclassified
80% of misclassified nodes
CO ICA CIAN
Cora 22.1 20.1 25.6
CiteSeer 20.7 19.1 23.6
WebKB-texas 17.6 16.3 20.5
WebKB-washington 19 17.7 22.2
29
Experiments – 2. ambiguous
Cora
33% 66% 99%74
76
78
80
82
84
86
CO
ICA
CIAN
Proportion of ambiguous nodes(NB)
Accu
racy
Max ambiguous nodes : 429
33% 66% 99%74
76
78
80
82
84
86
Proportion of ambiguous nodes (SVM)
Max ambiguous nodes : 356
30
Experiments – 2. ambiguous
CiteSeerMax ambiguous nodes : 590
33% 66% 99%74
76
78
80
82
84
CO
ICA
CIAN
Proportion of ambiguous nodes(NB)
Accu
racy
33% 66% 99%74
76
78
80
82
84
Proportion of ambiguous nodes (SVM)
Max ambiguous nodes : 365
31
Experiments – 2. ambiguous
WebKB-texasMax ambiguous nodes : 52
33% 66% 99%67
69
71
73
75
77
79
81
CO
ICA
CIAN
Proportion of ambiguous nodes(NB)
Accu
racy
33% 66% 99%67
69
71
73
75
77
79
81
Proportion of ambiguous nodes (SVM)
Max ambiguous nodes : 20
32
Experiments – 2. ambiguous
WebKB-washingtonMax ambiguous nodes : 33
33% 66% 99%71
73
75
77
79
81
CO
ICA
CIAN
Proportion of ambiguous nodes(NB)
Accu
racy
33% 66% 99%71
73
75
77
79
81
Proportion of ambiguous nodes (SVM)
Max ambiguous nodes : 31
33
Experiments – 2. ambiguous
Characteristics Cora CiteSeer
WebKB-texas
WebKB-washingt
on
Instances 2708 3312 187 230
Max ambiguous nodes
(NB)429 590 52 50
Max ambiguous nodes
(SVM)356 365 20 31
The same ambiguous nodes 157 164 15 17
The proportion of the same
ambiguous nodes (NB)
36.5% 27.7% 28.8% 34%
The proportion of the same
ambiguous nodes (SVM)
44.1% 44.9% 75% 54.8%
‧ How much the same ambiguous nodes between NB and SVM?
34
Experiments – 3. misclassified and ambiguous
Cora
10% 20% 30%67
69
71
73
75
77
79 COICACIAN
Proportion of misclassified and ambiguous nodes
Accu
racy
10% 6.3 1.5
20% 6.7 2
30% 6.7 2.3
CO ICA CIAN
10% 71.7 78 79.5
20% 69.2 75.9 77.9
30% 67.1 73.8 76.1
35
Experiments – 3. misclassified and ambiguous
CiteSeer
10% 20% 30%69707172737475767778
COICACIAN
Proportion of misclassified and ambiguous nodes
Accu
racy
10% 2.2 0.9
20% 1.1 1.7
30% 2.2 2.3
CO ICA CIAN
10% 74.5 76.7 77.6
20% 72.5 73.6 75.3
30% 69.1 71.3 73.6
36
Experiments – 3. misclassified and ambiguous
WebKB-texas
10% 20% 30%666768697071727374
COICACIAN
Proportion of misclassified and ambiguous nodes
Accu
racy
10% 1.8 1
20% 1.8 1.4
30% 1.4 3
CO ICA CIAN
10% 70.3 72.1 73.1
20% 68.2 70 71.4
30% 66.1 67.5 70.5
37
Experiments – 3. misclassified and ambiguous
WebKB-washington
10% 20% 30%66
68
70
72
74
76
78
80 COICACIAN
Proportion of misclassified and ambiguous nodes
Accu
racy
10% 0.5 1.8
20% 1.2 2.4
30% 1.8 5.2
CO ICA CIAN
10% 78.4 78.9 80.7
20% 72.4 73.6 76
30% 66.2 68 73.2
38
Experiments – 3. misclassified and ambiguous
80% of misclassified and ambiguous nodes
CO ICA CIAN
Cora 27.3 26.8 30.3
CiteSeer 26.9 24.8 28.6
WebKB-texas 24.9 22.5 26.9
WebKB-washington 25.8 20.9 25.9
39
Experiments
Number of
training data
The proportion of misclassified
nodes
The proportion of misclassified and ambiguous
nodesCora 1500 60% 65%
CiteSeer 2000 65% 70%
WebKB-texas 100 45% 50%WebKB-
washington 120 50% 55%
‧ When the accuracy of ICA is lower than CO ?
40
Experiments – 4. iteration & stable Cora
10%
round CO ICA Our
1 74.25 78.59 79.24
2 74.25 78.64 80.11
3 74.25 78.59 80.23
4 74.25 78.64 80.11
5 74.25 78.64 80.48
6 74.25 78.82 80.5
7 74.25 78.82 80.67
8 74.25 78.82 80.79
9 74.25 78.82 80.79
10 74.25 78.82 80.79
Avg. 74.25 78.72 80.371
20%
CO ICA Our
71.77 75.64 76.33
71.77 76.22 77.22
71.77 75.85 77.41
71.77 75.94 77.22
71.77 76.26 78.3
71.77 76.29 78.3
71.77 76.49 78.44
71.77 76.49 78.57
71.77 76.49 78.57
71.77 76.49 78.57
71.77 76.216 77.893
30%
CO ICA Our
67.52 71.32 73.59
67.52 71.53 74.11
67.52 71.53 74.11
67.52 71.32 74.25
67.52 71.64 74.11
67.52 71.73 74.28
67.52 71.82 74.32
67.52 71.82 74.4
67.52 71.82 74.49
67.52 71.82 74.49
67.52 71.635 74.215
41
Experiments – 4. iteration & stable CiteSeer
10%
round CO ICA Our
1 72.25 74.74 74.88
2 72.25 74.68 75.96
3 72.25 74.79 75.88
4 72.25 74.83 76.27
5 72.25 75.14 76.27
6 72.25 75.23 76.44
7 72.25 75.42 76.5
8 72.25 75.42 76.81
9 72.25 75.42 76.81
10 72.25 75.42 76.81
Avg. 72.25 75.109 76.263
20%
CO ICA Our
71.34 73.32 73.64
71.34 72.94 74.32
71.34 73.32 74.32
71.34 72.94 74.49
71.34 73.28 74.7
71.34 73.52 74.7
71.34 73.52 74.71
71.34 73.52 74.86
71.34 73.52 74.86
71.34 73.52 74.86
71.34 73.34 74.546
30%
CO ICA Our
69.24 71.03 72.17
69.24 71.24 72.78
69.24 71.24 72.63
69.24 71.03 72.78
69.24 71.38 72.86
69.24 71.46 73.24
69.24 71.76 73.3
69.24 71.76 73.39
69.24 71.76 73.48
69.24 71.76 73.48
69.24 71.442 73.011
42
Experiments – 4. iteration & stable WebKB-texas
10%
round CO ICA Our
1 66.73 66.82 68.34
2 66.73 66.96 68.83
3 66.73 66.82 68.83
4 66.73 67.24 68.54
5 66.73 67.24 68.92
6 66.73 67.24 68.92
7 66.73 67.24 68.92
8 66.73 67.24 68.92
9 66.73 67.24 68.92
10 66.73 67.24 68.92
Avg. 66.73 67.128 68.806
20%
CO ICA Our
64.36 65.51 65.66
64.36 65.79 66.8
64.36 66.08 66.8
64.36 66.25 67.36
64.36 66.25 67.67
64.36 66.25 67.67
64.36 66.25 67.67
64.36 66.25 67.67
64.36 66.25 67.67
64.36 66.25 67.67
64.36 66.113 67.264
30%
CO ICA Our
62.14 62.94 65.86
62.14 63.27 66.05
62.14 64.38 66.05
62.14 64.87 66.19
62.14 64.87 66.19
62.14 64.87 66.31
62.14 64.87 66.31
62.14 64.87 66.31
62.14 64.87 66.31
62.14 64.87 66.31
62.14 64.468 66.189
43
Experiments – 4. iteration & stable WebKB-washington
10%
round CO ICA Our
1 70.9 71.81 73.34
2 70.9 71.94 74.05
3 70.9 72.16 74.27
4 70.9 72.2 74.27
5 70.9 72.2 74.41
6 70.9 72.2 74.41
7 70.9 72.2 74.41
8 70.9 72.2 74.41
9 70.9 72.2 74.41
10 70.9 72.2 74.41
Avg. 70.9 72.131 74.239
20%
CO ICA Our
68.18 69.09 71.67
68.18 69.27 72.32
68.18 69.67 72.41
68.18 69.8 72.53
68.18 69.8 72.86
68.18 69.8 72.86
68.18 69.8 72.86
68.18 69.8 72.86
68.18 69.8 72.86
68.18 69.8 72.86
68.18 69.663 72.609
30%
CO ICA Our
66.36 67.27 69.37
66.36 67.27 69.86
66.36 67.46 69.92
66.36 67.73 70.18
66.36 67.73 70.18
66.36 67.73 70.22
66.36 67.73 70.22
66.36 67.73 70.22
66.36 67.73 70.22
66.36 67.73 70.22
66.36 67.611 70.061