Failure Analysis of A Ball Valve by using Data Mining

. . 2556 16-18 2556

Failure Analysis of A Ball Valve by using Data Mining

1* 2 3 1,2,3

E-mail: [email protected]*

Chantra Nakvachiratrakul1* Kasem Pipatpanyanukul2 Premkamol Preechaporn3 1,2,3Department of Industrial Engineering, Faculty of Engineering, Burapha University, Chonburi

E-mail: [email protected]*

(Performance)

40% (failure mode) 30% 2

(Ball valve) 4

Weka (Data mining)

Dataset 214 Input (Classify) Decision tree

Decision tree 76% 44.09

24.62 43.64%

Abstract

Performance records of a part repairing shop was adopted as a case study showing that about 40% of

repairing tasks had problem dealing with too much time consuming in analyzing its failure modes. In addition, it was found that about 30% of the repairing jobs needed twice or more time to duplicate due to the incorrect

diagnosis of the failure. This was caused by the inadequate experience (less than four months) of the

employees regarding ball valve repairing. This study presented Weka software for classifying the failure modes using the Data Mining for the repairing shop comprising 214 datasets pertaining to the incoming

parameters checking results, as well as actual failure modes. In fact, the datasets as such were the input of the classifying process with use of the Decision Tree, one of a �divide-and-conquer� approach to the problem

of learning a set independent instances. Despite the employee�s lack of reparation and analysis experience, this still enabled them to successfully determine ball valve repairing method by themselves. Overall, the

records from this study revealed that the Decision Tree could classify the ball valve repairing jobs with the

correctness of 76% resulting in the reduction of average time for ball valve failure mode analysis from 44.09

. . 2556 16-18 2556

minutes per task to 24.62 minutes per task. This means it could reduce the time for the analysis of up to

43.64%

Keywords: ball valve, Data Mining, classifying

1.

-

6

5-10

( 2-4

)

( 1 )

2.

2.1 Data mining

� � (Data mining)

(Algorithm)

(Artificial intelligent: AI) (Statistical method)

�

� [1] �

� [2]

�� (Decision tree method)

4 (Classification) �

� (overfitting)

Pruning

2.1.1 Decision Trees

Model

(Machine learning)

ID3 Quinlan, C4.5, C5, J48

CART J48

ID3 (top-down) [3]

J48 ID3 C4.5, C5 [4]

. . 2556 16-18 2556

(output) (input)

(class)

J48 WEKA (open source) Java

C4.5 ID3

J48 C4.5

ID3 [5] 2.2.2 Building

Classification Trees (split)

(

)

2

2

1 [6]

attribute

(attribute selection measure or goodness of split) [7]

Impurity measure attribute the entropy

expected information attribute

information gain entropy

S

class m classes class Ci (i=1,�,m) si

class Ci

Entropy expected information [7] m

iiim ppsssI

1221 )(log),...,,( (1)

),...,(...

)( 11

1

mjj

v

j

mjjssI

s

ssAE (2)

)(),...,,()( 21 AEsssIAGain m (3)

pi attribute class Ci si/s A

attribute v

3.

J48

123

3.1

3.1.1 (Collect data)

214

6 6

2 class 7 class

. . 2556 16-18 2556

2

6 class (B1-B6) 1 class (B7)

1 6

space 64 64 class

3.1.2

(Attributes) 5 Attributes

Testleak, Frontleak, Backseat, Testshell

DefectivePart Attribute 4

DefectivePart 375

214 3.1.3 Input Weka

Weka Input

CSV, ARFF Input CSV ARFF Attributes

5 Nominal Type

3.1.4 (Separate into Training and Test set)

10 (10 Folds � Cross-Validation)

3.1.5

(Supervised training algorithm) Decision tree J48

[7]

(1) create a node N; (2) if samples are all of the same class, C then

(3) return N as a leaf node labeled with the class C;

(4) if attribute-list is empty then

. . 2556 16-18 2556

(5) return N as leaf node labeled with

the most common class in sample; // majority voting

(6) select test-attribute, the attribute among attribute-list with the highest

information gain;

(7) label node N with test-attribute; (8) for each known value ai of test-attribute //

partition the samples (9) grow a branch from node N for the

condition test-attribute = ai; (10) let si be the set of sample in

samples for which test-attribute = ai; // a partition

(11) if si is empty then

(12) attach a leaf labeled with the most common class in

samples; (13) els attach the node returned by

Generate_dexision_tree(si, attribute-list-test-attribute);

4.

- Number of Leaves : 49

- Size of the tree : 60

- Correctly Classified Instances 164

76.6355 %

- Incorrectly Classified Instances 50

23.3645 %

- Mean absolute error 0.0083

- Root mean squared error 0.0821

5.

Decision tree

Decision

tree 76%

44.09 24.62

43.64%

[1] Usama M. Fayyad. Data mining and knowledge discovery: Making sense out of data. IEEE

Expert: Intelligent Systems and Their Applications, 11(5):20�25, 1996.

[2] R. Kruse G. Della Riccia and H. Lenz. Computational Intelligence in Data Mining.

Springer, New York, NY, USA, 2000. [3] Witten, I.H., Eibe, F.(2005), Data Mining:

Practical Machine Learning Tools and

Techniques, 2nd Edition, Morgan Kaufmann, San Fransisco.

[4] Quinlan, J.R. (1993) �C4.5: Programs for Machine Learning�. Morgan Kaufmann, San

Mateo, CA. [5] Aman Kumar Sharma et al., A Comparative

Study of Classification Algorithms for Spam Email Data Analysis, International Journal on

Computer Science and Engineering (IJCSE), Vol.

3 No. 5 May 2011, pp. 1890-1895 [6] A. Feelders. Classification trees.

http://www.cs.uu.nl/docs/vakken/adm/trees. [7] J. Han and M. Kamber (2001) �Data Mining:

Concepts and Techniques� Morgan Kaufmann Publishers

Documents

Failure Analysis of A Ball Valve by using Data Mining