12
TBCA Formulated Six Sigma to Investigate Anti-Diabetic Food Products Pradip Shivajirao Jadhav 1 Prof. Rahul Joshi 2 Dr. Preeti Mulay 3 , Computer Science & Engineering Symbiosis Institute of Technology . Pune, India. [email protected] [email protected] [email protected] May 27, 2018 Abstract Diabetes is day by day growing in India. The affecting factor is that now a days diabetes is with many complica- tions and to be occurring in each age group in the country. Purpose of the work presented in this paper is to hope that the findings of Diabetes free food product analysis with di- abetes complications will reveal the current situation of Di- abetes. There were more papers related to six sigma pub- lished but in this paper, wish to propose completely cov- ering review on Six Sigma. Diabetes analysis through Six Sigma DMAIC process and TBCA algorithm. Basically Six Sigma is process improvement technique. Six Sigma is pro- cess quality tool by identifying causes and removing defects. In TBCA, threshold is set by distance between cluster cen- ter and data point. Data point would be assigned to that cluster otherwise not, if the data point has less distance than the threshold distance. Diabetes complications and diabetes free product contents can be analyzed. Diabetes analysis can be improved using Six Sigma data analysis and then 1 International Journal of Pure and Applied Mathematics Volume 118 No. 24 2018 ISSN: 1314-3395 (on-line version) url: http://www.acadpubl.eu/hub/ Special Issue http://www.acadpubl.eu/hub/

TBCA Formulated Six Sigma to Investigate Anti-Diabetic ... · betic patients report and food item contents for recommending food product to Diabetic patient. Six Sigma and TBCA algorithm

  • Upload
    others

  • View
    0

  • Download
    0

Embed Size (px)

Citation preview

Page 1: TBCA Formulated Six Sigma to Investigate Anti-Diabetic ... · betic patients report and food item contents for recommending food product to Diabetic patient. Six Sigma and TBCA algorithm

TBCA Formulated Six Sigma toInvestigate Anti-Diabetic Food Products

Pradip Shivajirao Jadhav1

Prof. Rahul Joshi2 Dr. Preeti Mulay3,Computer Science & EngineeringSymbiosis Institute of Technology

. Pune, [email protected]

[email protected]@sitpune.edu.in

May 27, 2018

Abstract

Diabetes is day by day growing in India. The affectingfactor is that now a days diabetes is with many complica-tions and to be occurring in each age group in the country.Purpose of the work presented in this paper is to hope thatthe findings of Diabetes free food product analysis with di-abetes complications will reveal the current situation of Di-abetes. There were more papers related to six sigma pub-lished but in this paper, wish to propose completely cov-ering review on Six Sigma. Diabetes analysis through SixSigma DMAIC process and TBCA algorithm. Basically SixSigma is process improvement technique. Six Sigma is pro-cess quality tool by identifying causes and removing defects.In TBCA, threshold is set by distance between cluster cen-ter and data point. Data point would be assigned to thatcluster otherwise not, if the data point has less distance thanthe threshold distance. Diabetes complications and diabetesfree product contents can be analyzed. Diabetes analysiscan be improved using Six Sigma data analysis and then

1

International Journal of Pure and Applied MathematicsVolume 118 No. 24 2018ISSN: 1314-3395 (on-line version)url: http://www.acadpubl.eu/hub/Special Issue http://www.acadpubl.eu/hub/

Page 2: TBCA Formulated Six Sigma to Investigate Anti-Diabetic ... · betic patients report and food item contents for recommending food product to Diabetic patient. Six Sigma and TBCA algorithm

applying TBCA. We are going to use combination of SixSigma process and Threshold Based Clustering Algorithm.Firstly six sigma DMAIC processes on diabetes complica-tions as well as diabetes free product contents; then TBCAalgorithm on outcome from DMAIC process.

Key Words:Clinical Data, Food Data, Correlation, In-cremental Cluster, Anytime TBCA, C4.5 algorithm.

1 Introduction

Now a days, technology is being used for making decisions aboutright food choice. Diabetes is one of the most common diseases inthe world. This proposed exploratory research aims to analyze Dia-betic patients report and food item contents for recommending foodproduct to Diabetic patient. Six Sigma and TBCA algorithm areused to analyze Diabetes. It is hoped that this research will provideproper analysis of Diabetes. The main idea is analysis of diabetescomplications and food item data that automatically recommendbest suitable products to diabetic patients. So in previous studythere were different combinations used for Diabetes treatment andprevention. The result from those systems was diabetes preventive;we can improve it by applying new combination of Six Sigma andTBCA. The proposed study presents a Six Sigma DMAIC approachand Threshold Based Clustering Algorithm (TBCA). This study iscarried out by considering diabetic patients medical reports andfood item data as a primary source of input data set. AHP andTBCA are used to analyze diabetes mellitus [6] [13].

Health concern systems need a huge amount of data in the formof databases for research and analysis for making medical decisions.As an outcome, medical information systems in clinics, hospitalsand institutions become larger and larger so the process of extract-ing a feature becomes more complex and difficult. To avoid this,computerized AI systems have been examined. By using this ma-chine learning based AI systems, currently AI represents significantdata mining tools available. It has been shows reliable, valid, sig-nificant method to discover pattern and features. By research, datamining has been proven concrete method for diagnosis and reducesthe cost and human resource. Machine learning provides robustsolutions for expert systems, recommendation systems and data

2

International Journal of Pure and Applied Mathematics Special Issue

Page 3: TBCA Formulated Six Sigma to Investigate Anti-Diabetic ... · betic patients report and food item contents for recommending food product to Diabetic patient. Six Sigma and TBCA algorithm

mining for our future civilization. Machine learning systems solvespecific problem instead of writing code, by creating their own logicand so now a day machine learning systems are most accurate fordata mining process and all such above systems.

Data mining is a technique to extract useful information froma large datasets. In medical field, there are more than thousandsof datasets are available throughout the world and eac h one hasvaluable information. Clinicians and medical institutions use thisdata mining techniques for more accurately diagnoses the diseaseand to assist a curable medicine.

In recent time, Diabetes is rapidly increased disease in all agegroup and Diabetes mellitus or diabetes is a perennial, long-lastingcondition in which body is not able to use the energy taken asfood. In all types of diabetes body breaks down the sugars andcarbohydrates which you take into a special sugar called glucose.Glucose fuels the cells in body, but cells need insulin to take theglucose and use it as energy. So with diabetes mellitus, either yourbody doesn’t make enough insulin; it can’t use the insulin properit does produce, or a combination of both.

According to International Diabetes Federation (IDF) in 2015report, there were an estimated 78.3 million affected with diabetesin South East Asia. In addition to that, by 2040 there will be anestimated 214.8 million affected. 1 in 11 adults has diabetes aroundthe world in 2015& this will be 1 in 10 adults in 2040.

Six Sigma - Six Sigma introduced by Bill Smith in 1986 whileworking in Motorola; Six Sigma is process improvement with havinga set of techniques and tools. Basically it is organizations tool toimprove the capability of business manufacturing process. It helpsto increases the performance and decreases the process time; andhelps to improvements in profits, which exceed towards maintainingquality.

TBCA - It is like closeness factor based clustering problem. It isbased on one simple assumption that data comes in chunks. Thuscloser the data points, higher the probability that they are belongto same chunk [6].

The main contributions and novelty of this paper are summa-rized as follows.

1. For the Initialization process, Random clinical data for dia-betes and food dataset is given as input.

3

International Journal of Pure and Applied Mathematics Special Issue

Page 4: TBCA Formulated Six Sigma to Investigate Anti-Diabetic ... · betic patients report and food item contents for recommending food product to Diabetic patient. Six Sigma and TBCA algorithm

2. For Selection process, this study further analyzes the effectof variable selection on PCA monitoring performance through sta-tistical analysis.

3. After it, for Reproduction process, C4.5 algorithm and TBCAalgorithm is used for the cluster analyses.

4. And Incremental cluster is replaced for further studies andit will be the Reproduction based incremental cluster process.

5. By using this incremental cluster datasets, results aims to thecorrelation between clinical data and food data by using AnalyticalNetwork Process (ANP).

2 TECHNIQUES USED

RESEARCH METHODOLOGYDMAIC consists of five stages.1. Define Phase: In order to implement the Six Sigma Method-

ology, it is essential to define:• Principal Component Analysis (PCA) for data selection.• To check quality of anti-diabetic food items• To regulate flow of sugar• To check/verify quality of food item for diabetes2. Measure Phase: It is essential that you measure the perfor-

mance of Core Processes. You will need to:• U-Chart for mapping• Mapping of food item database and clinical database• To check accurate readings3. Analyze Phase: The next step is to analyze the data and

process map to establish causes of defects and where you can im-prove:

• Analysis of mapped data• Cause & Effect Diagram and TBCA algorithm is used for

analysis of mapped data.4. Improve Phase: Using the output data from the analyze

phase it is now possible to fix and prevent problems to improve theprocess by finding solutions.

• Future State Process Map• Steps to specify processes for control inferences

4

International Journal of Pure and Applied Mathematics Special Issue

Page 5: TBCA Formulated Six Sigma to Investigate Anti-Diabetic ... · betic patients report and food item contents for recommending food product to Diabetic patient. Six Sigma and TBCA algorithm

5. Control Phase: Control and sustain improvements over timeby:

• Process Control

Figure 1: Six Sigma DMAIC Process

Figure 1: It shows the Six Sigma DMAIC (Define-Measure-Analyze-Improve-Control) process.

2. Data SourcesThe clinical data consists of a 768 diabetic patients. This dataset

is collected from Indian Pima diabetic source. And different 25 dia-betes free food product dataset. The reports are voluntary diabeticfollow ups carried out by the patients on a regular basis at themedical center.

Population being taken into consideration for this research studyis predominantly Indian. The diabetic patients are factored intotype1 and type2.

3. C 4.5 AlgorithmC4.5 builds decision trees from a set of training data in the same

way as ID3, using the concept of information entropy. The trainingdata is a set of already classified samples. Each sample consistsof a p-dimensional vector, where they represent attribute values orfeatures of the sample, as well as the class in which falls.

At each node of the tree, C4.5 chooses the attribute of the datathat most effectively splits its set of samples into subsets enrichedin one class or the other. The splitting criterion is the normalizedinformation gain (difference in entropy). The attribute with thehighest normalized information gain is chosen to make the decision.The C4.5 algorithm then recurs on the smaller subsists.

Steps for C 4.5The general algorithm for building decision trees is:• Check for the above base cases.

5

International Journal of Pure and Applied Mathematics Special Issue

Page 6: TBCA Formulated Six Sigma to Investigate Anti-Diabetic ... · betic patients report and food item contents for recommending food product to Diabetic patient. Six Sigma and TBCA algorithm

• For each attribute a, find the normalized information gainratio from splitting on a.

• Let a best be the attribute with the highest normalized infor-mation gain.

• Create a decision node that splits on a best.Recur on the sub lists obtained by splitting on a best, and add

those nodes as children of node.4. Threshold Based Clustering Algorithm(TBCA)TBCA is Threshold Based Clustering Algorithm and enhanced

from CFBA (Closeness factor Based Algorithm) based on simpleassumption data arrives in chunks and closer the data points, higherprobability to relate that data chunks [6].

Three Phase of TBCA:[1] Initial Phase: Data arrives into Chunks.[2] Incremental Phase: It compares closeness value with already

formed clusters and compares matching clusters.[3] Final Phase: Update the set of clusters and new data are

available for the analyst to knowledge augmentation.Formula used for calculation of accuracy-ClusteringvalueofsingleiterationClusteringvalueofmultipleiteration

ClusteringvalueofsingleiterationdatasetsAccuracy = 91.9 %Algorithm

6

International Journal of Pure and Applied Mathematics Special Issue

Page 7: TBCA Formulated Six Sigma to Investigate Anti-Diabetic ... · betic patients report and food item contents for recommending food product to Diabetic patient. Six Sigma and TBCA algorithm

3 RESULTS

By performing PCA the impactful and non- Impactful attribute isderived. After it the C4.5 and TBCA is applied. By C4.5 algorithm,Instance based clustering data is achieved and Through TBCA,Threshold based data is achieved.

After it the C4.5 and TBCA is applied. By C4.5 algorithm,Instance based clustering data is achieved. Through C4.5 clustersare generated and validate based on next process TBCA. TBCA,Threshold based clustering algorithm is closeness factor based al-gorithm and clusters are formed by using closeness Value. By usingthese Algorithms, one incremental learning based data set is gen-erated and another is intelligent data set is generated.

The result of this proposed work is analyzed by statistical val-idation techniques and generalized it through real data. The val-idation techniques are sum of square between cluster and sum ofsquare within cluster for external validation and for internal vali-dation, Precision, F-measure and revise is performed.

7

International Journal of Pure and Applied Mathematics Special Issue

Page 8: TBCA Formulated Six Sigma to Investigate Anti-Diabetic ... · betic patients report and food item contents for recommending food product to Diabetic patient. Six Sigma and TBCA algorithm

4 WORKFLOW OF APPLICATION

Figure 2: Flow Chart for TBCA formulated Six Sigma

Figure 3: Selecting Dataset for Applying different algorithms

Figure 4: TBCA Output

8

International Journal of Pure and Applied Mathematics Special Issue

Page 9: TBCA Formulated Six Sigma to Investigate Anti-Diabetic ... · betic patients report and food item contents for recommending food product to Diabetic patient. Six Sigma and TBCA algorithm

Figure 5: C4.5 Algorithm Output

Figure 6: Before applying PCA on Diabetes dataset

9

International Journal of Pure and Applied Mathematics Special Issue

Page 10: TBCA Formulated Six Sigma to Investigate Anti-Diabetic ... · betic patients report and food item contents for recommending food product to Diabetic patient. Six Sigma and TBCA algorithm

Figure 7: After applying PCA on Diabetes dataset

Figure 8: Final Recommendation System

5 CONCLUSION & FUTURE SCOPE

Considering clinical dataset of diabetes patients for and applyingPrinciple Component

Algorithm (PCA) find out impactful and non-impactful attributesfor both dataset.

Then applying C4.5 algorithm generate an instance based datasetand clusters. And on same dataset applying TBCA find out Thresh-old Based clusters. By comparing both clusters generate intelligentdataset for diabetes dataset.

10

International Journal of Pure and Applied Mathematics Special Issue

Page 11: TBCA Formulated Six Sigma to Investigate Anti-Diabetic ... · betic patients report and food item contents for recommending food product to Diabetic patient. Six Sigma and TBCA algorithm

By performing this work it is concluded that the clinical datasetand food dataset is correlated to each other and it will be analyzedby cluster based incremental algorithm. The future Scope is toevaluate by using some deep learning algorithm and classificationtechniques.

References

[1] https://www.kaggle.com/ navya21 / d/ uciml/ pima- indians-diabetes- database /diabetes- prediction

[2] https://www.kaggle.com/uciml/pima-indians-diabetes-database

[3] Hsiang-Chin Hung and Ming-Hsien Sung, Applying six sigmato manufacturing processes in the food industry to reduce qual-ity cost, 2011.

[4] Alhuraish, I. Robledo, C. Kobi, Assessment of Lean Manufac-turing and Six Sigma operation with Decision Making Basedon the Analytic Hierarchy Process, 2016.

[5] M.Mounika, S.D.Suganya, B.Vijayashanthi, S.KrishnaAnand,Predictive Analysis of Diabetic Treatment Using ClassificationAlgorithm, 2015.

[6] Mulay, P., Joshi, R. R., Anguria, A. K., Gonsalves, A., Deep-ankar, D., & Ghosh, D. (2017). Threshold Based ClusteringAlgorithm Analyzes Diabetic Mellitus. In Proceedings of the5th International Conference on Frontiers in Intelligent Com-puting: Theory and Applications (pp. 27-33). Springer, Singa-pore.

[7] Md. Asafuddoula, Hemant Kumar Singh and Tapabrata Ray,Transactions on Evolutionary Computation Six-Sigma RobustDesign Optimization using a Many-objective DecompositionBased Evolutionary Algorithm, 10.1109/TEVC.2014.2343791,IEEE.

[8] Kamil Drab, Mariusz Kwiatkowski, Philip Wasalaski, How SixSigma brought saving and improved manufacturing process by98.9%, 2015.

11

International Journal of Pure and Applied Mathematics Special Issue

Page 12: TBCA Formulated Six Sigma to Investigate Anti-Diabetic ... · betic patients report and food item contents for recommending food product to Diabetic patient. Six Sigma and TBCA algorithm

[9] StanislavaSimonova, Hana Kopackova, Utilization of Six Sigmafor Data Improvement,2014.

[10] Haojun Sun, Rongbo Chen, Shulin Jin, Yong Qin, A Hierar-chical Clustering for Categorical Data Based on Holo-entropy,2015.

[11] Evelina Ericsson, Markus Buschle, JoakimLillieskld, MathiasEkstedt, Developing a Design for Six Sigma Framework Forthe Analysis of Product Development Processes, 2015.

[12] Sezaielika, Mehmet TolgaTanerb, GamzeKaanc, Masumimekc,Mehmet Kemal Kaand, brahim ztek, A Retrospective Study ofSix Sigma Methodology to Reduce Inoperability among LungCancer Patients, 2016.

[13] Gaikwad, S. M., Mulay, P., & Joshi, R. R. (2015). Analyticalhierarchy process to recommend an ice cream to a diabetic pa-tient based on sugar content in it . Procedia Computer Science,50, 64-72.

12

International Journal of Pure and Applied Mathematics Special Issue