11
Research Article Securely Outsourcing ID3 Decision Tree in Cloud Computing Ye Li, 1 Zoe L. Jiang , 1 Xuan Wang , 1 Junbin Fang, 2 En Zhang , 3 and Xianmin Wang 4 1 Harbin Institute of Technology, Shenzhen, Shenzhen 518055, China 2 Jinan University, Guangzhou, China 3 Henan Normal University, Henan, China 4 School of Computer Science, Guangzhou University, China Correspondence should be addressed to Zoe L. Jiang; [email protected] Received 2 May 2018; Accepted 2 September 2018; Published 4 October 2018 Academic Editor: Jaime Lloret Copyright © 2018 Ye Li et al. is is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. With the wide application of Internet of ings (IoT), a huge number of data are collected from IoT networks and are required to be processed, such as data mining. Although it is popular to outsource storage and computation to cloud, it may invade privacy of participants’ information. Cryptography-based privacy-preserving data mining has been proposed to protect the privacy of participating parties’ data for this process. However, it is still an open problem to handle with multiparticipant’s ciphertext computation and analysis. And these algorithms rely on the semihonest security model which requires all parties to follow the protocol rules. In this paper, we address the challenge of outsourcing ID3 decision tree algorithm in the malicious model. Particularly, to securely store and compute private data, the two-participant symmetric homomorphic encryption supporting addition and multiplication is proposed. To keep from malicious behaviors of cloud computing server, the secure garbled circuits are adopted to propose the privacy-preserving weight average protocol. Security and performance are analyzed. 1. Introduction In the modern Internet of ings (IoT), huge data are collected from sensor-networks and need to be provided for analysis by high-effective techniques, such as data mining. is process requires enormous computation and storage to support; cloud computing technology can provide the corresponding support. However, this process may leak the privacy of participants’ information. e privacy-preserving data mining (PPDM) based on encryption method has emerged as a solution to this problem. Privacy-Preserving Data Mining Framework. Considering different frameworks and theories, PPDM was originated by Lindell et al. [1] and Agrawal et al. [2] in 2002, respectively. Lindell’s framework is essentially a secure cryptography- based two-participant computation protocol without out- sourcing. In other words, two parties can interactively com- pute ( 1 + 2 )ln( 1 + 2 ) on their private input 1 and 2 . Agrawal’s framework is essentially a single-participant disturbance-based data storage and computation outsourcing algorithm. In particular, one party can upload disturbed data to server for private computation. With the development of cloud computation and IoT, a multiparty storage and computation outsourcing framework is preferred. Cryptography-based privacy-preserving data mining supporting one-party outsourcing has been studied [3, 4], with homomorphic encryption. However, multiple-key homomorphic encryption is an open problem when multiple parties are involved in the outsourcing framework. For example, how to execute ciphertext addition and multipli- cation on ciphertexts encrypted by different public keys? Security Models. We usually consider two different security models, including the semihonest and malicious security model. e definition in the semihonest model requires that all the users need to follow the rules of protocol. But we allow the dishonest users to obtain internal states of the other users. In the malicious model, different from the first security model, the corrupted users are allowed to deviate from the specified protocol. e success of the adversary means that the adversary can get the results of these protocols. Hindawi Wireless Communications and Mobile Computing Volume 2018, Article ID 2385150, 10 pages https://doi.org/10.1155/2018/2385150

ResearchArticle - Hindawi Publishing Corporationdownloads.hindawi.com/journals/wcmc/2018/2385150.pdf · ResearchArticle Securely Outsourcing ID3 Decision Tree in Cloud Computing YeLi,1

  • Upload
    others

  • View
    3

  • Download
    0

Embed Size (px)

Citation preview

Page 1: ResearchArticle - Hindawi Publishing Corporationdownloads.hindawi.com/journals/wcmc/2018/2385150.pdf · ResearchArticle Securely Outsourcing ID3 Decision Tree in Cloud Computing YeLi,1

Research ArticleSecurely Outsourcing ID3 Decision Tree in Cloud Computing

Ye Li1 Zoe L Jiang 1 Xuan Wang 1 Junbin Fang2 En Zhang 3 and Xianmin Wang 4

1Harbin Institute of Technology Shenzhen Shenzhen 518055 China2Jinan University Guangzhou China3Henan Normal University Henan China4School of Computer Science Guangzhou University China

Correspondence should be addressed to Zoe L Jiang zoeljianghiteducn

Received 2 May 2018 Accepted 2 September 2018 Published 4 October 2018

Academic Editor Jaime Lloret

Copyright copy 2018 Ye Li et al This is an open access article distributed under the Creative Commons Attribution License whichpermits unrestricted use distribution and reproduction in any medium provided the original work is properly cited

With the wide application of Internet of Things (IoT) a huge number of data are collected from IoT networks and are requiredto be processed such as data mining Although it is popular to outsource storage and computation to cloud it may invadeprivacy of participantsrsquo information Cryptography-based privacy-preserving datamining has been proposed to protect the privacyof participating partiesrsquo data for this process However it is still an open problem to handle with multiparticipantrsquos ciphertextcomputation and analysis And these algorithms rely on the semihonest security model which requires all parties to followthe protocol rules In this paper we address the challenge of outsourcing ID3 decision tree algorithm in the malicious modelParticularly to securely store and compute private data the two-participant symmetric homomorphic encryption supportingaddition and multiplication is proposed To keep from malicious behaviors of cloud computing server the secure garbled circuitsare adopted to propose the privacy-preserving weight average protocol Security and performance are analyzed

1 Introduction

In the modern Internet of Things (IoT) huge data arecollected from sensor-networks and need to be provided foranalysis by high-effective techniques such as data miningThis process requires enormous computation and storageto support cloud computing technology can provide thecorresponding support However this process may leak theprivacy of participantsrsquo information The privacy-preservingdata mining (PPDM) based on encryption method hasemerged as a solution to this problem

Privacy-Preserving Data Mining Framework Consideringdifferent frameworks and theories PPDM was originated byLindell et al [1] and Agrawal et al [2] in 2002 respectivelyLindellrsquos framework is essentially a secure cryptography-based two-participant computation protocol without out-sourcing In other words two parties can interactively com-pute (1199091 + 1199092)ln(1199091 + 1199092) on their private input 1199091 and1199092 Agrawalrsquos framework is essentially a single-participantdisturbance-based data storage and computation outsourcingalgorithm In particular one party can upload disturbed data

to server for private computation With the developmentof cloud computation and IoT a multiparty storage andcomputation outsourcing framework is preferred

Cryptography-based privacy-preserving data miningsupporting one-party outsourcing has been studied [3 4]with homomorphic encryption However multiple-keyhomomorphic encryption is an open problem whenmultipleparties are involved in the outsourcing framework Forexample how to execute ciphertext addition and multipli-cation on ciphertexts encrypted by different public keys

Security Models We usually consider two different securitymodels including the semihonest and malicious securitymodel The definition in the semihonest model requiresthat all the users need to follow the rules of protocolBut we allow the dishonest users to obtain internal statesof the other users In the malicious model different fromthe first security model the corrupted users are allowedto deviate from the specified protocol The success of theadversarymeans that the adversary can get the results of theseprotocols

HindawiWireless Communications and Mobile ComputingVolume 2018 Article ID 2385150 10 pageshttpsdoiorg10115520182385150

2 Wireless Communications and Mobile Computing

Data Distribution Three types of distributed datasets aredefined in related works including the horizontally dis-tributed datasets vertically distributed datasets and arbi-trarily distributed datasets The users in the horizontallydistributed dataset can keep divided parts for the sameattributes However in the vertical datasets users are allowedto keep different attributes In the last one the datasets can bearbitrarily divided and stored by the users

Due to the existence of malicious participants in thereal environment malicious participants may not follow theprotocol For example they can intentionally tamper with thedata suspend the protocol anytime during the execution ofthe protocol and so on To solve this problem this papercombines the noncontact commitment and confusion circuitmechanism studies the average computing protocol basedon confusion circuit and then proposes the framework ofa secure cryptography-based two-participant protocol withdata storage and computation outsourcing The frameworkconsists of two data owners and two cloud servers (cloud stor-age server (CSS) and cloud computing server (CCS)) Eachdata owner has a horizontally distributed private databasethat is encrypted before being outsourced to the cloud forstorage and computation

11 Our Contribution We decompose the key function ofdistributed ID3 decision tree 119866119886119894119899(119878 119860 119894) into counting(1199091 + 1199092)(1198861 + 1198862) sum multiplication and comparison

In counting we propose the Secure Equivalent Testing(SET) protocol to calculate the number of items for eachattribute value based on the encrypted data

To calculate the value of (1199091 + 1199092)(1198861 + 1198862) in maliciousmodel we implement the Outsourcing Secure Circuit (OSC)protocol

To perform the sum and multiplication operations overciphertext we adopt the Paillier encryption system andimplement the Secure Multiplication (SM) protocol

To execute comparison over ciphertext we adopt theSecure Minimum out of 2 Numbers (SMIN2) protocol

12 Related Work

Distributed PPDM without Outsourcing Distributed PPDMwithout outsourcing is mainly for data stored and calculatedlocally by the participant based on distributed data based onvarious data mining methods which can be decomposed todifferent operations such as average calculation calculationand calculation of logarithmic vector inner product Thenthe cryptography-based technology is used to design variousprivacy-preserving computing protocols In 2002 Lindell andPinkas [1] proposed a secure ID3 decision tree algorithmover horizontally partitioned data They decompose thedistributed ID3 algorithm to multilogarithmic calculationpolynomial evaluation calculation and data comparisonand then designed the security log protocol polynomialevaluation protocol and secure comparison protocol so asto achieve privacy-preserving in distributed ID3 algorithmIn 2007 Emekci et al [5] implemented a secure addition com-putational protocol based on the secret sharing algorithm andextended the secure logarithmic computing protocol from

two parties to multiple parties thus realizing the multipartyparticipation of the privacy protection ID3methodHoweverthe complexity of the algorithm increases exponentially whenthe participant data are more numerous In 2012 Lory et al[6] used Chebyshev polynomial expansion to replace Taylorexpansion in [1] thus further improving the computationalefficiency of secure logarithmic computing protocols How-ever their agreements still have limited efficiency in theimplementation of privacy protection protocols

Different from above in 2003 Vaidya et al [7 8] designeda multiparty privacy-preserving ID3 algorithm of verticallydistributed data setsThey vectorized all attribute value infor-mation by constructing constraint sets and then computedit by using the method of secure intersection protocol thusdesigning privacy-preserving ID3 for vertically distributeddata sets

In 2007 Han and Ng [9] proposed a multiparty dis-tributed privacy-preserving ID3 method based on arbitrarydistributed data sets Firstly each participantrsquos data set isvectorized and then the attribute value information is com-puted by using security intersection protocol and so onThen the entropy value of each attribute is computed byusing security logarithm computation protocol and so onThus the ID3 decision tree classification method of privacyprotection based on arbitrary distributed data set is obtainedHowever with the increase of the number of participants thecomputing volume of the client increases exponentially

Li et al [10] and Gao et al [11] addressed the Naive BayesLearning for aggregated arbitrary distributed databases

PPDM with Computation Outsourcing Cryptography-basedprivacy-preserving data mining has a lot of encryption anddecryption operations in the computation processThereforeit is difficult for large-scale data processing As a measurefor solving resource-restricted problems the outsourcingtechnique has been widely used in cloud computing appli-cations such as data sharing [12 13] data storing [14 15]data updating [16 17] and social network analysis [18 19]In this context we need to rely on security outsourcingtechnology to outsource computing or storage tasks of allparticipants to the cloud to process thus greatly reducingthe computingstorage load of each participant In 2014 Liuet al [3] adopted a new encryption scheme that supportsboth addition and multiplication over cipher texts In thisscheme most of the computations are performed on thecloud which reduces the computation workload of the dataowner However the scheme is limited to a single partyrsquos datamining operation Chen et al [20] designed new algorithmsfor secure outsourcing of modular exponentiations In 2015Bost et al [21] proposed the privacy-preserving hyperplanedecision Naive Bayesian and decision tree classificationalgorithms and through the semihonest model secure two-party computation model to prove that the above schemecan satisfy the semantic security (Semantic Security) andthe related protocol makes it possible to design an adaptiveenhancement algorithm (Adaptive Boosting) combine tofurther enhance the accuracy of the algorithm building aclassifier can be used to construct the privacy protectionof the library the further development of the classification

Wireless Communications and Mobile Computing 3

algorithms for privacy-preserving technology in the futurelays a solid foundation

PPDM with Multiparticipant Data Storage and ComputationOutsourcing In 2013 Peter et al [22] proposed a newsolution for the outsourcing of multiparty computationSuch a technique can be used in our setting But as thesecurity analysis in the previous works they can only achievesecurity in the semihonest model In [23] a new protocolwas proposed to achieve data mining for two parties In[24] association rule mining was addressed in the maliciousmodel In [25] the privacy-preservingKNN classificationwasaddressed In [26] the deep learning task was addressedBesides the above related work several fundamental securealgorithms such as dynamic homomorphic encryption [2728] authentication [29 30] and light-weight multipartycomputation [31] which have also been considered in themalicious model have been proposed However to thebest of our knowledge no existing study has considereda method for outsourcing computation in the maliciousmodel

In this study the secure outsourcing of ID3 data min-ing is considered in the malicious model for the cloudenvironment We show how to solve the outsourcingproblem for ID3 protocol over horizontally partitioneddata

2 Preliminaries

In this section we present a brief overview of the prelim-inaries used in this paper including the ID3 decision treealgorithm Paillierrsquos homomorphic encryption scheme andthe other related protocols

21 Distribute ID3 Decision Tree Algorithm The ID3 algo-rithm description is given as follows It builds a decisiontree in a top-down manner with the information of samplesStarting at the root the best object classification will beobtained The best prediction is computed with the informa-tion gain The information gain of an attribute 119860 119905 is definedas

119866119886119894119899 (119878 119860 119905) = 119864119899119905119903119900119901119910 (119878)

minus sum119886119905119895isin119860119905

100381610038161003816100381610038161003816119878119886119905119895100381610038161003816100381610038161003816

|119878| 119864119899119905119903119900119901119910 (119878119886119905119895)(1)

Assume that there are 2 parties P = 119875119894 | 119894 = 1 2with 2 databases S = 119878119894 | 119894 = 1 2 Each party 119875119894 hasone database 119878119894 All databases share the same general attribute(column) set A = 119860 119905 | 119905 = 1 2 119901 and each attribute 119860 119905has several general discrete attribute values denoted by 119860 119905 =119886119905119895 | 119895 = 1 2 119898119905 and one class attribute 119862 = 119888119895 | 119895 =1 2 119898

Without considering privacy each party119875119894 shares his own|119878119886119905119895119888119895 |119894 |119878119886119905119895 |119894 and |119878119888119895 |119894 to all other parties As a result anyparty can calculate 119864119899119905119903119900119901119910(119878) and 119864119899119905119903119900119901119910(119878119886119905119895)

119864119899119905119903119900119901119910 (119878) =119898

sum119895=1

minus100381610038161003816100381610038161003816119878119888119895100381610038161003816100381610038161003816

|119878| log2

100381610038161003816100381610038161003816119878119888119895100381610038161003816100381610038161003816

|119878|

=119898

sum119895=1

minussum119899119894=1

100381610038161003816100381610038161003816119878119888119895100381610038161003816100381610038161003816119894

|119878| log2sum119899119894=1

100381610038161003816100381610038161003816119878119888119895100381610038161003816100381610038161003816119894

|119878|

(2)

where 119878119888119895 is the subset of 119878 with tuples that have value 119888119895 forclass attribute 119862 |119878119888119895 |119894 equals the set of transactions with classattribute 119862 set to 119888119895 in database 119878119894

Then the value of 119864119899119905119903119900119901119910(119878119886119895) can be calculated as

119864119899119905119903119900119901119910 (119878119886119905119895) =119898119905

sum119895=1

minussum119899119894=1

100381610038161003816100381610038161003816119878119886119905119895119888119895100381610038161003816100381610038161003816119894

sum119899119894=1100381610038161003816100381610038161003816119878119886119905119895100381610038161003816100381610038161003816119894sdot log2

sum119899119894=1100381610038161003816100381610038161003816119878119886119905119895119888119895

100381610038161003816100381610038161003816119894sum119899119894=1

100381610038161003816100381610038161003816119878119886119905119895100381610038161003816100381610038161003816119894

(3)

where 119878119886119905119895119888119895 is the subset of 119878 with tuples that have values 119886119895for attribute 119860 119905 and 119888119895 for class attribute 119862

Therefore (3) can be easily computed by party 119875119894 andparties 119875119895 (119895 = 119894) all of the values |119878119886119905119895 |119894 and |119878119886119905119895119888119895 |119894 from itsdatabase Each party 119875119895 then sums these together with thevalues |119878119886119905119895 |119895 and |119878119886119905119895119888119895 |119895 from its database and completes thecomputation

Then each party can calculate 119866119886119894119899(119878 119860 119905) value at its ownside

22 Paillierrsquos Homomorphic Encryption Scheme Homomor-phic encryption is a special type of encryption in whichthe result of applying a special algebraic operation to plaintexts can be obtained by applying another algebraic operation(which may be different or the same) to the correspondingciphertexts Thus even when the user does not know theplain texts heshe can still obtain the results of applying thatalgebraic operation to the plain texts

Let1198981 and1198982 be two plain texts with encryptions 119864(1198981)and 119864(1198982) respectively

The Paillier encryption scheme [32] is described asfollows

119864 (1198981) oplus 119864 (1198982) = 119864 (1198981 + 1198982) (4)

23 Lirsquos Symmetric Homomorphic Encryption Scheme Thedescription of symmetric homomorphic encryption schemeproposed by Li et al [33] is as follows

(i) KeyGen()(119904 119902 119901) larr997888 119870119890119910119866119890119899 (120582) (5)

119870119890119910119866119890119899() is used to generate key for users as 119878119870 =(119904 119902) 119901 and 119902 are primes with the condition that 119901 ≫119902 119904 is chosen from Zlowast119873

(ii) Encsk()

119864119899119888119904119896 (119898 119889) = 119904119889 (119903119902 + 119898) mod119901 (6)

119889 is a small positive integer which is denoted asciphertext degree in this paper

4 Wireless Communications and Mobile Computing

1 The computation at Data ownerCompute 120591 = 119864119899119888119904119896(1) for the cloud2 The computation at cloud119906 V are chosen such that

119906 ≫ VV ≫ max(120572 120573)

(119902 minus 1)2 ≫ 120573 times 119906 + V

minus120572 times 119906 + V ≫ minus(119902 minus 1)23 The cloud compute the following value for the data ownerΦ = 119888119906 + 120591V mod119901 and sendsΦ4 Data Owner computes the following values120593 = 119863119890119888119904119896(119878119870Φ 119889) = (119898 times 119906 + V) mod 119902and compares 120593 with (119902 minus 1)2If 120593 lt (119902 minus 1)2119898 ge 0 Otherwise119898 lt 0

Algorithm 1 Secure outsourcing comparison (OSCP)

(iii) Decsk()

119863119890119888119904119896 (119888 119889) = (119888 times 119904minus119889 mod119901) mod 119902 (7)

231 Properties of the Proposed Homomorphic Encryption

Homomorphic Multiplication Let 1198881 1198882 be the ciphertexts oftwo plaintexts11989811198982Thenwe have 1198881 = 1199041198891(1199031119902+1198981) mod119901and 1198882 = 1199041198892(1199032119902 + 1198982) mod119901 for some random ingredients1199031 and 1199032 And we can obtain that

(1198881 times 1198882) mod119901

= 1199041198891 (1199031119902 + 1198981) mod119901 times 1199041198892 (1199032119902 + 1198982) mod119901

= 1199041198891+1198892 ((11990311199032119902 + 11990311198982 + 11990321198981) 119902 + 1198981 times 1198982) mod119901= 119864119899119888119904119896 (1198981 times 1198982 1198891 + 1198892)

(8)

Homomorphic Addition

(1198881 + 1198882) mod119901 = 119904119889 (1199031119902 + 1198981) mod119901

+ 119904119889 (1199032119902 + 1198982) mod119901

= 119904119889 ((1199031 + 1199032) 119902 + 1198981 + 1198982) mod119901= 119864119899119888119904119896 (1198981 + 1198982 119889)

(9)

Readers may refer to [18] for details on the scheme

24 Garbling Scheme Agarbling scheme [34] consists of fouralgorithms which is denoted by 119866 = (119866119887 119864119899119863119890 119864V) 119891 canbe transformed by Gb into (119865 119890 119889) Note that 119865 is the garbledcircuit The encoding and decoding information algorithmsare denoted by 119890 119889 The output of garbled 119884 can be encryptedand get the result 119910 = 119863119890(119889 119884)

25 Noninteractive Commitment A noninteractive commit-ment scheme [35] is also required in our paper denoted by

(Com119888119903119904Chk119888119903119904) The distribution of Com119888119903119904(119909 119903) is deter-mined by the value of 119903 as Com119888119903119904(119909)

26 Basic Cryptographic Subprotocols In this section wepresent a set of cryptographic subprotocols that will be usedas subroutines when constructing the proposed protocol

261 Outsourcing Secure Comparison Protocol (OSCP) Thevalue of119898 is kept secure from the cloud and users The valueof 119888 = 119864(119898mod 119902) is computed 119878119870 is kept by the data owner(Algorithm 1)

262 Secure Equivalent Testing Protocol (SET) With twociphertexts c1 = Encsk(m1) and c2 = Encsk(m2) SET is tocompute f and decide if the plaintexts are identical (m1 = m2)(Algorithm 2)

263 Secure Multiplication Protocol (SMP) The algorithm isdescribed as in Algorithm 3

264 Secure Minimum out of 2 Numbers Protocol (SMIN2)The algorithm is described as in Algorithm 4

27 SecureCircuit Protocol (SCP) Wedenote the three partiesof the protocol by CSS1 CSS2 and CCS and their respectiveinputs by x1 x2 or x

lowast3 Their goal is to securely compute the

function y = f(x1 x2 xlowast3 ) = x1 sdot x2xlowast3 [34] (Algorithm 5)For simplicity we assume that |xi| = |y| = m All

communication between parties is via private point-to-pointchannels Next we assume that CSS1 and CSS2 can learn thesame output y while CCS can get the garbled values for theportion of the output wires corresponding to its own outputsonly CCS cannot get the output y with these garbled valuesThis protocol uses a garbling scheme a four-tuple algorithm120575 = (GbEnDe Ev) as the underlying algorithm Gb is arandomized garbling algorithm that transforms a function ofa triple En and De are encoding and decoding algorithmsrespectively Ev is an algorithm that produces a garbled output

Wireless Communications and Mobile Computing 5

Two ciphertext are computed by the cloud as 1198881 = 119864119899119888119904119896(1198981) and 1198882 = 119864119899119888119904119896(1198982)1 The cloud computes 11988800 = 119864119899119888119904119896(11989800) = 1198881 minus 1198882 and 11988801 = 119864119899119888119904119896(11989801) = 1198882 minus 11988812 Check if11989800 ge 0 or11989801 ge 0 and computesif11989800 ge 0 and 11989801 lt 0 1198910 = 1 1198911 = 0 else if11989800 lt 0 and 11989801 ge 0 1198910 = 0 1198911 = 13 The value of 119891 is computed as follows

119891 = 1198910 oplus 1198911If 119891 = 0 set1198981 = 1198982

Algorithm 2 Secure equivalent testing protocol (SET)

The values are computed 119864119901119896(119909) and 119864119901119896(119910) 119875 keeps (119901119896 119904119896)1 119862(119886) Choose 119903119909 119903119910 isin 119885119873(119887) 1199091015840 larr997888 119864119901119896(119909)119864119901119896(119903119909) 1199101015840 larr997888 119864119901119896(119910)119864119901119896(119903119910)(119888) computes and sends 1199091015840 1199101015840 to 1198752 119875 computes(119886) ℎ119909 larr997888 119863119904119896(1199091015840) ℎ119910 larr997888 119863119904119896(1199101015840) ℎ larr997888 ℎ119909ℎ119910 mod119873 ℎ1015840 larr997888 119864119901119896(ℎ)(119887)The value of ℎ1015840 is sent to C3 119862 computes(119886) 119904 larr997888 ℎ1015840119864119901119896(119909)119873minus119903119910 1199041015840 larr997888 119904119864119901119896(119910)119873minus119903119909(119887) 119864119901119896(119909119910) larr997888 1199041015840119864119901119896(119903119909119903119910)119873minus1

Algorithm 3 119878119872(119864119901119896(119909) 119864119901119896(119910)) 997888rarr 119864119901119896(119909119910)

1 119862(119886)The function is chosen 119865(119887)for 119894 = 1 to 120582 do119864119901119896(119906119894V119894) larr997888 119878119872(119864119901119896(119906119894) 119864119901119896(V119894))if 119865 119906 gt V then119882119894 larr997888 119864119901119896(119906119894)119864119901119896(119906119894V119894)119873minus1

endelse119882119894 larr997888 119864119901119896(V119894)119864119901119896(119906119894V119894)119873minus1

end119866119894 larr997888 119864119901119896(119906119894 oplus V119894)119867119894 larr997888 119867119903119894119894minus1119866119894 119903119894isin119877119885119873 and1198670 = 119864(0)Φ119894 larr997888 119864119901119896(minus1)119867119894119871 119894 larr997888 119882119894Φ119903

1015840119894

119894 1199031015840119894 isin 119885119873end(119888) Sends 119871 to 1198752 119875(119886) 119872119894 larr997888 119863(119871 119894) for 1 le 119894 le 120582(119887)if exist119895 such that 119872119895 = 1 then120572 larr997888 1 (which means 119906 gt V)

endelse120572 larr997888 0 (which means 119906 lt V)

end

Algorithm 4 Secure Minimum out of 2 Numbers Protocol (SMIN2)

6 Wireless Communications and Mobile Computing

Input CSS1 has 1199091 CSS2 has 1199092Output (1199091 + 1199092)(1198861 + 1198862)1 CCS(119886) Randomly selects 119888119903119904 for commitment and sends 119888119903119904 to 1198621198781198781 and 1198621198781198782(119887) Generates the random number 120582 and share it secretly as 120582 = 1205821 oplus 1205822 sends 1198971198861198981198871198891198861 to 1198621198781198781 and sends 1198971198861198981198871198891198862 to 11986211987811987822 1198621198781198781 Select seed 119903 larr997888 0 1119896 for pseudo random function 119875119877119865 and send 119903 to 11986211987811987823 CSS1 and CSS2 (119886) Generate corresponding circuit 119866119887(1120581 119891) 997888rarr (119865 119890 119889) based on function 119891 = (1199091 + 1199092) lowast 120582 (119887) Randomselection of 1198871 1198872 larr997888 0 14119898 and generate the following commitments for all 119895 isin [4119898] and 119905 isin 0 1(1198621199051119895 1205901199051119895) larr997888 Com119888119903119904(119890[119895 1198871[119895] oplus 119905])(1198621199052119895 1205901199052119895) larr997888 Com119888119903119904(119890[119895 1198872[119895] oplus 119905])

(119888) 1198621198781198781 and 1198621198781198782 send the following information to 119862119862119878(1198871 [2119898 + 1 sdot sdot sdot 4119898] 119865 1198621199051119895119895119905)(1198872 [2119898 + 1 sdot sdot sdot 4119898] 119865 1198621199052119895119895119905)

4 CCS Abort if CSS1 and CSS2 report different values for these items5 CSS1 and CSS2(119886) CSS1 sends decommitment1205901199091[119895]oplus1198871[119895]1119895 1205901205821[119895]oplus1198871[2119898+119895]12119898+119895 1205901199091[119895]oplus1198872[119895]2119895 and 1205901205821 [119895]oplus1198872[2119898+119895]22119898+119895 to CCS(119887) CSS2 sends decommitment 1205901199092[119895]oplus1198871[119898+119895]1119898+119895 1205901205822[119895]oplus1198871[3119898+119895]13119898+119895 1205901199092[119895]oplus1198872[119898+119895]2119898+119895 and 1205901205822[119895]oplus1198872[3119898+119895]23119898+119895 to CCS6 CCS (119886) For 119895 isin [4119898] compute119883[119895] = Chk119888119903119904(119862119900[119895]1119895 120590119900[119895]1119895 )1198831015840[119895] = Chk119888119903119904(119862119900[119895]2119895 120590119900[119895]2119895 ) for the appropriate 119900[119895] If any call toChk returns perp then abort Similarly CCS knows the values 1198871[2119898 + 1 sdot sdot sdot 4119898] and 1198872[2119898 + 1 sdot sdot sdot 4119898] and aborts if CSS1 or CSS2can not open the corresponding commitments of 1205821 and 1205822 1198621205821[119895]oplus1198871[2119898+119895]12119898+119895 1198621205822[119895]oplus1198871[3119898+119895]13119898+119895 1198621205821[119895]oplus1198872[2119898+119895]22119898+119895 and 1198621205822[119895]oplus1198872[3119898+119895]23119898+119895 (119887) Run 119884 larr997888 Ev(119865119883) and 1198841015840 larr997888 Ev(1198651198831015840) then broadcasts 119884 and 1198841015840 to CSS1 and CSS27 CSS1 and CSS2 Compute (1199091 + 1199092)(1198861 + 1198862) = 119863119890(119889 119884)119863119890(119889 1198841015840)

Algorithm 5 Secure circuit protocol (SCP)

for a garbled input and garbled circuit Further Chk is analgorithm that can verify commitments

3 Outsourcing Privacy-Preserving ID3Decision Tree Algorithm in Malicious Model

In this section we present our secure outsourcing ID3decision tree in cloud computing using the homomorphicencryption scheme and subprotocols proposed in Section 2as building blocks

31 Main Concept The aim is to privately compute ID3 overencrypted databases and the key is to find privately theattribute A for which Gain is maximum From the abovedescription the key value which needs to be calculated withother parties is Entropy(Sai)

Since all the data was encrypted and sent to the cloud thecloud server can count the number of |S(ck)|t |S|it using theSETprotocol described in Section 2Now (3) can be executedas (x1+x2)(a1+a2)log2(x1+x2)(a1+a2) and the calculationof the logarithmic operation can be performed in CSS Thevalue to be calculated is the value of c1 = (x1 + x2)(a1 + a2)which can be easily determined using our SCP protocol asexplained in Section 2 Then all the parties can calculate thevalue of Entropy(S) independently

32 System Model The system model is shown in Figure 1which includes two data owners and cloud servers (cloudstorage server CSS1 CSS2 and cloud computing serverCCS) Each data owner owns a private data set that isencrypted and outsourced to cloud server storage Data

owners can request cloud server to process ID3 data onencrypted data At the same time CSS and CCS serversparticipate in supporting the outsourcing privacy protec-tion ID3 data mining algorithm steps after the imple-mentation of the algorithm the final results are sent tothe data owner Assuming that the data owner and theCSS server are semihonest participants CCS is a maliciousparticipant

33 Details of the Proposed Algorithm Our securely out-sourcing ID3 decision tree (SOID3) algorithm is detailed asfollows(1) P1 and P2 run KeyGen(120582) to generate the secret key

SKi i = 1 2 and a public parameter p of Lirsquos homomorphicencryption scheme Further each party shares p with theother party and the cloud but shares SKi only with itself(2) Each party uses its key SKi to encrypt every attribute

value of its database and then outsources the encrypteddatabase to the CSS (CSS1 and CSS2)(3)The CSS1 and CSS2 use the SET protocol to calculate

the value of |Saj |i and |Saj(ck)|i for each attribute with eachparty Pi(4)Eachparty generates its Paillier public and private keys

(pki ski) i = 1 2 and sends the public keys to the CSS1 andCSS2(5) CCS CSS1 and CSS2 jointly use the SCP protocol to

compute (x1 + x2)(a1 + a2) Here CSS1 has (x1 a1) and CSS2has (x2 a2)(6) Each party decrypts the received information cal-

culates it with the logarithmic operation of (x1 + x2a1 +

Wireless Communications and Mobile Computing 7

Sensors

Sensors

Database Database

Result Result

P1 P2

Cloud Storage Servers

Cloud Computing Servers

Figure 1 System model under consideration

a2)log2(x1 + x2a1 + a2) and then encrypts it with its publickey Then it sends it back to the cloud(7)After getting the result CSS1 and CSS2 use the SMIN2

protocol to select the ciphertext data with theminimum valueand then further select the attribute label with the maximuminformation gain and return it to each participant(8) The participants divide the data sets and build tree

nodes Then go to Step (3) until termination

4 Security Analysis

In this section we prove that the secure outsourcing ID3decision tree (SOID3) algorithm can offer protection againstthe malicious cloud server

Theorem 1 The SOID3 algorithm can achieve privacy for eachparty and the semihonest cloud storage server

Proof We mainly consider the security model under thenoncollusive semihonest model and the semihonest cloudserver Suppose there are two parties P1 and P2 and cloudstorage server CSS

Let P = (P1 P2CSS) be the participants of all protocolsConsider three types of attackers ( AP1 AP2 and ACSS) thatcan invade P1 P2 and CSS In the real model P1 and P2 havedata sets Dx and Dy respectively and CSS has encrypted datasets Enc(119863119909) and Enc(119863119910)MakeH sub P a collection of honestparticipants For all Pi120598H outPi indicates the output of Pi IfPi is invaded outPi represents all views of participant Pi inrunning protocol Π

For each Plowast isin P the attacker A = (AP1 AP2 ACSS) viewin the runtime protocol Π can be defined as

REALPlowast

ΠAH (DxDy) = outPi Pi120598H cup outPlowast (10)

In the ideal model there exists an ideal model F forfunction f and all participants can interact with the modelF That is Challenger DPa and participant Pi can send data xand y to F If Dx or Dy is perp F returns perp Finally F can return

f(DxDy) to challenger DPa As mentioned earlier H sub P isa collection of honest participants For each participant Pi120598Hin the collection return the outPi as F output to Pi If Pi isintruded on by a semihonest attacker outPi is still consistentwith the output of Pi in previous realistic models

For all Plowast isin P in the ideal model in the presence ofindependent simulators Sim = (SimP1 SimP2 SimCSS) the Plowastview is

IDEALPlowast

FSimH (DxDy) = outPi Pi120598H cup outPlowast (11)

Therefore it is considered that the protocolΠ is secure inthe presence of noncolluded semitruthful attackersDefinition 2 Let f be a deterministic functionality amongparties in P Let H sub P be the subset of honest partiesin P We say that Π securely realizes f if there exists a setSim = SimP1 SimP2 SimS of PPT transformations (whereSimDa

= SimP1(AP1 ) and so on) such that for all semihonestPPT adversaries A = AP1 AP2 AS for all inputs DxDy andauxiliary inputs z and for all parties P isin P the followingholds

REALPlowast

ΠAHz (120582 x y)120582isinN equiv

IDEALPlowast

FSimHz (120582 x y)120582isinN (12)

where equiv denotes computational indistinguishability

Theorem 2 The SOID3 algorithm is secure with the semi-honest cloud storage server and the malicious cloud computingserver

Proof First consider the case where CSS1 or CSS2 is cor-rupted It is necessary to prove that in the SCP protocol theideal model and the realistic model are not distinguishableThat is in the following interactions it is impossible to dis-tinguish between the various types of interaction informationand outputs of the participants in the ideal model and the realmodel

8 Wireless Communications and Mobile Computing

(1) In the realmodel assume that there is an emulator thatcan simulate various behaviors of a semihonest participantCSS1 (or CSS2) and receive inputs (x1 a2) and (x2 a2) fromthe execution environment of the protocol At the same timethe simulator can simulate the function Ff which sends allinputs (x1 a1) and (x2 a2) to the simulated Ff Since thesimulator does not do anything computed by Ff there is nodifference between the real Ff and the simulated Ff from theexecution environment point of view(2) Because in Step 2 CSS1 and CSS2 uniformly select the

seed r of Pseudo-Random Function (PRF) the PRF securityshows that the real model in Step 2 is indistinguishable fromthe ideal model(3) In Step 3 we modify the simulator which knows in

advance what promises will be opened when the simulatorgenerates commitment C First the simulator selects therandom numbers o1 o2 that can be marked which promisesto be opened and calculates the values of b1 = o1 oplus x1 x2b2 = o2 oplus a1 a2 At this point the simulator has obtainedthe values of x1 x2 and a1 a2Then the simulator can submitthe markings that promise not to be opened In this processdue to the concealment of commitment the realistic modeland the ideal model are equally indistinguishable(4) In Step 6a the simulated CSS1 and CSS2 stop exe-

cuting when De(D Y) = perp Change the emulator to makeY = Ev(FX) By obfuscating the authenticity of the circuitCCS has only negligible probability to obtain Y = Ev(FX) inDe(d Y) = perp Therefore in this step the realistic model andthe ideal model are equally indistinguishable(5) In Step 6b the correctness of the obfuscation circuit

guarantees that both CSS1 and CSS2 of the analog can beoutput Therefore if there is no pause in 6a we can modifythe simulator to an analog obfuscation circuit that generates(FX d) We can simulate the output of CSS1 and CSS2 bysimulating the instructions of Ff According to the security ofthe confusing circuit the real model is also indistinguishablefrom the ideal model in this step

Therefore in this protocol the execution environmentcan not distinguish between the realistic model and the idealmodel And the protocol is secure when CCS is a maliciousparticipant

5 Performance Analysis

In this paper we consider that CSS has a strong calculationability and we ignore its computation time Each data ownerdoes not need to store the ciphertext but can just use thepublic key to encrypt the message and the private key todecrypt the ciphertext

In each iteration first each data owner will execute theSBD protocol and SMIN2 protocol with the cloud There aretwo interactions in the SBD protocol and 2k interactions inthe SMIN2 protocolThen CSS1 CSS2 and CCS will execute6 interactions in the SCP protocol Finally each data ownerwill execute 1 interaction when it goes to the new iterationWe assume that t is the iteration time so the communicationtraffic is at most O(k lowast t)

In this paper a secure average computing protocolbased on SCP is implemented The server selected in the

0

50

100

150

Tim

e (s)

C1C2Client

100 200 300 400 500 6000Number of records in data set

Figure 2 Performance measurements for our SOID3 with 2 partic-ipants

experimental environment is CPU Intel (R) Xeon (R) CPUE5-2620 v3240GHzlowast2 memory 32G operating systemUbuntu 16044 LTS version In the experiment AES-128is chosen as the basic encryption method of the confusioncircuit and the open source code of JustGarble is changedand the commitment protocol is implemented based on SHA-256 Finally the average values obtained from experimentsare as follows

In our secure outsourcing ID3 decision tree (SOID3)algorithm experiment two participants were tested withdifferent numbers of records The experimental results areshown in Figure 2

From Figure 2 since the client is only responsible forencrypting uploaded data the time consumption is very lowIn the cloud CCS and CSS servers need to run SCP protocolresulting in a lot of time consumption (Table 1) The mainreason is that a large number of bit commitment processingis needed in the obfuscation circuit and the performanceimprovement will be focused on this issue in the follow-upwork

6 Conclusion

In this paper we proposed a secure outsourcing ID3 decisiontree algorithm for two parties of the malicious model Ouralgorithm can preserve the privacy of the usersrsquo data as wellas that of the data mining scheme for the cloud servers Theparties can get only the result trees and have no knowledgeabout the data mining scheme Moreover the cloud serverscannot get any private information about the parties Insummary our protocol offers protection against maliciouscloud servers

In the future we intend to extend our algorithm tovertical and arbitrary partitioning in the malicious modelIn addition we plan to extend our algorithm to a generalmultiparty privacy-preserving framework suitable for otheruseful schemes such as random decision tree Bayes SVM

Wireless Communications and Mobile Computing 9

Table 1 Time cost of the SCP protocol

AND XOR OR Input size(bits)

Output size(bits)

CSS1CSS2(ms)

119862119862119878(ms)

5090 4034 2016 129 65 26 47

and other data mining methods and can be extended for usein the wireless sensor-networks [36 37]

Data Availability

The data used to support the findings of this study areavailable from the corresponding author upon request

Conflicts of Interest

The authors declare that they have no conflicts of interest

Acknowledgments

This work is supported by National Key Research andDevelopment Program of China (no 2017YFB0803002)Basic Research Project of Shenzhen China (noJCYJ20160318094015947) National Natural ScienceFoundation of China (nos 61872109 and 61771222)and Key Technology Program of Shenzhen China (noJSGG20160427185010977)

References

[1] Y Lindell and B Pinkas ldquoPrivacy preserving data miningrdquoJournal of Cryptology vol 15 no 3 pp 177ndash206 2002

[2] R Agrawal and R Srikant ldquoPrivacy-preserving data miningrdquoACM SIGMOD Record vol 29 no 2 pp 439ndash450 2000

[3] D Liu E Bertino and X Yi ldquoPrivacy of outsourced k-means clusteringrdquo in Proceedings of the 9th ACM Symposiumon Information Computer and Communications Security ASIACCS 2014 pp 123ndash133 June 2014

[4] P Li J Li Z Huang C-Z Gao W-B Chen and K ChenldquoPrivacy-preserving outsourced classification in cloud comput-ingrdquo Cluster Computing pp 1ndash10 2017

[5] F Emekci O D Sahin D Agrawal and A El Abbadi ldquoPrivacypreserving decision tree learning over multiple partiesrdquoData ampKnowledge Engineering vol 63 no 2 pp 348ndash361 2007

[6] P Lory ldquoEnhancing the Efficiency in Privacy Preserving Learn-ing of Decision Trees in Partitioned Databasesrdquo in Privacy inStatistical Databases vol 7556 of Lecture Notes in ComputerScience pp 322ndash335 Springer Berlin Heidelberg Berlin Hei-delberg 2012

[7] J Vaidya and C Clifton ldquoPrivacy-preserving K-means cluster-ing over vertically partitioned datardquo in Proceedings of the 9thACM SIGKDD International Conference on Knowledge Discov-ery and Data Mining (KDD rsquo03) pp 206ndash215 Washington DC USA August 2003

[8] J Zhan S Matwin and LW Chang ldquoPrivacy-PreservingNaiveBayesian Classification over Horizontally Partitioned Datardquoin Data Mining Foundations and Practice vol 118 of Studiesin Computational Intelligence pp 529ndash538 Springer BerlinHeidelberg Berlin Heidelberg 2008

[9] S Han andW K Ng ldquoMulti-party privacy-preserving decisiontrees for arbitrarily partitioned datardquo International Journal ofIntelligent Control and Systems vol 12 no 4 2007

[10] T Li J Li Z Liu P Li and C Jia ldquoDifferentially private NaiveBayes learning overmultiple data sourcesrdquo Information Sciencesvol 444 pp 89ndash104 2018

[11] CGaoQ Cheng PHeW Susilo and J Li ldquoPrivacy-preservingNaive Bayes classifiers secure against the substitution-then-comparison attackrdquo Information Sciences vol 444 pp 72ndash882018

[12] J Shen T Zhou X Chen J Li and W Susilo ldquoAnonymousand Traceable Group Data Sharing in Cloud Computingrdquo IEEETransactions on Information Forensics and Security vol 13 no4 pp 912ndash925 2018

[13] Z Cai H Yan P Li Z-A Huang and C Gao ldquoTowards secureand flexible EHR sharing in mobile health cloud under staticassumptionsrdquo Cluster Computing vol 20 no 3 pp 2415ndash24222017

[14] J Li Y K Li X Chen P P C Lee and W Lou ldquoA HybridCloud Approach for Secure Authorized Deduplicationrdquo IEEETransactions on Parallel and Distributed Systems vol 26 no 5pp 1206ndash1216 2015

[15] Z Liu Y Huang J Li X Cheng and C Shen ldquoDivORAMTowards a practical oblivious RAM with variable block sizerdquoInformation Sciences vol 447 pp 1ndash11 2018

[16] X Chen J Li J Weng J Ma and W Lou ldquoVerifiable computa-tion over large database with incremental updatesrdquo Institute ofElectrical and Electronics Engineers Transactions on Computersvol 65 no 10 pp 3184ndash3195 2016

[17] J Shen C Wang T Li X Chen X Huang and Z-H ZhanldquoSecure data uploading scheme for a smart home systemrdquoInformation Sciences vol 453 pp 186ndash197 2018

[18] S ChenGWangG Yan andDXie ldquoMulti-dimensional fuzzytrust evaluation for mobile social networks based on dynamiccommunity structuresrdquoConcurrencyand Computation Practiceand Experience vol 29 no 7 p e3901 2017

[19] E Luo Q Liu J H Abawajy andGWang ldquoPrivacy-preservingmulti-hop profile-matching protocol for proximity mobilesocial networksrdquo Future Generation Computer Systems vol 68pp 222ndash233 2017

[20] X F Chen J Li J Ma Q Tang and W Lou ldquoNew algorithmsfor secure outsourcing of modular exponentiationsrdquo IEEETransactions on Parallel and Distributed Systems vol 25 no 9pp 2386ndash2396 2014

[21] R Bost R A Popa S Tu and S Goldwasser ldquoMachineLearning Classification over Encrypted Datardquo in Proceedings ofthe Network and Distributed System Security Symposium SanDiego CA

[22] A Peter E Tews and S Katzenbeisser ldquoEfficiently outsourcingmultiparty computation under multiple keysrdquo IEEE Transac-tions on Information Forensics and Security vol 8 no 12 pp2046ndash2058 2013

[23] G Jagannathan K Pillaipakkamnatt andRNWright ldquoA Prac-tical Differentially Private Random Decision Tree Classifierrdquo in

10 Wireless Communications and Mobile Computing

Proceedings of the 2009 IEEE International Conference on DataMining Workshops (ICDMW) pp 114ndash121 Miami FL USADecember 2009

[24] B Gilburd A Schuster and R Wolff ldquoPrivacy-preserving datamining on data grids in the presence of malicious participantsrdquoin Proceedings of the Proceedings 13th IEEE International Sym-posium on High performance Distributed Computing 2004 pp225ndash234 Honolulu HI USA

[25] D Shah and S Zhong ldquoTwo methods for privacy preservingdata mining with malicious participantsrdquo Information Sciencesvol 177 no 23 pp 5468ndash5483 2007

[26] P Li J Li Z Huang et al ldquoMulti-key privacy-preserving deeplearning in cloud computingrdquo Future Generation ComputerSystems vol 74 pp 76ndash85 2017

[27] Q Lin H Yan Z Huang W Chen J Shen and Y TangldquoAn ID-based linearly homomorphic signature scheme and itsapplication in blockchainrdquo IEEE Access vol PP no 99 pp 1-12018

[28] J Xu L Wei Y Zhang A Wang F Zhou and C Gao ldquoDy-namic Fully Homomorphic encryption-based Merkle Tree forlightweight streaming authenticated data structuresrdquo Journal ofNetwork and Computer Applications vol 107 pp 113ndash124 2018

[29] J Shen Z Gui S Ji J Shen H Tan and Y Tang ldquoCloud-aided lightweight certificateless authentication protocol withanonymity for wireless body area networksrdquo Journal of Networkand Computer Applications vol 106 pp 117ndash123 2018

[30] X Zhang Y Tan C Liang Y Li and J Li ldquoA Covert ChannelOver VoLTE via Adjusting Silence Periodsrdquo IEEE Access vol 6pp 9292ndash9302 2018

[31] Y Wang T Li H Qin et al ldquoA brief survey on secure multi-party computing in the presence of rational partiesrdquo Journal ofAmbient Intelligence and Humanized Computing vol 6 no 6pp 807ndash824 2015

[32] P Paillier ldquoPublic-key cryptosystems based on compositedegree residuosity classesrdquo in Advances in CryptologymdashEURO-CRYPT rsquo99 vol 1592 pp 223ndash238 Springer 1999

[33] L Li R Lu K-K R Choo A Datta and J Shao ldquoPrivacy-Preserving-Outsourced Association Rule Mining on Verti-cally Partitioned Databasesrdquo IEEE Transactions on InformationForensics and Security vol 11 no 8 pp 1547ndash1861 2016

[34] P Mohassel M Rosulek and Y Zhang ldquoFast and SecureThree-party Computationrdquo in Proceedings of the the 22ndACMSIGSACConference pp 591ndash602Denver Colorado USAOctober 2015

[35] M Naor ldquoBit commitment using pseudorandomnessrdquo Journalof Cryptology vol 4 no 2 pp 151ndash158 1991

[36] H Cheng Z Su N Xiong and Y Xiao ldquoEnergy-efficientnode scheduling algorithms for wireless sensor networks usingMarkov Random Field modelrdquo Information Sciences vol 329pp 461ndash477 2016

[37] H Cheng N Xiong A V Vasilakos L Tianruo Yang G Chenand X Zhuang ldquoNodes organization for channel assignmentwith topology preservation in multi-radio wireless mesh net-worksrdquo Ad Hoc Networks vol 10 no 5 pp 760ndash773 2012

International Journal of

AerospaceEngineeringHindawiwwwhindawicom Volume 2018

RoboticsJournal of

Hindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom Volume 2018

Active and Passive Electronic Components

VLSI Design

Hindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom Volume 2018

Shock and Vibration

Hindawiwwwhindawicom Volume 2018

Civil EngineeringAdvances in

Acoustics and VibrationAdvances in

Hindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom Volume 2018

Electrical and Computer Engineering

Journal of

Advances inOptoElectronics

Hindawiwwwhindawicom

Volume 2018

Hindawi Publishing Corporation httpwwwhindawicom Volume 2013Hindawiwwwhindawicom

The Scientific World Journal

Volume 2018

Control Scienceand Engineering

Journal of

Hindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom

Journal ofEngineeringVolume 2018

SensorsJournal of

Hindawiwwwhindawicom Volume 2018

International Journal of

RotatingMachinery

Hindawiwwwhindawicom Volume 2018

Modelling ampSimulationin EngineeringHindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom Volume 2018

Chemical EngineeringInternational Journal of Antennas and

Propagation

International Journal of

Hindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom Volume 2018

Navigation and Observation

International Journal of

Hindawi

wwwhindawicom Volume 2018

Advances in

Multimedia

Submit your manuscripts atwwwhindawicom

Page 2: ResearchArticle - Hindawi Publishing Corporationdownloads.hindawi.com/journals/wcmc/2018/2385150.pdf · ResearchArticle Securely Outsourcing ID3 Decision Tree in Cloud Computing YeLi,1

2 Wireless Communications and Mobile Computing

Data Distribution Three types of distributed datasets aredefined in related works including the horizontally dis-tributed datasets vertically distributed datasets and arbi-trarily distributed datasets The users in the horizontallydistributed dataset can keep divided parts for the sameattributes However in the vertical datasets users are allowedto keep different attributes In the last one the datasets can bearbitrarily divided and stored by the users

Due to the existence of malicious participants in thereal environment malicious participants may not follow theprotocol For example they can intentionally tamper with thedata suspend the protocol anytime during the execution ofthe protocol and so on To solve this problem this papercombines the noncontact commitment and confusion circuitmechanism studies the average computing protocol basedon confusion circuit and then proposes the framework ofa secure cryptography-based two-participant protocol withdata storage and computation outsourcing The frameworkconsists of two data owners and two cloud servers (cloud stor-age server (CSS) and cloud computing server (CCS)) Eachdata owner has a horizontally distributed private databasethat is encrypted before being outsourced to the cloud forstorage and computation

11 Our Contribution We decompose the key function ofdistributed ID3 decision tree 119866119886119894119899(119878 119860 119894) into counting(1199091 + 1199092)(1198861 + 1198862) sum multiplication and comparison

In counting we propose the Secure Equivalent Testing(SET) protocol to calculate the number of items for eachattribute value based on the encrypted data

To calculate the value of (1199091 + 1199092)(1198861 + 1198862) in maliciousmodel we implement the Outsourcing Secure Circuit (OSC)protocol

To perform the sum and multiplication operations overciphertext we adopt the Paillier encryption system andimplement the Secure Multiplication (SM) protocol

To execute comparison over ciphertext we adopt theSecure Minimum out of 2 Numbers (SMIN2) protocol

12 Related Work

Distributed PPDM without Outsourcing Distributed PPDMwithout outsourcing is mainly for data stored and calculatedlocally by the participant based on distributed data based onvarious data mining methods which can be decomposed todifferent operations such as average calculation calculationand calculation of logarithmic vector inner product Thenthe cryptography-based technology is used to design variousprivacy-preserving computing protocols In 2002 Lindell andPinkas [1] proposed a secure ID3 decision tree algorithmover horizontally partitioned data They decompose thedistributed ID3 algorithm to multilogarithmic calculationpolynomial evaluation calculation and data comparisonand then designed the security log protocol polynomialevaluation protocol and secure comparison protocol so asto achieve privacy-preserving in distributed ID3 algorithmIn 2007 Emekci et al [5] implemented a secure addition com-putational protocol based on the secret sharing algorithm andextended the secure logarithmic computing protocol from

two parties to multiple parties thus realizing the multipartyparticipation of the privacy protection ID3methodHoweverthe complexity of the algorithm increases exponentially whenthe participant data are more numerous In 2012 Lory et al[6] used Chebyshev polynomial expansion to replace Taylorexpansion in [1] thus further improving the computationalefficiency of secure logarithmic computing protocols How-ever their agreements still have limited efficiency in theimplementation of privacy protection protocols

Different from above in 2003 Vaidya et al [7 8] designeda multiparty privacy-preserving ID3 algorithm of verticallydistributed data setsThey vectorized all attribute value infor-mation by constructing constraint sets and then computedit by using the method of secure intersection protocol thusdesigning privacy-preserving ID3 for vertically distributeddata sets

In 2007 Han and Ng [9] proposed a multiparty dis-tributed privacy-preserving ID3 method based on arbitrarydistributed data sets Firstly each participantrsquos data set isvectorized and then the attribute value information is com-puted by using security intersection protocol and so onThen the entropy value of each attribute is computed byusing security logarithm computation protocol and so onThus the ID3 decision tree classification method of privacyprotection based on arbitrary distributed data set is obtainedHowever with the increase of the number of participants thecomputing volume of the client increases exponentially

Li et al [10] and Gao et al [11] addressed the Naive BayesLearning for aggregated arbitrary distributed databases

PPDM with Computation Outsourcing Cryptography-basedprivacy-preserving data mining has a lot of encryption anddecryption operations in the computation processThereforeit is difficult for large-scale data processing As a measurefor solving resource-restricted problems the outsourcingtechnique has been widely used in cloud computing appli-cations such as data sharing [12 13] data storing [14 15]data updating [16 17] and social network analysis [18 19]In this context we need to rely on security outsourcingtechnology to outsource computing or storage tasks of allparticipants to the cloud to process thus greatly reducingthe computingstorage load of each participant In 2014 Liuet al [3] adopted a new encryption scheme that supportsboth addition and multiplication over cipher texts In thisscheme most of the computations are performed on thecloud which reduces the computation workload of the dataowner However the scheme is limited to a single partyrsquos datamining operation Chen et al [20] designed new algorithmsfor secure outsourcing of modular exponentiations In 2015Bost et al [21] proposed the privacy-preserving hyperplanedecision Naive Bayesian and decision tree classificationalgorithms and through the semihonest model secure two-party computation model to prove that the above schemecan satisfy the semantic security (Semantic Security) andthe related protocol makes it possible to design an adaptiveenhancement algorithm (Adaptive Boosting) combine tofurther enhance the accuracy of the algorithm building aclassifier can be used to construct the privacy protectionof the library the further development of the classification

Wireless Communications and Mobile Computing 3

algorithms for privacy-preserving technology in the futurelays a solid foundation

PPDM with Multiparticipant Data Storage and ComputationOutsourcing In 2013 Peter et al [22] proposed a newsolution for the outsourcing of multiparty computationSuch a technique can be used in our setting But as thesecurity analysis in the previous works they can only achievesecurity in the semihonest model In [23] a new protocolwas proposed to achieve data mining for two parties In[24] association rule mining was addressed in the maliciousmodel In [25] the privacy-preservingKNN classificationwasaddressed In [26] the deep learning task was addressedBesides the above related work several fundamental securealgorithms such as dynamic homomorphic encryption [2728] authentication [29 30] and light-weight multipartycomputation [31] which have also been considered in themalicious model have been proposed However to thebest of our knowledge no existing study has considereda method for outsourcing computation in the maliciousmodel

In this study the secure outsourcing of ID3 data min-ing is considered in the malicious model for the cloudenvironment We show how to solve the outsourcingproblem for ID3 protocol over horizontally partitioneddata

2 Preliminaries

In this section we present a brief overview of the prelim-inaries used in this paper including the ID3 decision treealgorithm Paillierrsquos homomorphic encryption scheme andthe other related protocols

21 Distribute ID3 Decision Tree Algorithm The ID3 algo-rithm description is given as follows It builds a decisiontree in a top-down manner with the information of samplesStarting at the root the best object classification will beobtained The best prediction is computed with the informa-tion gain The information gain of an attribute 119860 119905 is definedas

119866119886119894119899 (119878 119860 119905) = 119864119899119905119903119900119901119910 (119878)

minus sum119886119905119895isin119860119905

100381610038161003816100381610038161003816119878119886119905119895100381610038161003816100381610038161003816

|119878| 119864119899119905119903119900119901119910 (119878119886119905119895)(1)

Assume that there are 2 parties P = 119875119894 | 119894 = 1 2with 2 databases S = 119878119894 | 119894 = 1 2 Each party 119875119894 hasone database 119878119894 All databases share the same general attribute(column) set A = 119860 119905 | 119905 = 1 2 119901 and each attribute 119860 119905has several general discrete attribute values denoted by 119860 119905 =119886119905119895 | 119895 = 1 2 119898119905 and one class attribute 119862 = 119888119895 | 119895 =1 2 119898

Without considering privacy each party119875119894 shares his own|119878119886119905119895119888119895 |119894 |119878119886119905119895 |119894 and |119878119888119895 |119894 to all other parties As a result anyparty can calculate 119864119899119905119903119900119901119910(119878) and 119864119899119905119903119900119901119910(119878119886119905119895)

119864119899119905119903119900119901119910 (119878) =119898

sum119895=1

minus100381610038161003816100381610038161003816119878119888119895100381610038161003816100381610038161003816

|119878| log2

100381610038161003816100381610038161003816119878119888119895100381610038161003816100381610038161003816

|119878|

=119898

sum119895=1

minussum119899119894=1

100381610038161003816100381610038161003816119878119888119895100381610038161003816100381610038161003816119894

|119878| log2sum119899119894=1

100381610038161003816100381610038161003816119878119888119895100381610038161003816100381610038161003816119894

|119878|

(2)

where 119878119888119895 is the subset of 119878 with tuples that have value 119888119895 forclass attribute 119862 |119878119888119895 |119894 equals the set of transactions with classattribute 119862 set to 119888119895 in database 119878119894

Then the value of 119864119899119905119903119900119901119910(119878119886119895) can be calculated as

119864119899119905119903119900119901119910 (119878119886119905119895) =119898119905

sum119895=1

minussum119899119894=1

100381610038161003816100381610038161003816119878119886119905119895119888119895100381610038161003816100381610038161003816119894

sum119899119894=1100381610038161003816100381610038161003816119878119886119905119895100381610038161003816100381610038161003816119894sdot log2

sum119899119894=1100381610038161003816100381610038161003816119878119886119905119895119888119895

100381610038161003816100381610038161003816119894sum119899119894=1

100381610038161003816100381610038161003816119878119886119905119895100381610038161003816100381610038161003816119894

(3)

where 119878119886119905119895119888119895 is the subset of 119878 with tuples that have values 119886119895for attribute 119860 119905 and 119888119895 for class attribute 119862

Therefore (3) can be easily computed by party 119875119894 andparties 119875119895 (119895 = 119894) all of the values |119878119886119905119895 |119894 and |119878119886119905119895119888119895 |119894 from itsdatabase Each party 119875119895 then sums these together with thevalues |119878119886119905119895 |119895 and |119878119886119905119895119888119895 |119895 from its database and completes thecomputation

Then each party can calculate 119866119886119894119899(119878 119860 119905) value at its ownside

22 Paillierrsquos Homomorphic Encryption Scheme Homomor-phic encryption is a special type of encryption in whichthe result of applying a special algebraic operation to plaintexts can be obtained by applying another algebraic operation(which may be different or the same) to the correspondingciphertexts Thus even when the user does not know theplain texts heshe can still obtain the results of applying thatalgebraic operation to the plain texts

Let1198981 and1198982 be two plain texts with encryptions 119864(1198981)and 119864(1198982) respectively

The Paillier encryption scheme [32] is described asfollows

119864 (1198981) oplus 119864 (1198982) = 119864 (1198981 + 1198982) (4)

23 Lirsquos Symmetric Homomorphic Encryption Scheme Thedescription of symmetric homomorphic encryption schemeproposed by Li et al [33] is as follows

(i) KeyGen()(119904 119902 119901) larr997888 119870119890119910119866119890119899 (120582) (5)

119870119890119910119866119890119899() is used to generate key for users as 119878119870 =(119904 119902) 119901 and 119902 are primes with the condition that 119901 ≫119902 119904 is chosen from Zlowast119873

(ii) Encsk()

119864119899119888119904119896 (119898 119889) = 119904119889 (119903119902 + 119898) mod119901 (6)

119889 is a small positive integer which is denoted asciphertext degree in this paper

4 Wireless Communications and Mobile Computing

1 The computation at Data ownerCompute 120591 = 119864119899119888119904119896(1) for the cloud2 The computation at cloud119906 V are chosen such that

119906 ≫ VV ≫ max(120572 120573)

(119902 minus 1)2 ≫ 120573 times 119906 + V

minus120572 times 119906 + V ≫ minus(119902 minus 1)23 The cloud compute the following value for the data ownerΦ = 119888119906 + 120591V mod119901 and sendsΦ4 Data Owner computes the following values120593 = 119863119890119888119904119896(119878119870Φ 119889) = (119898 times 119906 + V) mod 119902and compares 120593 with (119902 minus 1)2If 120593 lt (119902 minus 1)2119898 ge 0 Otherwise119898 lt 0

Algorithm 1 Secure outsourcing comparison (OSCP)

(iii) Decsk()

119863119890119888119904119896 (119888 119889) = (119888 times 119904minus119889 mod119901) mod 119902 (7)

231 Properties of the Proposed Homomorphic Encryption

Homomorphic Multiplication Let 1198881 1198882 be the ciphertexts oftwo plaintexts11989811198982Thenwe have 1198881 = 1199041198891(1199031119902+1198981) mod119901and 1198882 = 1199041198892(1199032119902 + 1198982) mod119901 for some random ingredients1199031 and 1199032 And we can obtain that

(1198881 times 1198882) mod119901

= 1199041198891 (1199031119902 + 1198981) mod119901 times 1199041198892 (1199032119902 + 1198982) mod119901

= 1199041198891+1198892 ((11990311199032119902 + 11990311198982 + 11990321198981) 119902 + 1198981 times 1198982) mod119901= 119864119899119888119904119896 (1198981 times 1198982 1198891 + 1198892)

(8)

Homomorphic Addition

(1198881 + 1198882) mod119901 = 119904119889 (1199031119902 + 1198981) mod119901

+ 119904119889 (1199032119902 + 1198982) mod119901

= 119904119889 ((1199031 + 1199032) 119902 + 1198981 + 1198982) mod119901= 119864119899119888119904119896 (1198981 + 1198982 119889)

(9)

Readers may refer to [18] for details on the scheme

24 Garbling Scheme Agarbling scheme [34] consists of fouralgorithms which is denoted by 119866 = (119866119887 119864119899119863119890 119864V) 119891 canbe transformed by Gb into (119865 119890 119889) Note that 119865 is the garbledcircuit The encoding and decoding information algorithmsare denoted by 119890 119889 The output of garbled 119884 can be encryptedand get the result 119910 = 119863119890(119889 119884)

25 Noninteractive Commitment A noninteractive commit-ment scheme [35] is also required in our paper denoted by

(Com119888119903119904Chk119888119903119904) The distribution of Com119888119903119904(119909 119903) is deter-mined by the value of 119903 as Com119888119903119904(119909)

26 Basic Cryptographic Subprotocols In this section wepresent a set of cryptographic subprotocols that will be usedas subroutines when constructing the proposed protocol

261 Outsourcing Secure Comparison Protocol (OSCP) Thevalue of119898 is kept secure from the cloud and users The valueof 119888 = 119864(119898mod 119902) is computed 119878119870 is kept by the data owner(Algorithm 1)

262 Secure Equivalent Testing Protocol (SET) With twociphertexts c1 = Encsk(m1) and c2 = Encsk(m2) SET is tocompute f and decide if the plaintexts are identical (m1 = m2)(Algorithm 2)

263 Secure Multiplication Protocol (SMP) The algorithm isdescribed as in Algorithm 3

264 Secure Minimum out of 2 Numbers Protocol (SMIN2)The algorithm is described as in Algorithm 4

27 SecureCircuit Protocol (SCP) Wedenote the three partiesof the protocol by CSS1 CSS2 and CCS and their respectiveinputs by x1 x2 or x

lowast3 Their goal is to securely compute the

function y = f(x1 x2 xlowast3 ) = x1 sdot x2xlowast3 [34] (Algorithm 5)For simplicity we assume that |xi| = |y| = m All

communication between parties is via private point-to-pointchannels Next we assume that CSS1 and CSS2 can learn thesame output y while CCS can get the garbled values for theportion of the output wires corresponding to its own outputsonly CCS cannot get the output y with these garbled valuesThis protocol uses a garbling scheme a four-tuple algorithm120575 = (GbEnDe Ev) as the underlying algorithm Gb is arandomized garbling algorithm that transforms a function ofa triple En and De are encoding and decoding algorithmsrespectively Ev is an algorithm that produces a garbled output

Wireless Communications and Mobile Computing 5

Two ciphertext are computed by the cloud as 1198881 = 119864119899119888119904119896(1198981) and 1198882 = 119864119899119888119904119896(1198982)1 The cloud computes 11988800 = 119864119899119888119904119896(11989800) = 1198881 minus 1198882 and 11988801 = 119864119899119888119904119896(11989801) = 1198882 minus 11988812 Check if11989800 ge 0 or11989801 ge 0 and computesif11989800 ge 0 and 11989801 lt 0 1198910 = 1 1198911 = 0 else if11989800 lt 0 and 11989801 ge 0 1198910 = 0 1198911 = 13 The value of 119891 is computed as follows

119891 = 1198910 oplus 1198911If 119891 = 0 set1198981 = 1198982

Algorithm 2 Secure equivalent testing protocol (SET)

The values are computed 119864119901119896(119909) and 119864119901119896(119910) 119875 keeps (119901119896 119904119896)1 119862(119886) Choose 119903119909 119903119910 isin 119885119873(119887) 1199091015840 larr997888 119864119901119896(119909)119864119901119896(119903119909) 1199101015840 larr997888 119864119901119896(119910)119864119901119896(119903119910)(119888) computes and sends 1199091015840 1199101015840 to 1198752 119875 computes(119886) ℎ119909 larr997888 119863119904119896(1199091015840) ℎ119910 larr997888 119863119904119896(1199101015840) ℎ larr997888 ℎ119909ℎ119910 mod119873 ℎ1015840 larr997888 119864119901119896(ℎ)(119887)The value of ℎ1015840 is sent to C3 119862 computes(119886) 119904 larr997888 ℎ1015840119864119901119896(119909)119873minus119903119910 1199041015840 larr997888 119904119864119901119896(119910)119873minus119903119909(119887) 119864119901119896(119909119910) larr997888 1199041015840119864119901119896(119903119909119903119910)119873minus1

Algorithm 3 119878119872(119864119901119896(119909) 119864119901119896(119910)) 997888rarr 119864119901119896(119909119910)

1 119862(119886)The function is chosen 119865(119887)for 119894 = 1 to 120582 do119864119901119896(119906119894V119894) larr997888 119878119872(119864119901119896(119906119894) 119864119901119896(V119894))if 119865 119906 gt V then119882119894 larr997888 119864119901119896(119906119894)119864119901119896(119906119894V119894)119873minus1

endelse119882119894 larr997888 119864119901119896(V119894)119864119901119896(119906119894V119894)119873minus1

end119866119894 larr997888 119864119901119896(119906119894 oplus V119894)119867119894 larr997888 119867119903119894119894minus1119866119894 119903119894isin119877119885119873 and1198670 = 119864(0)Φ119894 larr997888 119864119901119896(minus1)119867119894119871 119894 larr997888 119882119894Φ119903

1015840119894

119894 1199031015840119894 isin 119885119873end(119888) Sends 119871 to 1198752 119875(119886) 119872119894 larr997888 119863(119871 119894) for 1 le 119894 le 120582(119887)if exist119895 such that 119872119895 = 1 then120572 larr997888 1 (which means 119906 gt V)

endelse120572 larr997888 0 (which means 119906 lt V)

end

Algorithm 4 Secure Minimum out of 2 Numbers Protocol (SMIN2)

6 Wireless Communications and Mobile Computing

Input CSS1 has 1199091 CSS2 has 1199092Output (1199091 + 1199092)(1198861 + 1198862)1 CCS(119886) Randomly selects 119888119903119904 for commitment and sends 119888119903119904 to 1198621198781198781 and 1198621198781198782(119887) Generates the random number 120582 and share it secretly as 120582 = 1205821 oplus 1205822 sends 1198971198861198981198871198891198861 to 1198621198781198781 and sends 1198971198861198981198871198891198862 to 11986211987811987822 1198621198781198781 Select seed 119903 larr997888 0 1119896 for pseudo random function 119875119877119865 and send 119903 to 11986211987811987823 CSS1 and CSS2 (119886) Generate corresponding circuit 119866119887(1120581 119891) 997888rarr (119865 119890 119889) based on function 119891 = (1199091 + 1199092) lowast 120582 (119887) Randomselection of 1198871 1198872 larr997888 0 14119898 and generate the following commitments for all 119895 isin [4119898] and 119905 isin 0 1(1198621199051119895 1205901199051119895) larr997888 Com119888119903119904(119890[119895 1198871[119895] oplus 119905])(1198621199052119895 1205901199052119895) larr997888 Com119888119903119904(119890[119895 1198872[119895] oplus 119905])

(119888) 1198621198781198781 and 1198621198781198782 send the following information to 119862119862119878(1198871 [2119898 + 1 sdot sdot sdot 4119898] 119865 1198621199051119895119895119905)(1198872 [2119898 + 1 sdot sdot sdot 4119898] 119865 1198621199052119895119895119905)

4 CCS Abort if CSS1 and CSS2 report different values for these items5 CSS1 and CSS2(119886) CSS1 sends decommitment1205901199091[119895]oplus1198871[119895]1119895 1205901205821[119895]oplus1198871[2119898+119895]12119898+119895 1205901199091[119895]oplus1198872[119895]2119895 and 1205901205821 [119895]oplus1198872[2119898+119895]22119898+119895 to CCS(119887) CSS2 sends decommitment 1205901199092[119895]oplus1198871[119898+119895]1119898+119895 1205901205822[119895]oplus1198871[3119898+119895]13119898+119895 1205901199092[119895]oplus1198872[119898+119895]2119898+119895 and 1205901205822[119895]oplus1198872[3119898+119895]23119898+119895 to CCS6 CCS (119886) For 119895 isin [4119898] compute119883[119895] = Chk119888119903119904(119862119900[119895]1119895 120590119900[119895]1119895 )1198831015840[119895] = Chk119888119903119904(119862119900[119895]2119895 120590119900[119895]2119895 ) for the appropriate 119900[119895] If any call toChk returns perp then abort Similarly CCS knows the values 1198871[2119898 + 1 sdot sdot sdot 4119898] and 1198872[2119898 + 1 sdot sdot sdot 4119898] and aborts if CSS1 or CSS2can not open the corresponding commitments of 1205821 and 1205822 1198621205821[119895]oplus1198871[2119898+119895]12119898+119895 1198621205822[119895]oplus1198871[3119898+119895]13119898+119895 1198621205821[119895]oplus1198872[2119898+119895]22119898+119895 and 1198621205822[119895]oplus1198872[3119898+119895]23119898+119895 (119887) Run 119884 larr997888 Ev(119865119883) and 1198841015840 larr997888 Ev(1198651198831015840) then broadcasts 119884 and 1198841015840 to CSS1 and CSS27 CSS1 and CSS2 Compute (1199091 + 1199092)(1198861 + 1198862) = 119863119890(119889 119884)119863119890(119889 1198841015840)

Algorithm 5 Secure circuit protocol (SCP)

for a garbled input and garbled circuit Further Chk is analgorithm that can verify commitments

3 Outsourcing Privacy-Preserving ID3Decision Tree Algorithm in Malicious Model

In this section we present our secure outsourcing ID3decision tree in cloud computing using the homomorphicencryption scheme and subprotocols proposed in Section 2as building blocks

31 Main Concept The aim is to privately compute ID3 overencrypted databases and the key is to find privately theattribute A for which Gain is maximum From the abovedescription the key value which needs to be calculated withother parties is Entropy(Sai)

Since all the data was encrypted and sent to the cloud thecloud server can count the number of |S(ck)|t |S|it using theSETprotocol described in Section 2Now (3) can be executedas (x1+x2)(a1+a2)log2(x1+x2)(a1+a2) and the calculationof the logarithmic operation can be performed in CSS Thevalue to be calculated is the value of c1 = (x1 + x2)(a1 + a2)which can be easily determined using our SCP protocol asexplained in Section 2 Then all the parties can calculate thevalue of Entropy(S) independently

32 System Model The system model is shown in Figure 1which includes two data owners and cloud servers (cloudstorage server CSS1 CSS2 and cloud computing serverCCS) Each data owner owns a private data set that isencrypted and outsourced to cloud server storage Data

owners can request cloud server to process ID3 data onencrypted data At the same time CSS and CCS serversparticipate in supporting the outsourcing privacy protec-tion ID3 data mining algorithm steps after the imple-mentation of the algorithm the final results are sent tothe data owner Assuming that the data owner and theCSS server are semihonest participants CCS is a maliciousparticipant

33 Details of the Proposed Algorithm Our securely out-sourcing ID3 decision tree (SOID3) algorithm is detailed asfollows(1) P1 and P2 run KeyGen(120582) to generate the secret key

SKi i = 1 2 and a public parameter p of Lirsquos homomorphicencryption scheme Further each party shares p with theother party and the cloud but shares SKi only with itself(2) Each party uses its key SKi to encrypt every attribute

value of its database and then outsources the encrypteddatabase to the CSS (CSS1 and CSS2)(3)The CSS1 and CSS2 use the SET protocol to calculate

the value of |Saj |i and |Saj(ck)|i for each attribute with eachparty Pi(4)Eachparty generates its Paillier public and private keys

(pki ski) i = 1 2 and sends the public keys to the CSS1 andCSS2(5) CCS CSS1 and CSS2 jointly use the SCP protocol to

compute (x1 + x2)(a1 + a2) Here CSS1 has (x1 a1) and CSS2has (x2 a2)(6) Each party decrypts the received information cal-

culates it with the logarithmic operation of (x1 + x2a1 +

Wireless Communications and Mobile Computing 7

Sensors

Sensors

Database Database

Result Result

P1 P2

Cloud Storage Servers

Cloud Computing Servers

Figure 1 System model under consideration

a2)log2(x1 + x2a1 + a2) and then encrypts it with its publickey Then it sends it back to the cloud(7)After getting the result CSS1 and CSS2 use the SMIN2

protocol to select the ciphertext data with theminimum valueand then further select the attribute label with the maximuminformation gain and return it to each participant(8) The participants divide the data sets and build tree

nodes Then go to Step (3) until termination

4 Security Analysis

In this section we prove that the secure outsourcing ID3decision tree (SOID3) algorithm can offer protection againstthe malicious cloud server

Theorem 1 The SOID3 algorithm can achieve privacy for eachparty and the semihonest cloud storage server

Proof We mainly consider the security model under thenoncollusive semihonest model and the semihonest cloudserver Suppose there are two parties P1 and P2 and cloudstorage server CSS

Let P = (P1 P2CSS) be the participants of all protocolsConsider three types of attackers ( AP1 AP2 and ACSS) thatcan invade P1 P2 and CSS In the real model P1 and P2 havedata sets Dx and Dy respectively and CSS has encrypted datasets Enc(119863119909) and Enc(119863119910)MakeH sub P a collection of honestparticipants For all Pi120598H outPi indicates the output of Pi IfPi is invaded outPi represents all views of participant Pi inrunning protocol Π

For each Plowast isin P the attacker A = (AP1 AP2 ACSS) viewin the runtime protocol Π can be defined as

REALPlowast

ΠAH (DxDy) = outPi Pi120598H cup outPlowast (10)

In the ideal model there exists an ideal model F forfunction f and all participants can interact with the modelF That is Challenger DPa and participant Pi can send data xand y to F If Dx or Dy is perp F returns perp Finally F can return

f(DxDy) to challenger DPa As mentioned earlier H sub P isa collection of honest participants For each participant Pi120598Hin the collection return the outPi as F output to Pi If Pi isintruded on by a semihonest attacker outPi is still consistentwith the output of Pi in previous realistic models

For all Plowast isin P in the ideal model in the presence ofindependent simulators Sim = (SimP1 SimP2 SimCSS) the Plowastview is

IDEALPlowast

FSimH (DxDy) = outPi Pi120598H cup outPlowast (11)

Therefore it is considered that the protocolΠ is secure inthe presence of noncolluded semitruthful attackersDefinition 2 Let f be a deterministic functionality amongparties in P Let H sub P be the subset of honest partiesin P We say that Π securely realizes f if there exists a setSim = SimP1 SimP2 SimS of PPT transformations (whereSimDa

= SimP1(AP1 ) and so on) such that for all semihonestPPT adversaries A = AP1 AP2 AS for all inputs DxDy andauxiliary inputs z and for all parties P isin P the followingholds

REALPlowast

ΠAHz (120582 x y)120582isinN equiv

IDEALPlowast

FSimHz (120582 x y)120582isinN (12)

where equiv denotes computational indistinguishability

Theorem 2 The SOID3 algorithm is secure with the semi-honest cloud storage server and the malicious cloud computingserver

Proof First consider the case where CSS1 or CSS2 is cor-rupted It is necessary to prove that in the SCP protocol theideal model and the realistic model are not distinguishableThat is in the following interactions it is impossible to dis-tinguish between the various types of interaction informationand outputs of the participants in the ideal model and the realmodel

8 Wireless Communications and Mobile Computing

(1) In the realmodel assume that there is an emulator thatcan simulate various behaviors of a semihonest participantCSS1 (or CSS2) and receive inputs (x1 a2) and (x2 a2) fromthe execution environment of the protocol At the same timethe simulator can simulate the function Ff which sends allinputs (x1 a1) and (x2 a2) to the simulated Ff Since thesimulator does not do anything computed by Ff there is nodifference between the real Ff and the simulated Ff from theexecution environment point of view(2) Because in Step 2 CSS1 and CSS2 uniformly select the

seed r of Pseudo-Random Function (PRF) the PRF securityshows that the real model in Step 2 is indistinguishable fromthe ideal model(3) In Step 3 we modify the simulator which knows in

advance what promises will be opened when the simulatorgenerates commitment C First the simulator selects therandom numbers o1 o2 that can be marked which promisesto be opened and calculates the values of b1 = o1 oplus x1 x2b2 = o2 oplus a1 a2 At this point the simulator has obtainedthe values of x1 x2 and a1 a2Then the simulator can submitthe markings that promise not to be opened In this processdue to the concealment of commitment the realistic modeland the ideal model are equally indistinguishable(4) In Step 6a the simulated CSS1 and CSS2 stop exe-

cuting when De(D Y) = perp Change the emulator to makeY = Ev(FX) By obfuscating the authenticity of the circuitCCS has only negligible probability to obtain Y = Ev(FX) inDe(d Y) = perp Therefore in this step the realistic model andthe ideal model are equally indistinguishable(5) In Step 6b the correctness of the obfuscation circuit

guarantees that both CSS1 and CSS2 of the analog can beoutput Therefore if there is no pause in 6a we can modifythe simulator to an analog obfuscation circuit that generates(FX d) We can simulate the output of CSS1 and CSS2 bysimulating the instructions of Ff According to the security ofthe confusing circuit the real model is also indistinguishablefrom the ideal model in this step

Therefore in this protocol the execution environmentcan not distinguish between the realistic model and the idealmodel And the protocol is secure when CCS is a maliciousparticipant

5 Performance Analysis

In this paper we consider that CSS has a strong calculationability and we ignore its computation time Each data ownerdoes not need to store the ciphertext but can just use thepublic key to encrypt the message and the private key todecrypt the ciphertext

In each iteration first each data owner will execute theSBD protocol and SMIN2 protocol with the cloud There aretwo interactions in the SBD protocol and 2k interactions inthe SMIN2 protocolThen CSS1 CSS2 and CCS will execute6 interactions in the SCP protocol Finally each data ownerwill execute 1 interaction when it goes to the new iterationWe assume that t is the iteration time so the communicationtraffic is at most O(k lowast t)

In this paper a secure average computing protocolbased on SCP is implemented The server selected in the

0

50

100

150

Tim

e (s)

C1C2Client

100 200 300 400 500 6000Number of records in data set

Figure 2 Performance measurements for our SOID3 with 2 partic-ipants

experimental environment is CPU Intel (R) Xeon (R) CPUE5-2620 v3240GHzlowast2 memory 32G operating systemUbuntu 16044 LTS version In the experiment AES-128is chosen as the basic encryption method of the confusioncircuit and the open source code of JustGarble is changedand the commitment protocol is implemented based on SHA-256 Finally the average values obtained from experimentsare as follows

In our secure outsourcing ID3 decision tree (SOID3)algorithm experiment two participants were tested withdifferent numbers of records The experimental results areshown in Figure 2

From Figure 2 since the client is only responsible forencrypting uploaded data the time consumption is very lowIn the cloud CCS and CSS servers need to run SCP protocolresulting in a lot of time consumption (Table 1) The mainreason is that a large number of bit commitment processingis needed in the obfuscation circuit and the performanceimprovement will be focused on this issue in the follow-upwork

6 Conclusion

In this paper we proposed a secure outsourcing ID3 decisiontree algorithm for two parties of the malicious model Ouralgorithm can preserve the privacy of the usersrsquo data as wellas that of the data mining scheme for the cloud servers Theparties can get only the result trees and have no knowledgeabout the data mining scheme Moreover the cloud serverscannot get any private information about the parties Insummary our protocol offers protection against maliciouscloud servers

In the future we intend to extend our algorithm tovertical and arbitrary partitioning in the malicious modelIn addition we plan to extend our algorithm to a generalmultiparty privacy-preserving framework suitable for otheruseful schemes such as random decision tree Bayes SVM

Wireless Communications and Mobile Computing 9

Table 1 Time cost of the SCP protocol

AND XOR OR Input size(bits)

Output size(bits)

CSS1CSS2(ms)

119862119862119878(ms)

5090 4034 2016 129 65 26 47

and other data mining methods and can be extended for usein the wireless sensor-networks [36 37]

Data Availability

The data used to support the findings of this study areavailable from the corresponding author upon request

Conflicts of Interest

The authors declare that they have no conflicts of interest

Acknowledgments

This work is supported by National Key Research andDevelopment Program of China (no 2017YFB0803002)Basic Research Project of Shenzhen China (noJCYJ20160318094015947) National Natural ScienceFoundation of China (nos 61872109 and 61771222)and Key Technology Program of Shenzhen China (noJSGG20160427185010977)

References

[1] Y Lindell and B Pinkas ldquoPrivacy preserving data miningrdquoJournal of Cryptology vol 15 no 3 pp 177ndash206 2002

[2] R Agrawal and R Srikant ldquoPrivacy-preserving data miningrdquoACM SIGMOD Record vol 29 no 2 pp 439ndash450 2000

[3] D Liu E Bertino and X Yi ldquoPrivacy of outsourced k-means clusteringrdquo in Proceedings of the 9th ACM Symposiumon Information Computer and Communications Security ASIACCS 2014 pp 123ndash133 June 2014

[4] P Li J Li Z Huang C-Z Gao W-B Chen and K ChenldquoPrivacy-preserving outsourced classification in cloud comput-ingrdquo Cluster Computing pp 1ndash10 2017

[5] F Emekci O D Sahin D Agrawal and A El Abbadi ldquoPrivacypreserving decision tree learning over multiple partiesrdquoData ampKnowledge Engineering vol 63 no 2 pp 348ndash361 2007

[6] P Lory ldquoEnhancing the Efficiency in Privacy Preserving Learn-ing of Decision Trees in Partitioned Databasesrdquo in Privacy inStatistical Databases vol 7556 of Lecture Notes in ComputerScience pp 322ndash335 Springer Berlin Heidelberg Berlin Hei-delberg 2012

[7] J Vaidya and C Clifton ldquoPrivacy-preserving K-means cluster-ing over vertically partitioned datardquo in Proceedings of the 9thACM SIGKDD International Conference on Knowledge Discov-ery and Data Mining (KDD rsquo03) pp 206ndash215 Washington DC USA August 2003

[8] J Zhan S Matwin and LW Chang ldquoPrivacy-PreservingNaiveBayesian Classification over Horizontally Partitioned Datardquoin Data Mining Foundations and Practice vol 118 of Studiesin Computational Intelligence pp 529ndash538 Springer BerlinHeidelberg Berlin Heidelberg 2008

[9] S Han andW K Ng ldquoMulti-party privacy-preserving decisiontrees for arbitrarily partitioned datardquo International Journal ofIntelligent Control and Systems vol 12 no 4 2007

[10] T Li J Li Z Liu P Li and C Jia ldquoDifferentially private NaiveBayes learning overmultiple data sourcesrdquo Information Sciencesvol 444 pp 89ndash104 2018

[11] CGaoQ Cheng PHeW Susilo and J Li ldquoPrivacy-preservingNaive Bayes classifiers secure against the substitution-then-comparison attackrdquo Information Sciences vol 444 pp 72ndash882018

[12] J Shen T Zhou X Chen J Li and W Susilo ldquoAnonymousand Traceable Group Data Sharing in Cloud Computingrdquo IEEETransactions on Information Forensics and Security vol 13 no4 pp 912ndash925 2018

[13] Z Cai H Yan P Li Z-A Huang and C Gao ldquoTowards secureand flexible EHR sharing in mobile health cloud under staticassumptionsrdquo Cluster Computing vol 20 no 3 pp 2415ndash24222017

[14] J Li Y K Li X Chen P P C Lee and W Lou ldquoA HybridCloud Approach for Secure Authorized Deduplicationrdquo IEEETransactions on Parallel and Distributed Systems vol 26 no 5pp 1206ndash1216 2015

[15] Z Liu Y Huang J Li X Cheng and C Shen ldquoDivORAMTowards a practical oblivious RAM with variable block sizerdquoInformation Sciences vol 447 pp 1ndash11 2018

[16] X Chen J Li J Weng J Ma and W Lou ldquoVerifiable computa-tion over large database with incremental updatesrdquo Institute ofElectrical and Electronics Engineers Transactions on Computersvol 65 no 10 pp 3184ndash3195 2016

[17] J Shen C Wang T Li X Chen X Huang and Z-H ZhanldquoSecure data uploading scheme for a smart home systemrdquoInformation Sciences vol 453 pp 186ndash197 2018

[18] S ChenGWangG Yan andDXie ldquoMulti-dimensional fuzzytrust evaluation for mobile social networks based on dynamiccommunity structuresrdquoConcurrencyand Computation Practiceand Experience vol 29 no 7 p e3901 2017

[19] E Luo Q Liu J H Abawajy andGWang ldquoPrivacy-preservingmulti-hop profile-matching protocol for proximity mobilesocial networksrdquo Future Generation Computer Systems vol 68pp 222ndash233 2017

[20] X F Chen J Li J Ma Q Tang and W Lou ldquoNew algorithmsfor secure outsourcing of modular exponentiationsrdquo IEEETransactions on Parallel and Distributed Systems vol 25 no 9pp 2386ndash2396 2014

[21] R Bost R A Popa S Tu and S Goldwasser ldquoMachineLearning Classification over Encrypted Datardquo in Proceedings ofthe Network and Distributed System Security Symposium SanDiego CA

[22] A Peter E Tews and S Katzenbeisser ldquoEfficiently outsourcingmultiparty computation under multiple keysrdquo IEEE Transac-tions on Information Forensics and Security vol 8 no 12 pp2046ndash2058 2013

[23] G Jagannathan K Pillaipakkamnatt andRNWright ldquoA Prac-tical Differentially Private Random Decision Tree Classifierrdquo in

10 Wireless Communications and Mobile Computing

Proceedings of the 2009 IEEE International Conference on DataMining Workshops (ICDMW) pp 114ndash121 Miami FL USADecember 2009

[24] B Gilburd A Schuster and R Wolff ldquoPrivacy-preserving datamining on data grids in the presence of malicious participantsrdquoin Proceedings of the Proceedings 13th IEEE International Sym-posium on High performance Distributed Computing 2004 pp225ndash234 Honolulu HI USA

[25] D Shah and S Zhong ldquoTwo methods for privacy preservingdata mining with malicious participantsrdquo Information Sciencesvol 177 no 23 pp 5468ndash5483 2007

[26] P Li J Li Z Huang et al ldquoMulti-key privacy-preserving deeplearning in cloud computingrdquo Future Generation ComputerSystems vol 74 pp 76ndash85 2017

[27] Q Lin H Yan Z Huang W Chen J Shen and Y TangldquoAn ID-based linearly homomorphic signature scheme and itsapplication in blockchainrdquo IEEE Access vol PP no 99 pp 1-12018

[28] J Xu L Wei Y Zhang A Wang F Zhou and C Gao ldquoDy-namic Fully Homomorphic encryption-based Merkle Tree forlightweight streaming authenticated data structuresrdquo Journal ofNetwork and Computer Applications vol 107 pp 113ndash124 2018

[29] J Shen Z Gui S Ji J Shen H Tan and Y Tang ldquoCloud-aided lightweight certificateless authentication protocol withanonymity for wireless body area networksrdquo Journal of Networkand Computer Applications vol 106 pp 117ndash123 2018

[30] X Zhang Y Tan C Liang Y Li and J Li ldquoA Covert ChannelOver VoLTE via Adjusting Silence Periodsrdquo IEEE Access vol 6pp 9292ndash9302 2018

[31] Y Wang T Li H Qin et al ldquoA brief survey on secure multi-party computing in the presence of rational partiesrdquo Journal ofAmbient Intelligence and Humanized Computing vol 6 no 6pp 807ndash824 2015

[32] P Paillier ldquoPublic-key cryptosystems based on compositedegree residuosity classesrdquo in Advances in CryptologymdashEURO-CRYPT rsquo99 vol 1592 pp 223ndash238 Springer 1999

[33] L Li R Lu K-K R Choo A Datta and J Shao ldquoPrivacy-Preserving-Outsourced Association Rule Mining on Verti-cally Partitioned Databasesrdquo IEEE Transactions on InformationForensics and Security vol 11 no 8 pp 1547ndash1861 2016

[34] P Mohassel M Rosulek and Y Zhang ldquoFast and SecureThree-party Computationrdquo in Proceedings of the the 22ndACMSIGSACConference pp 591ndash602Denver Colorado USAOctober 2015

[35] M Naor ldquoBit commitment using pseudorandomnessrdquo Journalof Cryptology vol 4 no 2 pp 151ndash158 1991

[36] H Cheng Z Su N Xiong and Y Xiao ldquoEnergy-efficientnode scheduling algorithms for wireless sensor networks usingMarkov Random Field modelrdquo Information Sciences vol 329pp 461ndash477 2016

[37] H Cheng N Xiong A V Vasilakos L Tianruo Yang G Chenand X Zhuang ldquoNodes organization for channel assignmentwith topology preservation in multi-radio wireless mesh net-worksrdquo Ad Hoc Networks vol 10 no 5 pp 760ndash773 2012

International Journal of

AerospaceEngineeringHindawiwwwhindawicom Volume 2018

RoboticsJournal of

Hindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom Volume 2018

Active and Passive Electronic Components

VLSI Design

Hindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom Volume 2018

Shock and Vibration

Hindawiwwwhindawicom Volume 2018

Civil EngineeringAdvances in

Acoustics and VibrationAdvances in

Hindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom Volume 2018

Electrical and Computer Engineering

Journal of

Advances inOptoElectronics

Hindawiwwwhindawicom

Volume 2018

Hindawi Publishing Corporation httpwwwhindawicom Volume 2013Hindawiwwwhindawicom

The Scientific World Journal

Volume 2018

Control Scienceand Engineering

Journal of

Hindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom

Journal ofEngineeringVolume 2018

SensorsJournal of

Hindawiwwwhindawicom Volume 2018

International Journal of

RotatingMachinery

Hindawiwwwhindawicom Volume 2018

Modelling ampSimulationin EngineeringHindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom Volume 2018

Chemical EngineeringInternational Journal of Antennas and

Propagation

International Journal of

Hindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom Volume 2018

Navigation and Observation

International Journal of

Hindawi

wwwhindawicom Volume 2018

Advances in

Multimedia

Submit your manuscripts atwwwhindawicom

Page 3: ResearchArticle - Hindawi Publishing Corporationdownloads.hindawi.com/journals/wcmc/2018/2385150.pdf · ResearchArticle Securely Outsourcing ID3 Decision Tree in Cloud Computing YeLi,1

Wireless Communications and Mobile Computing 3

algorithms for privacy-preserving technology in the futurelays a solid foundation

PPDM with Multiparticipant Data Storage and ComputationOutsourcing In 2013 Peter et al [22] proposed a newsolution for the outsourcing of multiparty computationSuch a technique can be used in our setting But as thesecurity analysis in the previous works they can only achievesecurity in the semihonest model In [23] a new protocolwas proposed to achieve data mining for two parties In[24] association rule mining was addressed in the maliciousmodel In [25] the privacy-preservingKNN classificationwasaddressed In [26] the deep learning task was addressedBesides the above related work several fundamental securealgorithms such as dynamic homomorphic encryption [2728] authentication [29 30] and light-weight multipartycomputation [31] which have also been considered in themalicious model have been proposed However to thebest of our knowledge no existing study has considereda method for outsourcing computation in the maliciousmodel

In this study the secure outsourcing of ID3 data min-ing is considered in the malicious model for the cloudenvironment We show how to solve the outsourcingproblem for ID3 protocol over horizontally partitioneddata

2 Preliminaries

In this section we present a brief overview of the prelim-inaries used in this paper including the ID3 decision treealgorithm Paillierrsquos homomorphic encryption scheme andthe other related protocols

21 Distribute ID3 Decision Tree Algorithm The ID3 algo-rithm description is given as follows It builds a decisiontree in a top-down manner with the information of samplesStarting at the root the best object classification will beobtained The best prediction is computed with the informa-tion gain The information gain of an attribute 119860 119905 is definedas

119866119886119894119899 (119878 119860 119905) = 119864119899119905119903119900119901119910 (119878)

minus sum119886119905119895isin119860119905

100381610038161003816100381610038161003816119878119886119905119895100381610038161003816100381610038161003816

|119878| 119864119899119905119903119900119901119910 (119878119886119905119895)(1)

Assume that there are 2 parties P = 119875119894 | 119894 = 1 2with 2 databases S = 119878119894 | 119894 = 1 2 Each party 119875119894 hasone database 119878119894 All databases share the same general attribute(column) set A = 119860 119905 | 119905 = 1 2 119901 and each attribute 119860 119905has several general discrete attribute values denoted by 119860 119905 =119886119905119895 | 119895 = 1 2 119898119905 and one class attribute 119862 = 119888119895 | 119895 =1 2 119898

Without considering privacy each party119875119894 shares his own|119878119886119905119895119888119895 |119894 |119878119886119905119895 |119894 and |119878119888119895 |119894 to all other parties As a result anyparty can calculate 119864119899119905119903119900119901119910(119878) and 119864119899119905119903119900119901119910(119878119886119905119895)

119864119899119905119903119900119901119910 (119878) =119898

sum119895=1

minus100381610038161003816100381610038161003816119878119888119895100381610038161003816100381610038161003816

|119878| log2

100381610038161003816100381610038161003816119878119888119895100381610038161003816100381610038161003816

|119878|

=119898

sum119895=1

minussum119899119894=1

100381610038161003816100381610038161003816119878119888119895100381610038161003816100381610038161003816119894

|119878| log2sum119899119894=1

100381610038161003816100381610038161003816119878119888119895100381610038161003816100381610038161003816119894

|119878|

(2)

where 119878119888119895 is the subset of 119878 with tuples that have value 119888119895 forclass attribute 119862 |119878119888119895 |119894 equals the set of transactions with classattribute 119862 set to 119888119895 in database 119878119894

Then the value of 119864119899119905119903119900119901119910(119878119886119895) can be calculated as

119864119899119905119903119900119901119910 (119878119886119905119895) =119898119905

sum119895=1

minussum119899119894=1

100381610038161003816100381610038161003816119878119886119905119895119888119895100381610038161003816100381610038161003816119894

sum119899119894=1100381610038161003816100381610038161003816119878119886119905119895100381610038161003816100381610038161003816119894sdot log2

sum119899119894=1100381610038161003816100381610038161003816119878119886119905119895119888119895

100381610038161003816100381610038161003816119894sum119899119894=1

100381610038161003816100381610038161003816119878119886119905119895100381610038161003816100381610038161003816119894

(3)

where 119878119886119905119895119888119895 is the subset of 119878 with tuples that have values 119886119895for attribute 119860 119905 and 119888119895 for class attribute 119862

Therefore (3) can be easily computed by party 119875119894 andparties 119875119895 (119895 = 119894) all of the values |119878119886119905119895 |119894 and |119878119886119905119895119888119895 |119894 from itsdatabase Each party 119875119895 then sums these together with thevalues |119878119886119905119895 |119895 and |119878119886119905119895119888119895 |119895 from its database and completes thecomputation

Then each party can calculate 119866119886119894119899(119878 119860 119905) value at its ownside

22 Paillierrsquos Homomorphic Encryption Scheme Homomor-phic encryption is a special type of encryption in whichthe result of applying a special algebraic operation to plaintexts can be obtained by applying another algebraic operation(which may be different or the same) to the correspondingciphertexts Thus even when the user does not know theplain texts heshe can still obtain the results of applying thatalgebraic operation to the plain texts

Let1198981 and1198982 be two plain texts with encryptions 119864(1198981)and 119864(1198982) respectively

The Paillier encryption scheme [32] is described asfollows

119864 (1198981) oplus 119864 (1198982) = 119864 (1198981 + 1198982) (4)

23 Lirsquos Symmetric Homomorphic Encryption Scheme Thedescription of symmetric homomorphic encryption schemeproposed by Li et al [33] is as follows

(i) KeyGen()(119904 119902 119901) larr997888 119870119890119910119866119890119899 (120582) (5)

119870119890119910119866119890119899() is used to generate key for users as 119878119870 =(119904 119902) 119901 and 119902 are primes with the condition that 119901 ≫119902 119904 is chosen from Zlowast119873

(ii) Encsk()

119864119899119888119904119896 (119898 119889) = 119904119889 (119903119902 + 119898) mod119901 (6)

119889 is a small positive integer which is denoted asciphertext degree in this paper

4 Wireless Communications and Mobile Computing

1 The computation at Data ownerCompute 120591 = 119864119899119888119904119896(1) for the cloud2 The computation at cloud119906 V are chosen such that

119906 ≫ VV ≫ max(120572 120573)

(119902 minus 1)2 ≫ 120573 times 119906 + V

minus120572 times 119906 + V ≫ minus(119902 minus 1)23 The cloud compute the following value for the data ownerΦ = 119888119906 + 120591V mod119901 and sendsΦ4 Data Owner computes the following values120593 = 119863119890119888119904119896(119878119870Φ 119889) = (119898 times 119906 + V) mod 119902and compares 120593 with (119902 minus 1)2If 120593 lt (119902 minus 1)2119898 ge 0 Otherwise119898 lt 0

Algorithm 1 Secure outsourcing comparison (OSCP)

(iii) Decsk()

119863119890119888119904119896 (119888 119889) = (119888 times 119904minus119889 mod119901) mod 119902 (7)

231 Properties of the Proposed Homomorphic Encryption

Homomorphic Multiplication Let 1198881 1198882 be the ciphertexts oftwo plaintexts11989811198982Thenwe have 1198881 = 1199041198891(1199031119902+1198981) mod119901and 1198882 = 1199041198892(1199032119902 + 1198982) mod119901 for some random ingredients1199031 and 1199032 And we can obtain that

(1198881 times 1198882) mod119901

= 1199041198891 (1199031119902 + 1198981) mod119901 times 1199041198892 (1199032119902 + 1198982) mod119901

= 1199041198891+1198892 ((11990311199032119902 + 11990311198982 + 11990321198981) 119902 + 1198981 times 1198982) mod119901= 119864119899119888119904119896 (1198981 times 1198982 1198891 + 1198892)

(8)

Homomorphic Addition

(1198881 + 1198882) mod119901 = 119904119889 (1199031119902 + 1198981) mod119901

+ 119904119889 (1199032119902 + 1198982) mod119901

= 119904119889 ((1199031 + 1199032) 119902 + 1198981 + 1198982) mod119901= 119864119899119888119904119896 (1198981 + 1198982 119889)

(9)

Readers may refer to [18] for details on the scheme

24 Garbling Scheme Agarbling scheme [34] consists of fouralgorithms which is denoted by 119866 = (119866119887 119864119899119863119890 119864V) 119891 canbe transformed by Gb into (119865 119890 119889) Note that 119865 is the garbledcircuit The encoding and decoding information algorithmsare denoted by 119890 119889 The output of garbled 119884 can be encryptedand get the result 119910 = 119863119890(119889 119884)

25 Noninteractive Commitment A noninteractive commit-ment scheme [35] is also required in our paper denoted by

(Com119888119903119904Chk119888119903119904) The distribution of Com119888119903119904(119909 119903) is deter-mined by the value of 119903 as Com119888119903119904(119909)

26 Basic Cryptographic Subprotocols In this section wepresent a set of cryptographic subprotocols that will be usedas subroutines when constructing the proposed protocol

261 Outsourcing Secure Comparison Protocol (OSCP) Thevalue of119898 is kept secure from the cloud and users The valueof 119888 = 119864(119898mod 119902) is computed 119878119870 is kept by the data owner(Algorithm 1)

262 Secure Equivalent Testing Protocol (SET) With twociphertexts c1 = Encsk(m1) and c2 = Encsk(m2) SET is tocompute f and decide if the plaintexts are identical (m1 = m2)(Algorithm 2)

263 Secure Multiplication Protocol (SMP) The algorithm isdescribed as in Algorithm 3

264 Secure Minimum out of 2 Numbers Protocol (SMIN2)The algorithm is described as in Algorithm 4

27 SecureCircuit Protocol (SCP) Wedenote the three partiesof the protocol by CSS1 CSS2 and CCS and their respectiveinputs by x1 x2 or x

lowast3 Their goal is to securely compute the

function y = f(x1 x2 xlowast3 ) = x1 sdot x2xlowast3 [34] (Algorithm 5)For simplicity we assume that |xi| = |y| = m All

communication between parties is via private point-to-pointchannels Next we assume that CSS1 and CSS2 can learn thesame output y while CCS can get the garbled values for theportion of the output wires corresponding to its own outputsonly CCS cannot get the output y with these garbled valuesThis protocol uses a garbling scheme a four-tuple algorithm120575 = (GbEnDe Ev) as the underlying algorithm Gb is arandomized garbling algorithm that transforms a function ofa triple En and De are encoding and decoding algorithmsrespectively Ev is an algorithm that produces a garbled output

Wireless Communications and Mobile Computing 5

Two ciphertext are computed by the cloud as 1198881 = 119864119899119888119904119896(1198981) and 1198882 = 119864119899119888119904119896(1198982)1 The cloud computes 11988800 = 119864119899119888119904119896(11989800) = 1198881 minus 1198882 and 11988801 = 119864119899119888119904119896(11989801) = 1198882 minus 11988812 Check if11989800 ge 0 or11989801 ge 0 and computesif11989800 ge 0 and 11989801 lt 0 1198910 = 1 1198911 = 0 else if11989800 lt 0 and 11989801 ge 0 1198910 = 0 1198911 = 13 The value of 119891 is computed as follows

119891 = 1198910 oplus 1198911If 119891 = 0 set1198981 = 1198982

Algorithm 2 Secure equivalent testing protocol (SET)

The values are computed 119864119901119896(119909) and 119864119901119896(119910) 119875 keeps (119901119896 119904119896)1 119862(119886) Choose 119903119909 119903119910 isin 119885119873(119887) 1199091015840 larr997888 119864119901119896(119909)119864119901119896(119903119909) 1199101015840 larr997888 119864119901119896(119910)119864119901119896(119903119910)(119888) computes and sends 1199091015840 1199101015840 to 1198752 119875 computes(119886) ℎ119909 larr997888 119863119904119896(1199091015840) ℎ119910 larr997888 119863119904119896(1199101015840) ℎ larr997888 ℎ119909ℎ119910 mod119873 ℎ1015840 larr997888 119864119901119896(ℎ)(119887)The value of ℎ1015840 is sent to C3 119862 computes(119886) 119904 larr997888 ℎ1015840119864119901119896(119909)119873minus119903119910 1199041015840 larr997888 119904119864119901119896(119910)119873minus119903119909(119887) 119864119901119896(119909119910) larr997888 1199041015840119864119901119896(119903119909119903119910)119873minus1

Algorithm 3 119878119872(119864119901119896(119909) 119864119901119896(119910)) 997888rarr 119864119901119896(119909119910)

1 119862(119886)The function is chosen 119865(119887)for 119894 = 1 to 120582 do119864119901119896(119906119894V119894) larr997888 119878119872(119864119901119896(119906119894) 119864119901119896(V119894))if 119865 119906 gt V then119882119894 larr997888 119864119901119896(119906119894)119864119901119896(119906119894V119894)119873minus1

endelse119882119894 larr997888 119864119901119896(V119894)119864119901119896(119906119894V119894)119873minus1

end119866119894 larr997888 119864119901119896(119906119894 oplus V119894)119867119894 larr997888 119867119903119894119894minus1119866119894 119903119894isin119877119885119873 and1198670 = 119864(0)Φ119894 larr997888 119864119901119896(minus1)119867119894119871 119894 larr997888 119882119894Φ119903

1015840119894

119894 1199031015840119894 isin 119885119873end(119888) Sends 119871 to 1198752 119875(119886) 119872119894 larr997888 119863(119871 119894) for 1 le 119894 le 120582(119887)if exist119895 such that 119872119895 = 1 then120572 larr997888 1 (which means 119906 gt V)

endelse120572 larr997888 0 (which means 119906 lt V)

end

Algorithm 4 Secure Minimum out of 2 Numbers Protocol (SMIN2)

6 Wireless Communications and Mobile Computing

Input CSS1 has 1199091 CSS2 has 1199092Output (1199091 + 1199092)(1198861 + 1198862)1 CCS(119886) Randomly selects 119888119903119904 for commitment and sends 119888119903119904 to 1198621198781198781 and 1198621198781198782(119887) Generates the random number 120582 and share it secretly as 120582 = 1205821 oplus 1205822 sends 1198971198861198981198871198891198861 to 1198621198781198781 and sends 1198971198861198981198871198891198862 to 11986211987811987822 1198621198781198781 Select seed 119903 larr997888 0 1119896 for pseudo random function 119875119877119865 and send 119903 to 11986211987811987823 CSS1 and CSS2 (119886) Generate corresponding circuit 119866119887(1120581 119891) 997888rarr (119865 119890 119889) based on function 119891 = (1199091 + 1199092) lowast 120582 (119887) Randomselection of 1198871 1198872 larr997888 0 14119898 and generate the following commitments for all 119895 isin [4119898] and 119905 isin 0 1(1198621199051119895 1205901199051119895) larr997888 Com119888119903119904(119890[119895 1198871[119895] oplus 119905])(1198621199052119895 1205901199052119895) larr997888 Com119888119903119904(119890[119895 1198872[119895] oplus 119905])

(119888) 1198621198781198781 and 1198621198781198782 send the following information to 119862119862119878(1198871 [2119898 + 1 sdot sdot sdot 4119898] 119865 1198621199051119895119895119905)(1198872 [2119898 + 1 sdot sdot sdot 4119898] 119865 1198621199052119895119895119905)

4 CCS Abort if CSS1 and CSS2 report different values for these items5 CSS1 and CSS2(119886) CSS1 sends decommitment1205901199091[119895]oplus1198871[119895]1119895 1205901205821[119895]oplus1198871[2119898+119895]12119898+119895 1205901199091[119895]oplus1198872[119895]2119895 and 1205901205821 [119895]oplus1198872[2119898+119895]22119898+119895 to CCS(119887) CSS2 sends decommitment 1205901199092[119895]oplus1198871[119898+119895]1119898+119895 1205901205822[119895]oplus1198871[3119898+119895]13119898+119895 1205901199092[119895]oplus1198872[119898+119895]2119898+119895 and 1205901205822[119895]oplus1198872[3119898+119895]23119898+119895 to CCS6 CCS (119886) For 119895 isin [4119898] compute119883[119895] = Chk119888119903119904(119862119900[119895]1119895 120590119900[119895]1119895 )1198831015840[119895] = Chk119888119903119904(119862119900[119895]2119895 120590119900[119895]2119895 ) for the appropriate 119900[119895] If any call toChk returns perp then abort Similarly CCS knows the values 1198871[2119898 + 1 sdot sdot sdot 4119898] and 1198872[2119898 + 1 sdot sdot sdot 4119898] and aborts if CSS1 or CSS2can not open the corresponding commitments of 1205821 and 1205822 1198621205821[119895]oplus1198871[2119898+119895]12119898+119895 1198621205822[119895]oplus1198871[3119898+119895]13119898+119895 1198621205821[119895]oplus1198872[2119898+119895]22119898+119895 and 1198621205822[119895]oplus1198872[3119898+119895]23119898+119895 (119887) Run 119884 larr997888 Ev(119865119883) and 1198841015840 larr997888 Ev(1198651198831015840) then broadcasts 119884 and 1198841015840 to CSS1 and CSS27 CSS1 and CSS2 Compute (1199091 + 1199092)(1198861 + 1198862) = 119863119890(119889 119884)119863119890(119889 1198841015840)

Algorithm 5 Secure circuit protocol (SCP)

for a garbled input and garbled circuit Further Chk is analgorithm that can verify commitments

3 Outsourcing Privacy-Preserving ID3Decision Tree Algorithm in Malicious Model

In this section we present our secure outsourcing ID3decision tree in cloud computing using the homomorphicencryption scheme and subprotocols proposed in Section 2as building blocks

31 Main Concept The aim is to privately compute ID3 overencrypted databases and the key is to find privately theattribute A for which Gain is maximum From the abovedescription the key value which needs to be calculated withother parties is Entropy(Sai)

Since all the data was encrypted and sent to the cloud thecloud server can count the number of |S(ck)|t |S|it using theSETprotocol described in Section 2Now (3) can be executedas (x1+x2)(a1+a2)log2(x1+x2)(a1+a2) and the calculationof the logarithmic operation can be performed in CSS Thevalue to be calculated is the value of c1 = (x1 + x2)(a1 + a2)which can be easily determined using our SCP protocol asexplained in Section 2 Then all the parties can calculate thevalue of Entropy(S) independently

32 System Model The system model is shown in Figure 1which includes two data owners and cloud servers (cloudstorage server CSS1 CSS2 and cloud computing serverCCS) Each data owner owns a private data set that isencrypted and outsourced to cloud server storage Data

owners can request cloud server to process ID3 data onencrypted data At the same time CSS and CCS serversparticipate in supporting the outsourcing privacy protec-tion ID3 data mining algorithm steps after the imple-mentation of the algorithm the final results are sent tothe data owner Assuming that the data owner and theCSS server are semihonest participants CCS is a maliciousparticipant

33 Details of the Proposed Algorithm Our securely out-sourcing ID3 decision tree (SOID3) algorithm is detailed asfollows(1) P1 and P2 run KeyGen(120582) to generate the secret key

SKi i = 1 2 and a public parameter p of Lirsquos homomorphicencryption scheme Further each party shares p with theother party and the cloud but shares SKi only with itself(2) Each party uses its key SKi to encrypt every attribute

value of its database and then outsources the encrypteddatabase to the CSS (CSS1 and CSS2)(3)The CSS1 and CSS2 use the SET protocol to calculate

the value of |Saj |i and |Saj(ck)|i for each attribute with eachparty Pi(4)Eachparty generates its Paillier public and private keys

(pki ski) i = 1 2 and sends the public keys to the CSS1 andCSS2(5) CCS CSS1 and CSS2 jointly use the SCP protocol to

compute (x1 + x2)(a1 + a2) Here CSS1 has (x1 a1) and CSS2has (x2 a2)(6) Each party decrypts the received information cal-

culates it with the logarithmic operation of (x1 + x2a1 +

Wireless Communications and Mobile Computing 7

Sensors

Sensors

Database Database

Result Result

P1 P2

Cloud Storage Servers

Cloud Computing Servers

Figure 1 System model under consideration

a2)log2(x1 + x2a1 + a2) and then encrypts it with its publickey Then it sends it back to the cloud(7)After getting the result CSS1 and CSS2 use the SMIN2

protocol to select the ciphertext data with theminimum valueand then further select the attribute label with the maximuminformation gain and return it to each participant(8) The participants divide the data sets and build tree

nodes Then go to Step (3) until termination

4 Security Analysis

In this section we prove that the secure outsourcing ID3decision tree (SOID3) algorithm can offer protection againstthe malicious cloud server

Theorem 1 The SOID3 algorithm can achieve privacy for eachparty and the semihonest cloud storage server

Proof We mainly consider the security model under thenoncollusive semihonest model and the semihonest cloudserver Suppose there are two parties P1 and P2 and cloudstorage server CSS

Let P = (P1 P2CSS) be the participants of all protocolsConsider three types of attackers ( AP1 AP2 and ACSS) thatcan invade P1 P2 and CSS In the real model P1 and P2 havedata sets Dx and Dy respectively and CSS has encrypted datasets Enc(119863119909) and Enc(119863119910)MakeH sub P a collection of honestparticipants For all Pi120598H outPi indicates the output of Pi IfPi is invaded outPi represents all views of participant Pi inrunning protocol Π

For each Plowast isin P the attacker A = (AP1 AP2 ACSS) viewin the runtime protocol Π can be defined as

REALPlowast

ΠAH (DxDy) = outPi Pi120598H cup outPlowast (10)

In the ideal model there exists an ideal model F forfunction f and all participants can interact with the modelF That is Challenger DPa and participant Pi can send data xand y to F If Dx or Dy is perp F returns perp Finally F can return

f(DxDy) to challenger DPa As mentioned earlier H sub P isa collection of honest participants For each participant Pi120598Hin the collection return the outPi as F output to Pi If Pi isintruded on by a semihonest attacker outPi is still consistentwith the output of Pi in previous realistic models

For all Plowast isin P in the ideal model in the presence ofindependent simulators Sim = (SimP1 SimP2 SimCSS) the Plowastview is

IDEALPlowast

FSimH (DxDy) = outPi Pi120598H cup outPlowast (11)

Therefore it is considered that the protocolΠ is secure inthe presence of noncolluded semitruthful attackersDefinition 2 Let f be a deterministic functionality amongparties in P Let H sub P be the subset of honest partiesin P We say that Π securely realizes f if there exists a setSim = SimP1 SimP2 SimS of PPT transformations (whereSimDa

= SimP1(AP1 ) and so on) such that for all semihonestPPT adversaries A = AP1 AP2 AS for all inputs DxDy andauxiliary inputs z and for all parties P isin P the followingholds

REALPlowast

ΠAHz (120582 x y)120582isinN equiv

IDEALPlowast

FSimHz (120582 x y)120582isinN (12)

where equiv denotes computational indistinguishability

Theorem 2 The SOID3 algorithm is secure with the semi-honest cloud storage server and the malicious cloud computingserver

Proof First consider the case where CSS1 or CSS2 is cor-rupted It is necessary to prove that in the SCP protocol theideal model and the realistic model are not distinguishableThat is in the following interactions it is impossible to dis-tinguish between the various types of interaction informationand outputs of the participants in the ideal model and the realmodel

8 Wireless Communications and Mobile Computing

(1) In the realmodel assume that there is an emulator thatcan simulate various behaviors of a semihonest participantCSS1 (or CSS2) and receive inputs (x1 a2) and (x2 a2) fromthe execution environment of the protocol At the same timethe simulator can simulate the function Ff which sends allinputs (x1 a1) and (x2 a2) to the simulated Ff Since thesimulator does not do anything computed by Ff there is nodifference between the real Ff and the simulated Ff from theexecution environment point of view(2) Because in Step 2 CSS1 and CSS2 uniformly select the

seed r of Pseudo-Random Function (PRF) the PRF securityshows that the real model in Step 2 is indistinguishable fromthe ideal model(3) In Step 3 we modify the simulator which knows in

advance what promises will be opened when the simulatorgenerates commitment C First the simulator selects therandom numbers o1 o2 that can be marked which promisesto be opened and calculates the values of b1 = o1 oplus x1 x2b2 = o2 oplus a1 a2 At this point the simulator has obtainedthe values of x1 x2 and a1 a2Then the simulator can submitthe markings that promise not to be opened In this processdue to the concealment of commitment the realistic modeland the ideal model are equally indistinguishable(4) In Step 6a the simulated CSS1 and CSS2 stop exe-

cuting when De(D Y) = perp Change the emulator to makeY = Ev(FX) By obfuscating the authenticity of the circuitCCS has only negligible probability to obtain Y = Ev(FX) inDe(d Y) = perp Therefore in this step the realistic model andthe ideal model are equally indistinguishable(5) In Step 6b the correctness of the obfuscation circuit

guarantees that both CSS1 and CSS2 of the analog can beoutput Therefore if there is no pause in 6a we can modifythe simulator to an analog obfuscation circuit that generates(FX d) We can simulate the output of CSS1 and CSS2 bysimulating the instructions of Ff According to the security ofthe confusing circuit the real model is also indistinguishablefrom the ideal model in this step

Therefore in this protocol the execution environmentcan not distinguish between the realistic model and the idealmodel And the protocol is secure when CCS is a maliciousparticipant

5 Performance Analysis

In this paper we consider that CSS has a strong calculationability and we ignore its computation time Each data ownerdoes not need to store the ciphertext but can just use thepublic key to encrypt the message and the private key todecrypt the ciphertext

In each iteration first each data owner will execute theSBD protocol and SMIN2 protocol with the cloud There aretwo interactions in the SBD protocol and 2k interactions inthe SMIN2 protocolThen CSS1 CSS2 and CCS will execute6 interactions in the SCP protocol Finally each data ownerwill execute 1 interaction when it goes to the new iterationWe assume that t is the iteration time so the communicationtraffic is at most O(k lowast t)

In this paper a secure average computing protocolbased on SCP is implemented The server selected in the

0

50

100

150

Tim

e (s)

C1C2Client

100 200 300 400 500 6000Number of records in data set

Figure 2 Performance measurements for our SOID3 with 2 partic-ipants

experimental environment is CPU Intel (R) Xeon (R) CPUE5-2620 v3240GHzlowast2 memory 32G operating systemUbuntu 16044 LTS version In the experiment AES-128is chosen as the basic encryption method of the confusioncircuit and the open source code of JustGarble is changedand the commitment protocol is implemented based on SHA-256 Finally the average values obtained from experimentsare as follows

In our secure outsourcing ID3 decision tree (SOID3)algorithm experiment two participants were tested withdifferent numbers of records The experimental results areshown in Figure 2

From Figure 2 since the client is only responsible forencrypting uploaded data the time consumption is very lowIn the cloud CCS and CSS servers need to run SCP protocolresulting in a lot of time consumption (Table 1) The mainreason is that a large number of bit commitment processingis needed in the obfuscation circuit and the performanceimprovement will be focused on this issue in the follow-upwork

6 Conclusion

In this paper we proposed a secure outsourcing ID3 decisiontree algorithm for two parties of the malicious model Ouralgorithm can preserve the privacy of the usersrsquo data as wellas that of the data mining scheme for the cloud servers Theparties can get only the result trees and have no knowledgeabout the data mining scheme Moreover the cloud serverscannot get any private information about the parties Insummary our protocol offers protection against maliciouscloud servers

In the future we intend to extend our algorithm tovertical and arbitrary partitioning in the malicious modelIn addition we plan to extend our algorithm to a generalmultiparty privacy-preserving framework suitable for otheruseful schemes such as random decision tree Bayes SVM

Wireless Communications and Mobile Computing 9

Table 1 Time cost of the SCP protocol

AND XOR OR Input size(bits)

Output size(bits)

CSS1CSS2(ms)

119862119862119878(ms)

5090 4034 2016 129 65 26 47

and other data mining methods and can be extended for usein the wireless sensor-networks [36 37]

Data Availability

The data used to support the findings of this study areavailable from the corresponding author upon request

Conflicts of Interest

The authors declare that they have no conflicts of interest

Acknowledgments

This work is supported by National Key Research andDevelopment Program of China (no 2017YFB0803002)Basic Research Project of Shenzhen China (noJCYJ20160318094015947) National Natural ScienceFoundation of China (nos 61872109 and 61771222)and Key Technology Program of Shenzhen China (noJSGG20160427185010977)

References

[1] Y Lindell and B Pinkas ldquoPrivacy preserving data miningrdquoJournal of Cryptology vol 15 no 3 pp 177ndash206 2002

[2] R Agrawal and R Srikant ldquoPrivacy-preserving data miningrdquoACM SIGMOD Record vol 29 no 2 pp 439ndash450 2000

[3] D Liu E Bertino and X Yi ldquoPrivacy of outsourced k-means clusteringrdquo in Proceedings of the 9th ACM Symposiumon Information Computer and Communications Security ASIACCS 2014 pp 123ndash133 June 2014

[4] P Li J Li Z Huang C-Z Gao W-B Chen and K ChenldquoPrivacy-preserving outsourced classification in cloud comput-ingrdquo Cluster Computing pp 1ndash10 2017

[5] F Emekci O D Sahin D Agrawal and A El Abbadi ldquoPrivacypreserving decision tree learning over multiple partiesrdquoData ampKnowledge Engineering vol 63 no 2 pp 348ndash361 2007

[6] P Lory ldquoEnhancing the Efficiency in Privacy Preserving Learn-ing of Decision Trees in Partitioned Databasesrdquo in Privacy inStatistical Databases vol 7556 of Lecture Notes in ComputerScience pp 322ndash335 Springer Berlin Heidelberg Berlin Hei-delberg 2012

[7] J Vaidya and C Clifton ldquoPrivacy-preserving K-means cluster-ing over vertically partitioned datardquo in Proceedings of the 9thACM SIGKDD International Conference on Knowledge Discov-ery and Data Mining (KDD rsquo03) pp 206ndash215 Washington DC USA August 2003

[8] J Zhan S Matwin and LW Chang ldquoPrivacy-PreservingNaiveBayesian Classification over Horizontally Partitioned Datardquoin Data Mining Foundations and Practice vol 118 of Studiesin Computational Intelligence pp 529ndash538 Springer BerlinHeidelberg Berlin Heidelberg 2008

[9] S Han andW K Ng ldquoMulti-party privacy-preserving decisiontrees for arbitrarily partitioned datardquo International Journal ofIntelligent Control and Systems vol 12 no 4 2007

[10] T Li J Li Z Liu P Li and C Jia ldquoDifferentially private NaiveBayes learning overmultiple data sourcesrdquo Information Sciencesvol 444 pp 89ndash104 2018

[11] CGaoQ Cheng PHeW Susilo and J Li ldquoPrivacy-preservingNaive Bayes classifiers secure against the substitution-then-comparison attackrdquo Information Sciences vol 444 pp 72ndash882018

[12] J Shen T Zhou X Chen J Li and W Susilo ldquoAnonymousand Traceable Group Data Sharing in Cloud Computingrdquo IEEETransactions on Information Forensics and Security vol 13 no4 pp 912ndash925 2018

[13] Z Cai H Yan P Li Z-A Huang and C Gao ldquoTowards secureand flexible EHR sharing in mobile health cloud under staticassumptionsrdquo Cluster Computing vol 20 no 3 pp 2415ndash24222017

[14] J Li Y K Li X Chen P P C Lee and W Lou ldquoA HybridCloud Approach for Secure Authorized Deduplicationrdquo IEEETransactions on Parallel and Distributed Systems vol 26 no 5pp 1206ndash1216 2015

[15] Z Liu Y Huang J Li X Cheng and C Shen ldquoDivORAMTowards a practical oblivious RAM with variable block sizerdquoInformation Sciences vol 447 pp 1ndash11 2018

[16] X Chen J Li J Weng J Ma and W Lou ldquoVerifiable computa-tion over large database with incremental updatesrdquo Institute ofElectrical and Electronics Engineers Transactions on Computersvol 65 no 10 pp 3184ndash3195 2016

[17] J Shen C Wang T Li X Chen X Huang and Z-H ZhanldquoSecure data uploading scheme for a smart home systemrdquoInformation Sciences vol 453 pp 186ndash197 2018

[18] S ChenGWangG Yan andDXie ldquoMulti-dimensional fuzzytrust evaluation for mobile social networks based on dynamiccommunity structuresrdquoConcurrencyand Computation Practiceand Experience vol 29 no 7 p e3901 2017

[19] E Luo Q Liu J H Abawajy andGWang ldquoPrivacy-preservingmulti-hop profile-matching protocol for proximity mobilesocial networksrdquo Future Generation Computer Systems vol 68pp 222ndash233 2017

[20] X F Chen J Li J Ma Q Tang and W Lou ldquoNew algorithmsfor secure outsourcing of modular exponentiationsrdquo IEEETransactions on Parallel and Distributed Systems vol 25 no 9pp 2386ndash2396 2014

[21] R Bost R A Popa S Tu and S Goldwasser ldquoMachineLearning Classification over Encrypted Datardquo in Proceedings ofthe Network and Distributed System Security Symposium SanDiego CA

[22] A Peter E Tews and S Katzenbeisser ldquoEfficiently outsourcingmultiparty computation under multiple keysrdquo IEEE Transac-tions on Information Forensics and Security vol 8 no 12 pp2046ndash2058 2013

[23] G Jagannathan K Pillaipakkamnatt andRNWright ldquoA Prac-tical Differentially Private Random Decision Tree Classifierrdquo in

10 Wireless Communications and Mobile Computing

Proceedings of the 2009 IEEE International Conference on DataMining Workshops (ICDMW) pp 114ndash121 Miami FL USADecember 2009

[24] B Gilburd A Schuster and R Wolff ldquoPrivacy-preserving datamining on data grids in the presence of malicious participantsrdquoin Proceedings of the Proceedings 13th IEEE International Sym-posium on High performance Distributed Computing 2004 pp225ndash234 Honolulu HI USA

[25] D Shah and S Zhong ldquoTwo methods for privacy preservingdata mining with malicious participantsrdquo Information Sciencesvol 177 no 23 pp 5468ndash5483 2007

[26] P Li J Li Z Huang et al ldquoMulti-key privacy-preserving deeplearning in cloud computingrdquo Future Generation ComputerSystems vol 74 pp 76ndash85 2017

[27] Q Lin H Yan Z Huang W Chen J Shen and Y TangldquoAn ID-based linearly homomorphic signature scheme and itsapplication in blockchainrdquo IEEE Access vol PP no 99 pp 1-12018

[28] J Xu L Wei Y Zhang A Wang F Zhou and C Gao ldquoDy-namic Fully Homomorphic encryption-based Merkle Tree forlightweight streaming authenticated data structuresrdquo Journal ofNetwork and Computer Applications vol 107 pp 113ndash124 2018

[29] J Shen Z Gui S Ji J Shen H Tan and Y Tang ldquoCloud-aided lightweight certificateless authentication protocol withanonymity for wireless body area networksrdquo Journal of Networkand Computer Applications vol 106 pp 117ndash123 2018

[30] X Zhang Y Tan C Liang Y Li and J Li ldquoA Covert ChannelOver VoLTE via Adjusting Silence Periodsrdquo IEEE Access vol 6pp 9292ndash9302 2018

[31] Y Wang T Li H Qin et al ldquoA brief survey on secure multi-party computing in the presence of rational partiesrdquo Journal ofAmbient Intelligence and Humanized Computing vol 6 no 6pp 807ndash824 2015

[32] P Paillier ldquoPublic-key cryptosystems based on compositedegree residuosity classesrdquo in Advances in CryptologymdashEURO-CRYPT rsquo99 vol 1592 pp 223ndash238 Springer 1999

[33] L Li R Lu K-K R Choo A Datta and J Shao ldquoPrivacy-Preserving-Outsourced Association Rule Mining on Verti-cally Partitioned Databasesrdquo IEEE Transactions on InformationForensics and Security vol 11 no 8 pp 1547ndash1861 2016

[34] P Mohassel M Rosulek and Y Zhang ldquoFast and SecureThree-party Computationrdquo in Proceedings of the the 22ndACMSIGSACConference pp 591ndash602Denver Colorado USAOctober 2015

[35] M Naor ldquoBit commitment using pseudorandomnessrdquo Journalof Cryptology vol 4 no 2 pp 151ndash158 1991

[36] H Cheng Z Su N Xiong and Y Xiao ldquoEnergy-efficientnode scheduling algorithms for wireless sensor networks usingMarkov Random Field modelrdquo Information Sciences vol 329pp 461ndash477 2016

[37] H Cheng N Xiong A V Vasilakos L Tianruo Yang G Chenand X Zhuang ldquoNodes organization for channel assignmentwith topology preservation in multi-radio wireless mesh net-worksrdquo Ad Hoc Networks vol 10 no 5 pp 760ndash773 2012

International Journal of

AerospaceEngineeringHindawiwwwhindawicom Volume 2018

RoboticsJournal of

Hindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom Volume 2018

Active and Passive Electronic Components

VLSI Design

Hindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom Volume 2018

Shock and Vibration

Hindawiwwwhindawicom Volume 2018

Civil EngineeringAdvances in

Acoustics and VibrationAdvances in

Hindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom Volume 2018

Electrical and Computer Engineering

Journal of

Advances inOptoElectronics

Hindawiwwwhindawicom

Volume 2018

Hindawi Publishing Corporation httpwwwhindawicom Volume 2013Hindawiwwwhindawicom

The Scientific World Journal

Volume 2018

Control Scienceand Engineering

Journal of

Hindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom

Journal ofEngineeringVolume 2018

SensorsJournal of

Hindawiwwwhindawicom Volume 2018

International Journal of

RotatingMachinery

Hindawiwwwhindawicom Volume 2018

Modelling ampSimulationin EngineeringHindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom Volume 2018

Chemical EngineeringInternational Journal of Antennas and

Propagation

International Journal of

Hindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom Volume 2018

Navigation and Observation

International Journal of

Hindawi

wwwhindawicom Volume 2018

Advances in

Multimedia

Submit your manuscripts atwwwhindawicom

Page 4: ResearchArticle - Hindawi Publishing Corporationdownloads.hindawi.com/journals/wcmc/2018/2385150.pdf · ResearchArticle Securely Outsourcing ID3 Decision Tree in Cloud Computing YeLi,1

4 Wireless Communications and Mobile Computing

1 The computation at Data ownerCompute 120591 = 119864119899119888119904119896(1) for the cloud2 The computation at cloud119906 V are chosen such that

119906 ≫ VV ≫ max(120572 120573)

(119902 minus 1)2 ≫ 120573 times 119906 + V

minus120572 times 119906 + V ≫ minus(119902 minus 1)23 The cloud compute the following value for the data ownerΦ = 119888119906 + 120591V mod119901 and sendsΦ4 Data Owner computes the following values120593 = 119863119890119888119904119896(119878119870Φ 119889) = (119898 times 119906 + V) mod 119902and compares 120593 with (119902 minus 1)2If 120593 lt (119902 minus 1)2119898 ge 0 Otherwise119898 lt 0

Algorithm 1 Secure outsourcing comparison (OSCP)

(iii) Decsk()

119863119890119888119904119896 (119888 119889) = (119888 times 119904minus119889 mod119901) mod 119902 (7)

231 Properties of the Proposed Homomorphic Encryption

Homomorphic Multiplication Let 1198881 1198882 be the ciphertexts oftwo plaintexts11989811198982Thenwe have 1198881 = 1199041198891(1199031119902+1198981) mod119901and 1198882 = 1199041198892(1199032119902 + 1198982) mod119901 for some random ingredients1199031 and 1199032 And we can obtain that

(1198881 times 1198882) mod119901

= 1199041198891 (1199031119902 + 1198981) mod119901 times 1199041198892 (1199032119902 + 1198982) mod119901

= 1199041198891+1198892 ((11990311199032119902 + 11990311198982 + 11990321198981) 119902 + 1198981 times 1198982) mod119901= 119864119899119888119904119896 (1198981 times 1198982 1198891 + 1198892)

(8)

Homomorphic Addition

(1198881 + 1198882) mod119901 = 119904119889 (1199031119902 + 1198981) mod119901

+ 119904119889 (1199032119902 + 1198982) mod119901

= 119904119889 ((1199031 + 1199032) 119902 + 1198981 + 1198982) mod119901= 119864119899119888119904119896 (1198981 + 1198982 119889)

(9)

Readers may refer to [18] for details on the scheme

24 Garbling Scheme Agarbling scheme [34] consists of fouralgorithms which is denoted by 119866 = (119866119887 119864119899119863119890 119864V) 119891 canbe transformed by Gb into (119865 119890 119889) Note that 119865 is the garbledcircuit The encoding and decoding information algorithmsare denoted by 119890 119889 The output of garbled 119884 can be encryptedand get the result 119910 = 119863119890(119889 119884)

25 Noninteractive Commitment A noninteractive commit-ment scheme [35] is also required in our paper denoted by

(Com119888119903119904Chk119888119903119904) The distribution of Com119888119903119904(119909 119903) is deter-mined by the value of 119903 as Com119888119903119904(119909)

26 Basic Cryptographic Subprotocols In this section wepresent a set of cryptographic subprotocols that will be usedas subroutines when constructing the proposed protocol

261 Outsourcing Secure Comparison Protocol (OSCP) Thevalue of119898 is kept secure from the cloud and users The valueof 119888 = 119864(119898mod 119902) is computed 119878119870 is kept by the data owner(Algorithm 1)

262 Secure Equivalent Testing Protocol (SET) With twociphertexts c1 = Encsk(m1) and c2 = Encsk(m2) SET is tocompute f and decide if the plaintexts are identical (m1 = m2)(Algorithm 2)

263 Secure Multiplication Protocol (SMP) The algorithm isdescribed as in Algorithm 3

264 Secure Minimum out of 2 Numbers Protocol (SMIN2)The algorithm is described as in Algorithm 4

27 SecureCircuit Protocol (SCP) Wedenote the three partiesof the protocol by CSS1 CSS2 and CCS and their respectiveinputs by x1 x2 or x

lowast3 Their goal is to securely compute the

function y = f(x1 x2 xlowast3 ) = x1 sdot x2xlowast3 [34] (Algorithm 5)For simplicity we assume that |xi| = |y| = m All

communication between parties is via private point-to-pointchannels Next we assume that CSS1 and CSS2 can learn thesame output y while CCS can get the garbled values for theportion of the output wires corresponding to its own outputsonly CCS cannot get the output y with these garbled valuesThis protocol uses a garbling scheme a four-tuple algorithm120575 = (GbEnDe Ev) as the underlying algorithm Gb is arandomized garbling algorithm that transforms a function ofa triple En and De are encoding and decoding algorithmsrespectively Ev is an algorithm that produces a garbled output

Wireless Communications and Mobile Computing 5

Two ciphertext are computed by the cloud as 1198881 = 119864119899119888119904119896(1198981) and 1198882 = 119864119899119888119904119896(1198982)1 The cloud computes 11988800 = 119864119899119888119904119896(11989800) = 1198881 minus 1198882 and 11988801 = 119864119899119888119904119896(11989801) = 1198882 minus 11988812 Check if11989800 ge 0 or11989801 ge 0 and computesif11989800 ge 0 and 11989801 lt 0 1198910 = 1 1198911 = 0 else if11989800 lt 0 and 11989801 ge 0 1198910 = 0 1198911 = 13 The value of 119891 is computed as follows

119891 = 1198910 oplus 1198911If 119891 = 0 set1198981 = 1198982

Algorithm 2 Secure equivalent testing protocol (SET)

The values are computed 119864119901119896(119909) and 119864119901119896(119910) 119875 keeps (119901119896 119904119896)1 119862(119886) Choose 119903119909 119903119910 isin 119885119873(119887) 1199091015840 larr997888 119864119901119896(119909)119864119901119896(119903119909) 1199101015840 larr997888 119864119901119896(119910)119864119901119896(119903119910)(119888) computes and sends 1199091015840 1199101015840 to 1198752 119875 computes(119886) ℎ119909 larr997888 119863119904119896(1199091015840) ℎ119910 larr997888 119863119904119896(1199101015840) ℎ larr997888 ℎ119909ℎ119910 mod119873 ℎ1015840 larr997888 119864119901119896(ℎ)(119887)The value of ℎ1015840 is sent to C3 119862 computes(119886) 119904 larr997888 ℎ1015840119864119901119896(119909)119873minus119903119910 1199041015840 larr997888 119904119864119901119896(119910)119873minus119903119909(119887) 119864119901119896(119909119910) larr997888 1199041015840119864119901119896(119903119909119903119910)119873minus1

Algorithm 3 119878119872(119864119901119896(119909) 119864119901119896(119910)) 997888rarr 119864119901119896(119909119910)

1 119862(119886)The function is chosen 119865(119887)for 119894 = 1 to 120582 do119864119901119896(119906119894V119894) larr997888 119878119872(119864119901119896(119906119894) 119864119901119896(V119894))if 119865 119906 gt V then119882119894 larr997888 119864119901119896(119906119894)119864119901119896(119906119894V119894)119873minus1

endelse119882119894 larr997888 119864119901119896(V119894)119864119901119896(119906119894V119894)119873minus1

end119866119894 larr997888 119864119901119896(119906119894 oplus V119894)119867119894 larr997888 119867119903119894119894minus1119866119894 119903119894isin119877119885119873 and1198670 = 119864(0)Φ119894 larr997888 119864119901119896(minus1)119867119894119871 119894 larr997888 119882119894Φ119903

1015840119894

119894 1199031015840119894 isin 119885119873end(119888) Sends 119871 to 1198752 119875(119886) 119872119894 larr997888 119863(119871 119894) for 1 le 119894 le 120582(119887)if exist119895 such that 119872119895 = 1 then120572 larr997888 1 (which means 119906 gt V)

endelse120572 larr997888 0 (which means 119906 lt V)

end

Algorithm 4 Secure Minimum out of 2 Numbers Protocol (SMIN2)

6 Wireless Communications and Mobile Computing

Input CSS1 has 1199091 CSS2 has 1199092Output (1199091 + 1199092)(1198861 + 1198862)1 CCS(119886) Randomly selects 119888119903119904 for commitment and sends 119888119903119904 to 1198621198781198781 and 1198621198781198782(119887) Generates the random number 120582 and share it secretly as 120582 = 1205821 oplus 1205822 sends 1198971198861198981198871198891198861 to 1198621198781198781 and sends 1198971198861198981198871198891198862 to 11986211987811987822 1198621198781198781 Select seed 119903 larr997888 0 1119896 for pseudo random function 119875119877119865 and send 119903 to 11986211987811987823 CSS1 and CSS2 (119886) Generate corresponding circuit 119866119887(1120581 119891) 997888rarr (119865 119890 119889) based on function 119891 = (1199091 + 1199092) lowast 120582 (119887) Randomselection of 1198871 1198872 larr997888 0 14119898 and generate the following commitments for all 119895 isin [4119898] and 119905 isin 0 1(1198621199051119895 1205901199051119895) larr997888 Com119888119903119904(119890[119895 1198871[119895] oplus 119905])(1198621199052119895 1205901199052119895) larr997888 Com119888119903119904(119890[119895 1198872[119895] oplus 119905])

(119888) 1198621198781198781 and 1198621198781198782 send the following information to 119862119862119878(1198871 [2119898 + 1 sdot sdot sdot 4119898] 119865 1198621199051119895119895119905)(1198872 [2119898 + 1 sdot sdot sdot 4119898] 119865 1198621199052119895119895119905)

4 CCS Abort if CSS1 and CSS2 report different values for these items5 CSS1 and CSS2(119886) CSS1 sends decommitment1205901199091[119895]oplus1198871[119895]1119895 1205901205821[119895]oplus1198871[2119898+119895]12119898+119895 1205901199091[119895]oplus1198872[119895]2119895 and 1205901205821 [119895]oplus1198872[2119898+119895]22119898+119895 to CCS(119887) CSS2 sends decommitment 1205901199092[119895]oplus1198871[119898+119895]1119898+119895 1205901205822[119895]oplus1198871[3119898+119895]13119898+119895 1205901199092[119895]oplus1198872[119898+119895]2119898+119895 and 1205901205822[119895]oplus1198872[3119898+119895]23119898+119895 to CCS6 CCS (119886) For 119895 isin [4119898] compute119883[119895] = Chk119888119903119904(119862119900[119895]1119895 120590119900[119895]1119895 )1198831015840[119895] = Chk119888119903119904(119862119900[119895]2119895 120590119900[119895]2119895 ) for the appropriate 119900[119895] If any call toChk returns perp then abort Similarly CCS knows the values 1198871[2119898 + 1 sdot sdot sdot 4119898] and 1198872[2119898 + 1 sdot sdot sdot 4119898] and aborts if CSS1 or CSS2can not open the corresponding commitments of 1205821 and 1205822 1198621205821[119895]oplus1198871[2119898+119895]12119898+119895 1198621205822[119895]oplus1198871[3119898+119895]13119898+119895 1198621205821[119895]oplus1198872[2119898+119895]22119898+119895 and 1198621205822[119895]oplus1198872[3119898+119895]23119898+119895 (119887) Run 119884 larr997888 Ev(119865119883) and 1198841015840 larr997888 Ev(1198651198831015840) then broadcasts 119884 and 1198841015840 to CSS1 and CSS27 CSS1 and CSS2 Compute (1199091 + 1199092)(1198861 + 1198862) = 119863119890(119889 119884)119863119890(119889 1198841015840)

Algorithm 5 Secure circuit protocol (SCP)

for a garbled input and garbled circuit Further Chk is analgorithm that can verify commitments

3 Outsourcing Privacy-Preserving ID3Decision Tree Algorithm in Malicious Model

In this section we present our secure outsourcing ID3decision tree in cloud computing using the homomorphicencryption scheme and subprotocols proposed in Section 2as building blocks

31 Main Concept The aim is to privately compute ID3 overencrypted databases and the key is to find privately theattribute A for which Gain is maximum From the abovedescription the key value which needs to be calculated withother parties is Entropy(Sai)

Since all the data was encrypted and sent to the cloud thecloud server can count the number of |S(ck)|t |S|it using theSETprotocol described in Section 2Now (3) can be executedas (x1+x2)(a1+a2)log2(x1+x2)(a1+a2) and the calculationof the logarithmic operation can be performed in CSS Thevalue to be calculated is the value of c1 = (x1 + x2)(a1 + a2)which can be easily determined using our SCP protocol asexplained in Section 2 Then all the parties can calculate thevalue of Entropy(S) independently

32 System Model The system model is shown in Figure 1which includes two data owners and cloud servers (cloudstorage server CSS1 CSS2 and cloud computing serverCCS) Each data owner owns a private data set that isencrypted and outsourced to cloud server storage Data

owners can request cloud server to process ID3 data onencrypted data At the same time CSS and CCS serversparticipate in supporting the outsourcing privacy protec-tion ID3 data mining algorithm steps after the imple-mentation of the algorithm the final results are sent tothe data owner Assuming that the data owner and theCSS server are semihonest participants CCS is a maliciousparticipant

33 Details of the Proposed Algorithm Our securely out-sourcing ID3 decision tree (SOID3) algorithm is detailed asfollows(1) P1 and P2 run KeyGen(120582) to generate the secret key

SKi i = 1 2 and a public parameter p of Lirsquos homomorphicencryption scheme Further each party shares p with theother party and the cloud but shares SKi only with itself(2) Each party uses its key SKi to encrypt every attribute

value of its database and then outsources the encrypteddatabase to the CSS (CSS1 and CSS2)(3)The CSS1 and CSS2 use the SET protocol to calculate

the value of |Saj |i and |Saj(ck)|i for each attribute with eachparty Pi(4)Eachparty generates its Paillier public and private keys

(pki ski) i = 1 2 and sends the public keys to the CSS1 andCSS2(5) CCS CSS1 and CSS2 jointly use the SCP protocol to

compute (x1 + x2)(a1 + a2) Here CSS1 has (x1 a1) and CSS2has (x2 a2)(6) Each party decrypts the received information cal-

culates it with the logarithmic operation of (x1 + x2a1 +

Wireless Communications and Mobile Computing 7

Sensors

Sensors

Database Database

Result Result

P1 P2

Cloud Storage Servers

Cloud Computing Servers

Figure 1 System model under consideration

a2)log2(x1 + x2a1 + a2) and then encrypts it with its publickey Then it sends it back to the cloud(7)After getting the result CSS1 and CSS2 use the SMIN2

protocol to select the ciphertext data with theminimum valueand then further select the attribute label with the maximuminformation gain and return it to each participant(8) The participants divide the data sets and build tree

nodes Then go to Step (3) until termination

4 Security Analysis

In this section we prove that the secure outsourcing ID3decision tree (SOID3) algorithm can offer protection againstthe malicious cloud server

Theorem 1 The SOID3 algorithm can achieve privacy for eachparty and the semihonest cloud storage server

Proof We mainly consider the security model under thenoncollusive semihonest model and the semihonest cloudserver Suppose there are two parties P1 and P2 and cloudstorage server CSS

Let P = (P1 P2CSS) be the participants of all protocolsConsider three types of attackers ( AP1 AP2 and ACSS) thatcan invade P1 P2 and CSS In the real model P1 and P2 havedata sets Dx and Dy respectively and CSS has encrypted datasets Enc(119863119909) and Enc(119863119910)MakeH sub P a collection of honestparticipants For all Pi120598H outPi indicates the output of Pi IfPi is invaded outPi represents all views of participant Pi inrunning protocol Π

For each Plowast isin P the attacker A = (AP1 AP2 ACSS) viewin the runtime protocol Π can be defined as

REALPlowast

ΠAH (DxDy) = outPi Pi120598H cup outPlowast (10)

In the ideal model there exists an ideal model F forfunction f and all participants can interact with the modelF That is Challenger DPa and participant Pi can send data xand y to F If Dx or Dy is perp F returns perp Finally F can return

f(DxDy) to challenger DPa As mentioned earlier H sub P isa collection of honest participants For each participant Pi120598Hin the collection return the outPi as F output to Pi If Pi isintruded on by a semihonest attacker outPi is still consistentwith the output of Pi in previous realistic models

For all Plowast isin P in the ideal model in the presence ofindependent simulators Sim = (SimP1 SimP2 SimCSS) the Plowastview is

IDEALPlowast

FSimH (DxDy) = outPi Pi120598H cup outPlowast (11)

Therefore it is considered that the protocolΠ is secure inthe presence of noncolluded semitruthful attackersDefinition 2 Let f be a deterministic functionality amongparties in P Let H sub P be the subset of honest partiesin P We say that Π securely realizes f if there exists a setSim = SimP1 SimP2 SimS of PPT transformations (whereSimDa

= SimP1(AP1 ) and so on) such that for all semihonestPPT adversaries A = AP1 AP2 AS for all inputs DxDy andauxiliary inputs z and for all parties P isin P the followingholds

REALPlowast

ΠAHz (120582 x y)120582isinN equiv

IDEALPlowast

FSimHz (120582 x y)120582isinN (12)

where equiv denotes computational indistinguishability

Theorem 2 The SOID3 algorithm is secure with the semi-honest cloud storage server and the malicious cloud computingserver

Proof First consider the case where CSS1 or CSS2 is cor-rupted It is necessary to prove that in the SCP protocol theideal model and the realistic model are not distinguishableThat is in the following interactions it is impossible to dis-tinguish between the various types of interaction informationand outputs of the participants in the ideal model and the realmodel

8 Wireless Communications and Mobile Computing

(1) In the realmodel assume that there is an emulator thatcan simulate various behaviors of a semihonest participantCSS1 (or CSS2) and receive inputs (x1 a2) and (x2 a2) fromthe execution environment of the protocol At the same timethe simulator can simulate the function Ff which sends allinputs (x1 a1) and (x2 a2) to the simulated Ff Since thesimulator does not do anything computed by Ff there is nodifference between the real Ff and the simulated Ff from theexecution environment point of view(2) Because in Step 2 CSS1 and CSS2 uniformly select the

seed r of Pseudo-Random Function (PRF) the PRF securityshows that the real model in Step 2 is indistinguishable fromthe ideal model(3) In Step 3 we modify the simulator which knows in

advance what promises will be opened when the simulatorgenerates commitment C First the simulator selects therandom numbers o1 o2 that can be marked which promisesto be opened and calculates the values of b1 = o1 oplus x1 x2b2 = o2 oplus a1 a2 At this point the simulator has obtainedthe values of x1 x2 and a1 a2Then the simulator can submitthe markings that promise not to be opened In this processdue to the concealment of commitment the realistic modeland the ideal model are equally indistinguishable(4) In Step 6a the simulated CSS1 and CSS2 stop exe-

cuting when De(D Y) = perp Change the emulator to makeY = Ev(FX) By obfuscating the authenticity of the circuitCCS has only negligible probability to obtain Y = Ev(FX) inDe(d Y) = perp Therefore in this step the realistic model andthe ideal model are equally indistinguishable(5) In Step 6b the correctness of the obfuscation circuit

guarantees that both CSS1 and CSS2 of the analog can beoutput Therefore if there is no pause in 6a we can modifythe simulator to an analog obfuscation circuit that generates(FX d) We can simulate the output of CSS1 and CSS2 bysimulating the instructions of Ff According to the security ofthe confusing circuit the real model is also indistinguishablefrom the ideal model in this step

Therefore in this protocol the execution environmentcan not distinguish between the realistic model and the idealmodel And the protocol is secure when CCS is a maliciousparticipant

5 Performance Analysis

In this paper we consider that CSS has a strong calculationability and we ignore its computation time Each data ownerdoes not need to store the ciphertext but can just use thepublic key to encrypt the message and the private key todecrypt the ciphertext

In each iteration first each data owner will execute theSBD protocol and SMIN2 protocol with the cloud There aretwo interactions in the SBD protocol and 2k interactions inthe SMIN2 protocolThen CSS1 CSS2 and CCS will execute6 interactions in the SCP protocol Finally each data ownerwill execute 1 interaction when it goes to the new iterationWe assume that t is the iteration time so the communicationtraffic is at most O(k lowast t)

In this paper a secure average computing protocolbased on SCP is implemented The server selected in the

0

50

100

150

Tim

e (s)

C1C2Client

100 200 300 400 500 6000Number of records in data set

Figure 2 Performance measurements for our SOID3 with 2 partic-ipants

experimental environment is CPU Intel (R) Xeon (R) CPUE5-2620 v3240GHzlowast2 memory 32G operating systemUbuntu 16044 LTS version In the experiment AES-128is chosen as the basic encryption method of the confusioncircuit and the open source code of JustGarble is changedand the commitment protocol is implemented based on SHA-256 Finally the average values obtained from experimentsare as follows

In our secure outsourcing ID3 decision tree (SOID3)algorithm experiment two participants were tested withdifferent numbers of records The experimental results areshown in Figure 2

From Figure 2 since the client is only responsible forencrypting uploaded data the time consumption is very lowIn the cloud CCS and CSS servers need to run SCP protocolresulting in a lot of time consumption (Table 1) The mainreason is that a large number of bit commitment processingis needed in the obfuscation circuit and the performanceimprovement will be focused on this issue in the follow-upwork

6 Conclusion

In this paper we proposed a secure outsourcing ID3 decisiontree algorithm for two parties of the malicious model Ouralgorithm can preserve the privacy of the usersrsquo data as wellas that of the data mining scheme for the cloud servers Theparties can get only the result trees and have no knowledgeabout the data mining scheme Moreover the cloud serverscannot get any private information about the parties Insummary our protocol offers protection against maliciouscloud servers

In the future we intend to extend our algorithm tovertical and arbitrary partitioning in the malicious modelIn addition we plan to extend our algorithm to a generalmultiparty privacy-preserving framework suitable for otheruseful schemes such as random decision tree Bayes SVM

Wireless Communications and Mobile Computing 9

Table 1 Time cost of the SCP protocol

AND XOR OR Input size(bits)

Output size(bits)

CSS1CSS2(ms)

119862119862119878(ms)

5090 4034 2016 129 65 26 47

and other data mining methods and can be extended for usein the wireless sensor-networks [36 37]

Data Availability

The data used to support the findings of this study areavailable from the corresponding author upon request

Conflicts of Interest

The authors declare that they have no conflicts of interest

Acknowledgments

This work is supported by National Key Research andDevelopment Program of China (no 2017YFB0803002)Basic Research Project of Shenzhen China (noJCYJ20160318094015947) National Natural ScienceFoundation of China (nos 61872109 and 61771222)and Key Technology Program of Shenzhen China (noJSGG20160427185010977)

References

[1] Y Lindell and B Pinkas ldquoPrivacy preserving data miningrdquoJournal of Cryptology vol 15 no 3 pp 177ndash206 2002

[2] R Agrawal and R Srikant ldquoPrivacy-preserving data miningrdquoACM SIGMOD Record vol 29 no 2 pp 439ndash450 2000

[3] D Liu E Bertino and X Yi ldquoPrivacy of outsourced k-means clusteringrdquo in Proceedings of the 9th ACM Symposiumon Information Computer and Communications Security ASIACCS 2014 pp 123ndash133 June 2014

[4] P Li J Li Z Huang C-Z Gao W-B Chen and K ChenldquoPrivacy-preserving outsourced classification in cloud comput-ingrdquo Cluster Computing pp 1ndash10 2017

[5] F Emekci O D Sahin D Agrawal and A El Abbadi ldquoPrivacypreserving decision tree learning over multiple partiesrdquoData ampKnowledge Engineering vol 63 no 2 pp 348ndash361 2007

[6] P Lory ldquoEnhancing the Efficiency in Privacy Preserving Learn-ing of Decision Trees in Partitioned Databasesrdquo in Privacy inStatistical Databases vol 7556 of Lecture Notes in ComputerScience pp 322ndash335 Springer Berlin Heidelberg Berlin Hei-delberg 2012

[7] J Vaidya and C Clifton ldquoPrivacy-preserving K-means cluster-ing over vertically partitioned datardquo in Proceedings of the 9thACM SIGKDD International Conference on Knowledge Discov-ery and Data Mining (KDD rsquo03) pp 206ndash215 Washington DC USA August 2003

[8] J Zhan S Matwin and LW Chang ldquoPrivacy-PreservingNaiveBayesian Classification over Horizontally Partitioned Datardquoin Data Mining Foundations and Practice vol 118 of Studiesin Computational Intelligence pp 529ndash538 Springer BerlinHeidelberg Berlin Heidelberg 2008

[9] S Han andW K Ng ldquoMulti-party privacy-preserving decisiontrees for arbitrarily partitioned datardquo International Journal ofIntelligent Control and Systems vol 12 no 4 2007

[10] T Li J Li Z Liu P Li and C Jia ldquoDifferentially private NaiveBayes learning overmultiple data sourcesrdquo Information Sciencesvol 444 pp 89ndash104 2018

[11] CGaoQ Cheng PHeW Susilo and J Li ldquoPrivacy-preservingNaive Bayes classifiers secure against the substitution-then-comparison attackrdquo Information Sciences vol 444 pp 72ndash882018

[12] J Shen T Zhou X Chen J Li and W Susilo ldquoAnonymousand Traceable Group Data Sharing in Cloud Computingrdquo IEEETransactions on Information Forensics and Security vol 13 no4 pp 912ndash925 2018

[13] Z Cai H Yan P Li Z-A Huang and C Gao ldquoTowards secureand flexible EHR sharing in mobile health cloud under staticassumptionsrdquo Cluster Computing vol 20 no 3 pp 2415ndash24222017

[14] J Li Y K Li X Chen P P C Lee and W Lou ldquoA HybridCloud Approach for Secure Authorized Deduplicationrdquo IEEETransactions on Parallel and Distributed Systems vol 26 no 5pp 1206ndash1216 2015

[15] Z Liu Y Huang J Li X Cheng and C Shen ldquoDivORAMTowards a practical oblivious RAM with variable block sizerdquoInformation Sciences vol 447 pp 1ndash11 2018

[16] X Chen J Li J Weng J Ma and W Lou ldquoVerifiable computa-tion over large database with incremental updatesrdquo Institute ofElectrical and Electronics Engineers Transactions on Computersvol 65 no 10 pp 3184ndash3195 2016

[17] J Shen C Wang T Li X Chen X Huang and Z-H ZhanldquoSecure data uploading scheme for a smart home systemrdquoInformation Sciences vol 453 pp 186ndash197 2018

[18] S ChenGWangG Yan andDXie ldquoMulti-dimensional fuzzytrust evaluation for mobile social networks based on dynamiccommunity structuresrdquoConcurrencyand Computation Practiceand Experience vol 29 no 7 p e3901 2017

[19] E Luo Q Liu J H Abawajy andGWang ldquoPrivacy-preservingmulti-hop profile-matching protocol for proximity mobilesocial networksrdquo Future Generation Computer Systems vol 68pp 222ndash233 2017

[20] X F Chen J Li J Ma Q Tang and W Lou ldquoNew algorithmsfor secure outsourcing of modular exponentiationsrdquo IEEETransactions on Parallel and Distributed Systems vol 25 no 9pp 2386ndash2396 2014

[21] R Bost R A Popa S Tu and S Goldwasser ldquoMachineLearning Classification over Encrypted Datardquo in Proceedings ofthe Network and Distributed System Security Symposium SanDiego CA

[22] A Peter E Tews and S Katzenbeisser ldquoEfficiently outsourcingmultiparty computation under multiple keysrdquo IEEE Transac-tions on Information Forensics and Security vol 8 no 12 pp2046ndash2058 2013

[23] G Jagannathan K Pillaipakkamnatt andRNWright ldquoA Prac-tical Differentially Private Random Decision Tree Classifierrdquo in

10 Wireless Communications and Mobile Computing

Proceedings of the 2009 IEEE International Conference on DataMining Workshops (ICDMW) pp 114ndash121 Miami FL USADecember 2009

[24] B Gilburd A Schuster and R Wolff ldquoPrivacy-preserving datamining on data grids in the presence of malicious participantsrdquoin Proceedings of the Proceedings 13th IEEE International Sym-posium on High performance Distributed Computing 2004 pp225ndash234 Honolulu HI USA

[25] D Shah and S Zhong ldquoTwo methods for privacy preservingdata mining with malicious participantsrdquo Information Sciencesvol 177 no 23 pp 5468ndash5483 2007

[26] P Li J Li Z Huang et al ldquoMulti-key privacy-preserving deeplearning in cloud computingrdquo Future Generation ComputerSystems vol 74 pp 76ndash85 2017

[27] Q Lin H Yan Z Huang W Chen J Shen and Y TangldquoAn ID-based linearly homomorphic signature scheme and itsapplication in blockchainrdquo IEEE Access vol PP no 99 pp 1-12018

[28] J Xu L Wei Y Zhang A Wang F Zhou and C Gao ldquoDy-namic Fully Homomorphic encryption-based Merkle Tree forlightweight streaming authenticated data structuresrdquo Journal ofNetwork and Computer Applications vol 107 pp 113ndash124 2018

[29] J Shen Z Gui S Ji J Shen H Tan and Y Tang ldquoCloud-aided lightweight certificateless authentication protocol withanonymity for wireless body area networksrdquo Journal of Networkand Computer Applications vol 106 pp 117ndash123 2018

[30] X Zhang Y Tan C Liang Y Li and J Li ldquoA Covert ChannelOver VoLTE via Adjusting Silence Periodsrdquo IEEE Access vol 6pp 9292ndash9302 2018

[31] Y Wang T Li H Qin et al ldquoA brief survey on secure multi-party computing in the presence of rational partiesrdquo Journal ofAmbient Intelligence and Humanized Computing vol 6 no 6pp 807ndash824 2015

[32] P Paillier ldquoPublic-key cryptosystems based on compositedegree residuosity classesrdquo in Advances in CryptologymdashEURO-CRYPT rsquo99 vol 1592 pp 223ndash238 Springer 1999

[33] L Li R Lu K-K R Choo A Datta and J Shao ldquoPrivacy-Preserving-Outsourced Association Rule Mining on Verti-cally Partitioned Databasesrdquo IEEE Transactions on InformationForensics and Security vol 11 no 8 pp 1547ndash1861 2016

[34] P Mohassel M Rosulek and Y Zhang ldquoFast and SecureThree-party Computationrdquo in Proceedings of the the 22ndACMSIGSACConference pp 591ndash602Denver Colorado USAOctober 2015

[35] M Naor ldquoBit commitment using pseudorandomnessrdquo Journalof Cryptology vol 4 no 2 pp 151ndash158 1991

[36] H Cheng Z Su N Xiong and Y Xiao ldquoEnergy-efficientnode scheduling algorithms for wireless sensor networks usingMarkov Random Field modelrdquo Information Sciences vol 329pp 461ndash477 2016

[37] H Cheng N Xiong A V Vasilakos L Tianruo Yang G Chenand X Zhuang ldquoNodes organization for channel assignmentwith topology preservation in multi-radio wireless mesh net-worksrdquo Ad Hoc Networks vol 10 no 5 pp 760ndash773 2012

International Journal of

AerospaceEngineeringHindawiwwwhindawicom Volume 2018

RoboticsJournal of

Hindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom Volume 2018

Active and Passive Electronic Components

VLSI Design

Hindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom Volume 2018

Shock and Vibration

Hindawiwwwhindawicom Volume 2018

Civil EngineeringAdvances in

Acoustics and VibrationAdvances in

Hindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom Volume 2018

Electrical and Computer Engineering

Journal of

Advances inOptoElectronics

Hindawiwwwhindawicom

Volume 2018

Hindawi Publishing Corporation httpwwwhindawicom Volume 2013Hindawiwwwhindawicom

The Scientific World Journal

Volume 2018

Control Scienceand Engineering

Journal of

Hindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom

Journal ofEngineeringVolume 2018

SensorsJournal of

Hindawiwwwhindawicom Volume 2018

International Journal of

RotatingMachinery

Hindawiwwwhindawicom Volume 2018

Modelling ampSimulationin EngineeringHindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom Volume 2018

Chemical EngineeringInternational Journal of Antennas and

Propagation

International Journal of

Hindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom Volume 2018

Navigation and Observation

International Journal of

Hindawi

wwwhindawicom Volume 2018

Advances in

Multimedia

Submit your manuscripts atwwwhindawicom

Page 5: ResearchArticle - Hindawi Publishing Corporationdownloads.hindawi.com/journals/wcmc/2018/2385150.pdf · ResearchArticle Securely Outsourcing ID3 Decision Tree in Cloud Computing YeLi,1

Wireless Communications and Mobile Computing 5

Two ciphertext are computed by the cloud as 1198881 = 119864119899119888119904119896(1198981) and 1198882 = 119864119899119888119904119896(1198982)1 The cloud computes 11988800 = 119864119899119888119904119896(11989800) = 1198881 minus 1198882 and 11988801 = 119864119899119888119904119896(11989801) = 1198882 minus 11988812 Check if11989800 ge 0 or11989801 ge 0 and computesif11989800 ge 0 and 11989801 lt 0 1198910 = 1 1198911 = 0 else if11989800 lt 0 and 11989801 ge 0 1198910 = 0 1198911 = 13 The value of 119891 is computed as follows

119891 = 1198910 oplus 1198911If 119891 = 0 set1198981 = 1198982

Algorithm 2 Secure equivalent testing protocol (SET)

The values are computed 119864119901119896(119909) and 119864119901119896(119910) 119875 keeps (119901119896 119904119896)1 119862(119886) Choose 119903119909 119903119910 isin 119885119873(119887) 1199091015840 larr997888 119864119901119896(119909)119864119901119896(119903119909) 1199101015840 larr997888 119864119901119896(119910)119864119901119896(119903119910)(119888) computes and sends 1199091015840 1199101015840 to 1198752 119875 computes(119886) ℎ119909 larr997888 119863119904119896(1199091015840) ℎ119910 larr997888 119863119904119896(1199101015840) ℎ larr997888 ℎ119909ℎ119910 mod119873 ℎ1015840 larr997888 119864119901119896(ℎ)(119887)The value of ℎ1015840 is sent to C3 119862 computes(119886) 119904 larr997888 ℎ1015840119864119901119896(119909)119873minus119903119910 1199041015840 larr997888 119904119864119901119896(119910)119873minus119903119909(119887) 119864119901119896(119909119910) larr997888 1199041015840119864119901119896(119903119909119903119910)119873minus1

Algorithm 3 119878119872(119864119901119896(119909) 119864119901119896(119910)) 997888rarr 119864119901119896(119909119910)

1 119862(119886)The function is chosen 119865(119887)for 119894 = 1 to 120582 do119864119901119896(119906119894V119894) larr997888 119878119872(119864119901119896(119906119894) 119864119901119896(V119894))if 119865 119906 gt V then119882119894 larr997888 119864119901119896(119906119894)119864119901119896(119906119894V119894)119873minus1

endelse119882119894 larr997888 119864119901119896(V119894)119864119901119896(119906119894V119894)119873minus1

end119866119894 larr997888 119864119901119896(119906119894 oplus V119894)119867119894 larr997888 119867119903119894119894minus1119866119894 119903119894isin119877119885119873 and1198670 = 119864(0)Φ119894 larr997888 119864119901119896(minus1)119867119894119871 119894 larr997888 119882119894Φ119903

1015840119894

119894 1199031015840119894 isin 119885119873end(119888) Sends 119871 to 1198752 119875(119886) 119872119894 larr997888 119863(119871 119894) for 1 le 119894 le 120582(119887)if exist119895 such that 119872119895 = 1 then120572 larr997888 1 (which means 119906 gt V)

endelse120572 larr997888 0 (which means 119906 lt V)

end

Algorithm 4 Secure Minimum out of 2 Numbers Protocol (SMIN2)

6 Wireless Communications and Mobile Computing

Input CSS1 has 1199091 CSS2 has 1199092Output (1199091 + 1199092)(1198861 + 1198862)1 CCS(119886) Randomly selects 119888119903119904 for commitment and sends 119888119903119904 to 1198621198781198781 and 1198621198781198782(119887) Generates the random number 120582 and share it secretly as 120582 = 1205821 oplus 1205822 sends 1198971198861198981198871198891198861 to 1198621198781198781 and sends 1198971198861198981198871198891198862 to 11986211987811987822 1198621198781198781 Select seed 119903 larr997888 0 1119896 for pseudo random function 119875119877119865 and send 119903 to 11986211987811987823 CSS1 and CSS2 (119886) Generate corresponding circuit 119866119887(1120581 119891) 997888rarr (119865 119890 119889) based on function 119891 = (1199091 + 1199092) lowast 120582 (119887) Randomselection of 1198871 1198872 larr997888 0 14119898 and generate the following commitments for all 119895 isin [4119898] and 119905 isin 0 1(1198621199051119895 1205901199051119895) larr997888 Com119888119903119904(119890[119895 1198871[119895] oplus 119905])(1198621199052119895 1205901199052119895) larr997888 Com119888119903119904(119890[119895 1198872[119895] oplus 119905])

(119888) 1198621198781198781 and 1198621198781198782 send the following information to 119862119862119878(1198871 [2119898 + 1 sdot sdot sdot 4119898] 119865 1198621199051119895119895119905)(1198872 [2119898 + 1 sdot sdot sdot 4119898] 119865 1198621199052119895119895119905)

4 CCS Abort if CSS1 and CSS2 report different values for these items5 CSS1 and CSS2(119886) CSS1 sends decommitment1205901199091[119895]oplus1198871[119895]1119895 1205901205821[119895]oplus1198871[2119898+119895]12119898+119895 1205901199091[119895]oplus1198872[119895]2119895 and 1205901205821 [119895]oplus1198872[2119898+119895]22119898+119895 to CCS(119887) CSS2 sends decommitment 1205901199092[119895]oplus1198871[119898+119895]1119898+119895 1205901205822[119895]oplus1198871[3119898+119895]13119898+119895 1205901199092[119895]oplus1198872[119898+119895]2119898+119895 and 1205901205822[119895]oplus1198872[3119898+119895]23119898+119895 to CCS6 CCS (119886) For 119895 isin [4119898] compute119883[119895] = Chk119888119903119904(119862119900[119895]1119895 120590119900[119895]1119895 )1198831015840[119895] = Chk119888119903119904(119862119900[119895]2119895 120590119900[119895]2119895 ) for the appropriate 119900[119895] If any call toChk returns perp then abort Similarly CCS knows the values 1198871[2119898 + 1 sdot sdot sdot 4119898] and 1198872[2119898 + 1 sdot sdot sdot 4119898] and aborts if CSS1 or CSS2can not open the corresponding commitments of 1205821 and 1205822 1198621205821[119895]oplus1198871[2119898+119895]12119898+119895 1198621205822[119895]oplus1198871[3119898+119895]13119898+119895 1198621205821[119895]oplus1198872[2119898+119895]22119898+119895 and 1198621205822[119895]oplus1198872[3119898+119895]23119898+119895 (119887) Run 119884 larr997888 Ev(119865119883) and 1198841015840 larr997888 Ev(1198651198831015840) then broadcasts 119884 and 1198841015840 to CSS1 and CSS27 CSS1 and CSS2 Compute (1199091 + 1199092)(1198861 + 1198862) = 119863119890(119889 119884)119863119890(119889 1198841015840)

Algorithm 5 Secure circuit protocol (SCP)

for a garbled input and garbled circuit Further Chk is analgorithm that can verify commitments

3 Outsourcing Privacy-Preserving ID3Decision Tree Algorithm in Malicious Model

In this section we present our secure outsourcing ID3decision tree in cloud computing using the homomorphicencryption scheme and subprotocols proposed in Section 2as building blocks

31 Main Concept The aim is to privately compute ID3 overencrypted databases and the key is to find privately theattribute A for which Gain is maximum From the abovedescription the key value which needs to be calculated withother parties is Entropy(Sai)

Since all the data was encrypted and sent to the cloud thecloud server can count the number of |S(ck)|t |S|it using theSETprotocol described in Section 2Now (3) can be executedas (x1+x2)(a1+a2)log2(x1+x2)(a1+a2) and the calculationof the logarithmic operation can be performed in CSS Thevalue to be calculated is the value of c1 = (x1 + x2)(a1 + a2)which can be easily determined using our SCP protocol asexplained in Section 2 Then all the parties can calculate thevalue of Entropy(S) independently

32 System Model The system model is shown in Figure 1which includes two data owners and cloud servers (cloudstorage server CSS1 CSS2 and cloud computing serverCCS) Each data owner owns a private data set that isencrypted and outsourced to cloud server storage Data

owners can request cloud server to process ID3 data onencrypted data At the same time CSS and CCS serversparticipate in supporting the outsourcing privacy protec-tion ID3 data mining algorithm steps after the imple-mentation of the algorithm the final results are sent tothe data owner Assuming that the data owner and theCSS server are semihonest participants CCS is a maliciousparticipant

33 Details of the Proposed Algorithm Our securely out-sourcing ID3 decision tree (SOID3) algorithm is detailed asfollows(1) P1 and P2 run KeyGen(120582) to generate the secret key

SKi i = 1 2 and a public parameter p of Lirsquos homomorphicencryption scheme Further each party shares p with theother party and the cloud but shares SKi only with itself(2) Each party uses its key SKi to encrypt every attribute

value of its database and then outsources the encrypteddatabase to the CSS (CSS1 and CSS2)(3)The CSS1 and CSS2 use the SET protocol to calculate

the value of |Saj |i and |Saj(ck)|i for each attribute with eachparty Pi(4)Eachparty generates its Paillier public and private keys

(pki ski) i = 1 2 and sends the public keys to the CSS1 andCSS2(5) CCS CSS1 and CSS2 jointly use the SCP protocol to

compute (x1 + x2)(a1 + a2) Here CSS1 has (x1 a1) and CSS2has (x2 a2)(6) Each party decrypts the received information cal-

culates it with the logarithmic operation of (x1 + x2a1 +

Wireless Communications and Mobile Computing 7

Sensors

Sensors

Database Database

Result Result

P1 P2

Cloud Storage Servers

Cloud Computing Servers

Figure 1 System model under consideration

a2)log2(x1 + x2a1 + a2) and then encrypts it with its publickey Then it sends it back to the cloud(7)After getting the result CSS1 and CSS2 use the SMIN2

protocol to select the ciphertext data with theminimum valueand then further select the attribute label with the maximuminformation gain and return it to each participant(8) The participants divide the data sets and build tree

nodes Then go to Step (3) until termination

4 Security Analysis

In this section we prove that the secure outsourcing ID3decision tree (SOID3) algorithm can offer protection againstthe malicious cloud server

Theorem 1 The SOID3 algorithm can achieve privacy for eachparty and the semihonest cloud storage server

Proof We mainly consider the security model under thenoncollusive semihonest model and the semihonest cloudserver Suppose there are two parties P1 and P2 and cloudstorage server CSS

Let P = (P1 P2CSS) be the participants of all protocolsConsider three types of attackers ( AP1 AP2 and ACSS) thatcan invade P1 P2 and CSS In the real model P1 and P2 havedata sets Dx and Dy respectively and CSS has encrypted datasets Enc(119863119909) and Enc(119863119910)MakeH sub P a collection of honestparticipants For all Pi120598H outPi indicates the output of Pi IfPi is invaded outPi represents all views of participant Pi inrunning protocol Π

For each Plowast isin P the attacker A = (AP1 AP2 ACSS) viewin the runtime protocol Π can be defined as

REALPlowast

ΠAH (DxDy) = outPi Pi120598H cup outPlowast (10)

In the ideal model there exists an ideal model F forfunction f and all participants can interact with the modelF That is Challenger DPa and participant Pi can send data xand y to F If Dx or Dy is perp F returns perp Finally F can return

f(DxDy) to challenger DPa As mentioned earlier H sub P isa collection of honest participants For each participant Pi120598Hin the collection return the outPi as F output to Pi If Pi isintruded on by a semihonest attacker outPi is still consistentwith the output of Pi in previous realistic models

For all Plowast isin P in the ideal model in the presence ofindependent simulators Sim = (SimP1 SimP2 SimCSS) the Plowastview is

IDEALPlowast

FSimH (DxDy) = outPi Pi120598H cup outPlowast (11)

Therefore it is considered that the protocolΠ is secure inthe presence of noncolluded semitruthful attackersDefinition 2 Let f be a deterministic functionality amongparties in P Let H sub P be the subset of honest partiesin P We say that Π securely realizes f if there exists a setSim = SimP1 SimP2 SimS of PPT transformations (whereSimDa

= SimP1(AP1 ) and so on) such that for all semihonestPPT adversaries A = AP1 AP2 AS for all inputs DxDy andauxiliary inputs z and for all parties P isin P the followingholds

REALPlowast

ΠAHz (120582 x y)120582isinN equiv

IDEALPlowast

FSimHz (120582 x y)120582isinN (12)

where equiv denotes computational indistinguishability

Theorem 2 The SOID3 algorithm is secure with the semi-honest cloud storage server and the malicious cloud computingserver

Proof First consider the case where CSS1 or CSS2 is cor-rupted It is necessary to prove that in the SCP protocol theideal model and the realistic model are not distinguishableThat is in the following interactions it is impossible to dis-tinguish between the various types of interaction informationand outputs of the participants in the ideal model and the realmodel

8 Wireless Communications and Mobile Computing

(1) In the realmodel assume that there is an emulator thatcan simulate various behaviors of a semihonest participantCSS1 (or CSS2) and receive inputs (x1 a2) and (x2 a2) fromthe execution environment of the protocol At the same timethe simulator can simulate the function Ff which sends allinputs (x1 a1) and (x2 a2) to the simulated Ff Since thesimulator does not do anything computed by Ff there is nodifference between the real Ff and the simulated Ff from theexecution environment point of view(2) Because in Step 2 CSS1 and CSS2 uniformly select the

seed r of Pseudo-Random Function (PRF) the PRF securityshows that the real model in Step 2 is indistinguishable fromthe ideal model(3) In Step 3 we modify the simulator which knows in

advance what promises will be opened when the simulatorgenerates commitment C First the simulator selects therandom numbers o1 o2 that can be marked which promisesto be opened and calculates the values of b1 = o1 oplus x1 x2b2 = o2 oplus a1 a2 At this point the simulator has obtainedthe values of x1 x2 and a1 a2Then the simulator can submitthe markings that promise not to be opened In this processdue to the concealment of commitment the realistic modeland the ideal model are equally indistinguishable(4) In Step 6a the simulated CSS1 and CSS2 stop exe-

cuting when De(D Y) = perp Change the emulator to makeY = Ev(FX) By obfuscating the authenticity of the circuitCCS has only negligible probability to obtain Y = Ev(FX) inDe(d Y) = perp Therefore in this step the realistic model andthe ideal model are equally indistinguishable(5) In Step 6b the correctness of the obfuscation circuit

guarantees that both CSS1 and CSS2 of the analog can beoutput Therefore if there is no pause in 6a we can modifythe simulator to an analog obfuscation circuit that generates(FX d) We can simulate the output of CSS1 and CSS2 bysimulating the instructions of Ff According to the security ofthe confusing circuit the real model is also indistinguishablefrom the ideal model in this step

Therefore in this protocol the execution environmentcan not distinguish between the realistic model and the idealmodel And the protocol is secure when CCS is a maliciousparticipant

5 Performance Analysis

In this paper we consider that CSS has a strong calculationability and we ignore its computation time Each data ownerdoes not need to store the ciphertext but can just use thepublic key to encrypt the message and the private key todecrypt the ciphertext

In each iteration first each data owner will execute theSBD protocol and SMIN2 protocol with the cloud There aretwo interactions in the SBD protocol and 2k interactions inthe SMIN2 protocolThen CSS1 CSS2 and CCS will execute6 interactions in the SCP protocol Finally each data ownerwill execute 1 interaction when it goes to the new iterationWe assume that t is the iteration time so the communicationtraffic is at most O(k lowast t)

In this paper a secure average computing protocolbased on SCP is implemented The server selected in the

0

50

100

150

Tim

e (s)

C1C2Client

100 200 300 400 500 6000Number of records in data set

Figure 2 Performance measurements for our SOID3 with 2 partic-ipants

experimental environment is CPU Intel (R) Xeon (R) CPUE5-2620 v3240GHzlowast2 memory 32G operating systemUbuntu 16044 LTS version In the experiment AES-128is chosen as the basic encryption method of the confusioncircuit and the open source code of JustGarble is changedand the commitment protocol is implemented based on SHA-256 Finally the average values obtained from experimentsare as follows

In our secure outsourcing ID3 decision tree (SOID3)algorithm experiment two participants were tested withdifferent numbers of records The experimental results areshown in Figure 2

From Figure 2 since the client is only responsible forencrypting uploaded data the time consumption is very lowIn the cloud CCS and CSS servers need to run SCP protocolresulting in a lot of time consumption (Table 1) The mainreason is that a large number of bit commitment processingis needed in the obfuscation circuit and the performanceimprovement will be focused on this issue in the follow-upwork

6 Conclusion

In this paper we proposed a secure outsourcing ID3 decisiontree algorithm for two parties of the malicious model Ouralgorithm can preserve the privacy of the usersrsquo data as wellas that of the data mining scheme for the cloud servers Theparties can get only the result trees and have no knowledgeabout the data mining scheme Moreover the cloud serverscannot get any private information about the parties Insummary our protocol offers protection against maliciouscloud servers

In the future we intend to extend our algorithm tovertical and arbitrary partitioning in the malicious modelIn addition we plan to extend our algorithm to a generalmultiparty privacy-preserving framework suitable for otheruseful schemes such as random decision tree Bayes SVM

Wireless Communications and Mobile Computing 9

Table 1 Time cost of the SCP protocol

AND XOR OR Input size(bits)

Output size(bits)

CSS1CSS2(ms)

119862119862119878(ms)

5090 4034 2016 129 65 26 47

and other data mining methods and can be extended for usein the wireless sensor-networks [36 37]

Data Availability

The data used to support the findings of this study areavailable from the corresponding author upon request

Conflicts of Interest

The authors declare that they have no conflicts of interest

Acknowledgments

This work is supported by National Key Research andDevelopment Program of China (no 2017YFB0803002)Basic Research Project of Shenzhen China (noJCYJ20160318094015947) National Natural ScienceFoundation of China (nos 61872109 and 61771222)and Key Technology Program of Shenzhen China (noJSGG20160427185010977)

References

[1] Y Lindell and B Pinkas ldquoPrivacy preserving data miningrdquoJournal of Cryptology vol 15 no 3 pp 177ndash206 2002

[2] R Agrawal and R Srikant ldquoPrivacy-preserving data miningrdquoACM SIGMOD Record vol 29 no 2 pp 439ndash450 2000

[3] D Liu E Bertino and X Yi ldquoPrivacy of outsourced k-means clusteringrdquo in Proceedings of the 9th ACM Symposiumon Information Computer and Communications Security ASIACCS 2014 pp 123ndash133 June 2014

[4] P Li J Li Z Huang C-Z Gao W-B Chen and K ChenldquoPrivacy-preserving outsourced classification in cloud comput-ingrdquo Cluster Computing pp 1ndash10 2017

[5] F Emekci O D Sahin D Agrawal and A El Abbadi ldquoPrivacypreserving decision tree learning over multiple partiesrdquoData ampKnowledge Engineering vol 63 no 2 pp 348ndash361 2007

[6] P Lory ldquoEnhancing the Efficiency in Privacy Preserving Learn-ing of Decision Trees in Partitioned Databasesrdquo in Privacy inStatistical Databases vol 7556 of Lecture Notes in ComputerScience pp 322ndash335 Springer Berlin Heidelberg Berlin Hei-delberg 2012

[7] J Vaidya and C Clifton ldquoPrivacy-preserving K-means cluster-ing over vertically partitioned datardquo in Proceedings of the 9thACM SIGKDD International Conference on Knowledge Discov-ery and Data Mining (KDD rsquo03) pp 206ndash215 Washington DC USA August 2003

[8] J Zhan S Matwin and LW Chang ldquoPrivacy-PreservingNaiveBayesian Classification over Horizontally Partitioned Datardquoin Data Mining Foundations and Practice vol 118 of Studiesin Computational Intelligence pp 529ndash538 Springer BerlinHeidelberg Berlin Heidelberg 2008

[9] S Han andW K Ng ldquoMulti-party privacy-preserving decisiontrees for arbitrarily partitioned datardquo International Journal ofIntelligent Control and Systems vol 12 no 4 2007

[10] T Li J Li Z Liu P Li and C Jia ldquoDifferentially private NaiveBayes learning overmultiple data sourcesrdquo Information Sciencesvol 444 pp 89ndash104 2018

[11] CGaoQ Cheng PHeW Susilo and J Li ldquoPrivacy-preservingNaive Bayes classifiers secure against the substitution-then-comparison attackrdquo Information Sciences vol 444 pp 72ndash882018

[12] J Shen T Zhou X Chen J Li and W Susilo ldquoAnonymousand Traceable Group Data Sharing in Cloud Computingrdquo IEEETransactions on Information Forensics and Security vol 13 no4 pp 912ndash925 2018

[13] Z Cai H Yan P Li Z-A Huang and C Gao ldquoTowards secureand flexible EHR sharing in mobile health cloud under staticassumptionsrdquo Cluster Computing vol 20 no 3 pp 2415ndash24222017

[14] J Li Y K Li X Chen P P C Lee and W Lou ldquoA HybridCloud Approach for Secure Authorized Deduplicationrdquo IEEETransactions on Parallel and Distributed Systems vol 26 no 5pp 1206ndash1216 2015

[15] Z Liu Y Huang J Li X Cheng and C Shen ldquoDivORAMTowards a practical oblivious RAM with variable block sizerdquoInformation Sciences vol 447 pp 1ndash11 2018

[16] X Chen J Li J Weng J Ma and W Lou ldquoVerifiable computa-tion over large database with incremental updatesrdquo Institute ofElectrical and Electronics Engineers Transactions on Computersvol 65 no 10 pp 3184ndash3195 2016

[17] J Shen C Wang T Li X Chen X Huang and Z-H ZhanldquoSecure data uploading scheme for a smart home systemrdquoInformation Sciences vol 453 pp 186ndash197 2018

[18] S ChenGWangG Yan andDXie ldquoMulti-dimensional fuzzytrust evaluation for mobile social networks based on dynamiccommunity structuresrdquoConcurrencyand Computation Practiceand Experience vol 29 no 7 p e3901 2017

[19] E Luo Q Liu J H Abawajy andGWang ldquoPrivacy-preservingmulti-hop profile-matching protocol for proximity mobilesocial networksrdquo Future Generation Computer Systems vol 68pp 222ndash233 2017

[20] X F Chen J Li J Ma Q Tang and W Lou ldquoNew algorithmsfor secure outsourcing of modular exponentiationsrdquo IEEETransactions on Parallel and Distributed Systems vol 25 no 9pp 2386ndash2396 2014

[21] R Bost R A Popa S Tu and S Goldwasser ldquoMachineLearning Classification over Encrypted Datardquo in Proceedings ofthe Network and Distributed System Security Symposium SanDiego CA

[22] A Peter E Tews and S Katzenbeisser ldquoEfficiently outsourcingmultiparty computation under multiple keysrdquo IEEE Transac-tions on Information Forensics and Security vol 8 no 12 pp2046ndash2058 2013

[23] G Jagannathan K Pillaipakkamnatt andRNWright ldquoA Prac-tical Differentially Private Random Decision Tree Classifierrdquo in

10 Wireless Communications and Mobile Computing

Proceedings of the 2009 IEEE International Conference on DataMining Workshops (ICDMW) pp 114ndash121 Miami FL USADecember 2009

[24] B Gilburd A Schuster and R Wolff ldquoPrivacy-preserving datamining on data grids in the presence of malicious participantsrdquoin Proceedings of the Proceedings 13th IEEE International Sym-posium on High performance Distributed Computing 2004 pp225ndash234 Honolulu HI USA

[25] D Shah and S Zhong ldquoTwo methods for privacy preservingdata mining with malicious participantsrdquo Information Sciencesvol 177 no 23 pp 5468ndash5483 2007

[26] P Li J Li Z Huang et al ldquoMulti-key privacy-preserving deeplearning in cloud computingrdquo Future Generation ComputerSystems vol 74 pp 76ndash85 2017

[27] Q Lin H Yan Z Huang W Chen J Shen and Y TangldquoAn ID-based linearly homomorphic signature scheme and itsapplication in blockchainrdquo IEEE Access vol PP no 99 pp 1-12018

[28] J Xu L Wei Y Zhang A Wang F Zhou and C Gao ldquoDy-namic Fully Homomorphic encryption-based Merkle Tree forlightweight streaming authenticated data structuresrdquo Journal ofNetwork and Computer Applications vol 107 pp 113ndash124 2018

[29] J Shen Z Gui S Ji J Shen H Tan and Y Tang ldquoCloud-aided lightweight certificateless authentication protocol withanonymity for wireless body area networksrdquo Journal of Networkand Computer Applications vol 106 pp 117ndash123 2018

[30] X Zhang Y Tan C Liang Y Li and J Li ldquoA Covert ChannelOver VoLTE via Adjusting Silence Periodsrdquo IEEE Access vol 6pp 9292ndash9302 2018

[31] Y Wang T Li H Qin et al ldquoA brief survey on secure multi-party computing in the presence of rational partiesrdquo Journal ofAmbient Intelligence and Humanized Computing vol 6 no 6pp 807ndash824 2015

[32] P Paillier ldquoPublic-key cryptosystems based on compositedegree residuosity classesrdquo in Advances in CryptologymdashEURO-CRYPT rsquo99 vol 1592 pp 223ndash238 Springer 1999

[33] L Li R Lu K-K R Choo A Datta and J Shao ldquoPrivacy-Preserving-Outsourced Association Rule Mining on Verti-cally Partitioned Databasesrdquo IEEE Transactions on InformationForensics and Security vol 11 no 8 pp 1547ndash1861 2016

[34] P Mohassel M Rosulek and Y Zhang ldquoFast and SecureThree-party Computationrdquo in Proceedings of the the 22ndACMSIGSACConference pp 591ndash602Denver Colorado USAOctober 2015

[35] M Naor ldquoBit commitment using pseudorandomnessrdquo Journalof Cryptology vol 4 no 2 pp 151ndash158 1991

[36] H Cheng Z Su N Xiong and Y Xiao ldquoEnergy-efficientnode scheduling algorithms for wireless sensor networks usingMarkov Random Field modelrdquo Information Sciences vol 329pp 461ndash477 2016

[37] H Cheng N Xiong A V Vasilakos L Tianruo Yang G Chenand X Zhuang ldquoNodes organization for channel assignmentwith topology preservation in multi-radio wireless mesh net-worksrdquo Ad Hoc Networks vol 10 no 5 pp 760ndash773 2012

International Journal of

AerospaceEngineeringHindawiwwwhindawicom Volume 2018

RoboticsJournal of

Hindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom Volume 2018

Active and Passive Electronic Components

VLSI Design

Hindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom Volume 2018

Shock and Vibration

Hindawiwwwhindawicom Volume 2018

Civil EngineeringAdvances in

Acoustics and VibrationAdvances in

Hindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom Volume 2018

Electrical and Computer Engineering

Journal of

Advances inOptoElectronics

Hindawiwwwhindawicom

Volume 2018

Hindawi Publishing Corporation httpwwwhindawicom Volume 2013Hindawiwwwhindawicom

The Scientific World Journal

Volume 2018

Control Scienceand Engineering

Journal of

Hindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom

Journal ofEngineeringVolume 2018

SensorsJournal of

Hindawiwwwhindawicom Volume 2018

International Journal of

RotatingMachinery

Hindawiwwwhindawicom Volume 2018

Modelling ampSimulationin EngineeringHindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom Volume 2018

Chemical EngineeringInternational Journal of Antennas and

Propagation

International Journal of

Hindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom Volume 2018

Navigation and Observation

International Journal of

Hindawi

wwwhindawicom Volume 2018

Advances in

Multimedia

Submit your manuscripts atwwwhindawicom

Page 6: ResearchArticle - Hindawi Publishing Corporationdownloads.hindawi.com/journals/wcmc/2018/2385150.pdf · ResearchArticle Securely Outsourcing ID3 Decision Tree in Cloud Computing YeLi,1

6 Wireless Communications and Mobile Computing

Input CSS1 has 1199091 CSS2 has 1199092Output (1199091 + 1199092)(1198861 + 1198862)1 CCS(119886) Randomly selects 119888119903119904 for commitment and sends 119888119903119904 to 1198621198781198781 and 1198621198781198782(119887) Generates the random number 120582 and share it secretly as 120582 = 1205821 oplus 1205822 sends 1198971198861198981198871198891198861 to 1198621198781198781 and sends 1198971198861198981198871198891198862 to 11986211987811987822 1198621198781198781 Select seed 119903 larr997888 0 1119896 for pseudo random function 119875119877119865 and send 119903 to 11986211987811987823 CSS1 and CSS2 (119886) Generate corresponding circuit 119866119887(1120581 119891) 997888rarr (119865 119890 119889) based on function 119891 = (1199091 + 1199092) lowast 120582 (119887) Randomselection of 1198871 1198872 larr997888 0 14119898 and generate the following commitments for all 119895 isin [4119898] and 119905 isin 0 1(1198621199051119895 1205901199051119895) larr997888 Com119888119903119904(119890[119895 1198871[119895] oplus 119905])(1198621199052119895 1205901199052119895) larr997888 Com119888119903119904(119890[119895 1198872[119895] oplus 119905])

(119888) 1198621198781198781 and 1198621198781198782 send the following information to 119862119862119878(1198871 [2119898 + 1 sdot sdot sdot 4119898] 119865 1198621199051119895119895119905)(1198872 [2119898 + 1 sdot sdot sdot 4119898] 119865 1198621199052119895119895119905)

4 CCS Abort if CSS1 and CSS2 report different values for these items5 CSS1 and CSS2(119886) CSS1 sends decommitment1205901199091[119895]oplus1198871[119895]1119895 1205901205821[119895]oplus1198871[2119898+119895]12119898+119895 1205901199091[119895]oplus1198872[119895]2119895 and 1205901205821 [119895]oplus1198872[2119898+119895]22119898+119895 to CCS(119887) CSS2 sends decommitment 1205901199092[119895]oplus1198871[119898+119895]1119898+119895 1205901205822[119895]oplus1198871[3119898+119895]13119898+119895 1205901199092[119895]oplus1198872[119898+119895]2119898+119895 and 1205901205822[119895]oplus1198872[3119898+119895]23119898+119895 to CCS6 CCS (119886) For 119895 isin [4119898] compute119883[119895] = Chk119888119903119904(119862119900[119895]1119895 120590119900[119895]1119895 )1198831015840[119895] = Chk119888119903119904(119862119900[119895]2119895 120590119900[119895]2119895 ) for the appropriate 119900[119895] If any call toChk returns perp then abort Similarly CCS knows the values 1198871[2119898 + 1 sdot sdot sdot 4119898] and 1198872[2119898 + 1 sdot sdot sdot 4119898] and aborts if CSS1 or CSS2can not open the corresponding commitments of 1205821 and 1205822 1198621205821[119895]oplus1198871[2119898+119895]12119898+119895 1198621205822[119895]oplus1198871[3119898+119895]13119898+119895 1198621205821[119895]oplus1198872[2119898+119895]22119898+119895 and 1198621205822[119895]oplus1198872[3119898+119895]23119898+119895 (119887) Run 119884 larr997888 Ev(119865119883) and 1198841015840 larr997888 Ev(1198651198831015840) then broadcasts 119884 and 1198841015840 to CSS1 and CSS27 CSS1 and CSS2 Compute (1199091 + 1199092)(1198861 + 1198862) = 119863119890(119889 119884)119863119890(119889 1198841015840)

Algorithm 5 Secure circuit protocol (SCP)

for a garbled input and garbled circuit Further Chk is analgorithm that can verify commitments

3 Outsourcing Privacy-Preserving ID3Decision Tree Algorithm in Malicious Model

In this section we present our secure outsourcing ID3decision tree in cloud computing using the homomorphicencryption scheme and subprotocols proposed in Section 2as building blocks

31 Main Concept The aim is to privately compute ID3 overencrypted databases and the key is to find privately theattribute A for which Gain is maximum From the abovedescription the key value which needs to be calculated withother parties is Entropy(Sai)

Since all the data was encrypted and sent to the cloud thecloud server can count the number of |S(ck)|t |S|it using theSETprotocol described in Section 2Now (3) can be executedas (x1+x2)(a1+a2)log2(x1+x2)(a1+a2) and the calculationof the logarithmic operation can be performed in CSS Thevalue to be calculated is the value of c1 = (x1 + x2)(a1 + a2)which can be easily determined using our SCP protocol asexplained in Section 2 Then all the parties can calculate thevalue of Entropy(S) independently

32 System Model The system model is shown in Figure 1which includes two data owners and cloud servers (cloudstorage server CSS1 CSS2 and cloud computing serverCCS) Each data owner owns a private data set that isencrypted and outsourced to cloud server storage Data

owners can request cloud server to process ID3 data onencrypted data At the same time CSS and CCS serversparticipate in supporting the outsourcing privacy protec-tion ID3 data mining algorithm steps after the imple-mentation of the algorithm the final results are sent tothe data owner Assuming that the data owner and theCSS server are semihonest participants CCS is a maliciousparticipant

33 Details of the Proposed Algorithm Our securely out-sourcing ID3 decision tree (SOID3) algorithm is detailed asfollows(1) P1 and P2 run KeyGen(120582) to generate the secret key

SKi i = 1 2 and a public parameter p of Lirsquos homomorphicencryption scheme Further each party shares p with theother party and the cloud but shares SKi only with itself(2) Each party uses its key SKi to encrypt every attribute

value of its database and then outsources the encrypteddatabase to the CSS (CSS1 and CSS2)(3)The CSS1 and CSS2 use the SET protocol to calculate

the value of |Saj |i and |Saj(ck)|i for each attribute with eachparty Pi(4)Eachparty generates its Paillier public and private keys

(pki ski) i = 1 2 and sends the public keys to the CSS1 andCSS2(5) CCS CSS1 and CSS2 jointly use the SCP protocol to

compute (x1 + x2)(a1 + a2) Here CSS1 has (x1 a1) and CSS2has (x2 a2)(6) Each party decrypts the received information cal-

culates it with the logarithmic operation of (x1 + x2a1 +

Wireless Communications and Mobile Computing 7

Sensors

Sensors

Database Database

Result Result

P1 P2

Cloud Storage Servers

Cloud Computing Servers

Figure 1 System model under consideration

a2)log2(x1 + x2a1 + a2) and then encrypts it with its publickey Then it sends it back to the cloud(7)After getting the result CSS1 and CSS2 use the SMIN2

protocol to select the ciphertext data with theminimum valueand then further select the attribute label with the maximuminformation gain and return it to each participant(8) The participants divide the data sets and build tree

nodes Then go to Step (3) until termination

4 Security Analysis

In this section we prove that the secure outsourcing ID3decision tree (SOID3) algorithm can offer protection againstthe malicious cloud server

Theorem 1 The SOID3 algorithm can achieve privacy for eachparty and the semihonest cloud storage server

Proof We mainly consider the security model under thenoncollusive semihonest model and the semihonest cloudserver Suppose there are two parties P1 and P2 and cloudstorage server CSS

Let P = (P1 P2CSS) be the participants of all protocolsConsider three types of attackers ( AP1 AP2 and ACSS) thatcan invade P1 P2 and CSS In the real model P1 and P2 havedata sets Dx and Dy respectively and CSS has encrypted datasets Enc(119863119909) and Enc(119863119910)MakeH sub P a collection of honestparticipants For all Pi120598H outPi indicates the output of Pi IfPi is invaded outPi represents all views of participant Pi inrunning protocol Π

For each Plowast isin P the attacker A = (AP1 AP2 ACSS) viewin the runtime protocol Π can be defined as

REALPlowast

ΠAH (DxDy) = outPi Pi120598H cup outPlowast (10)

In the ideal model there exists an ideal model F forfunction f and all participants can interact with the modelF That is Challenger DPa and participant Pi can send data xand y to F If Dx or Dy is perp F returns perp Finally F can return

f(DxDy) to challenger DPa As mentioned earlier H sub P isa collection of honest participants For each participant Pi120598Hin the collection return the outPi as F output to Pi If Pi isintruded on by a semihonest attacker outPi is still consistentwith the output of Pi in previous realistic models

For all Plowast isin P in the ideal model in the presence ofindependent simulators Sim = (SimP1 SimP2 SimCSS) the Plowastview is

IDEALPlowast

FSimH (DxDy) = outPi Pi120598H cup outPlowast (11)

Therefore it is considered that the protocolΠ is secure inthe presence of noncolluded semitruthful attackersDefinition 2 Let f be a deterministic functionality amongparties in P Let H sub P be the subset of honest partiesin P We say that Π securely realizes f if there exists a setSim = SimP1 SimP2 SimS of PPT transformations (whereSimDa

= SimP1(AP1 ) and so on) such that for all semihonestPPT adversaries A = AP1 AP2 AS for all inputs DxDy andauxiliary inputs z and for all parties P isin P the followingholds

REALPlowast

ΠAHz (120582 x y)120582isinN equiv

IDEALPlowast

FSimHz (120582 x y)120582isinN (12)

where equiv denotes computational indistinguishability

Theorem 2 The SOID3 algorithm is secure with the semi-honest cloud storage server and the malicious cloud computingserver

Proof First consider the case where CSS1 or CSS2 is cor-rupted It is necessary to prove that in the SCP protocol theideal model and the realistic model are not distinguishableThat is in the following interactions it is impossible to dis-tinguish between the various types of interaction informationand outputs of the participants in the ideal model and the realmodel

8 Wireless Communications and Mobile Computing

(1) In the realmodel assume that there is an emulator thatcan simulate various behaviors of a semihonest participantCSS1 (or CSS2) and receive inputs (x1 a2) and (x2 a2) fromthe execution environment of the protocol At the same timethe simulator can simulate the function Ff which sends allinputs (x1 a1) and (x2 a2) to the simulated Ff Since thesimulator does not do anything computed by Ff there is nodifference between the real Ff and the simulated Ff from theexecution environment point of view(2) Because in Step 2 CSS1 and CSS2 uniformly select the

seed r of Pseudo-Random Function (PRF) the PRF securityshows that the real model in Step 2 is indistinguishable fromthe ideal model(3) In Step 3 we modify the simulator which knows in

advance what promises will be opened when the simulatorgenerates commitment C First the simulator selects therandom numbers o1 o2 that can be marked which promisesto be opened and calculates the values of b1 = o1 oplus x1 x2b2 = o2 oplus a1 a2 At this point the simulator has obtainedthe values of x1 x2 and a1 a2Then the simulator can submitthe markings that promise not to be opened In this processdue to the concealment of commitment the realistic modeland the ideal model are equally indistinguishable(4) In Step 6a the simulated CSS1 and CSS2 stop exe-

cuting when De(D Y) = perp Change the emulator to makeY = Ev(FX) By obfuscating the authenticity of the circuitCCS has only negligible probability to obtain Y = Ev(FX) inDe(d Y) = perp Therefore in this step the realistic model andthe ideal model are equally indistinguishable(5) In Step 6b the correctness of the obfuscation circuit

guarantees that both CSS1 and CSS2 of the analog can beoutput Therefore if there is no pause in 6a we can modifythe simulator to an analog obfuscation circuit that generates(FX d) We can simulate the output of CSS1 and CSS2 bysimulating the instructions of Ff According to the security ofthe confusing circuit the real model is also indistinguishablefrom the ideal model in this step

Therefore in this protocol the execution environmentcan not distinguish between the realistic model and the idealmodel And the protocol is secure when CCS is a maliciousparticipant

5 Performance Analysis

In this paper we consider that CSS has a strong calculationability and we ignore its computation time Each data ownerdoes not need to store the ciphertext but can just use thepublic key to encrypt the message and the private key todecrypt the ciphertext

In each iteration first each data owner will execute theSBD protocol and SMIN2 protocol with the cloud There aretwo interactions in the SBD protocol and 2k interactions inthe SMIN2 protocolThen CSS1 CSS2 and CCS will execute6 interactions in the SCP protocol Finally each data ownerwill execute 1 interaction when it goes to the new iterationWe assume that t is the iteration time so the communicationtraffic is at most O(k lowast t)

In this paper a secure average computing protocolbased on SCP is implemented The server selected in the

0

50

100

150

Tim

e (s)

C1C2Client

100 200 300 400 500 6000Number of records in data set

Figure 2 Performance measurements for our SOID3 with 2 partic-ipants

experimental environment is CPU Intel (R) Xeon (R) CPUE5-2620 v3240GHzlowast2 memory 32G operating systemUbuntu 16044 LTS version In the experiment AES-128is chosen as the basic encryption method of the confusioncircuit and the open source code of JustGarble is changedand the commitment protocol is implemented based on SHA-256 Finally the average values obtained from experimentsare as follows

In our secure outsourcing ID3 decision tree (SOID3)algorithm experiment two participants were tested withdifferent numbers of records The experimental results areshown in Figure 2

From Figure 2 since the client is only responsible forencrypting uploaded data the time consumption is very lowIn the cloud CCS and CSS servers need to run SCP protocolresulting in a lot of time consumption (Table 1) The mainreason is that a large number of bit commitment processingis needed in the obfuscation circuit and the performanceimprovement will be focused on this issue in the follow-upwork

6 Conclusion

In this paper we proposed a secure outsourcing ID3 decisiontree algorithm for two parties of the malicious model Ouralgorithm can preserve the privacy of the usersrsquo data as wellas that of the data mining scheme for the cloud servers Theparties can get only the result trees and have no knowledgeabout the data mining scheme Moreover the cloud serverscannot get any private information about the parties Insummary our protocol offers protection against maliciouscloud servers

In the future we intend to extend our algorithm tovertical and arbitrary partitioning in the malicious modelIn addition we plan to extend our algorithm to a generalmultiparty privacy-preserving framework suitable for otheruseful schemes such as random decision tree Bayes SVM

Wireless Communications and Mobile Computing 9

Table 1 Time cost of the SCP protocol

AND XOR OR Input size(bits)

Output size(bits)

CSS1CSS2(ms)

119862119862119878(ms)

5090 4034 2016 129 65 26 47

and other data mining methods and can be extended for usein the wireless sensor-networks [36 37]

Data Availability

The data used to support the findings of this study areavailable from the corresponding author upon request

Conflicts of Interest

The authors declare that they have no conflicts of interest

Acknowledgments

This work is supported by National Key Research andDevelopment Program of China (no 2017YFB0803002)Basic Research Project of Shenzhen China (noJCYJ20160318094015947) National Natural ScienceFoundation of China (nos 61872109 and 61771222)and Key Technology Program of Shenzhen China (noJSGG20160427185010977)

References

[1] Y Lindell and B Pinkas ldquoPrivacy preserving data miningrdquoJournal of Cryptology vol 15 no 3 pp 177ndash206 2002

[2] R Agrawal and R Srikant ldquoPrivacy-preserving data miningrdquoACM SIGMOD Record vol 29 no 2 pp 439ndash450 2000

[3] D Liu E Bertino and X Yi ldquoPrivacy of outsourced k-means clusteringrdquo in Proceedings of the 9th ACM Symposiumon Information Computer and Communications Security ASIACCS 2014 pp 123ndash133 June 2014

[4] P Li J Li Z Huang C-Z Gao W-B Chen and K ChenldquoPrivacy-preserving outsourced classification in cloud comput-ingrdquo Cluster Computing pp 1ndash10 2017

[5] F Emekci O D Sahin D Agrawal and A El Abbadi ldquoPrivacypreserving decision tree learning over multiple partiesrdquoData ampKnowledge Engineering vol 63 no 2 pp 348ndash361 2007

[6] P Lory ldquoEnhancing the Efficiency in Privacy Preserving Learn-ing of Decision Trees in Partitioned Databasesrdquo in Privacy inStatistical Databases vol 7556 of Lecture Notes in ComputerScience pp 322ndash335 Springer Berlin Heidelberg Berlin Hei-delberg 2012

[7] J Vaidya and C Clifton ldquoPrivacy-preserving K-means cluster-ing over vertically partitioned datardquo in Proceedings of the 9thACM SIGKDD International Conference on Knowledge Discov-ery and Data Mining (KDD rsquo03) pp 206ndash215 Washington DC USA August 2003

[8] J Zhan S Matwin and LW Chang ldquoPrivacy-PreservingNaiveBayesian Classification over Horizontally Partitioned Datardquoin Data Mining Foundations and Practice vol 118 of Studiesin Computational Intelligence pp 529ndash538 Springer BerlinHeidelberg Berlin Heidelberg 2008

[9] S Han andW K Ng ldquoMulti-party privacy-preserving decisiontrees for arbitrarily partitioned datardquo International Journal ofIntelligent Control and Systems vol 12 no 4 2007

[10] T Li J Li Z Liu P Li and C Jia ldquoDifferentially private NaiveBayes learning overmultiple data sourcesrdquo Information Sciencesvol 444 pp 89ndash104 2018

[11] CGaoQ Cheng PHeW Susilo and J Li ldquoPrivacy-preservingNaive Bayes classifiers secure against the substitution-then-comparison attackrdquo Information Sciences vol 444 pp 72ndash882018

[12] J Shen T Zhou X Chen J Li and W Susilo ldquoAnonymousand Traceable Group Data Sharing in Cloud Computingrdquo IEEETransactions on Information Forensics and Security vol 13 no4 pp 912ndash925 2018

[13] Z Cai H Yan P Li Z-A Huang and C Gao ldquoTowards secureand flexible EHR sharing in mobile health cloud under staticassumptionsrdquo Cluster Computing vol 20 no 3 pp 2415ndash24222017

[14] J Li Y K Li X Chen P P C Lee and W Lou ldquoA HybridCloud Approach for Secure Authorized Deduplicationrdquo IEEETransactions on Parallel and Distributed Systems vol 26 no 5pp 1206ndash1216 2015

[15] Z Liu Y Huang J Li X Cheng and C Shen ldquoDivORAMTowards a practical oblivious RAM with variable block sizerdquoInformation Sciences vol 447 pp 1ndash11 2018

[16] X Chen J Li J Weng J Ma and W Lou ldquoVerifiable computa-tion over large database with incremental updatesrdquo Institute ofElectrical and Electronics Engineers Transactions on Computersvol 65 no 10 pp 3184ndash3195 2016

[17] J Shen C Wang T Li X Chen X Huang and Z-H ZhanldquoSecure data uploading scheme for a smart home systemrdquoInformation Sciences vol 453 pp 186ndash197 2018

[18] S ChenGWangG Yan andDXie ldquoMulti-dimensional fuzzytrust evaluation for mobile social networks based on dynamiccommunity structuresrdquoConcurrencyand Computation Practiceand Experience vol 29 no 7 p e3901 2017

[19] E Luo Q Liu J H Abawajy andGWang ldquoPrivacy-preservingmulti-hop profile-matching protocol for proximity mobilesocial networksrdquo Future Generation Computer Systems vol 68pp 222ndash233 2017

[20] X F Chen J Li J Ma Q Tang and W Lou ldquoNew algorithmsfor secure outsourcing of modular exponentiationsrdquo IEEETransactions on Parallel and Distributed Systems vol 25 no 9pp 2386ndash2396 2014

[21] R Bost R A Popa S Tu and S Goldwasser ldquoMachineLearning Classification over Encrypted Datardquo in Proceedings ofthe Network and Distributed System Security Symposium SanDiego CA

[22] A Peter E Tews and S Katzenbeisser ldquoEfficiently outsourcingmultiparty computation under multiple keysrdquo IEEE Transac-tions on Information Forensics and Security vol 8 no 12 pp2046ndash2058 2013

[23] G Jagannathan K Pillaipakkamnatt andRNWright ldquoA Prac-tical Differentially Private Random Decision Tree Classifierrdquo in

10 Wireless Communications and Mobile Computing

Proceedings of the 2009 IEEE International Conference on DataMining Workshops (ICDMW) pp 114ndash121 Miami FL USADecember 2009

[24] B Gilburd A Schuster and R Wolff ldquoPrivacy-preserving datamining on data grids in the presence of malicious participantsrdquoin Proceedings of the Proceedings 13th IEEE International Sym-posium on High performance Distributed Computing 2004 pp225ndash234 Honolulu HI USA

[25] D Shah and S Zhong ldquoTwo methods for privacy preservingdata mining with malicious participantsrdquo Information Sciencesvol 177 no 23 pp 5468ndash5483 2007

[26] P Li J Li Z Huang et al ldquoMulti-key privacy-preserving deeplearning in cloud computingrdquo Future Generation ComputerSystems vol 74 pp 76ndash85 2017

[27] Q Lin H Yan Z Huang W Chen J Shen and Y TangldquoAn ID-based linearly homomorphic signature scheme and itsapplication in blockchainrdquo IEEE Access vol PP no 99 pp 1-12018

[28] J Xu L Wei Y Zhang A Wang F Zhou and C Gao ldquoDy-namic Fully Homomorphic encryption-based Merkle Tree forlightweight streaming authenticated data structuresrdquo Journal ofNetwork and Computer Applications vol 107 pp 113ndash124 2018

[29] J Shen Z Gui S Ji J Shen H Tan and Y Tang ldquoCloud-aided lightweight certificateless authentication protocol withanonymity for wireless body area networksrdquo Journal of Networkand Computer Applications vol 106 pp 117ndash123 2018

[30] X Zhang Y Tan C Liang Y Li and J Li ldquoA Covert ChannelOver VoLTE via Adjusting Silence Periodsrdquo IEEE Access vol 6pp 9292ndash9302 2018

[31] Y Wang T Li H Qin et al ldquoA brief survey on secure multi-party computing in the presence of rational partiesrdquo Journal ofAmbient Intelligence and Humanized Computing vol 6 no 6pp 807ndash824 2015

[32] P Paillier ldquoPublic-key cryptosystems based on compositedegree residuosity classesrdquo in Advances in CryptologymdashEURO-CRYPT rsquo99 vol 1592 pp 223ndash238 Springer 1999

[33] L Li R Lu K-K R Choo A Datta and J Shao ldquoPrivacy-Preserving-Outsourced Association Rule Mining on Verti-cally Partitioned Databasesrdquo IEEE Transactions on InformationForensics and Security vol 11 no 8 pp 1547ndash1861 2016

[34] P Mohassel M Rosulek and Y Zhang ldquoFast and SecureThree-party Computationrdquo in Proceedings of the the 22ndACMSIGSACConference pp 591ndash602Denver Colorado USAOctober 2015

[35] M Naor ldquoBit commitment using pseudorandomnessrdquo Journalof Cryptology vol 4 no 2 pp 151ndash158 1991

[36] H Cheng Z Su N Xiong and Y Xiao ldquoEnergy-efficientnode scheduling algorithms for wireless sensor networks usingMarkov Random Field modelrdquo Information Sciences vol 329pp 461ndash477 2016

[37] H Cheng N Xiong A V Vasilakos L Tianruo Yang G Chenand X Zhuang ldquoNodes organization for channel assignmentwith topology preservation in multi-radio wireless mesh net-worksrdquo Ad Hoc Networks vol 10 no 5 pp 760ndash773 2012

International Journal of

AerospaceEngineeringHindawiwwwhindawicom Volume 2018

RoboticsJournal of

Hindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom Volume 2018

Active and Passive Electronic Components

VLSI Design

Hindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom Volume 2018

Shock and Vibration

Hindawiwwwhindawicom Volume 2018

Civil EngineeringAdvances in

Acoustics and VibrationAdvances in

Hindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom Volume 2018

Electrical and Computer Engineering

Journal of

Advances inOptoElectronics

Hindawiwwwhindawicom

Volume 2018

Hindawi Publishing Corporation httpwwwhindawicom Volume 2013Hindawiwwwhindawicom

The Scientific World Journal

Volume 2018

Control Scienceand Engineering

Journal of

Hindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom

Journal ofEngineeringVolume 2018

SensorsJournal of

Hindawiwwwhindawicom Volume 2018

International Journal of

RotatingMachinery

Hindawiwwwhindawicom Volume 2018

Modelling ampSimulationin EngineeringHindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom Volume 2018

Chemical EngineeringInternational Journal of Antennas and

Propagation

International Journal of

Hindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom Volume 2018

Navigation and Observation

International Journal of

Hindawi

wwwhindawicom Volume 2018

Advances in

Multimedia

Submit your manuscripts atwwwhindawicom

Page 7: ResearchArticle - Hindawi Publishing Corporationdownloads.hindawi.com/journals/wcmc/2018/2385150.pdf · ResearchArticle Securely Outsourcing ID3 Decision Tree in Cloud Computing YeLi,1

Wireless Communications and Mobile Computing 7

Sensors

Sensors

Database Database

Result Result

P1 P2

Cloud Storage Servers

Cloud Computing Servers

Figure 1 System model under consideration

a2)log2(x1 + x2a1 + a2) and then encrypts it with its publickey Then it sends it back to the cloud(7)After getting the result CSS1 and CSS2 use the SMIN2

protocol to select the ciphertext data with theminimum valueand then further select the attribute label with the maximuminformation gain and return it to each participant(8) The participants divide the data sets and build tree

nodes Then go to Step (3) until termination

4 Security Analysis

In this section we prove that the secure outsourcing ID3decision tree (SOID3) algorithm can offer protection againstthe malicious cloud server

Theorem 1 The SOID3 algorithm can achieve privacy for eachparty and the semihonest cloud storage server

Proof We mainly consider the security model under thenoncollusive semihonest model and the semihonest cloudserver Suppose there are two parties P1 and P2 and cloudstorage server CSS

Let P = (P1 P2CSS) be the participants of all protocolsConsider three types of attackers ( AP1 AP2 and ACSS) thatcan invade P1 P2 and CSS In the real model P1 and P2 havedata sets Dx and Dy respectively and CSS has encrypted datasets Enc(119863119909) and Enc(119863119910)MakeH sub P a collection of honestparticipants For all Pi120598H outPi indicates the output of Pi IfPi is invaded outPi represents all views of participant Pi inrunning protocol Π

For each Plowast isin P the attacker A = (AP1 AP2 ACSS) viewin the runtime protocol Π can be defined as

REALPlowast

ΠAH (DxDy) = outPi Pi120598H cup outPlowast (10)

In the ideal model there exists an ideal model F forfunction f and all participants can interact with the modelF That is Challenger DPa and participant Pi can send data xand y to F If Dx or Dy is perp F returns perp Finally F can return

f(DxDy) to challenger DPa As mentioned earlier H sub P isa collection of honest participants For each participant Pi120598Hin the collection return the outPi as F output to Pi If Pi isintruded on by a semihonest attacker outPi is still consistentwith the output of Pi in previous realistic models

For all Plowast isin P in the ideal model in the presence ofindependent simulators Sim = (SimP1 SimP2 SimCSS) the Plowastview is

IDEALPlowast

FSimH (DxDy) = outPi Pi120598H cup outPlowast (11)

Therefore it is considered that the protocolΠ is secure inthe presence of noncolluded semitruthful attackersDefinition 2 Let f be a deterministic functionality amongparties in P Let H sub P be the subset of honest partiesin P We say that Π securely realizes f if there exists a setSim = SimP1 SimP2 SimS of PPT transformations (whereSimDa

= SimP1(AP1 ) and so on) such that for all semihonestPPT adversaries A = AP1 AP2 AS for all inputs DxDy andauxiliary inputs z and for all parties P isin P the followingholds

REALPlowast

ΠAHz (120582 x y)120582isinN equiv

IDEALPlowast

FSimHz (120582 x y)120582isinN (12)

where equiv denotes computational indistinguishability

Theorem 2 The SOID3 algorithm is secure with the semi-honest cloud storage server and the malicious cloud computingserver

Proof First consider the case where CSS1 or CSS2 is cor-rupted It is necessary to prove that in the SCP protocol theideal model and the realistic model are not distinguishableThat is in the following interactions it is impossible to dis-tinguish between the various types of interaction informationand outputs of the participants in the ideal model and the realmodel

8 Wireless Communications and Mobile Computing

(1) In the realmodel assume that there is an emulator thatcan simulate various behaviors of a semihonest participantCSS1 (or CSS2) and receive inputs (x1 a2) and (x2 a2) fromthe execution environment of the protocol At the same timethe simulator can simulate the function Ff which sends allinputs (x1 a1) and (x2 a2) to the simulated Ff Since thesimulator does not do anything computed by Ff there is nodifference between the real Ff and the simulated Ff from theexecution environment point of view(2) Because in Step 2 CSS1 and CSS2 uniformly select the

seed r of Pseudo-Random Function (PRF) the PRF securityshows that the real model in Step 2 is indistinguishable fromthe ideal model(3) In Step 3 we modify the simulator which knows in

advance what promises will be opened when the simulatorgenerates commitment C First the simulator selects therandom numbers o1 o2 that can be marked which promisesto be opened and calculates the values of b1 = o1 oplus x1 x2b2 = o2 oplus a1 a2 At this point the simulator has obtainedthe values of x1 x2 and a1 a2Then the simulator can submitthe markings that promise not to be opened In this processdue to the concealment of commitment the realistic modeland the ideal model are equally indistinguishable(4) In Step 6a the simulated CSS1 and CSS2 stop exe-

cuting when De(D Y) = perp Change the emulator to makeY = Ev(FX) By obfuscating the authenticity of the circuitCCS has only negligible probability to obtain Y = Ev(FX) inDe(d Y) = perp Therefore in this step the realistic model andthe ideal model are equally indistinguishable(5) In Step 6b the correctness of the obfuscation circuit

guarantees that both CSS1 and CSS2 of the analog can beoutput Therefore if there is no pause in 6a we can modifythe simulator to an analog obfuscation circuit that generates(FX d) We can simulate the output of CSS1 and CSS2 bysimulating the instructions of Ff According to the security ofthe confusing circuit the real model is also indistinguishablefrom the ideal model in this step

Therefore in this protocol the execution environmentcan not distinguish between the realistic model and the idealmodel And the protocol is secure when CCS is a maliciousparticipant

5 Performance Analysis

In this paper we consider that CSS has a strong calculationability and we ignore its computation time Each data ownerdoes not need to store the ciphertext but can just use thepublic key to encrypt the message and the private key todecrypt the ciphertext

In each iteration first each data owner will execute theSBD protocol and SMIN2 protocol with the cloud There aretwo interactions in the SBD protocol and 2k interactions inthe SMIN2 protocolThen CSS1 CSS2 and CCS will execute6 interactions in the SCP protocol Finally each data ownerwill execute 1 interaction when it goes to the new iterationWe assume that t is the iteration time so the communicationtraffic is at most O(k lowast t)

In this paper a secure average computing protocolbased on SCP is implemented The server selected in the

0

50

100

150

Tim

e (s)

C1C2Client

100 200 300 400 500 6000Number of records in data set

Figure 2 Performance measurements for our SOID3 with 2 partic-ipants

experimental environment is CPU Intel (R) Xeon (R) CPUE5-2620 v3240GHzlowast2 memory 32G operating systemUbuntu 16044 LTS version In the experiment AES-128is chosen as the basic encryption method of the confusioncircuit and the open source code of JustGarble is changedand the commitment protocol is implemented based on SHA-256 Finally the average values obtained from experimentsare as follows

In our secure outsourcing ID3 decision tree (SOID3)algorithm experiment two participants were tested withdifferent numbers of records The experimental results areshown in Figure 2

From Figure 2 since the client is only responsible forencrypting uploaded data the time consumption is very lowIn the cloud CCS and CSS servers need to run SCP protocolresulting in a lot of time consumption (Table 1) The mainreason is that a large number of bit commitment processingis needed in the obfuscation circuit and the performanceimprovement will be focused on this issue in the follow-upwork

6 Conclusion

In this paper we proposed a secure outsourcing ID3 decisiontree algorithm for two parties of the malicious model Ouralgorithm can preserve the privacy of the usersrsquo data as wellas that of the data mining scheme for the cloud servers Theparties can get only the result trees and have no knowledgeabout the data mining scheme Moreover the cloud serverscannot get any private information about the parties Insummary our protocol offers protection against maliciouscloud servers

In the future we intend to extend our algorithm tovertical and arbitrary partitioning in the malicious modelIn addition we plan to extend our algorithm to a generalmultiparty privacy-preserving framework suitable for otheruseful schemes such as random decision tree Bayes SVM

Wireless Communications and Mobile Computing 9

Table 1 Time cost of the SCP protocol

AND XOR OR Input size(bits)

Output size(bits)

CSS1CSS2(ms)

119862119862119878(ms)

5090 4034 2016 129 65 26 47

and other data mining methods and can be extended for usein the wireless sensor-networks [36 37]

Data Availability

The data used to support the findings of this study areavailable from the corresponding author upon request

Conflicts of Interest

The authors declare that they have no conflicts of interest

Acknowledgments

This work is supported by National Key Research andDevelopment Program of China (no 2017YFB0803002)Basic Research Project of Shenzhen China (noJCYJ20160318094015947) National Natural ScienceFoundation of China (nos 61872109 and 61771222)and Key Technology Program of Shenzhen China (noJSGG20160427185010977)

References

[1] Y Lindell and B Pinkas ldquoPrivacy preserving data miningrdquoJournal of Cryptology vol 15 no 3 pp 177ndash206 2002

[2] R Agrawal and R Srikant ldquoPrivacy-preserving data miningrdquoACM SIGMOD Record vol 29 no 2 pp 439ndash450 2000

[3] D Liu E Bertino and X Yi ldquoPrivacy of outsourced k-means clusteringrdquo in Proceedings of the 9th ACM Symposiumon Information Computer and Communications Security ASIACCS 2014 pp 123ndash133 June 2014

[4] P Li J Li Z Huang C-Z Gao W-B Chen and K ChenldquoPrivacy-preserving outsourced classification in cloud comput-ingrdquo Cluster Computing pp 1ndash10 2017

[5] F Emekci O D Sahin D Agrawal and A El Abbadi ldquoPrivacypreserving decision tree learning over multiple partiesrdquoData ampKnowledge Engineering vol 63 no 2 pp 348ndash361 2007

[6] P Lory ldquoEnhancing the Efficiency in Privacy Preserving Learn-ing of Decision Trees in Partitioned Databasesrdquo in Privacy inStatistical Databases vol 7556 of Lecture Notes in ComputerScience pp 322ndash335 Springer Berlin Heidelberg Berlin Hei-delberg 2012

[7] J Vaidya and C Clifton ldquoPrivacy-preserving K-means cluster-ing over vertically partitioned datardquo in Proceedings of the 9thACM SIGKDD International Conference on Knowledge Discov-ery and Data Mining (KDD rsquo03) pp 206ndash215 Washington DC USA August 2003

[8] J Zhan S Matwin and LW Chang ldquoPrivacy-PreservingNaiveBayesian Classification over Horizontally Partitioned Datardquoin Data Mining Foundations and Practice vol 118 of Studiesin Computational Intelligence pp 529ndash538 Springer BerlinHeidelberg Berlin Heidelberg 2008

[9] S Han andW K Ng ldquoMulti-party privacy-preserving decisiontrees for arbitrarily partitioned datardquo International Journal ofIntelligent Control and Systems vol 12 no 4 2007

[10] T Li J Li Z Liu P Li and C Jia ldquoDifferentially private NaiveBayes learning overmultiple data sourcesrdquo Information Sciencesvol 444 pp 89ndash104 2018

[11] CGaoQ Cheng PHeW Susilo and J Li ldquoPrivacy-preservingNaive Bayes classifiers secure against the substitution-then-comparison attackrdquo Information Sciences vol 444 pp 72ndash882018

[12] J Shen T Zhou X Chen J Li and W Susilo ldquoAnonymousand Traceable Group Data Sharing in Cloud Computingrdquo IEEETransactions on Information Forensics and Security vol 13 no4 pp 912ndash925 2018

[13] Z Cai H Yan P Li Z-A Huang and C Gao ldquoTowards secureand flexible EHR sharing in mobile health cloud under staticassumptionsrdquo Cluster Computing vol 20 no 3 pp 2415ndash24222017

[14] J Li Y K Li X Chen P P C Lee and W Lou ldquoA HybridCloud Approach for Secure Authorized Deduplicationrdquo IEEETransactions on Parallel and Distributed Systems vol 26 no 5pp 1206ndash1216 2015

[15] Z Liu Y Huang J Li X Cheng and C Shen ldquoDivORAMTowards a practical oblivious RAM with variable block sizerdquoInformation Sciences vol 447 pp 1ndash11 2018

[16] X Chen J Li J Weng J Ma and W Lou ldquoVerifiable computa-tion over large database with incremental updatesrdquo Institute ofElectrical and Electronics Engineers Transactions on Computersvol 65 no 10 pp 3184ndash3195 2016

[17] J Shen C Wang T Li X Chen X Huang and Z-H ZhanldquoSecure data uploading scheme for a smart home systemrdquoInformation Sciences vol 453 pp 186ndash197 2018

[18] S ChenGWangG Yan andDXie ldquoMulti-dimensional fuzzytrust evaluation for mobile social networks based on dynamiccommunity structuresrdquoConcurrencyand Computation Practiceand Experience vol 29 no 7 p e3901 2017

[19] E Luo Q Liu J H Abawajy andGWang ldquoPrivacy-preservingmulti-hop profile-matching protocol for proximity mobilesocial networksrdquo Future Generation Computer Systems vol 68pp 222ndash233 2017

[20] X F Chen J Li J Ma Q Tang and W Lou ldquoNew algorithmsfor secure outsourcing of modular exponentiationsrdquo IEEETransactions on Parallel and Distributed Systems vol 25 no 9pp 2386ndash2396 2014

[21] R Bost R A Popa S Tu and S Goldwasser ldquoMachineLearning Classification over Encrypted Datardquo in Proceedings ofthe Network and Distributed System Security Symposium SanDiego CA

[22] A Peter E Tews and S Katzenbeisser ldquoEfficiently outsourcingmultiparty computation under multiple keysrdquo IEEE Transac-tions on Information Forensics and Security vol 8 no 12 pp2046ndash2058 2013

[23] G Jagannathan K Pillaipakkamnatt andRNWright ldquoA Prac-tical Differentially Private Random Decision Tree Classifierrdquo in

10 Wireless Communications and Mobile Computing

Proceedings of the 2009 IEEE International Conference on DataMining Workshops (ICDMW) pp 114ndash121 Miami FL USADecember 2009

[24] B Gilburd A Schuster and R Wolff ldquoPrivacy-preserving datamining on data grids in the presence of malicious participantsrdquoin Proceedings of the Proceedings 13th IEEE International Sym-posium on High performance Distributed Computing 2004 pp225ndash234 Honolulu HI USA

[25] D Shah and S Zhong ldquoTwo methods for privacy preservingdata mining with malicious participantsrdquo Information Sciencesvol 177 no 23 pp 5468ndash5483 2007

[26] P Li J Li Z Huang et al ldquoMulti-key privacy-preserving deeplearning in cloud computingrdquo Future Generation ComputerSystems vol 74 pp 76ndash85 2017

[27] Q Lin H Yan Z Huang W Chen J Shen and Y TangldquoAn ID-based linearly homomorphic signature scheme and itsapplication in blockchainrdquo IEEE Access vol PP no 99 pp 1-12018

[28] J Xu L Wei Y Zhang A Wang F Zhou and C Gao ldquoDy-namic Fully Homomorphic encryption-based Merkle Tree forlightweight streaming authenticated data structuresrdquo Journal ofNetwork and Computer Applications vol 107 pp 113ndash124 2018

[29] J Shen Z Gui S Ji J Shen H Tan and Y Tang ldquoCloud-aided lightweight certificateless authentication protocol withanonymity for wireless body area networksrdquo Journal of Networkand Computer Applications vol 106 pp 117ndash123 2018

[30] X Zhang Y Tan C Liang Y Li and J Li ldquoA Covert ChannelOver VoLTE via Adjusting Silence Periodsrdquo IEEE Access vol 6pp 9292ndash9302 2018

[31] Y Wang T Li H Qin et al ldquoA brief survey on secure multi-party computing in the presence of rational partiesrdquo Journal ofAmbient Intelligence and Humanized Computing vol 6 no 6pp 807ndash824 2015

[32] P Paillier ldquoPublic-key cryptosystems based on compositedegree residuosity classesrdquo in Advances in CryptologymdashEURO-CRYPT rsquo99 vol 1592 pp 223ndash238 Springer 1999

[33] L Li R Lu K-K R Choo A Datta and J Shao ldquoPrivacy-Preserving-Outsourced Association Rule Mining on Verti-cally Partitioned Databasesrdquo IEEE Transactions on InformationForensics and Security vol 11 no 8 pp 1547ndash1861 2016

[34] P Mohassel M Rosulek and Y Zhang ldquoFast and SecureThree-party Computationrdquo in Proceedings of the the 22ndACMSIGSACConference pp 591ndash602Denver Colorado USAOctober 2015

[35] M Naor ldquoBit commitment using pseudorandomnessrdquo Journalof Cryptology vol 4 no 2 pp 151ndash158 1991

[36] H Cheng Z Su N Xiong and Y Xiao ldquoEnergy-efficientnode scheduling algorithms for wireless sensor networks usingMarkov Random Field modelrdquo Information Sciences vol 329pp 461ndash477 2016

[37] H Cheng N Xiong A V Vasilakos L Tianruo Yang G Chenand X Zhuang ldquoNodes organization for channel assignmentwith topology preservation in multi-radio wireless mesh net-worksrdquo Ad Hoc Networks vol 10 no 5 pp 760ndash773 2012

International Journal of

AerospaceEngineeringHindawiwwwhindawicom Volume 2018

RoboticsJournal of

Hindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom Volume 2018

Active and Passive Electronic Components

VLSI Design

Hindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom Volume 2018

Shock and Vibration

Hindawiwwwhindawicom Volume 2018

Civil EngineeringAdvances in

Acoustics and VibrationAdvances in

Hindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom Volume 2018

Electrical and Computer Engineering

Journal of

Advances inOptoElectronics

Hindawiwwwhindawicom

Volume 2018

Hindawi Publishing Corporation httpwwwhindawicom Volume 2013Hindawiwwwhindawicom

The Scientific World Journal

Volume 2018

Control Scienceand Engineering

Journal of

Hindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom

Journal ofEngineeringVolume 2018

SensorsJournal of

Hindawiwwwhindawicom Volume 2018

International Journal of

RotatingMachinery

Hindawiwwwhindawicom Volume 2018

Modelling ampSimulationin EngineeringHindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom Volume 2018

Chemical EngineeringInternational Journal of Antennas and

Propagation

International Journal of

Hindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom Volume 2018

Navigation and Observation

International Journal of

Hindawi

wwwhindawicom Volume 2018

Advances in

Multimedia

Submit your manuscripts atwwwhindawicom

Page 8: ResearchArticle - Hindawi Publishing Corporationdownloads.hindawi.com/journals/wcmc/2018/2385150.pdf · ResearchArticle Securely Outsourcing ID3 Decision Tree in Cloud Computing YeLi,1

8 Wireless Communications and Mobile Computing

(1) In the realmodel assume that there is an emulator thatcan simulate various behaviors of a semihonest participantCSS1 (or CSS2) and receive inputs (x1 a2) and (x2 a2) fromthe execution environment of the protocol At the same timethe simulator can simulate the function Ff which sends allinputs (x1 a1) and (x2 a2) to the simulated Ff Since thesimulator does not do anything computed by Ff there is nodifference between the real Ff and the simulated Ff from theexecution environment point of view(2) Because in Step 2 CSS1 and CSS2 uniformly select the

seed r of Pseudo-Random Function (PRF) the PRF securityshows that the real model in Step 2 is indistinguishable fromthe ideal model(3) In Step 3 we modify the simulator which knows in

advance what promises will be opened when the simulatorgenerates commitment C First the simulator selects therandom numbers o1 o2 that can be marked which promisesto be opened and calculates the values of b1 = o1 oplus x1 x2b2 = o2 oplus a1 a2 At this point the simulator has obtainedthe values of x1 x2 and a1 a2Then the simulator can submitthe markings that promise not to be opened In this processdue to the concealment of commitment the realistic modeland the ideal model are equally indistinguishable(4) In Step 6a the simulated CSS1 and CSS2 stop exe-

cuting when De(D Y) = perp Change the emulator to makeY = Ev(FX) By obfuscating the authenticity of the circuitCCS has only negligible probability to obtain Y = Ev(FX) inDe(d Y) = perp Therefore in this step the realistic model andthe ideal model are equally indistinguishable(5) In Step 6b the correctness of the obfuscation circuit

guarantees that both CSS1 and CSS2 of the analog can beoutput Therefore if there is no pause in 6a we can modifythe simulator to an analog obfuscation circuit that generates(FX d) We can simulate the output of CSS1 and CSS2 bysimulating the instructions of Ff According to the security ofthe confusing circuit the real model is also indistinguishablefrom the ideal model in this step

Therefore in this protocol the execution environmentcan not distinguish between the realistic model and the idealmodel And the protocol is secure when CCS is a maliciousparticipant

5 Performance Analysis

In this paper we consider that CSS has a strong calculationability and we ignore its computation time Each data ownerdoes not need to store the ciphertext but can just use thepublic key to encrypt the message and the private key todecrypt the ciphertext

In each iteration first each data owner will execute theSBD protocol and SMIN2 protocol with the cloud There aretwo interactions in the SBD protocol and 2k interactions inthe SMIN2 protocolThen CSS1 CSS2 and CCS will execute6 interactions in the SCP protocol Finally each data ownerwill execute 1 interaction when it goes to the new iterationWe assume that t is the iteration time so the communicationtraffic is at most O(k lowast t)

In this paper a secure average computing protocolbased on SCP is implemented The server selected in the

0

50

100

150

Tim

e (s)

C1C2Client

100 200 300 400 500 6000Number of records in data set

Figure 2 Performance measurements for our SOID3 with 2 partic-ipants

experimental environment is CPU Intel (R) Xeon (R) CPUE5-2620 v3240GHzlowast2 memory 32G operating systemUbuntu 16044 LTS version In the experiment AES-128is chosen as the basic encryption method of the confusioncircuit and the open source code of JustGarble is changedand the commitment protocol is implemented based on SHA-256 Finally the average values obtained from experimentsare as follows

In our secure outsourcing ID3 decision tree (SOID3)algorithm experiment two participants were tested withdifferent numbers of records The experimental results areshown in Figure 2

From Figure 2 since the client is only responsible forencrypting uploaded data the time consumption is very lowIn the cloud CCS and CSS servers need to run SCP protocolresulting in a lot of time consumption (Table 1) The mainreason is that a large number of bit commitment processingis needed in the obfuscation circuit and the performanceimprovement will be focused on this issue in the follow-upwork

6 Conclusion

In this paper we proposed a secure outsourcing ID3 decisiontree algorithm for two parties of the malicious model Ouralgorithm can preserve the privacy of the usersrsquo data as wellas that of the data mining scheme for the cloud servers Theparties can get only the result trees and have no knowledgeabout the data mining scheme Moreover the cloud serverscannot get any private information about the parties Insummary our protocol offers protection against maliciouscloud servers

In the future we intend to extend our algorithm tovertical and arbitrary partitioning in the malicious modelIn addition we plan to extend our algorithm to a generalmultiparty privacy-preserving framework suitable for otheruseful schemes such as random decision tree Bayes SVM

Wireless Communications and Mobile Computing 9

Table 1 Time cost of the SCP protocol

AND XOR OR Input size(bits)

Output size(bits)

CSS1CSS2(ms)

119862119862119878(ms)

5090 4034 2016 129 65 26 47

and other data mining methods and can be extended for usein the wireless sensor-networks [36 37]

Data Availability

The data used to support the findings of this study areavailable from the corresponding author upon request

Conflicts of Interest

The authors declare that they have no conflicts of interest

Acknowledgments

This work is supported by National Key Research andDevelopment Program of China (no 2017YFB0803002)Basic Research Project of Shenzhen China (noJCYJ20160318094015947) National Natural ScienceFoundation of China (nos 61872109 and 61771222)and Key Technology Program of Shenzhen China (noJSGG20160427185010977)

References

[1] Y Lindell and B Pinkas ldquoPrivacy preserving data miningrdquoJournal of Cryptology vol 15 no 3 pp 177ndash206 2002

[2] R Agrawal and R Srikant ldquoPrivacy-preserving data miningrdquoACM SIGMOD Record vol 29 no 2 pp 439ndash450 2000

[3] D Liu E Bertino and X Yi ldquoPrivacy of outsourced k-means clusteringrdquo in Proceedings of the 9th ACM Symposiumon Information Computer and Communications Security ASIACCS 2014 pp 123ndash133 June 2014

[4] P Li J Li Z Huang C-Z Gao W-B Chen and K ChenldquoPrivacy-preserving outsourced classification in cloud comput-ingrdquo Cluster Computing pp 1ndash10 2017

[5] F Emekci O D Sahin D Agrawal and A El Abbadi ldquoPrivacypreserving decision tree learning over multiple partiesrdquoData ampKnowledge Engineering vol 63 no 2 pp 348ndash361 2007

[6] P Lory ldquoEnhancing the Efficiency in Privacy Preserving Learn-ing of Decision Trees in Partitioned Databasesrdquo in Privacy inStatistical Databases vol 7556 of Lecture Notes in ComputerScience pp 322ndash335 Springer Berlin Heidelberg Berlin Hei-delberg 2012

[7] J Vaidya and C Clifton ldquoPrivacy-preserving K-means cluster-ing over vertically partitioned datardquo in Proceedings of the 9thACM SIGKDD International Conference on Knowledge Discov-ery and Data Mining (KDD rsquo03) pp 206ndash215 Washington DC USA August 2003

[8] J Zhan S Matwin and LW Chang ldquoPrivacy-PreservingNaiveBayesian Classification over Horizontally Partitioned Datardquoin Data Mining Foundations and Practice vol 118 of Studiesin Computational Intelligence pp 529ndash538 Springer BerlinHeidelberg Berlin Heidelberg 2008

[9] S Han andW K Ng ldquoMulti-party privacy-preserving decisiontrees for arbitrarily partitioned datardquo International Journal ofIntelligent Control and Systems vol 12 no 4 2007

[10] T Li J Li Z Liu P Li and C Jia ldquoDifferentially private NaiveBayes learning overmultiple data sourcesrdquo Information Sciencesvol 444 pp 89ndash104 2018

[11] CGaoQ Cheng PHeW Susilo and J Li ldquoPrivacy-preservingNaive Bayes classifiers secure against the substitution-then-comparison attackrdquo Information Sciences vol 444 pp 72ndash882018

[12] J Shen T Zhou X Chen J Li and W Susilo ldquoAnonymousand Traceable Group Data Sharing in Cloud Computingrdquo IEEETransactions on Information Forensics and Security vol 13 no4 pp 912ndash925 2018

[13] Z Cai H Yan P Li Z-A Huang and C Gao ldquoTowards secureand flexible EHR sharing in mobile health cloud under staticassumptionsrdquo Cluster Computing vol 20 no 3 pp 2415ndash24222017

[14] J Li Y K Li X Chen P P C Lee and W Lou ldquoA HybridCloud Approach for Secure Authorized Deduplicationrdquo IEEETransactions on Parallel and Distributed Systems vol 26 no 5pp 1206ndash1216 2015

[15] Z Liu Y Huang J Li X Cheng and C Shen ldquoDivORAMTowards a practical oblivious RAM with variable block sizerdquoInformation Sciences vol 447 pp 1ndash11 2018

[16] X Chen J Li J Weng J Ma and W Lou ldquoVerifiable computa-tion over large database with incremental updatesrdquo Institute ofElectrical and Electronics Engineers Transactions on Computersvol 65 no 10 pp 3184ndash3195 2016

[17] J Shen C Wang T Li X Chen X Huang and Z-H ZhanldquoSecure data uploading scheme for a smart home systemrdquoInformation Sciences vol 453 pp 186ndash197 2018

[18] S ChenGWangG Yan andDXie ldquoMulti-dimensional fuzzytrust evaluation for mobile social networks based on dynamiccommunity structuresrdquoConcurrencyand Computation Practiceand Experience vol 29 no 7 p e3901 2017

[19] E Luo Q Liu J H Abawajy andGWang ldquoPrivacy-preservingmulti-hop profile-matching protocol for proximity mobilesocial networksrdquo Future Generation Computer Systems vol 68pp 222ndash233 2017

[20] X F Chen J Li J Ma Q Tang and W Lou ldquoNew algorithmsfor secure outsourcing of modular exponentiationsrdquo IEEETransactions on Parallel and Distributed Systems vol 25 no 9pp 2386ndash2396 2014

[21] R Bost R A Popa S Tu and S Goldwasser ldquoMachineLearning Classification over Encrypted Datardquo in Proceedings ofthe Network and Distributed System Security Symposium SanDiego CA

[22] A Peter E Tews and S Katzenbeisser ldquoEfficiently outsourcingmultiparty computation under multiple keysrdquo IEEE Transac-tions on Information Forensics and Security vol 8 no 12 pp2046ndash2058 2013

[23] G Jagannathan K Pillaipakkamnatt andRNWright ldquoA Prac-tical Differentially Private Random Decision Tree Classifierrdquo in

10 Wireless Communications and Mobile Computing

Proceedings of the 2009 IEEE International Conference on DataMining Workshops (ICDMW) pp 114ndash121 Miami FL USADecember 2009

[24] B Gilburd A Schuster and R Wolff ldquoPrivacy-preserving datamining on data grids in the presence of malicious participantsrdquoin Proceedings of the Proceedings 13th IEEE International Sym-posium on High performance Distributed Computing 2004 pp225ndash234 Honolulu HI USA

[25] D Shah and S Zhong ldquoTwo methods for privacy preservingdata mining with malicious participantsrdquo Information Sciencesvol 177 no 23 pp 5468ndash5483 2007

[26] P Li J Li Z Huang et al ldquoMulti-key privacy-preserving deeplearning in cloud computingrdquo Future Generation ComputerSystems vol 74 pp 76ndash85 2017

[27] Q Lin H Yan Z Huang W Chen J Shen and Y TangldquoAn ID-based linearly homomorphic signature scheme and itsapplication in blockchainrdquo IEEE Access vol PP no 99 pp 1-12018

[28] J Xu L Wei Y Zhang A Wang F Zhou and C Gao ldquoDy-namic Fully Homomorphic encryption-based Merkle Tree forlightweight streaming authenticated data structuresrdquo Journal ofNetwork and Computer Applications vol 107 pp 113ndash124 2018

[29] J Shen Z Gui S Ji J Shen H Tan and Y Tang ldquoCloud-aided lightweight certificateless authentication protocol withanonymity for wireless body area networksrdquo Journal of Networkand Computer Applications vol 106 pp 117ndash123 2018

[30] X Zhang Y Tan C Liang Y Li and J Li ldquoA Covert ChannelOver VoLTE via Adjusting Silence Periodsrdquo IEEE Access vol 6pp 9292ndash9302 2018

[31] Y Wang T Li H Qin et al ldquoA brief survey on secure multi-party computing in the presence of rational partiesrdquo Journal ofAmbient Intelligence and Humanized Computing vol 6 no 6pp 807ndash824 2015

[32] P Paillier ldquoPublic-key cryptosystems based on compositedegree residuosity classesrdquo in Advances in CryptologymdashEURO-CRYPT rsquo99 vol 1592 pp 223ndash238 Springer 1999

[33] L Li R Lu K-K R Choo A Datta and J Shao ldquoPrivacy-Preserving-Outsourced Association Rule Mining on Verti-cally Partitioned Databasesrdquo IEEE Transactions on InformationForensics and Security vol 11 no 8 pp 1547ndash1861 2016

[34] P Mohassel M Rosulek and Y Zhang ldquoFast and SecureThree-party Computationrdquo in Proceedings of the the 22ndACMSIGSACConference pp 591ndash602Denver Colorado USAOctober 2015

[35] M Naor ldquoBit commitment using pseudorandomnessrdquo Journalof Cryptology vol 4 no 2 pp 151ndash158 1991

[36] H Cheng Z Su N Xiong and Y Xiao ldquoEnergy-efficientnode scheduling algorithms for wireless sensor networks usingMarkov Random Field modelrdquo Information Sciences vol 329pp 461ndash477 2016

[37] H Cheng N Xiong A V Vasilakos L Tianruo Yang G Chenand X Zhuang ldquoNodes organization for channel assignmentwith topology preservation in multi-radio wireless mesh net-worksrdquo Ad Hoc Networks vol 10 no 5 pp 760ndash773 2012

International Journal of

AerospaceEngineeringHindawiwwwhindawicom Volume 2018

RoboticsJournal of

Hindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom Volume 2018

Active and Passive Electronic Components

VLSI Design

Hindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom Volume 2018

Shock and Vibration

Hindawiwwwhindawicom Volume 2018

Civil EngineeringAdvances in

Acoustics and VibrationAdvances in

Hindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom Volume 2018

Electrical and Computer Engineering

Journal of

Advances inOptoElectronics

Hindawiwwwhindawicom

Volume 2018

Hindawi Publishing Corporation httpwwwhindawicom Volume 2013Hindawiwwwhindawicom

The Scientific World Journal

Volume 2018

Control Scienceand Engineering

Journal of

Hindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom

Journal ofEngineeringVolume 2018

SensorsJournal of

Hindawiwwwhindawicom Volume 2018

International Journal of

RotatingMachinery

Hindawiwwwhindawicom Volume 2018

Modelling ampSimulationin EngineeringHindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom Volume 2018

Chemical EngineeringInternational Journal of Antennas and

Propagation

International Journal of

Hindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom Volume 2018

Navigation and Observation

International Journal of

Hindawi

wwwhindawicom Volume 2018

Advances in

Multimedia

Submit your manuscripts atwwwhindawicom

Page 9: ResearchArticle - Hindawi Publishing Corporationdownloads.hindawi.com/journals/wcmc/2018/2385150.pdf · ResearchArticle Securely Outsourcing ID3 Decision Tree in Cloud Computing YeLi,1

Wireless Communications and Mobile Computing 9

Table 1 Time cost of the SCP protocol

AND XOR OR Input size(bits)

Output size(bits)

CSS1CSS2(ms)

119862119862119878(ms)

5090 4034 2016 129 65 26 47

and other data mining methods and can be extended for usein the wireless sensor-networks [36 37]

Data Availability

The data used to support the findings of this study areavailable from the corresponding author upon request

Conflicts of Interest

The authors declare that they have no conflicts of interest

Acknowledgments

This work is supported by National Key Research andDevelopment Program of China (no 2017YFB0803002)Basic Research Project of Shenzhen China (noJCYJ20160318094015947) National Natural ScienceFoundation of China (nos 61872109 and 61771222)and Key Technology Program of Shenzhen China (noJSGG20160427185010977)

References

[1] Y Lindell and B Pinkas ldquoPrivacy preserving data miningrdquoJournal of Cryptology vol 15 no 3 pp 177ndash206 2002

[2] R Agrawal and R Srikant ldquoPrivacy-preserving data miningrdquoACM SIGMOD Record vol 29 no 2 pp 439ndash450 2000

[3] D Liu E Bertino and X Yi ldquoPrivacy of outsourced k-means clusteringrdquo in Proceedings of the 9th ACM Symposiumon Information Computer and Communications Security ASIACCS 2014 pp 123ndash133 June 2014

[4] P Li J Li Z Huang C-Z Gao W-B Chen and K ChenldquoPrivacy-preserving outsourced classification in cloud comput-ingrdquo Cluster Computing pp 1ndash10 2017

[5] F Emekci O D Sahin D Agrawal and A El Abbadi ldquoPrivacypreserving decision tree learning over multiple partiesrdquoData ampKnowledge Engineering vol 63 no 2 pp 348ndash361 2007

[6] P Lory ldquoEnhancing the Efficiency in Privacy Preserving Learn-ing of Decision Trees in Partitioned Databasesrdquo in Privacy inStatistical Databases vol 7556 of Lecture Notes in ComputerScience pp 322ndash335 Springer Berlin Heidelberg Berlin Hei-delberg 2012

[7] J Vaidya and C Clifton ldquoPrivacy-preserving K-means cluster-ing over vertically partitioned datardquo in Proceedings of the 9thACM SIGKDD International Conference on Knowledge Discov-ery and Data Mining (KDD rsquo03) pp 206ndash215 Washington DC USA August 2003

[8] J Zhan S Matwin and LW Chang ldquoPrivacy-PreservingNaiveBayesian Classification over Horizontally Partitioned Datardquoin Data Mining Foundations and Practice vol 118 of Studiesin Computational Intelligence pp 529ndash538 Springer BerlinHeidelberg Berlin Heidelberg 2008

[9] S Han andW K Ng ldquoMulti-party privacy-preserving decisiontrees for arbitrarily partitioned datardquo International Journal ofIntelligent Control and Systems vol 12 no 4 2007

[10] T Li J Li Z Liu P Li and C Jia ldquoDifferentially private NaiveBayes learning overmultiple data sourcesrdquo Information Sciencesvol 444 pp 89ndash104 2018

[11] CGaoQ Cheng PHeW Susilo and J Li ldquoPrivacy-preservingNaive Bayes classifiers secure against the substitution-then-comparison attackrdquo Information Sciences vol 444 pp 72ndash882018

[12] J Shen T Zhou X Chen J Li and W Susilo ldquoAnonymousand Traceable Group Data Sharing in Cloud Computingrdquo IEEETransactions on Information Forensics and Security vol 13 no4 pp 912ndash925 2018

[13] Z Cai H Yan P Li Z-A Huang and C Gao ldquoTowards secureand flexible EHR sharing in mobile health cloud under staticassumptionsrdquo Cluster Computing vol 20 no 3 pp 2415ndash24222017

[14] J Li Y K Li X Chen P P C Lee and W Lou ldquoA HybridCloud Approach for Secure Authorized Deduplicationrdquo IEEETransactions on Parallel and Distributed Systems vol 26 no 5pp 1206ndash1216 2015

[15] Z Liu Y Huang J Li X Cheng and C Shen ldquoDivORAMTowards a practical oblivious RAM with variable block sizerdquoInformation Sciences vol 447 pp 1ndash11 2018

[16] X Chen J Li J Weng J Ma and W Lou ldquoVerifiable computa-tion over large database with incremental updatesrdquo Institute ofElectrical and Electronics Engineers Transactions on Computersvol 65 no 10 pp 3184ndash3195 2016

[17] J Shen C Wang T Li X Chen X Huang and Z-H ZhanldquoSecure data uploading scheme for a smart home systemrdquoInformation Sciences vol 453 pp 186ndash197 2018

[18] S ChenGWangG Yan andDXie ldquoMulti-dimensional fuzzytrust evaluation for mobile social networks based on dynamiccommunity structuresrdquoConcurrencyand Computation Practiceand Experience vol 29 no 7 p e3901 2017

[19] E Luo Q Liu J H Abawajy andGWang ldquoPrivacy-preservingmulti-hop profile-matching protocol for proximity mobilesocial networksrdquo Future Generation Computer Systems vol 68pp 222ndash233 2017

[20] X F Chen J Li J Ma Q Tang and W Lou ldquoNew algorithmsfor secure outsourcing of modular exponentiationsrdquo IEEETransactions on Parallel and Distributed Systems vol 25 no 9pp 2386ndash2396 2014

[21] R Bost R A Popa S Tu and S Goldwasser ldquoMachineLearning Classification over Encrypted Datardquo in Proceedings ofthe Network and Distributed System Security Symposium SanDiego CA

[22] A Peter E Tews and S Katzenbeisser ldquoEfficiently outsourcingmultiparty computation under multiple keysrdquo IEEE Transac-tions on Information Forensics and Security vol 8 no 12 pp2046ndash2058 2013

[23] G Jagannathan K Pillaipakkamnatt andRNWright ldquoA Prac-tical Differentially Private Random Decision Tree Classifierrdquo in

10 Wireless Communications and Mobile Computing

Proceedings of the 2009 IEEE International Conference on DataMining Workshops (ICDMW) pp 114ndash121 Miami FL USADecember 2009

[24] B Gilburd A Schuster and R Wolff ldquoPrivacy-preserving datamining on data grids in the presence of malicious participantsrdquoin Proceedings of the Proceedings 13th IEEE International Sym-posium on High performance Distributed Computing 2004 pp225ndash234 Honolulu HI USA

[25] D Shah and S Zhong ldquoTwo methods for privacy preservingdata mining with malicious participantsrdquo Information Sciencesvol 177 no 23 pp 5468ndash5483 2007

[26] P Li J Li Z Huang et al ldquoMulti-key privacy-preserving deeplearning in cloud computingrdquo Future Generation ComputerSystems vol 74 pp 76ndash85 2017

[27] Q Lin H Yan Z Huang W Chen J Shen and Y TangldquoAn ID-based linearly homomorphic signature scheme and itsapplication in blockchainrdquo IEEE Access vol PP no 99 pp 1-12018

[28] J Xu L Wei Y Zhang A Wang F Zhou and C Gao ldquoDy-namic Fully Homomorphic encryption-based Merkle Tree forlightweight streaming authenticated data structuresrdquo Journal ofNetwork and Computer Applications vol 107 pp 113ndash124 2018

[29] J Shen Z Gui S Ji J Shen H Tan and Y Tang ldquoCloud-aided lightweight certificateless authentication protocol withanonymity for wireless body area networksrdquo Journal of Networkand Computer Applications vol 106 pp 117ndash123 2018

[30] X Zhang Y Tan C Liang Y Li and J Li ldquoA Covert ChannelOver VoLTE via Adjusting Silence Periodsrdquo IEEE Access vol 6pp 9292ndash9302 2018

[31] Y Wang T Li H Qin et al ldquoA brief survey on secure multi-party computing in the presence of rational partiesrdquo Journal ofAmbient Intelligence and Humanized Computing vol 6 no 6pp 807ndash824 2015

[32] P Paillier ldquoPublic-key cryptosystems based on compositedegree residuosity classesrdquo in Advances in CryptologymdashEURO-CRYPT rsquo99 vol 1592 pp 223ndash238 Springer 1999

[33] L Li R Lu K-K R Choo A Datta and J Shao ldquoPrivacy-Preserving-Outsourced Association Rule Mining on Verti-cally Partitioned Databasesrdquo IEEE Transactions on InformationForensics and Security vol 11 no 8 pp 1547ndash1861 2016

[34] P Mohassel M Rosulek and Y Zhang ldquoFast and SecureThree-party Computationrdquo in Proceedings of the the 22ndACMSIGSACConference pp 591ndash602Denver Colorado USAOctober 2015

[35] M Naor ldquoBit commitment using pseudorandomnessrdquo Journalof Cryptology vol 4 no 2 pp 151ndash158 1991

[36] H Cheng Z Su N Xiong and Y Xiao ldquoEnergy-efficientnode scheduling algorithms for wireless sensor networks usingMarkov Random Field modelrdquo Information Sciences vol 329pp 461ndash477 2016

[37] H Cheng N Xiong A V Vasilakos L Tianruo Yang G Chenand X Zhuang ldquoNodes organization for channel assignmentwith topology preservation in multi-radio wireless mesh net-worksrdquo Ad Hoc Networks vol 10 no 5 pp 760ndash773 2012

International Journal of

AerospaceEngineeringHindawiwwwhindawicom Volume 2018

RoboticsJournal of

Hindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom Volume 2018

Active and Passive Electronic Components

VLSI Design

Hindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom Volume 2018

Shock and Vibration

Hindawiwwwhindawicom Volume 2018

Civil EngineeringAdvances in

Acoustics and VibrationAdvances in

Hindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom Volume 2018

Electrical and Computer Engineering

Journal of

Advances inOptoElectronics

Hindawiwwwhindawicom

Volume 2018

Hindawi Publishing Corporation httpwwwhindawicom Volume 2013Hindawiwwwhindawicom

The Scientific World Journal

Volume 2018

Control Scienceand Engineering

Journal of

Hindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom

Journal ofEngineeringVolume 2018

SensorsJournal of

Hindawiwwwhindawicom Volume 2018

International Journal of

RotatingMachinery

Hindawiwwwhindawicom Volume 2018

Modelling ampSimulationin EngineeringHindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom Volume 2018

Chemical EngineeringInternational Journal of Antennas and

Propagation

International Journal of

Hindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom Volume 2018

Navigation and Observation

International Journal of

Hindawi

wwwhindawicom Volume 2018

Advances in

Multimedia

Submit your manuscripts atwwwhindawicom

Page 10: ResearchArticle - Hindawi Publishing Corporationdownloads.hindawi.com/journals/wcmc/2018/2385150.pdf · ResearchArticle Securely Outsourcing ID3 Decision Tree in Cloud Computing YeLi,1

10 Wireless Communications and Mobile Computing

Proceedings of the 2009 IEEE International Conference on DataMining Workshops (ICDMW) pp 114ndash121 Miami FL USADecember 2009

[24] B Gilburd A Schuster and R Wolff ldquoPrivacy-preserving datamining on data grids in the presence of malicious participantsrdquoin Proceedings of the Proceedings 13th IEEE International Sym-posium on High performance Distributed Computing 2004 pp225ndash234 Honolulu HI USA

[25] D Shah and S Zhong ldquoTwo methods for privacy preservingdata mining with malicious participantsrdquo Information Sciencesvol 177 no 23 pp 5468ndash5483 2007

[26] P Li J Li Z Huang et al ldquoMulti-key privacy-preserving deeplearning in cloud computingrdquo Future Generation ComputerSystems vol 74 pp 76ndash85 2017

[27] Q Lin H Yan Z Huang W Chen J Shen and Y TangldquoAn ID-based linearly homomorphic signature scheme and itsapplication in blockchainrdquo IEEE Access vol PP no 99 pp 1-12018

[28] J Xu L Wei Y Zhang A Wang F Zhou and C Gao ldquoDy-namic Fully Homomorphic encryption-based Merkle Tree forlightweight streaming authenticated data structuresrdquo Journal ofNetwork and Computer Applications vol 107 pp 113ndash124 2018

[29] J Shen Z Gui S Ji J Shen H Tan and Y Tang ldquoCloud-aided lightweight certificateless authentication protocol withanonymity for wireless body area networksrdquo Journal of Networkand Computer Applications vol 106 pp 117ndash123 2018

[30] X Zhang Y Tan C Liang Y Li and J Li ldquoA Covert ChannelOver VoLTE via Adjusting Silence Periodsrdquo IEEE Access vol 6pp 9292ndash9302 2018

[31] Y Wang T Li H Qin et al ldquoA brief survey on secure multi-party computing in the presence of rational partiesrdquo Journal ofAmbient Intelligence and Humanized Computing vol 6 no 6pp 807ndash824 2015

[32] P Paillier ldquoPublic-key cryptosystems based on compositedegree residuosity classesrdquo in Advances in CryptologymdashEURO-CRYPT rsquo99 vol 1592 pp 223ndash238 Springer 1999

[33] L Li R Lu K-K R Choo A Datta and J Shao ldquoPrivacy-Preserving-Outsourced Association Rule Mining on Verti-cally Partitioned Databasesrdquo IEEE Transactions on InformationForensics and Security vol 11 no 8 pp 1547ndash1861 2016

[34] P Mohassel M Rosulek and Y Zhang ldquoFast and SecureThree-party Computationrdquo in Proceedings of the the 22ndACMSIGSACConference pp 591ndash602Denver Colorado USAOctober 2015

[35] M Naor ldquoBit commitment using pseudorandomnessrdquo Journalof Cryptology vol 4 no 2 pp 151ndash158 1991

[36] H Cheng Z Su N Xiong and Y Xiao ldquoEnergy-efficientnode scheduling algorithms for wireless sensor networks usingMarkov Random Field modelrdquo Information Sciences vol 329pp 461ndash477 2016

[37] H Cheng N Xiong A V Vasilakos L Tianruo Yang G Chenand X Zhuang ldquoNodes organization for channel assignmentwith topology preservation in multi-radio wireless mesh net-worksrdquo Ad Hoc Networks vol 10 no 5 pp 760ndash773 2012

International Journal of

AerospaceEngineeringHindawiwwwhindawicom Volume 2018

RoboticsJournal of

Hindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom Volume 2018

Active and Passive Electronic Components

VLSI Design

Hindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom Volume 2018

Shock and Vibration

Hindawiwwwhindawicom Volume 2018

Civil EngineeringAdvances in

Acoustics and VibrationAdvances in

Hindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom Volume 2018

Electrical and Computer Engineering

Journal of

Advances inOptoElectronics

Hindawiwwwhindawicom

Volume 2018

Hindawi Publishing Corporation httpwwwhindawicom Volume 2013Hindawiwwwhindawicom

The Scientific World Journal

Volume 2018

Control Scienceand Engineering

Journal of

Hindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom

Journal ofEngineeringVolume 2018

SensorsJournal of

Hindawiwwwhindawicom Volume 2018

International Journal of

RotatingMachinery

Hindawiwwwhindawicom Volume 2018

Modelling ampSimulationin EngineeringHindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom Volume 2018

Chemical EngineeringInternational Journal of Antennas and

Propagation

International Journal of

Hindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom Volume 2018

Navigation and Observation

International Journal of

Hindawi

wwwhindawicom Volume 2018

Advances in

Multimedia

Submit your manuscripts atwwwhindawicom

Page 11: ResearchArticle - Hindawi Publishing Corporationdownloads.hindawi.com/journals/wcmc/2018/2385150.pdf · ResearchArticle Securely Outsourcing ID3 Decision Tree in Cloud Computing YeLi,1

International Journal of

AerospaceEngineeringHindawiwwwhindawicom Volume 2018

RoboticsJournal of

Hindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom Volume 2018

Active and Passive Electronic Components

VLSI Design

Hindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom Volume 2018

Shock and Vibration

Hindawiwwwhindawicom Volume 2018

Civil EngineeringAdvances in

Acoustics and VibrationAdvances in

Hindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom Volume 2018

Electrical and Computer Engineering

Journal of

Advances inOptoElectronics

Hindawiwwwhindawicom

Volume 2018

Hindawi Publishing Corporation httpwwwhindawicom Volume 2013Hindawiwwwhindawicom

The Scientific World Journal

Volume 2018

Control Scienceand Engineering

Journal of

Hindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom

Journal ofEngineeringVolume 2018

SensorsJournal of

Hindawiwwwhindawicom Volume 2018

International Journal of

RotatingMachinery

Hindawiwwwhindawicom Volume 2018

Modelling ampSimulationin EngineeringHindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom Volume 2018

Chemical EngineeringInternational Journal of Antennas and

Propagation

International Journal of

Hindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom Volume 2018

Navigation and Observation

International Journal of

Hindawi

wwwhindawicom Volume 2018

Advances in

Multimedia

Submit your manuscripts atwwwhindawicom