2
Engineering Privacy-Preserving Machine Learning Protocols Thomas Schneider Cryptography and Privacy Engineering Group (ENCRYPTO), TU Darmstadt, Germany [email protected] ABSTRACT Privacy-preserving machine learning (PPML) protocols allow to privately evaluate or even train machine learning (ML) models on sensitive data while simultaneously protecting the data and the model. So far, most of these protocols were built and optimized by hand, which requires expert knowledge in cryptography and also a thorough understanding of the ML models. Moreover, the design space is very large as there are many technologies that can even be combined with several trade-offs. Examples for the underlying cryptographic building blocks include homomorphic encryption (HE) where computation typically is the bottleneck, and secure multi-party computation protocols (MPC) that rely mostly on symmetric key cryptography where communication is often the bottleneck. In this keynote, I will describe our research towards engineering practical PPML protocols that protect models and data. First of all, there is no point in designing PPML protocols for too simple models such as Support Vector Machines (SVMs) or Support Vector Regression Machines (SVRs), because they can be stolen easily [10] and hence do not benefit from protection. Complex models can be protected and evaluated in real-time using Trusted Execution Environments (TEEs) which we demonstrated for speech recogni- tion using Intel SGX [5] and for keyword recognition using ARM TrustZone [3] as respective commercial TEE technologies. Our goal is to build tools for non-experts in cryptography to automat- ically generate highly optimized mixed PPML protocols given a high-level specification in a ML framework like TensorFlow. To- wards this, we have built tools to automatically generate optimized mixed protocols that combine HE and different MPC protocols [68]. Such mixed protocols can for example be used for the efficient privacy-preserving evaluation of decision trees [1, 2, 9, 13] and neural networks [2, 11, 12]. The first PPML protocols for these ML classifiers were proposed long before the current hype on PPML started [1, 2, 12]. We already have first results for compiling high- level ML specifications via our tools into mixed protocols for neural networks (from TensorFlow) [4] and sum-product networks (from SPFlow) [14], and I will conclude with major open challenges. CCS CONCEPTS Security and privacy Privacy-preserving protocols; Com- puting methodologies Classification and regression trees; Neu- ral networks. Permission to make digital or hard copies of part or all of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for third-party components of this work must be honored. For all other uses, contact the owner/author(s). PPMLP’20, November 9, 2020, Virtual Event, USA © 2020 Copyright held by the owner/author(s). ACM ISBN 978-1-4503-8088-1/20/11. https://doi.org/10.1145/3411501.3418607 KEYWORDS privacy-preserving machine learning; tools for secure computation; optimization ACM Reference Format: Thomas Schneider. 2020. Engineering Privacy-Preserving Machine Learning Protocols. In 2020 Workshop on Privacy-Preserving Machine Learning in Practice (PPMLP’20), November 9, 2020, Virtual Event, USA. ACM, New York, NY, USA, 2 pages. https://doi.org/10.1145/3411501.3418607 BIOGRAPHY Thomas Schneider is full professor in the Department of Computer Science at the Technical University of Darmstadt. Before, he was independent research group leader at TU Darmstadt (2012-2018), earned a PhD in IT Security from Ruhr-University Bochum (2008- 2011), and wrote his Master thesis [13] during a research internship at Alcatel-Lucent Bell Labs, NJ, USA (2007). His main expertise is in cryptography and privacy engineering for which he was awarded in 2019 with an ERC Starting Grant, the highest profile excellence project in Europe for researchers at his scientific age. He heads the Cryptography and Privacy Engineering Group (ENCRYPTO), whose mission is to demonstrate that privacy can be efficiently protected in real-world applications. For this, his group combines applied cryptography and algorithm engineering to develop protocols and tools for protecting sensitive data and algorithms. This keynote summarizes some of the research works that are most relevant to privacy-preserving machine learning (see https://encrypto.de/topics/PPML), for which the ENCRYPTO group received a research award by Intel Corporation in 2019. Keynote Talk PPMLP '20, November 9, 2020, Virtual Event, USA 3

Engineering Privacy-Preserving Machine Learning Protocols

  • Upload
    others

  • View
    5

  • Download
    0

Embed Size (px)

Citation preview

Engineering Privacy-Preserving Machine Learning ProtocolsThomas Schneider

Cryptography and Privacy Engineering Group (ENCRYPTO), TU Darmstadt, [email protected]

ABSTRACTPrivacy-preserving machine learning (PPML) protocols allow toprivately evaluate or even train machine learning (ML) models onsensitive data while simultaneously protecting the data and themodel. So far, most of these protocols were built and optimizedby hand, which requires expert knowledge in cryptography andalso a thorough understanding of the ML models. Moreover, thedesign space is very large as there are many technologies thatcan even be combined with several trade-offs. Examples for theunderlying cryptographic building blocks include homomorphicencryption (HE) where computation typically is the bottleneck, andsecure multi-party computation protocols (MPC) that rely mostlyon symmetric key cryptography where communication is oftenthe bottleneck.

In this keynote, I will describe our research towards engineeringpractical PPML protocols that protect models and data. First ofall, there is no point in designing PPML protocols for too simplemodels such as Support Vector Machines (SVMs) or Support VectorRegression Machines (SVRs), because they can be stolen easily [10]and hence do not benefit from protection. Complex models canbe protected and evaluated in real-time using Trusted ExecutionEnvironments (TEEs) which we demonstrated for speech recogni-tion using Intel SGX [5] and for keyword recognition using ARMTrustZone [3] as respective commercial TEE technologies. Ourgoal is to build tools for non-experts in cryptography to automat-ically generate highly optimized mixed PPML protocols given ahigh-level specification in a ML framework like TensorFlow. To-wards this, we have built tools to automatically generate optimizedmixed protocols that combine HE and different MPC protocols [6–8]. Such mixed protocols can for example be used for the efficientprivacy-preserving evaluation of decision trees [1, 2, 9, 13] andneural networks [2, 11, 12]. The first PPML protocols for these MLclassifiers were proposed long before the current hype on PPMLstarted [1, 2, 12]. We already have first results for compiling high-level ML specifications via our tools into mixed protocols for neuralnetworks (from TensorFlow) [4] and sum-product networks (fromSPFlow) [14], and I will conclude with major open challenges.

CCS CONCEPTS• Security andprivacy→Privacy-preserving protocols; •Com-puting methodologies→ Classification and regression trees; Neu-ral networks.

Permission to make digital or hard copies of part or all of this work for personal orclassroom use is granted without fee provided that copies are not made or distributedfor profit or commercial advantage and that copies bear this notice and the full citationon the first page. Copyrights for third-party components of this work must be honored.For all other uses, contact the owner/author(s).PPMLP’20, November 9, 2020, Virtual Event, USA© 2020 Copyright held by the owner/author(s).ACM ISBN 978-1-4503-8088-1/20/11.https://doi.org/10.1145/3411501.3418607

KEYWORDSprivacy-preserving machine learning; tools for secure computation;optimization

ACM Reference Format:Thomas Schneider. 2020. Engineering Privacy-Preserving Machine LearningProtocols. In 2020 Workshop on Privacy-Preserving Machine Learning inPractice (PPMLP’20), November 9, 2020, Virtual Event, USA. ACM, New York,NY, USA, 2 pages. https://doi.org/10.1145/3411501.3418607

BIOGRAPHYThomas Schneider is full professor in the Department of ComputerScience at the Technical University of Darmstadt. Before, he wasindependent research group leader at TU Darmstadt (2012-2018),earned a PhD in IT Security from Ruhr-University Bochum (2008-2011), and wrote his Master thesis [13] during a research internshipat Alcatel-Lucent Bell Labs, NJ, USA (2007).

His main expertise is in cryptography and privacy engineeringfor which he was awarded in 2019 with an ERC Starting Grant, thehighest profile excellence project in Europe for researchers at hisscientific age. He heads the Cryptography and Privacy EngineeringGroup (ENCRYPTO), whose mission is to demonstrate that privacycan be efficiently protected in real-world applications. For this, hisgroup combines applied cryptography and algorithm engineeringto develop protocols and tools for protecting sensitive data andalgorithms. This keynote summarizes some of the research worksthat are most relevant to privacy-preserving machine learning (seehttps://encrypto.de/topics/PPML), for which the ENCRYPTO groupreceived a research award by Intel Corporation in 2019.

Keynote Talk PPMLP '20, November 9, 2020, Virtual Event, USA

3

PPMLP’20, November 9, 2020, Virtual Event, USA Thomas Schneider

ACKNOWLEDGMENTSThe research described in this keynote has received co-fundingfrom the European Research Council (ERC) under the EuropeanUnion’s Horizon 2020 research and innovation program (grantagreement No. 850990 PSOTI), from the Deutsche Forschungsge-meinschaft (DFG) — SFB 1119 CROSSING/236615297 and GRK 2050Privacy & Trust/251805230, and from the German Federal Ministryof Education and Research and the Hessen State Ministry for HigherEducation, Research and the Arts within ATHENE.

REFERENCES[1] Mauro Barni, Pierluigi Failla, Vladimir Kolesnikov, Riccardo Lazzeretti, Ahmad-

Reza Sadeghi, and Thomas Schneider. 2009. Secure Evaluation of Private LinearBranching Programs with Medical Applications. In 14. European Symposium onResearch in Computer Security (ESORICS’09) (LNCS, Vol. 5789). Springer.

[2] Mauro Barni, Pierluigi Failla, Riccardo Lazzeretti, Ahmad-Reza Sadeghi, andThomas Schneider. 2011. Privacy-Preserving ECG Classification with BranchingPrograms and Neural Networks. IEEE Transactions on Information Forensics andSecurity (TIFS) 6, 2 (2011).

[3] Sebastian P. Bayerl, Tommaso Frassetto, Patrick Jauernig, Korbinian Riedhammer,Ahmad-Reza Sadeghi, Thomas Schneider, Emmanuel Stapf, and ChristianWeinert.2020. Offline Model Guard: Secure and Private ML on Mobile Devices. In 23.Design, Automation & Test in Europe Conference & Exhibition (DATE’20). IEEE.

[4] Fabian Boemer, Rosario Cammarota, Daniel Demmler, Thomas Schneider, andHossein Yalame. 2020. MP2ML: A Mixed-Protocol Machine Learning Frameworkfor Private Inference. In 15. International Conference on Availability, Reliabilityand Security (ARES’20). ACM. Code: https://ngra.ph/he.

[5] Ferdinand Brasser, Tommaso Frassetto, Korbinian Riedhammer, Ahmad-RezaSadeghi, Thomas Schneider, and ChristianWeinert. 2018. VoiceGuard: Secure andPrivate Speech Processing. In 19. Annual Conference of the International SpeechCommunication Association (INTERSPEECH’18). International Speech Communi-cation Association (ISCA).

[6] Niklas Büscher, Daniel Demmler, Stefan Katzenbeisser, David Kretzmer, andThomas Schneider. 2018. HyCC: Compilation of Hybrid Protocols for PracticalSecure Computation. In 25. ACM Conference on Computer and CommunicationsSecurity (CCS’18). ACM. Code: https://gitlab.com/securityengineering/HyCC.

[7] Daniel Demmler, Thomas Schneider, and Michael Zohner. 2015. ABY – A Frame-work for Efficient Mixed-Protocol Secure Two-Party Computation. In 22. AnnualNetwork and Distributed System Security Symposium (NDSS’15). Internet Society.Code: https://encrypto.de/code/ABY.

[8] Wilko Henecka, Stefan Kögl, Ahmad-Reza Sadeghi, Thomas Schneider, and ImmoWehrenberg. 2010. TASTY: Tool for Automating Secure Two-partY computations.In 17. ACM Conference on Computer and Communications Security (CCS’10). ACM.Code: https://encrypto.de/code/TASTY.

[9] Ágnes Kiss, Masoud Naderpour, Jian Liu, N. Asokan, and Thomas Schneider. 2019.SoK: Modular and Efficient Private Decision Tree Evaluation. Proceedings onPrivacy Enhancing Technologies (PoPETs) 2019, 2 (2019). Code: https://encrypto.de/code/PDTE.

[10] Robert Nikolai Reith, Thomas Schneider, and Oleksandr Tkachenko. 2019. Effi-ciently Stealing your Machine Learning Models. In 18. Workshop on Privacy inthe Electronic Society (WPES’19). ACM.

[11] M. Sadegh Riazi, Christian Weinert, Oleksandr Tkachenko, Ebrahim M. Songhori,Thomas Schneider, and Farinaz Koushanfar. 2018. Chameleon: A Hybrid SecureComputation Framework for Machine Learning Applications. In 13. ACM AsiaConference on Information, Computer and Communications Security (ASIACCS’18).ACM.

[12] Ahmad-Reza Sadeghi and Thomas Schneider. 2008. Generalized Universal Circuitsfor Secure Evaluation of Private Functions with Application to Data Classification.In 11. International Conference on Information Security and Cryptology (ICISC’08)(LNCS, Vol. 5461). Springer.

[13] Thomas Schneider. 2008. Practical Secure Function Evaluation. Master’s thesis.Friedrich-Alexander University Erlangen-Nürnberg, Germany. https://encrypto.de/papers/S08Thesis.pdf

[14] Amos Treiber, Alejandro Molina, Christian Weinert, Thomas Schneider, andKristian Kersting. 2020. CryptoSPN: Privacy-preserving Sum-Product NetworkInference. In 24. European Conference on Artificial Intelligence (ECAI’20). IOSPress.

Keynote Talk PPMLP '20, November 9, 2020, Virtual Event, USA

4