Differential Privacy Preservation for Deep Auto-Encoders

1

30th AAAI Conference on Artificial Intelligence, Phoenix, Arizona - 2016

Differential Privacy Preservation for Deep Auto-Encoders: an Application of

Human Behavior Prediction

NhatHai Phan1, Yue Wang2, Xintao Wu3, and Dejing Dou1

1 University of Oregon, 2 University of North Carolina Charlotte, 3University of Arkansas

{haiphan,dou}@cs.uoregon.edu, [email protected], [email protected]

mailto:[email protected]






2

Outline• Deep Learning and Deep Auto-Encoders

• Differential Privacy Preservation for Deep Auto-Encoders– Deep Private Auto-encoders (dPA)

• Application– YesiWell Health Social Network– Human Behavior Prediction

• Conclusions and Future Works

3

Deep Learning

Pixels

1st Layer“Edges”

2nd Layer“Object parts”

3rd Layer“Objects”

[Andrew Ng]iv

1h

2h

3h

y

1W

2W

3W

4W

v

4

Deep Auto-Encoders• Data reconstruction

• Softmax layer

Auto-encoder

y

0WTW0

v

1h

1

11WTW1

2h1

1

……

…

)(kW

TkW 1

Deep Auto-encoder

0WTW0

v

1h

1

1

)(h k

||

1 1

)~1log()1(~log),(D

i

d

jijijijij vvvvWDR

||

1

)ˆ1log()1(ˆlog),(TY

iiiiiT yyyyYC

v~

5

Motivation

• Deep learning– Social media, social network analysis,

bioinformatics, medicine and healthcare. • Privacy issues? – Users' personal and highly sensitive data, such as

clinical records, user profiles, photo, etc.• Differential privacy– Deep Private Auto-Encoders

6

- Differential Privacy Definition

• The goal of a privacy-preserving statistical database is to – learn properties of the population as a whole, – while protecting the privacy of the individuals in the

sample

• Differential privacy (preserving algorithm)– maximize the accuracy of queries from statistical

databases– minimize the chances of identifying its records

7

Challenges

• Unprecedented work

• A non-trivial task– R(D,W) is complicated– The algorithm must be

efficient on large datasets

• Guarantee the potential to use unlabeled data in a dPA model

Amount of DataPe

rform

ance

Most learning algorithms

New AI methods(deep learning)

[Andrew Ng, 2015]

Deep Private Auto-Encoders

• Functional Mechanism– injecting Laplace noise Lap(Δ/ε) into

polynomial coefficients of polynomial functions

y

0WTW0

v

1h

1

1

11WTW1

2h1

1

……

……

)(kW

1

Deep Private Auto-encoder

8

Polynomial Approximation

||

1 1 )~1log()1(

~log),(

D

i

d

j ijij

ijij

vvvv

WDR

v~

9

•

• Apply Functional Mechanism to inject Laplace noise Lap(Δ/ε)

Polynomial Approximation Taylor Expansion [Arfken 1985]

Arfken, G. 1985. In Mathematical Methods for Physicists (Third Edition). Academic Press.

||

1 1 22

1

)2(

2

1

2

1

)1()0(

))(!2)0(

(

))0(()0(),(ˆ

D

i

d

jij

l

lj

ijl l

ljlj

hWf

hWffWDR

||

1 1

)~1log()1(~log),(D

i

d

jijijijij vvvvWDR

Taylor Expansion Error?

10

Approximation Error Bounds

• Approximation error bounds

• Our algorithm can be applied on large datasets

2

2

2

2

)1(12)~,(~)ˆ,(~)1(12)~,(~)ˆ,(~

),(ˆminargˆ);,(~minarg~

eeeeYCYC

deeeeWDRWDR

WDRWWDRW

TT

WW

#input units - derror

11

Outline• Deep Learning and Deep Auto-Encoders

• Differential Privacy Preservation for Deep Auto-Encoders– Deep Private Auto-encoders (dPA)

• Application– YesiWell Health Social Network– Human Behavior Prediction

• Conclusions and Future Works

12

Semantic Mining of Activity, Social, and Health Data (NIH/NIGMS Funded in 2013, R01 GM103309) (PI: Dou)

13

Human Behavior Prediction

1

2

3

4

5

6

7

1

2

3

4

5

6

7

1

2

3

4

5

6

7

t1t 1tDecrease exercise Increase exercise

Dataset, Features, and Task• YesiWell dataset

– 254 users– Oct 2010 – Aug 2011

• BMI • Wellness Score

• Prediction Task: Try to predict whether a YesiWell user will increase or decrease exercises in the next week compared with the current week.

14

2))(()( mheightkgmass

)1()()/()(

3423

121

cHbAULDLUHDLTGUBMIUy

15

dPA-based Human Behavior Prediction (dPAH)

IndividualFeatures

IndividualPast Features

SocialCorrelations

1h

1

16

Human Behavior PredictionExperimental Results

• Do not enforce differential privacy– CRBM, SctRBM– Deep Auto-Encoder (dA)– Truncated Deep Auto-

Encoder (TdA)

• Do enforce differential privacy– Functional Mechanism (FM)– DPME, Filter-Priority (FP)

• dPAH: 83.39% – (ε, sampling rate) = (1, 0.4)

data

17

Conclusions

• Deep private auto-encoders– Human behavior prediction: 83.392%

• The proposed algorithm can work for– Deep Belief Networks– Convolutional Neural Networks

• Extract sensitive information from a deep private auto-encoder

18

30th AAAI Conference on Artificial Intelligence, Phoenix, Arizona - 2016

SMASH Project: http://aimlab.cs.uoregon.edu/smash/

YesiWell Health Social Network

Thank you!

Science

Differential Privacy Preservation for Deep Auto-Encoders