91
Machine Learning for Data Science (CS4786) Lecture 26 Differential Privacy and Machine Learning Machine Learning for Data Science (CS4786) Lecture 23

Machine Learning for Data Science (CS4786) Lecture 26

  • Upload
    others

  • View
    2

  • Download
    1

Embed Size (px)

Citation preview

Page 1: Machine Learning for Data Science (CS4786) Lecture 26

Machine Learning for Data Science (CS4786)

Lecture 26

Differential Privacy and Machine Learning

Course Webpage :

http://www.cs.cornell.edu/Courses/cs4786/2017fa/

Machine Learning for Data Science (CS4786)Lecture 23

Approximate Inference

Course Webpage :http://www.cs.cornell.edu/Courses/cs4786/2017fa/

Page 2: Machine Learning for Data Science (CS4786) Lecture 26

Competition• Clustering Wikipedia articles: (11039 articles)

• Data provided:

• Text of articles: Both raw data and feature extracted data provided

• Graph linking articles to one another

• Competition is hosted on Vocareum

• Vocareum submission + 5 page report due at end of exam week

Page 3: Machine Learning for Data Science (CS4786) Lecture 26

ML Requires Data

Page 4: Machine Learning for Data Science (CS4786) Lecture 26

ML Requires Data• By definition, ML is the task of automatically learning from

examples or instances (Data)

Page 5: Machine Learning for Data Science (CS4786) Lecture 26

ML Requires Data• By definition, ML is the task of automatically learning from

examples or instances (Data)

• Often privacy concerns about Data used:

Page 6: Machine Learning for Data Science (CS4786) Lecture 26

ML Requires Data• By definition, ML is the task of automatically learning from

examples or instances (Data)

• Often privacy concerns about Data used:

• Medical records of patients (Eg. learn how much smoking affects chances of getting cancer)

Page 7: Machine Learning for Data Science (CS4786) Lecture 26

ML Requires Data• By definition, ML is the task of automatically learning from

examples or instances (Data)

• Often privacy concerns about Data used:

• Medical records of patients (Eg. learn how much smoking affects chances of getting cancer)

• User search logs (Eg. learning personalized query retrieval for searches)

Page 8: Machine Learning for Data Science (CS4786) Lecture 26

ML Requires Data• By definition, ML is the task of automatically learning from

examples or instances (Data)

• Often privacy concerns about Data used:

• Medical records of patients (Eg. learn how much smoking affects chances of getting cancer)

• User search logs (Eg. learning personalized query retrieval for searches)

• Genetic information (Eg. to learn genetic predispositions)

Page 9: Machine Learning for Data Science (CS4786) Lecture 26

AOL Data Release• AOL released internet query dataset (after anonymizing)

Page 10: Machine Learning for Data Science (CS4786) Lecture 26

AOL Data Release• AOL released internet query dataset (after anonymizing)

Page 11: Machine Learning for Data Science (CS4786) Lecture 26
Page 12: Machine Learning for Data Science (CS4786) Lecture 26

Netflix Challenge [NS’08]

Page 13: Machine Learning for Data Science (CS4786) Lecture 26

Netflix Challenge [NS’08]

Page 14: Machine Learning for Data Science (CS4786) Lecture 26

Netflix Challenge [NS’08]

• Given ratings by users for some movies

Netflix Violates Privacy [NS08]

User%1%User%2%User%3%

Movies%

2-8 movie-ratings and dates for Alice reveals:

Whether Alice is in the dataset or notAlice’s other movie ratings

5

4

1

11 1

Page 15: Machine Learning for Data Science (CS4786) Lecture 26

Netflix Challenge [NS’08]

• Given ratings by users for some movies

• Predict remaining ratings

Netflix Violates Privacy [NS08]

User%1%User%2%User%3%

Movies%

2-8 movie-ratings and dates for Alice reveals:

Whether Alice is in the dataset or notAlice’s other movie ratings

5

4

1

11 1? ?? ??

?

?

??

Page 16: Machine Learning for Data Science (CS4786) Lecture 26

Netflix Challenge [NS’08]

• Given ratings by users for some movies

• Predict remaining ratings

• Though users were anonymized, some users on dataset were identified

Netflix Violates Privacy [NS08]

User%1%User%2%User%3%

Movies%

2-8 movie-ratings and dates for Alice reveals:

Whether Alice is in the dataset or notAlice’s other movie ratings

5

4

1

11 1? ?? ??

?

?

??

Page 17: Machine Learning for Data Science (CS4786) Lecture 26

Netflix Challenge [NS’08]

• Given ratings by users for some movies

• Predict remaining ratings

• Though users were anonymized, some users on dataset were identified

•How?!!!

Netflix Violates Privacy [NS08]

User%1%User%2%User%3%

Movies%

2-8 movie-ratings and dates for Alice reveals:

Whether Alice is in the dataset or notAlice’s other movie ratings

5

4

1

11 1? ?? ??

?

?

??

Page 18: Machine Learning for Data Science (CS4786) Lecture 26

Netflix Challenge [NS’08]Netflix Violates Privacy [NS08]

User%1%User%2%User%3%

Movies%

2-8 movie-ratings and dates for Alice reveals:

Whether Alice is in the dataset or notAlice’s other movie ratings

5

4

1

11 1? ?? ??

?

?

??

Page 19: Machine Learning for Data Science (CS4786) Lecture 26

Netflix Challenge [NS’08]

• Some of the users posted reviews (for few movies) on IMDB

Netflix Violates Privacy [NS08]

User%1%User%2%User%3%

Movies%

2-8 movie-ratings and dates for Alice reveals:

Whether Alice is in the dataset or notAlice’s other movie ratings

5

4

1

11 1? ?? ??

?

?

??

Page 20: Machine Learning for Data Science (CS4786) Lecture 26

Netflix Challenge [NS’08]

• Some of the users posted reviews (for few movies) on IMDB

• Only a very small overlap with IMDB was required

Netflix Violates Privacy [NS08]

User%1%User%2%User%3%

Movies%

2-8 movie-ratings and dates for Alice reveals:

Whether Alice is in the dataset or notAlice’s other movie ratings

5

4

1

11 1? ?? ??

?

?

??

Page 21: Machine Learning for Data Science (CS4786) Lecture 26

Netflix Challenge [NS’08]

• Some of the users posted reviews (for few movies) on IMDB

• Only a very small overlap with IMDB was required

• You pretty much get the persons viewing record from /Netflix without consent

Netflix Violates Privacy [NS08]

User%1%User%2%User%3%

Movies%

2-8 movie-ratings and dates for Alice reveals:

Whether Alice is in the dataset or notAlice’s other movie ratings

5

4

1

11 1? ?? ??

?

?

??

Page 22: Machine Learning for Data Science (CS4786) Lecture 26

Privacy Concerns in ML

Page 23: Machine Learning for Data Science (CS4786) Lecture 26

Privacy Concerns in ML• Clearly just anonymizing didn’t seem to do the trick here

Page 24: Machine Learning for Data Science (CS4786) Lecture 26

Privacy Concerns in ML• Clearly just anonymizing didn’t seem to do the trick here

• Of course in these cases data set was released.

Page 25: Machine Learning for Data Science (CS4786) Lecture 26

Privacy Concerns in ML• Clearly just anonymizing didn’t seem to do the trick here

• Of course in these cases data set was released.

• What if we didn’t release data set:

Page 26: Machine Learning for Data Science (CS4786) Lecture 26

Privacy Concerns in ML• Clearly just anonymizing didn’t seem to do the trick here

• Of course in these cases data set was released.

• What if we didn’t release data set:

• Trusted party uses data to learn classifiers or general statistics

Page 27: Machine Learning for Data Science (CS4786) Lecture 26

Privacy Concerns in ML• Clearly just anonymizing didn’t seem to do the trick here

• Of course in these cases data set was released.

• What if we didn’t release data set:

• Trusted party uses data to learn classifiers or general statistics

• Only releases general statistics?

Page 28: Machine Learning for Data Science (CS4786) Lecture 26

Privacy Concerns in ML• Clearly just anonymizing didn’t seem to do the trick here

• Of course in these cases data set was released.

• What if we didn’t release data set:

• Trusted party uses data to learn classifiers or general statistics

• Only releases general statistics?

• Or classifier learnt from data?

Page 29: Machine Learning for Data Science (CS4786) Lecture 26

Thought Experiment

Page 30: Machine Learning for Data Science (CS4786) Lecture 26

Thought Experiment• Say we release general statistics from a study

Page 31: Machine Learning for Data Science (CS4786) Lecture 26

Thought Experiment• Say we release general statistics from a study

• Eg. Smokers Vs Non-smokers (per state or county…)

Page 32: Machine Learning for Data Science (CS4786) Lecture 26

Thought Experiment• Say we release general statistics from a study

• Eg. Smokers Vs Non-smokers (per state or county…)

• We release mean salary in the two groups

Page 33: Machine Learning for Data Science (CS4786) Lecture 26

Thought Experiment• Say we release general statistics from a study

• Eg. Smokers Vs Non-smokers (per state or county…)

• We release mean salary in the two groups

• Likelihood of Cancer in the two groups

Page 34: Machine Learning for Data Science (CS4786) Lecture 26

Thought Experiment• Say we release general statistics from a study

• Eg. Smokers Vs Non-smokers (per state or county…)

• We release mean salary in the two groups

• Likelihood of Cancer in the two groups

• Average number of jobs held by people in the two groups

Page 35: Machine Learning for Data Science (CS4786) Lecture 26

Thought Experiment• Say we release general statistics from a study

• Eg. Smokers Vs Non-smokers (per state or county…)

• We release mean salary in the two groups

• Likelihood of Cancer in the two groups

• Average number of jobs held by people in the two groups

What is the problem?

Page 36: Machine Learning for Data Science (CS4786) Lecture 26

Thought Experiment• Say we release general statistics from a study

• Eg. Smokers Vs Non-smokers (per state or county…)

• We release mean salary in the two groups

• Likelihood of Cancer in the two groups

• Average number of jobs held by people in the two groups

Page 37: Machine Learning for Data Science (CS4786) Lecture 26

Thought Experiment• Say we release general statistics from a study

• Eg. Smokers Vs Non-smokers (per state or county…)

• We release mean salary in the two groups

• Likelihood of Cancer in the two groups

• Average number of jobs held by people in the two groups

Say “Fill Nates’’ from WA was in the dataset, and is very very rich.

Page 38: Machine Learning for Data Science (CS4786) Lecture 26

Thought Experiment: subtler

Page 39: Machine Learning for Data Science (CS4786) Lecture 26

Thought Experiment: subtler

• Building classifier and releasing only the classifier

Page 40: Machine Learning for Data Science (CS4786) Lecture 26

Thought Experiment: subtler

• Building classifier and releasing only the classifier

• “Assume” chain smoking has some correlation with lower income

Page 41: Machine Learning for Data Science (CS4786) Lecture 26

Thought Experiment: subtler

• Building classifier and releasing only the classifier

• “Assume” chain smoking has some correlation with lower income

• Say we have classifier from two or more counties/hospital, one of them has “Fill Nates”

Page 42: Machine Learning for Data Science (CS4786) Lecture 26

Thought Experiment: subtler

• Building classifier and releasing only the classifier

• “Assume” chain smoking has some correlation with lower income

• Say we have classifier from two or more counties/hospital, one of them has “Fill Nates”

• Say we use regression for learning the classifier

Page 43: Machine Learning for Data Science (CS4786) Lecture 26

Thought Experiment: subtler

• Building classifier and releasing only the classifier

• “Assume” chain smoking has some correlation with lower income

• Say we have classifier from two or more counties/hospital, one of them has “Fill Nates”

• Say we use regression for learning the classifier

• By looking at weight put on income column of dataset, we can infer if “Fill Nates” was part of study and which hospital

Page 44: Machine Learning for Data Science (CS4786) Lecture 26

Defining Privacy

Page 45: Machine Learning for Data Science (CS4786) Lecture 26

Defining Privacy

Dataset +

Page 46: Machine Learning for Data Science (CS4786) Lecture 26

Defining Privacy

Dataset + Learning Algorithm

Differential Privacy [DMNS06]

“similar”

RandomizedSanitizer

Randomized Sanitizer

Data +

Data +

Participation of single person does not change outcome

Distribution of outcome

Page 47: Machine Learning for Data Science (CS4786) Lecture 26

Defining Privacy

Dataset + Learning Algorithm

Differential Privacy [DMNS06]

“similar”

RandomizedSanitizer

Randomized Sanitizer

Data +

Data +

Participation of single person does not change outcome

Dataset + Learning Algorithm

Differential Privacy [DMNS06]

“similar”

RandomizedSanitizer

Randomized Sanitizer

Data +

Data +

Participation of single person does not change outcome

Distribution of outcome

Distribution of outcome

Page 48: Machine Learning for Data Science (CS4786) Lecture 26

Defining Privacy

Dataset + Learning Algorithm

Differential Privacy [DMNS06]

“similar”

RandomizedSanitizer

Randomized Sanitizer

Data +

Data +

Participation of single person does not change outcome

Dataset + Learning Algorithm

Differential Privacy [DMNS06]

“similar”

RandomizedSanitizer

Randomized Sanitizer

Data +

Data +

Participation of single person does not change outcome

Distribution of outcome

Distribution of outcome

Differential Privacy [DMNS06]

“similar”

RandomizedSanitizer

Randomized Sanitizer

Data +

Data +

Participation of single person does not change outcome

Similar

Page 49: Machine Learning for Data Science (CS4786) Lecture 26

Differential Privacy

Page 50: Machine Learning for Data Science (CS4786) Lecture 26

Differential Privacy• A deterministic algorithm cannot preserve privacy

Page 51: Machine Learning for Data Science (CS4786) Lecture 26

Differential Privacy• A deterministic algorithm cannot preserve privacy

• Say S = (Data1,…,Datan) is the data provided to learning algorithm (be it clustering, supervised learning etc).

Page 52: Machine Learning for Data Science (CS4786) Lecture 26

Differential Privacy• A deterministic algorithm cannot preserve privacy

• Say S = (Data1,…,Datan) is the data provided to learning algorithm (be it clustering, supervised learning etc).

• Say (randomized) learning algorithm A takes this training data and returns solution as A(S)

Page 53: Machine Learning for Data Science (CS4786) Lecture 26

Differential Privacy• A deterministic algorithm cannot preserve privacy

• Say S = (Data1,…,Datan) is the data provided to learning algorithm (be it clustering, supervised learning etc).

• Say (randomized) learning algorithm A takes this training data and returns solution as A(S)

• Algorithm A is (𝜖,ẟ)- differentially private if for all samples S and S’ that only differ by one data point and any set C

P (A(S) 2 C) e✏P (A(S0) 2 C) + �

Page 54: Machine Learning for Data Science (CS4786) Lecture 26

Differential Privacy• A deterministic algorithm cannot preserve privacy

• Say S = (Data1,…,Datan) is the data provided to learning algorithm (be it clustering, supervised learning etc).

• Say (randomized) learning algorithm A takes this training data and returns solution as A(S)

• Algorithm A is (𝜖,ẟ)- differentially private if for all samples S and S’ that only differ by one data point and any set C

P (A(S) 2 C) e✏P (A(S0) 2 C) + �

• ẟ=0 is called pure differential privacy

Page 55: Machine Learning for Data Science (CS4786) Lecture 26

Differential Privacy

Page 56: Machine Learning for Data Science (CS4786) Lecture 26

Differential Privacy

Formally defining privacy

S

CDataset + Dataset +

Page 57: Machine Learning for Data Science (CS4786) Lecture 26

Differential Privacy

Formally defining privacy

S

CDataset + Dataset +

P (A(S) 2 C) e✏P (A(S0) 2 C) + �

Page 58: Machine Learning for Data Science (CS4786) Lecture 26

Differential Privacy

Formally defining privacy

S

CDataset + Dataset +

P (A(S) 2 C) e✏P (A(S0) 2 C) + �

1 (for small 𝜖)

Page 59: Machine Learning for Data Science (CS4786) Lecture 26

Differential Privacy

Page 60: Machine Learning for Data Science (CS4786) Lecture 26

• Any deterministic algorithm either has to produce constant outcome (making it useless)

Differential Privacy

Page 61: Machine Learning for Data Science (CS4786) Lecture 26

• Any deterministic algorithm either has to produce constant outcome (making it useless)

• If it doesn’t, define C to be the singleton set of outcome under say S.

Differential Privacy

Page 62: Machine Learning for Data Science (CS4786) Lecture 26

• Any deterministic algorithm either has to produce constant outcome (making it useless)

• If it doesn’t, define C to be the singleton set of outcome under say S.

• Then probability of this set under S’ is 0

Differential Privacy

Page 63: Machine Learning for Data Science (CS4786) Lecture 26

• Any deterministic algorithm either has to produce constant outcome (making it useless)

• If it doesn’t, define C to be the singleton set of outcome under say S.

• Then probability of this set under S’ is 0

• But under S probability is 1

Differential Privacy

Page 64: Machine Learning for Data Science (CS4786) Lecture 26

• Any deterministic algorithm either has to produce constant outcome (making it useless)

• If it doesn’t, define C to be the singleton set of outcome under say S.

• Then probability of this set under S’ is 0

• But under S probability is 1

• Hence cannot be differentially private

Differential Privacy

Page 65: Machine Learning for Data Science (CS4786) Lecture 26

Obtaining Differential Privacy

Page 66: Machine Learning for Data Science (CS4786) Lecture 26

Obtaining Differential Privacy

• Typical mechanism: Add noise to outcome or inside algorithm

Page 67: Machine Learning for Data Science (CS4786) Lecture 26

Obtaining Differential Privacy

• Typical mechanism: Add noise to outcome or inside algorithm

• More privacy we want the more noise we add

Page 68: Machine Learning for Data Science (CS4786) Lecture 26

Back to Example I

Page 69: Machine Learning for Data Science (CS4786) Lecture 26

Back to Example I

• First lets begin with the example of releasing mean incomes (smokers Vs non-smokers)

Page 70: Machine Learning for Data Science (CS4786) Lecture 26

Back to Example I

• First lets begin with the example of releasing mean incomes (smokers Vs non-smokers)

• Say incomes I1,…,In are the income of subjects in the sample

Page 71: Machine Learning for Data Science (CS4786) Lecture 26

Back to Example I

• First lets begin with the example of releasing mean incomes (smokers Vs non-smokers)

• Say incomes I1,…,In are the income of subjects in the sample

• First compute mean M =1

n

nX

t=1

It

Page 72: Machine Learning for Data Science (CS4786) Lecture 26

Back to Example I

• First lets begin with the example of releasing mean incomes (smokers Vs non-smokers)

• Say incomes I1,…,In are the income of subjects in the sample

• First compute mean

• Add noise to it M + max_income Laplace(0,1)/ n 𝜖

M =1

n

nX

t=1

It

Page 73: Machine Learning for Data Science (CS4786) Lecture 26

Why it works?

Page 74: Machine Learning for Data Science (CS4786) Lecture 26

Why it works?

• Take any arbitrary (possibly deterministic) function f(S).

Page 75: Machine Learning for Data Science (CS4786) Lecture 26

Why it works?

• Take any arbitrary (possibly deterministic) function f(S).

• Say where S and S’ differ on one data point

B = maxS,S0

|f(S)� f(S0)|<latexit sha1_base64="64q5fqjJ9RKHqYkoJ7QojhjzeCI=">AAACBnicbVDLSgMxFM3UV62vUZciBIu0BS0zIuhGKHXjslL7gHYYMmmmDU1mhiQjlmlXbvwVNy4Uces3uPNvTB8LbT1wuYdz7iW5x4sYlcqyvo3U0vLK6lp6PbOxubW9Y+7u1WUYC0xqOGShaHpIEkYDUlNUMdKMBEHcY6Th9a/HfuOeCEnD4E4NIuJw1A2oTzFSWnLNwzK8gm2OHtykelLNjYZ+vlqAp1C3XGHomlmraE0AF4k9I1kwQ8U1v9qdEMecBAozJGXLtiLlJEgoihkZZdqxJBHCfdQlLU0DxIl0kskZI3islQ70Q6ErUHCi/t5IEJdywD09yZHqyXlvLP7ntWLlXzoJDaJYkQBPH/JjBlUIx5nADhUEKzbQBGFB9V8h7iGBsNLJZXQI9vzJi6R+VrSton17ni2VZ3GkwQE4AnlggwtQAjegAmoAg0fwDF7Bm/FkvBjvxsd0NGXMdvbBHxifP8oMlh4=</latexit><latexit sha1_base64="64q5fqjJ9RKHqYkoJ7QojhjzeCI=">AAACBnicbVDLSgMxFM3UV62vUZciBIu0BS0zIuhGKHXjslL7gHYYMmmmDU1mhiQjlmlXbvwVNy4Uces3uPNvTB8LbT1wuYdz7iW5x4sYlcqyvo3U0vLK6lp6PbOxubW9Y+7u1WUYC0xqOGShaHpIEkYDUlNUMdKMBEHcY6Th9a/HfuOeCEnD4E4NIuJw1A2oTzFSWnLNwzK8gm2OHtykelLNjYZ+vlqAp1C3XGHomlmraE0AF4k9I1kwQ8U1v9qdEMecBAozJGXLtiLlJEgoihkZZdqxJBHCfdQlLU0DxIl0kskZI3islQ70Q6ErUHCi/t5IEJdywD09yZHqyXlvLP7ntWLlXzoJDaJYkQBPH/JjBlUIx5nADhUEKzbQBGFB9V8h7iGBsNLJZXQI9vzJi6R+VrSton17ni2VZ3GkwQE4AnlggwtQAjegAmoAg0fwDF7Bm/FkvBjvxsd0NGXMdvbBHxifP8oMlh4=</latexit><latexit sha1_base64="64q5fqjJ9RKHqYkoJ7QojhjzeCI=">AAACBnicbVDLSgMxFM3UV62vUZciBIu0BS0zIuhGKHXjslL7gHYYMmmmDU1mhiQjlmlXbvwVNy4Uces3uPNvTB8LbT1wuYdz7iW5x4sYlcqyvo3U0vLK6lp6PbOxubW9Y+7u1WUYC0xqOGShaHpIEkYDUlNUMdKMBEHcY6Th9a/HfuOeCEnD4E4NIuJw1A2oTzFSWnLNwzK8gm2OHtykelLNjYZ+vlqAp1C3XGHomlmraE0AF4k9I1kwQ8U1v9qdEMecBAozJGXLtiLlJEgoihkZZdqxJBHCfdQlLU0DxIl0kskZI3islQ70Q6ErUHCi/t5IEJdywD09yZHqyXlvLP7ntWLlXzoJDaJYkQBPH/JjBlUIx5nADhUEKzbQBGFB9V8h7iGBsNLJZXQI9vzJi6R+VrSton17ni2VZ3GkwQE4AnlggwtQAjegAmoAg0fwDF7Bm/FkvBjvxsd0NGXMdvbBHxifP8oMlh4=</latexit><latexit sha1_base64="64q5fqjJ9RKHqYkoJ7QojhjzeCI=">AAACBnicbVDLSgMxFM3UV62vUZciBIu0BS0zIuhGKHXjslL7gHYYMmmmDU1mhiQjlmlXbvwVNy4Uces3uPNvTB8LbT1wuYdz7iW5x4sYlcqyvo3U0vLK6lp6PbOxubW9Y+7u1WUYC0xqOGShaHpIEkYDUlNUMdKMBEHcY6Th9a/HfuOeCEnD4E4NIuJw1A2oTzFSWnLNwzK8gm2OHtykelLNjYZ+vlqAp1C3XGHomlmraE0AF4k9I1kwQ8U1v9qdEMecBAozJGXLtiLlJEgoihkZZdqxJBHCfdQlLU0DxIl0kskZI3islQ70Q6ErUHCi/t5IEJdywD09yZHqyXlvLP7ntWLlXzoJDaJYkQBPH/JjBlUIx5nADhUEKzbQBGFB9V8h7iGBsNLJZXQI9vzJi6R+VrSton17ni2VZ3GkwQE4AnlggwtQAjegAmoAg0fwDF7Bm/FkvBjvxsd0NGXMdvbBHxifP8oMlh4=</latexit>

Page 76: Machine Learning for Data Science (CS4786) Lecture 26

Why it works?

• Take any arbitrary (possibly deterministic) function f(S).

• Say where S and S’ differ on one data point

• A(S) = f(S) + B Laplace(0,1)/𝜖

B = maxS,S0

|f(S)� f(S0)|<latexit sha1_base64="64q5fqjJ9RKHqYkoJ7QojhjzeCI=">AAACBnicbVDLSgMxFM3UV62vUZciBIu0BS0zIuhGKHXjslL7gHYYMmmmDU1mhiQjlmlXbvwVNy4Uces3uPNvTB8LbT1wuYdz7iW5x4sYlcqyvo3U0vLK6lp6PbOxubW9Y+7u1WUYC0xqOGShaHpIEkYDUlNUMdKMBEHcY6Th9a/HfuOeCEnD4E4NIuJw1A2oTzFSWnLNwzK8gm2OHtykelLNjYZ+vlqAp1C3XGHomlmraE0AF4k9I1kwQ8U1v9qdEMecBAozJGXLtiLlJEgoihkZZdqxJBHCfdQlLU0DxIl0kskZI3islQ70Q6ErUHCi/t5IEJdywD09yZHqyXlvLP7ntWLlXzoJDaJYkQBPH/JjBlUIx5nADhUEKzbQBGFB9V8h7iGBsNLJZXQI9vzJi6R+VrSton17ni2VZ3GkwQE4AnlggwtQAjegAmoAg0fwDF7Bm/FkvBjvxsd0NGXMdvbBHxifP8oMlh4=</latexit><latexit sha1_base64="64q5fqjJ9RKHqYkoJ7QojhjzeCI=">AAACBnicbVDLSgMxFM3UV62vUZciBIu0BS0zIuhGKHXjslL7gHYYMmmmDU1mhiQjlmlXbvwVNy4Uces3uPNvTB8LbT1wuYdz7iW5x4sYlcqyvo3U0vLK6lp6PbOxubW9Y+7u1WUYC0xqOGShaHpIEkYDUlNUMdKMBEHcY6Th9a/HfuOeCEnD4E4NIuJw1A2oTzFSWnLNwzK8gm2OHtykelLNjYZ+vlqAp1C3XGHomlmraE0AF4k9I1kwQ8U1v9qdEMecBAozJGXLtiLlJEgoihkZZdqxJBHCfdQlLU0DxIl0kskZI3islQ70Q6ErUHCi/t5IEJdywD09yZHqyXlvLP7ntWLlXzoJDaJYkQBPH/JjBlUIx5nADhUEKzbQBGFB9V8h7iGBsNLJZXQI9vzJi6R+VrSton17ni2VZ3GkwQE4AnlggwtQAjegAmoAg0fwDF7Bm/FkvBjvxsd0NGXMdvbBHxifP8oMlh4=</latexit><latexit sha1_base64="64q5fqjJ9RKHqYkoJ7QojhjzeCI=">AAACBnicbVDLSgMxFM3UV62vUZciBIu0BS0zIuhGKHXjslL7gHYYMmmmDU1mhiQjlmlXbvwVNy4Uces3uPNvTB8LbT1wuYdz7iW5x4sYlcqyvo3U0vLK6lp6PbOxubW9Y+7u1WUYC0xqOGShaHpIEkYDUlNUMdKMBEHcY6Th9a/HfuOeCEnD4E4NIuJw1A2oTzFSWnLNwzK8gm2OHtykelLNjYZ+vlqAp1C3XGHomlmraE0AF4k9I1kwQ8U1v9qdEMecBAozJGXLtiLlJEgoihkZZdqxJBHCfdQlLU0DxIl0kskZI3islQ70Q6ErUHCi/t5IEJdywD09yZHqyXlvLP7ntWLlXzoJDaJYkQBPH/JjBlUIx5nADhUEKzbQBGFB9V8h7iGBsNLJZXQI9vzJi6R+VrSton17ni2VZ3GkwQE4AnlggwtQAjegAmoAg0fwDF7Bm/FkvBjvxsd0NGXMdvbBHxifP8oMlh4=</latexit><latexit sha1_base64="64q5fqjJ9RKHqYkoJ7QojhjzeCI=">AAACBnicbVDLSgMxFM3UV62vUZciBIu0BS0zIuhGKHXjslL7gHYYMmmmDU1mhiQjlmlXbvwVNy4Uces3uPNvTB8LbT1wuYdz7iW5x4sYlcqyvo3U0vLK6lp6PbOxubW9Y+7u1WUYC0xqOGShaHpIEkYDUlNUMdKMBEHcY6Th9a/HfuOeCEnD4E4NIuJw1A2oTzFSWnLNwzK8gm2OHtykelLNjYZ+vlqAp1C3XGHomlmraE0AF4k9I1kwQ8U1v9qdEMecBAozJGXLtiLlJEgoihkZZdqxJBHCfdQlLU0DxIl0kskZI3islQ70Q6ErUHCi/t5IEJdywD09yZHqyXlvLP7ntWLlXzoJDaJYkQBPH/JjBlUIx5nADhUEKzbQBGFB9V8h7iGBsNLJZXQI9vzJi6R+VrSton17ni2VZ3GkwQE4AnlggwtQAjegAmoAg0fwDF7Bm/FkvBjvxsd0NGXMdvbBHxifP8oMlh4=</latexit>

Page 77: Machine Learning for Data Science (CS4786) Lecture 26

Why it works?

• Take any arbitrary (possibly deterministic) function f(S).

• Say where S and S’ differ on one data point

• A(S) = f(S) + B Laplace(0,1)/𝜖

• A is (𝜖,0)-differentially private

B = maxS,S0

|f(S)� f(S0)|<latexit sha1_base64="64q5fqjJ9RKHqYkoJ7QojhjzeCI=">AAACBnicbVDLSgMxFM3UV62vUZciBIu0BS0zIuhGKHXjslL7gHYYMmmmDU1mhiQjlmlXbvwVNy4Uces3uPNvTB8LbT1wuYdz7iW5x4sYlcqyvo3U0vLK6lp6PbOxubW9Y+7u1WUYC0xqOGShaHpIEkYDUlNUMdKMBEHcY6Th9a/HfuOeCEnD4E4NIuJw1A2oTzFSWnLNwzK8gm2OHtykelLNjYZ+vlqAp1C3XGHomlmraE0AF4k9I1kwQ8U1v9qdEMecBAozJGXLtiLlJEgoihkZZdqxJBHCfdQlLU0DxIl0kskZI3islQ70Q6ErUHCi/t5IEJdywD09yZHqyXlvLP7ntWLlXzoJDaJYkQBPH/JjBlUIx5nADhUEKzbQBGFB9V8h7iGBsNLJZXQI9vzJi6R+VrSton17ni2VZ3GkwQE4AnlggwtQAjegAmoAg0fwDF7Bm/FkvBjvxsd0NGXMdvbBHxifP8oMlh4=</latexit><latexit sha1_base64="64q5fqjJ9RKHqYkoJ7QojhjzeCI=">AAACBnicbVDLSgMxFM3UV62vUZciBIu0BS0zIuhGKHXjslL7gHYYMmmmDU1mhiQjlmlXbvwVNy4Uces3uPNvTB8LbT1wuYdz7iW5x4sYlcqyvo3U0vLK6lp6PbOxubW9Y+7u1WUYC0xqOGShaHpIEkYDUlNUMdKMBEHcY6Th9a/HfuOeCEnD4E4NIuJw1A2oTzFSWnLNwzK8gm2OHtykelLNjYZ+vlqAp1C3XGHomlmraE0AF4k9I1kwQ8U1v9qdEMecBAozJGXLtiLlJEgoihkZZdqxJBHCfdQlLU0DxIl0kskZI3islQ70Q6ErUHCi/t5IEJdywD09yZHqyXlvLP7ntWLlXzoJDaJYkQBPH/JjBlUIx5nADhUEKzbQBGFB9V8h7iGBsNLJZXQI9vzJi6R+VrSton17ni2VZ3GkwQE4AnlggwtQAjegAmoAg0fwDF7Bm/FkvBjvxsd0NGXMdvbBHxifP8oMlh4=</latexit><latexit sha1_base64="64q5fqjJ9RKHqYkoJ7QojhjzeCI=">AAACBnicbVDLSgMxFM3UV62vUZciBIu0BS0zIuhGKHXjslL7gHYYMmmmDU1mhiQjlmlXbvwVNy4Uces3uPNvTB8LbT1wuYdz7iW5x4sYlcqyvo3U0vLK6lp6PbOxubW9Y+7u1WUYC0xqOGShaHpIEkYDUlNUMdKMBEHcY6Th9a/HfuOeCEnD4E4NIuJw1A2oTzFSWnLNwzK8gm2OHtykelLNjYZ+vlqAp1C3XGHomlmraE0AF4k9I1kwQ8U1v9qdEMecBAozJGXLtiLlJEgoihkZZdqxJBHCfdQlLU0DxIl0kskZI3islQ70Q6ErUHCi/t5IEJdywD09yZHqyXlvLP7ntWLlXzoJDaJYkQBPH/JjBlUIx5nADhUEKzbQBGFB9V8h7iGBsNLJZXQI9vzJi6R+VrSton17ni2VZ3GkwQE4AnlggwtQAjegAmoAg0fwDF7Bm/FkvBjvxsd0NGXMdvbBHxifP8oMlh4=</latexit><latexit sha1_base64="64q5fqjJ9RKHqYkoJ7QojhjzeCI=">AAACBnicbVDLSgMxFM3UV62vUZciBIu0BS0zIuhGKHXjslL7gHYYMmmmDU1mhiQjlmlXbvwVNy4Uces3uPNvTB8LbT1wuYdz7iW5x4sYlcqyvo3U0vLK6lp6PbOxubW9Y+7u1WUYC0xqOGShaHpIEkYDUlNUMdKMBEHcY6Th9a/HfuOeCEnD4E4NIuJw1A2oTzFSWnLNwzK8gm2OHtykelLNjYZ+vlqAp1C3XGHomlmraE0AF4k9I1kwQ8U1v9qdEMecBAozJGXLtiLlJEgoihkZZdqxJBHCfdQlLU0DxIl0kskZI3islQ70Q6ErUHCi/t5IEJdywD09yZHqyXlvLP7ntWLlXzoJDaJYkQBPH/JjBlUIx5nADhUEKzbQBGFB9V8h7iGBsNLJZXQI9vzJi6R+VrSton17ni2VZ3GkwQE4AnlggwtQAjegAmoAg0fwDF7Bm/FkvBjvxsd0NGXMdvbBHxifP8oMlh4=</latexit>

Page 78: Machine Learning for Data Science (CS4786) Lecture 26

Why it works?

P (A(S) = x)

P (A(S0) = x)=

e�✏|f(S)�x|/B

e�✏|f(S0)�x|/B

= e✏|f(S0)�x|�|f(S)�x|

B

e✏|f(S0)�f(S)|

B

e✏

p(x; f(S))

p(x; f(S0))<latexit sha1_base64="WnLL5R0l7S6CtNK7OdgkOt6x4+A=">AAACBXicdZDLSgMxFIYzXmu9jbrURbCI7aYkRWyLm6IblxXtBdpSMmmmDc1cSDJiGbpx46u4caGIW9/BnW9jphdQ0QOBj/8/h5PzO6HgSiP0aS0sLi2vrKbW0usbm1vb9s5uXQWRpKxGAxHIpkMUE9xnNc21YM1QMuI5gjWc4UXiN26ZVDzwb/QoZB2P9H3uckq0kbr2QduVhMZh9u7MzV7ncuM5Hhvu2hmURwhhjGECuHiKDJTLpQIuQZxYpjJgVtWu/dHuBTTymK+pIEq1MAp1JyZScyrYON2OFAsJHZI+axn0icdUJ55cMYZHRulBN5Dm+RpO1O8TMfGUGnmO6fSIHqjfXiL+5bUi7ZY6MffDSDOfThe5kYA6gEkksMclo1qMDBAqufkrpANiYtEmuLQJYX4p/B/qhTxGeXx1kqmcz+JIgX1wCLIAgyKogEtQBTVAwT14BM/gxXqwnqxX623aumDNZvbAj7LevwAWcZcE</latexit><latexit sha1_base64="WnLL5R0l7S6CtNK7OdgkOt6x4+A=">AAACBXicdZDLSgMxFIYzXmu9jbrURbCI7aYkRWyLm6IblxXtBdpSMmmmDc1cSDJiGbpx46u4caGIW9/BnW9jphdQ0QOBj/8/h5PzO6HgSiP0aS0sLi2vrKbW0usbm1vb9s5uXQWRpKxGAxHIpkMUE9xnNc21YM1QMuI5gjWc4UXiN26ZVDzwb/QoZB2P9H3uckq0kbr2QduVhMZh9u7MzV7ncuM5Hhvu2hmURwhhjGECuHiKDJTLpQIuQZxYpjJgVtWu/dHuBTTymK+pIEq1MAp1JyZScyrYON2OFAsJHZI+axn0icdUJ55cMYZHRulBN5Dm+RpO1O8TMfGUGnmO6fSIHqjfXiL+5bUi7ZY6MffDSDOfThe5kYA6gEkksMclo1qMDBAqufkrpANiYtEmuLQJYX4p/B/qhTxGeXx1kqmcz+JIgX1wCLIAgyKogEtQBTVAwT14BM/gxXqwnqxX623aumDNZvbAj7LevwAWcZcE</latexit><latexit sha1_base64="WnLL5R0l7S6CtNK7OdgkOt6x4+A=">AAACBXicdZDLSgMxFIYzXmu9jbrURbCI7aYkRWyLm6IblxXtBdpSMmmmDc1cSDJiGbpx46u4caGIW9/BnW9jphdQ0QOBj/8/h5PzO6HgSiP0aS0sLi2vrKbW0usbm1vb9s5uXQWRpKxGAxHIpkMUE9xnNc21YM1QMuI5gjWc4UXiN26ZVDzwb/QoZB2P9H3uckq0kbr2QduVhMZh9u7MzV7ncuM5Hhvu2hmURwhhjGECuHiKDJTLpQIuQZxYpjJgVtWu/dHuBTTymK+pIEq1MAp1JyZScyrYON2OFAsJHZI+axn0icdUJ55cMYZHRulBN5Dm+RpO1O8TMfGUGnmO6fSIHqjfXiL+5bUi7ZY6MffDSDOfThe5kYA6gEkksMclo1qMDBAqufkrpANiYtEmuLQJYX4p/B/qhTxGeXx1kqmcz+JIgX1wCLIAgyKogEtQBTVAwT14BM/gxXqwnqxX623aumDNZvbAj7LevwAWcZcE</latexit><latexit sha1_base64="WnLL5R0l7S6CtNK7OdgkOt6x4+A=">AAACBXicdZDLSgMxFIYzXmu9jbrURbCI7aYkRWyLm6IblxXtBdpSMmmmDc1cSDJiGbpx46u4caGIW9/BnW9jphdQ0QOBj/8/h5PzO6HgSiP0aS0sLi2vrKbW0usbm1vb9s5uXQWRpKxGAxHIpkMUE9xnNc21YM1QMuI5gjWc4UXiN26ZVDzwb/QoZB2P9H3uckq0kbr2QduVhMZh9u7MzV7ncuM5Hhvu2hmURwhhjGECuHiKDJTLpQIuQZxYpjJgVtWu/dHuBTTymK+pIEq1MAp1JyZScyrYON2OFAsJHZI+axn0icdUJ55cMYZHRulBN5Dm+RpO1O8TMfGUGnmO6fSIHqjfXiL+5bUi7ZY6MffDSDOfThe5kYA6gEkksMclo1qMDBAqufkrpANiYtEmuLQJYX4p/B/qhTxGeXx1kqmcz+JIgX1wCLIAgyKogEtQBTVAwT14BM/gxXqwnqxX623aumDNZvbAj7LevwAWcZcE</latexit>

Page 79: Machine Learning for Data Science (CS4786) Lecture 26

Why it works?

P (A(S) = x)

P (A(S0) = x)=

e�✏|f(S)�x|/B

e�✏|f(S0)�x|/B

= e✏|f(S0)�x|�|f(S)�x|

B

e✏|f(S0)�f(S)|

B

e✏

p(x; f(S))

p(x; f(S0))<latexit sha1_base64="WnLL5R0l7S6CtNK7OdgkOt6x4+A=">AAACBXicdZDLSgMxFIYzXmu9jbrURbCI7aYkRWyLm6IblxXtBdpSMmmmDc1cSDJiGbpx46u4caGIW9/BnW9jphdQ0QOBj/8/h5PzO6HgSiP0aS0sLi2vrKbW0usbm1vb9s5uXQWRpKxGAxHIpkMUE9xnNc21YM1QMuI5gjWc4UXiN26ZVDzwb/QoZB2P9H3uckq0kbr2QduVhMZh9u7MzV7ncuM5Hhvu2hmURwhhjGECuHiKDJTLpQIuQZxYpjJgVtWu/dHuBTTymK+pIEq1MAp1JyZScyrYON2OFAsJHZI+axn0icdUJ55cMYZHRulBN5Dm+RpO1O8TMfGUGnmO6fSIHqjfXiL+5bUi7ZY6MffDSDOfThe5kYA6gEkksMclo1qMDBAqufkrpANiYtEmuLQJYX4p/B/qhTxGeXx1kqmcz+JIgX1wCLIAgyKogEtQBTVAwT14BM/gxXqwnqxX623aumDNZvbAj7LevwAWcZcE</latexit><latexit sha1_base64="WnLL5R0l7S6CtNK7OdgkOt6x4+A=">AAACBXicdZDLSgMxFIYzXmu9jbrURbCI7aYkRWyLm6IblxXtBdpSMmmmDc1cSDJiGbpx46u4caGIW9/BnW9jphdQ0QOBj/8/h5PzO6HgSiP0aS0sLi2vrKbW0usbm1vb9s5uXQWRpKxGAxHIpkMUE9xnNc21YM1QMuI5gjWc4UXiN26ZVDzwb/QoZB2P9H3uckq0kbr2QduVhMZh9u7MzV7ncuM5Hhvu2hmURwhhjGECuHiKDJTLpQIuQZxYpjJgVtWu/dHuBTTymK+pIEq1MAp1JyZScyrYON2OFAsJHZI+axn0icdUJ55cMYZHRulBN5Dm+RpO1O8TMfGUGnmO6fSIHqjfXiL+5bUi7ZY6MffDSDOfThe5kYA6gEkksMclo1qMDBAqufkrpANiYtEmuLQJYX4p/B/qhTxGeXx1kqmcz+JIgX1wCLIAgyKogEtQBTVAwT14BM/gxXqwnqxX623aumDNZvbAj7LevwAWcZcE</latexit><latexit sha1_base64="WnLL5R0l7S6CtNK7OdgkOt6x4+A=">AAACBXicdZDLSgMxFIYzXmu9jbrURbCI7aYkRWyLm6IblxXtBdpSMmmmDc1cSDJiGbpx46u4caGIW9/BnW9jphdQ0QOBj/8/h5PzO6HgSiP0aS0sLi2vrKbW0usbm1vb9s5uXQWRpKxGAxHIpkMUE9xnNc21YM1QMuI5gjWc4UXiN26ZVDzwb/QoZB2P9H3uckq0kbr2QduVhMZh9u7MzV7ncuM5Hhvu2hmURwhhjGECuHiKDJTLpQIuQZxYpjJgVtWu/dHuBTTymK+pIEq1MAp1JyZScyrYON2OFAsJHZI+axn0icdUJ55cMYZHRulBN5Dm+RpO1O8TMfGUGnmO6fSIHqjfXiL+5bUi7ZY6MffDSDOfThe5kYA6gEkksMclo1qMDBAqufkrpANiYtEmuLQJYX4p/B/qhTxGeXx1kqmcz+JIgX1wCLIAgyKogEtQBTVAwT14BM/gxXqwnqxX623aumDNZvbAj7LevwAWcZcE</latexit><latexit sha1_base64="WnLL5R0l7S6CtNK7OdgkOt6x4+A=">AAACBXicdZDLSgMxFIYzXmu9jbrURbCI7aYkRWyLm6IblxXtBdpSMmmmDc1cSDJiGbpx46u4caGIW9/BnW9jphdQ0QOBj/8/h5PzO6HgSiP0aS0sLi2vrKbW0usbm1vb9s5uXQWRpKxGAxHIpkMUE9xnNc21YM1QMuI5gjWc4UXiN26ZVDzwb/QoZB2P9H3uckq0kbr2QduVhMZh9u7MzV7ncuM5Hhvu2hmURwhhjGECuHiKDJTLpQIuQZxYpjJgVtWu/dHuBTTymK+pIEq1MAp1JyZScyrYON2OFAsJHZI+axn0icdUJ55cMYZHRulBN5Dm+RpO1O8TMfGUGnmO6fSIHqjfXiL+5bUi7ZY6MffDSDOfThe5kYA6gEkksMclo1qMDBAqufkrpANiYtEmuLQJYX4p/B/qhTxGeXx1kqmcz+JIgX1wCLIAgyKogEtQBTVAwT14BM/gxXqwnqxX623aumDNZvbAj7LevwAWcZcE</latexit>

Page 80: Machine Learning for Data Science (CS4786) Lecture 26

Why it works?

P (A(S) = x)

P (A(S0) = x)=

e�✏|f(S)�x|/B

e�✏|f(S0)�x|/B

= e✏|f(S0)�x|�|f(S)�x|

B

e✏|f(S0)�f(S)|

B

e✏

p(x; f(S))

p(x; f(S0))<latexit sha1_base64="WnLL5R0l7S6CtNK7OdgkOt6x4+A=">AAACBXicdZDLSgMxFIYzXmu9jbrURbCI7aYkRWyLm6IblxXtBdpSMmmmDc1cSDJiGbpx46u4caGIW9/BnW9jphdQ0QOBj/8/h5PzO6HgSiP0aS0sLi2vrKbW0usbm1vb9s5uXQWRpKxGAxHIpkMUE9xnNc21YM1QMuI5gjWc4UXiN26ZVDzwb/QoZB2P9H3uckq0kbr2QduVhMZh9u7MzV7ncuM5Hhvu2hmURwhhjGECuHiKDJTLpQIuQZxYpjJgVtWu/dHuBTTymK+pIEq1MAp1JyZScyrYON2OFAsJHZI+axn0icdUJ55cMYZHRulBN5Dm+RpO1O8TMfGUGnmO6fSIHqjfXiL+5bUi7ZY6MffDSDOfThe5kYA6gEkksMclo1qMDBAqufkrpANiYtEmuLQJYX4p/B/qhTxGeXx1kqmcz+JIgX1wCLIAgyKogEtQBTVAwT14BM/gxXqwnqxX623aumDNZvbAj7LevwAWcZcE</latexit><latexit sha1_base64="WnLL5R0l7S6CtNK7OdgkOt6x4+A=">AAACBXicdZDLSgMxFIYzXmu9jbrURbCI7aYkRWyLm6IblxXtBdpSMmmmDc1cSDJiGbpx46u4caGIW9/BnW9jphdQ0QOBj/8/h5PzO6HgSiP0aS0sLi2vrKbW0usbm1vb9s5uXQWRpKxGAxHIpkMUE9xnNc21YM1QMuI5gjWc4UXiN26ZVDzwb/QoZB2P9H3uckq0kbr2QduVhMZh9u7MzV7ncuM5Hhvu2hmURwhhjGECuHiKDJTLpQIuQZxYpjJgVtWu/dHuBTTymK+pIEq1MAp1JyZScyrYON2OFAsJHZI+axn0icdUJ55cMYZHRulBN5Dm+RpO1O8TMfGUGnmO6fSIHqjfXiL+5bUi7ZY6MffDSDOfThe5kYA6gEkksMclo1qMDBAqufkrpANiYtEmuLQJYX4p/B/qhTxGeXx1kqmcz+JIgX1wCLIAgyKogEtQBTVAwT14BM/gxXqwnqxX623aumDNZvbAj7LevwAWcZcE</latexit><latexit sha1_base64="WnLL5R0l7S6CtNK7OdgkOt6x4+A=">AAACBXicdZDLSgMxFIYzXmu9jbrURbCI7aYkRWyLm6IblxXtBdpSMmmmDc1cSDJiGbpx46u4caGIW9/BnW9jphdQ0QOBj/8/h5PzO6HgSiP0aS0sLi2vrKbW0usbm1vb9s5uXQWRpKxGAxHIpkMUE9xnNc21YM1QMuI5gjWc4UXiN26ZVDzwb/QoZB2P9H3uckq0kbr2QduVhMZh9u7MzV7ncuM5Hhvu2hmURwhhjGECuHiKDJTLpQIuQZxYpjJgVtWu/dHuBTTymK+pIEq1MAp1JyZScyrYON2OFAsJHZI+axn0icdUJ55cMYZHRulBN5Dm+RpO1O8TMfGUGnmO6fSIHqjfXiL+5bUi7ZY6MffDSDOfThe5kYA6gEkksMclo1qMDBAqufkrpANiYtEmuLQJYX4p/B/qhTxGeXx1kqmcz+JIgX1wCLIAgyKogEtQBTVAwT14BM/gxXqwnqxX623aumDNZvbAj7LevwAWcZcE</latexit><latexit sha1_base64="WnLL5R0l7S6CtNK7OdgkOt6x4+A=">AAACBXicdZDLSgMxFIYzXmu9jbrURbCI7aYkRWyLm6IblxXtBdpSMmmmDc1cSDJiGbpx46u4caGIW9/BnW9jphdQ0QOBj/8/h5PzO6HgSiP0aS0sLi2vrKbW0usbm1vb9s5uXQWRpKxGAxHIpkMUE9xnNc21YM1QMuI5gjWc4UXiN26ZVDzwb/QoZB2P9H3uckq0kbr2QduVhMZh9u7MzV7ncuM5Hhvu2hmURwhhjGECuHiKDJTLpQIuQZxYpjJgVtWu/dHuBTTymK+pIEq1MAp1JyZScyrYON2OFAsJHZI+axn0icdUJ55cMYZHRulBN5Dm+RpO1O8TMfGUGnmO6fSIHqjfXiL+5bUi7ZY6MffDSDOfThe5kYA6gEkksMclo1qMDBAqufkrpANiYtEmuLQJYX4p/B/qhTxGeXx1kqmcz+JIgX1wCLIAgyKogEtQBTVAwT14BM/gxXqwnqxX623aumDNZvbAj7LevwAWcZcE</latexit>

Page 81: Machine Learning for Data Science (CS4786) Lecture 26

Why it works?

P (A(S) = x)

P (A(S0) = x)=

e�✏|f(S)�x|/B

e�✏|f(S0)�x|/B

= e✏|f(S0)�x|�|f(S)�x|

B

e✏|f(S0)�f(S)|

B

e✏

p(x; f(S))

p(x; f(S0))<latexit sha1_base64="WnLL5R0l7S6CtNK7OdgkOt6x4+A=">AAACBXicdZDLSgMxFIYzXmu9jbrURbCI7aYkRWyLm6IblxXtBdpSMmmmDc1cSDJiGbpx46u4caGIW9/BnW9jphdQ0QOBj/8/h5PzO6HgSiP0aS0sLi2vrKbW0usbm1vb9s5uXQWRpKxGAxHIpkMUE9xnNc21YM1QMuI5gjWc4UXiN26ZVDzwb/QoZB2P9H3uckq0kbr2QduVhMZh9u7MzV7ncuM5Hhvu2hmURwhhjGECuHiKDJTLpQIuQZxYpjJgVtWu/dHuBTTymK+pIEq1MAp1JyZScyrYON2OFAsJHZI+axn0icdUJ55cMYZHRulBN5Dm+RpO1O8TMfGUGnmO6fSIHqjfXiL+5bUi7ZY6MffDSDOfThe5kYA6gEkksMclo1qMDBAqufkrpANiYtEmuLQJYX4p/B/qhTxGeXx1kqmcz+JIgX1wCLIAgyKogEtQBTVAwT14BM/gxXqwnqxX623aumDNZvbAj7LevwAWcZcE</latexit><latexit sha1_base64="WnLL5R0l7S6CtNK7OdgkOt6x4+A=">AAACBXicdZDLSgMxFIYzXmu9jbrURbCI7aYkRWyLm6IblxXtBdpSMmmmDc1cSDJiGbpx46u4caGIW9/BnW9jphdQ0QOBj/8/h5PzO6HgSiP0aS0sLi2vrKbW0usbm1vb9s5uXQWRpKxGAxHIpkMUE9xnNc21YM1QMuI5gjWc4UXiN26ZVDzwb/QoZB2P9H3uckq0kbr2QduVhMZh9u7MzV7ncuM5Hhvu2hmURwhhjGECuHiKDJTLpQIuQZxYpjJgVtWu/dHuBTTymK+pIEq1MAp1JyZScyrYON2OFAsJHZI+axn0icdUJ55cMYZHRulBN5Dm+RpO1O8TMfGUGnmO6fSIHqjfXiL+5bUi7ZY6MffDSDOfThe5kYA6gEkksMclo1qMDBAqufkrpANiYtEmuLQJYX4p/B/qhTxGeXx1kqmcz+JIgX1wCLIAgyKogEtQBTVAwT14BM/gxXqwnqxX623aumDNZvbAj7LevwAWcZcE</latexit><latexit sha1_base64="WnLL5R0l7S6CtNK7OdgkOt6x4+A=">AAACBXicdZDLSgMxFIYzXmu9jbrURbCI7aYkRWyLm6IblxXtBdpSMmmmDc1cSDJiGbpx46u4caGIW9/BnW9jphdQ0QOBj/8/h5PzO6HgSiP0aS0sLi2vrKbW0usbm1vb9s5uXQWRpKxGAxHIpkMUE9xnNc21YM1QMuI5gjWc4UXiN26ZVDzwb/QoZB2P9H3uckq0kbr2QduVhMZh9u7MzV7ncuM5Hhvu2hmURwhhjGECuHiKDJTLpQIuQZxYpjJgVtWu/dHuBTTymK+pIEq1MAp1JyZScyrYON2OFAsJHZI+axn0icdUJ55cMYZHRulBN5Dm+RpO1O8TMfGUGnmO6fSIHqjfXiL+5bUi7ZY6MffDSDOfThe5kYA6gEkksMclo1qMDBAqufkrpANiYtEmuLQJYX4p/B/qhTxGeXx1kqmcz+JIgX1wCLIAgyKogEtQBTVAwT14BM/gxXqwnqxX623aumDNZvbAj7LevwAWcZcE</latexit><latexit sha1_base64="WnLL5R0l7S6CtNK7OdgkOt6x4+A=">AAACBXicdZDLSgMxFIYzXmu9jbrURbCI7aYkRWyLm6IblxXtBdpSMmmmDc1cSDJiGbpx46u4caGIW9/BnW9jphdQ0QOBj/8/h5PzO6HgSiP0aS0sLi2vrKbW0usbm1vb9s5uXQWRpKxGAxHIpkMUE9xnNc21YM1QMuI5gjWc4UXiN26ZVDzwb/QoZB2P9H3uckq0kbr2QduVhMZh9u7MzV7ncuM5Hhvu2hmURwhhjGECuHiKDJTLpQIuQZxYpjJgVtWu/dHuBTTymK+pIEq1MAp1JyZScyrYON2OFAsJHZI+axn0icdUJ55cMYZHRulBN5Dm+RpO1O8TMfGUGnmO6fSIHqjfXiL+5bUi7ZY6MffDSDOfThe5kYA6gEkksMclo1qMDBAqufkrpANiYtEmuLQJYX4p/B/qhTxGeXx1kqmcz+JIgX1wCLIAgyKogEtQBTVAwT14BM/gxXqwnqxX623aumDNZvbAj7LevwAWcZcE</latexit>

Page 82: Machine Learning for Data Science (CS4786) Lecture 26

Why it works?

P (A(S) = x)

P (A(S0) = x)=

e�✏|f(S)�x|/B

e�✏|f(S0)�x|/B

= e✏|f(S0)�x|�|f(S)�x|

B

e✏|f(S0)�f(S)|

B

e✏

p(x; f(S))

p(x; f(S0))<latexit sha1_base64="WnLL5R0l7S6CtNK7OdgkOt6x4+A=">AAACBXicdZDLSgMxFIYzXmu9jbrURbCI7aYkRWyLm6IblxXtBdpSMmmmDc1cSDJiGbpx46u4caGIW9/BnW9jphdQ0QOBj/8/h5PzO6HgSiP0aS0sLi2vrKbW0usbm1vb9s5uXQWRpKxGAxHIpkMUE9xnNc21YM1QMuI5gjWc4UXiN26ZVDzwb/QoZB2P9H3uckq0kbr2QduVhMZh9u7MzV7ncuM5Hhvu2hmURwhhjGECuHiKDJTLpQIuQZxYpjJgVtWu/dHuBTTymK+pIEq1MAp1JyZScyrYON2OFAsJHZI+axn0icdUJ55cMYZHRulBN5Dm+RpO1O8TMfGUGnmO6fSIHqjfXiL+5bUi7ZY6MffDSDOfThe5kYA6gEkksMclo1qMDBAqufkrpANiYtEmuLQJYX4p/B/qhTxGeXx1kqmcz+JIgX1wCLIAgyKogEtQBTVAwT14BM/gxXqwnqxX623aumDNZvbAj7LevwAWcZcE</latexit><latexit sha1_base64="WnLL5R0l7S6CtNK7OdgkOt6x4+A=">AAACBXicdZDLSgMxFIYzXmu9jbrURbCI7aYkRWyLm6IblxXtBdpSMmmmDc1cSDJiGbpx46u4caGIW9/BnW9jphdQ0QOBj/8/h5PzO6HgSiP0aS0sLi2vrKbW0usbm1vb9s5uXQWRpKxGAxHIpkMUE9xnNc21YM1QMuI5gjWc4UXiN26ZVDzwb/QoZB2P9H3uckq0kbr2QduVhMZh9u7MzV7ncuM5Hhvu2hmURwhhjGECuHiKDJTLpQIuQZxYpjJgVtWu/dHuBTTymK+pIEq1MAp1JyZScyrYON2OFAsJHZI+axn0icdUJ55cMYZHRulBN5Dm+RpO1O8TMfGUGnmO6fSIHqjfXiL+5bUi7ZY6MffDSDOfThe5kYA6gEkksMclo1qMDBAqufkrpANiYtEmuLQJYX4p/B/qhTxGeXx1kqmcz+JIgX1wCLIAgyKogEtQBTVAwT14BM/gxXqwnqxX623aumDNZvbAj7LevwAWcZcE</latexit><latexit sha1_base64="WnLL5R0l7S6CtNK7OdgkOt6x4+A=">AAACBXicdZDLSgMxFIYzXmu9jbrURbCI7aYkRWyLm6IblxXtBdpSMmmmDc1cSDJiGbpx46u4caGIW9/BnW9jphdQ0QOBj/8/h5PzO6HgSiP0aS0sLi2vrKbW0usbm1vb9s5uXQWRpKxGAxHIpkMUE9xnNc21YM1QMuI5gjWc4UXiN26ZVDzwb/QoZB2P9H3uckq0kbr2QduVhMZh9u7MzV7ncuM5Hhvu2hmURwhhjGECuHiKDJTLpQIuQZxYpjJgVtWu/dHuBTTymK+pIEq1MAp1JyZScyrYON2OFAsJHZI+axn0icdUJ55cMYZHRulBN5Dm+RpO1O8TMfGUGnmO6fSIHqjfXiL+5bUi7ZY6MffDSDOfThe5kYA6gEkksMclo1qMDBAqufkrpANiYtEmuLQJYX4p/B/qhTxGeXx1kqmcz+JIgX1wCLIAgyKogEtQBTVAwT14BM/gxXqwnqxX623aumDNZvbAj7LevwAWcZcE</latexit><latexit sha1_base64="WnLL5R0l7S6CtNK7OdgkOt6x4+A=">AAACBXicdZDLSgMxFIYzXmu9jbrURbCI7aYkRWyLm6IblxXtBdpSMmmmDc1cSDJiGbpx46u4caGIW9/BnW9jphdQ0QOBj/8/h5PzO6HgSiP0aS0sLi2vrKbW0usbm1vb9s5uXQWRpKxGAxHIpkMUE9xnNc21YM1QMuI5gjWc4UXiN26ZVDzwb/QoZB2P9H3uckq0kbr2QduVhMZh9u7MzV7ncuM5Hhvu2hmURwhhjGECuHiKDJTLpQIuQZxYpjJgVtWu/dHuBTTymK+pIEq1MAp1JyZScyrYON2OFAsJHZI+axn0icdUJ55cMYZHRulBN5Dm+RpO1O8TMfGUGnmO6fSIHqjfXiL+5bUi7ZY6MffDSDOfThe5kYA6gEkksMclo1qMDBAqufkrpANiYtEmuLQJYX4p/B/qhTxGeXx1kqmcz+JIgX1wCLIAgyKogEtQBTVAwT14BM/gxXqwnqxX623aumDNZvbAj7LevwAWcZcE</latexit>

Page 83: Machine Learning for Data Science (CS4786) Lecture 26

Back to Example II

Page 84: Machine Learning for Data Science (CS4786) Lecture 26

Back to Example II

• For the classification/regression problem we can of course use the Laplace mechanism

Page 85: Machine Learning for Data Science (CS4786) Lecture 26

Back to Example II

• For the classification/regression problem we can of course use the Laplace mechanism

• Can we do better?

Page 86: Machine Learning for Data Science (CS4786) Lecture 26

Back to Example II

Page 87: Machine Learning for Data Science (CS4786) Lecture 26

Back to Example II• Yes! For instance for linear classifiers.

Page 88: Machine Learning for Data Science (CS4786) Lecture 26

Back to Example II• Yes! For instance for linear classifiers.

• Say we use SVM or logistic regression as follows:

f(S) = argminw1

n

nX

t=1

`(w>xt, yt) + �kwk2

Page 89: Machine Learning for Data Science (CS4786) Lecture 26

Back to Example II• Yes! For instance for linear classifiers.

• Say we use SVM or logistic regression as follows:

• It can be shown that if ||x||’s < 1:

f(S) = argminw1

n

nX

t=1

`(w>xt, yt) + �kwk2

kf(S)� f(S0)k 1

�n

Page 90: Machine Learning for Data Science (CS4786) Lecture 26

Back to Example II• Yes! For instance for linear classifiers.

• Say we use SVM or logistic regression as follows:

• It can be shown that if ||x||’s < 1:

• Add vector version of noise to f, only scale now is of order O(1/𝜖ƛn)

f(S) = argminw1

n

nX

t=1

`(w>xt, yt) + �kwk2

kf(S)� f(S0)k 1

�n

Page 91: Machine Learning for Data Science (CS4786) Lecture 26

Differential Privacy in ML

• Differential private versions of PCA, clustering algorithms, deep learning etc. have been explored

• Nice properties of Differential Privacy

• post processing is ok

• compostability lemma

• Recently Differential Privacy was used as tool to allow statistically safe reuse of data