Differential Privacy: a short tutorial
Presenter: WANG Yuxiang
Some slides/materials extracted from
Aaron Rothβs Lecture at CMU
The Algorithmic Foundations of Data Privacy [Website]
Cynthia Dworkβs FOCSβ11 Tutorial
The Promise of Differential Privacy. A Tutorial on Algorithmic
Techniques. [Website]
Christine Taskβs seminar
A Practical Beginnersβ Guide to Differential Privacy
[YouTube][Slides]
Sep 19, 2012 Differential Privacy and Machine Learning 2
In the presentation
1. Intuitions
Anonymity means privacy?
A running example: Justin Bieber
What exactly does DP protects? Smoker Mary example.
2. What and how
π-Differential Privacy
Global sensitivity and Laplace Mechanism
π, πΏ -Differential Privacy and Composition Theorem
3. Many queries
A disappointing lower bound (Dinur-Nissim Attack 03)
Sparse-vector technique
Sep 19, 2012 Differential Privacy and Machine Learning 3
In the presentation
4. Advanced techniques
Local sensitivity
Sample and Aggregate
Exponential mechanism and Net-mechanism
5. Diff-Private in Machine Learning
Diff-Private logistic regression (Perturb objective)
Diff-Private low-rank approximation
Diff-Private PCA (use Exponential Mechanism)
Diff-Private SVM
Sep 19, 2012 Differential Privacy and Machine Learning 4
Privacy in information age
Government, company, research centers collect personal
information and analyze them.
Social networks: Facebook, LinkedIn
YouTube & Amazon use viewing/buying records for
recommendations.
Emails in Gmail are used for targeted Ads.
Sep 19, 2012 Differential Privacy and Machine Learning 5
Privacy by information control
Conventional measures for privacy :
Control access to information
Control the flow of information
Control the purposes information is used
Typical approaches for private release
Anonymization (removing identifiers or k-anonymity)
(Conventional) sanitization (release a sampled subset)
They basically do not guarantee privacy.
Sep 19, 2012 Differential Privacy and Machine Learning 6
An example of privacy leak
De-anonymize Netflix data
βSparsityβ of data: With large probability, no two profiles are
similar up to π. In Netflix data, not two records are similar
more than 50%.
If the profile can be matched up to 50% similarity to a profile in
IMDB , then the adversary knows with good chance the true
identity of the profile.
This paper proposes efficient random algorithm to break
privacy.
A. Narayanan and V. Shmatikov, βRobust de-anonymization of large sparse
datasets (how to break anonymity of the netflix prize dataset),β in Proc. 29th
IEEE Symposium on Security and Privacy, 2008.
Sep 19, 2012 Differential Privacy and Machine Learning 7
Other cases
Medical records of MA Governor in βanonymizedβ
medical database
Search history of Thelma Arnold in βanonymizedβ AOL
query records
DNA of a individual in genome-wide association study
(getting scaryβ¦)
倩梯人θζη΄’εΌζβ¦ Various ways to gather background
information.
Sep 19, 2012 Differential Privacy and Machine Learning 8
A running example: Justin Bieber
To understand the guarantee and what it protects against.
Suppose you are handed a survey:
If your music taste is sensitive information, what will make
you feel safe? Anonymous?
Sep 19, 2012 Differential Privacy and Machine Learning 9
Notations
Sep 19, 2012 Differential Privacy and Machine Learning 10
What do we want?
Sep 19, 2012 Differential Privacy and Machine Learning 11
Why canβt we have it?
Sep 19, 2012 Differential Privacy and Machine Learning 12
Why canβt we have it?
Sep 19, 2012 Differential Privacy and Machine Learning 13
Disappointing fact
We canβt promise my data wonβt affect the results
We canβt promise that an attacker wonβt be able to learn
new information about me. Giving proper background
information.
What can we do?
Sep 19, 2012 Differential Privacy and Machine Learning 14
One more try
Sep 19, 2012 Differential Privacy and Machine Learning 15
Differential Privacy
Proposed by Cynthia Dwork in 2006.
The chance that the noisy released result will be C is
nearly the same, whether or not you submit your info.
The harm to you is βalmostβ the same regardless of your
participation.
Definition: π-Differential Privacy Pr(π π· = πΆ)
Pr(π π·Β±π = πΆ) < eπ
For any π·Β±π β π· β€ 1 and any πΆ β π ππππ(π).
Sep 19, 2012 Differential Privacy and Machine Learning 16
Differential Privacy
Sep 19, 2012 Differential Privacy and Machine Learning 17
Popular over-claims
DP protects individual against ALL harms regardless of
prior knowledge. Fun paper: βIs Terry Gross protected?β
Harm from the result itself cannot be eliminated.
DP makes it impossible to guess whether one
participated in a database with large probability.
Only true under assumption that there is no group structure.
Participants is giving information only about him/herself.
Sep 19, 2012 Differential Privacy and Machine Learning 18
A short example: Smoking Mary
Mary is a smoker. She is harmed by the outcome of a study that shows βsmoking causes cancerβ:
Her insurance premium rises.
Her insurance premium will rises regardless whether she participate in the study or not. (no way to avoid as this finding is the whole point of the study)
There are benefits too:
Mary decided to quit smoking.
Differential privacy: limit harms to the teachings, not participation
The outcome of any analysis is essentially equally likely, independent of whether any individual joins, or refrains from joining, the dataset.
Automatically immune to linkage attacks
Sep 19, 2012 Differential Privacy and Machine Learning 19
Summary of Differential Privacy idea
DP can:
Deconstructs harm and limit the harm to only from the results
Ensures the released results gives minimal evidence whether
any individual contributed to the dataset
Individual only provide info about themselves, DP protects
Personal Identifiable Information to the strictest possible level.
Sep 19, 2012 Differential Privacy and Machine Learning 20
In the presentation
1. Intuitions
Anonymity means privacy?
A running example: Justin Bieber
What exactly does DP protects? Smoker Mary example.
2. What and how
π-Differential Privacy
Global sensitivity and Laplace Mechanism
π, πΏ -Differential Privacy and Composition Theorem
3. Many queries
A disappointing lower bound (Dinur-Nissim Attack 03)
Sparse-vector technique
Sep 19, 2012 Differential Privacy and Machine Learning 21
Differential Privacy [D., McSherry, Nissim,
Smith 06]
Bad Responses: Z Z Z
Pr [response]
ratio bounded
M gives π- differential privacy if for all adjacent x and xβ, and all
C β πππππ(M): Pr[ M (x) β C] β€ e
Pr[ M (xβ) β C]
Neutralizes all linkage attacks.
Composes unconditionally and automatically: Ξ£i Ξ΅ i
Sep 19, 2012 Differential Privacy and Machine Learning 22
Sensitivity of a Function
Adjacent databases differ in at most one row.
Counting queries have sensitivity 1.
Sensitivity captures how much one personβs data can affect output.
f = maxadjacent x,xβ |f(x) β f(xβ)|
Sep 19, 2012 Differential Privacy and Machine Learning 23
Example
How many survey takers are female?
Sensitivity =1
In total, how many Justin Bieber albums are bought by
survey takers?
Sensitivity = 4? Since he has only 4 different albums by far.
Sep 19, 2012 Differential Privacy and Machine Learning 24
25
Laplace Distribution Lap(b)
p(z) = exp(-|z|/b)/2b
variance = 2b2
π = β2 b
Increasing b flattens curve
Sep 19, 2012 Differential Privacy and Machine Learning
Laplace Mechanism
f = maxadj x,xβ |f(x) β f(xβ)|
Theorem [DMNS06]: On query f, to achieve -differential privacy, use
scaled symmetric noise [Lap(b)] with b = f/.
Noise depends on f and , not on the database
Smaller sensitivity (f) means less distortion
0 b 2b 3b 4b 5b -b -2b -3b -4b
Sep 19, 2012 Differential Privacy and Machine Learning 26
Proof of Laplace Mechanism
ππ π π₯ +πΏππ(Ξπ/π)=π¦
ππ π π₯β² +πΏππ(Ξπ/π)=π¦=
exp βπ¦βπ π₯ π
Ξπ
exp βπ¦βπ π₯β² π
Ξπ
= expπ
Ξπ π¦ β π π₯β² β π¦ β π π₯
β€ expπ
Ξπ π π₯ β π π₯β² β€ ππ
Thatβs it!
Sep 19, 2012 Differential Privacy and Machine Learning 27
Example: Counting Queries How many people in the database are female?
Sensitivity = 1
Sufficient to add noise βΌ Lap(1 π)
What about multiple counting queries?
It depends.
Sep 19, 2012 Differential Privacy and Machine Learning 28
Vector-Valued Queries
f = maxadj x, xβ ||f(x) β f(xβ)||1
Theorem [DMNS06]: On query f, to achieve -differential privacy, use
scaled symmetric noise [Lap(f/)]d .
Noise depends on f and , not on the database
Smaller sensitivity (f) means less distortion
0 b 2b 3b 4b 5b -b -2b -3b -4b
Sep 19, 2012 Differential Privacy and Machine Learning 29
(, d) - Differential Privacy
Bad Responses: Z Z Z
Pr [response]
M gives (π, πΏ)- differential privacy if for all adjacent x and xβ, and
all C β πππππ(M) : Pr[ M (D)βC] β€ e
Pr[ M (Dβ) βC] + d
Neutralizes all linkage attacks.
Composes unconditionally and automatically: (Ξ£i i , Ξ£i di )
ratio bounded
This talk: πΏ negligible Sep 19, 2012 Differential Privacy and Machine Learning 30
From worst case to average case
Sep 19, 2012 Differential Privacy and Machine Learning 31
Trade off a lot of π with only a little πΏ
How?
Useful Lemma [D., Rothblum, Vadhanβ10]:
Privacy loss bounded by π β expected loss bounded by 2π2.
βπΆ β Range π : Pr[π π₯ β πΆ]
Pr[π π₯β² β πΆ] β€ ππ
Equivalently,
lnPr[π π₯ β πΆ]
Pr[π π₯β² β πΆ] β€ π
βPrivacy Lossβ
Sep 19, 2012 Differential Privacy and Machine Learning 32
Max Divergence and KL-Divergence
Max Divergence (exactly the definition of π!) :
KL Divergence (average divergence)
The Useful Lemma gives a bound on KL-divergence.
D Y β₯ Z β€ π ππ β 1
π ππ β 1 β€ 2π2π€βπππ < 1
Sep 19, 2012 Differential Privacy and Machine Learning 33
βSimpleβ Composition k-fold composition of (,Ξ΄)-differentially private
mechanisms is (k, kΞ΄)-differentially private.
If want to keep original guarantee, must inject k times the noise
When k is large, this destroys utility of the output
Can we do better than that by again leveraging the tradeoff?
Trade-off a little πΏ with a lot of π?
Sep 19, 2012 Differential Privacy and Machine Learning 34
Composition [D., Rothblum,Vadhanβ10]
Qualitively: Formalize Composition
Multiple, adaptively and adversarially generated databases and
mechanisms
What is Bobβs lifetime exposure risk?
Eg, for a 1-dp lifetime in 10,000 π-dp or (π, πΏ)-dp databases
What should be the value of π?
Quantitatively
βπ, πΏ, πΏβ²: the π-fold composition of (π, πΏ)-dp mechanisms is
2π ln 1 πΏβ² π + ππ ππ β 1 , ππΏ + πΏβ² -dp
ππ rather than ππ
Sep 19, 2012 Differential Privacy and Machine Learning 35
Recall βUseful Lemmaβ:
Privacy loss bounded by π β expected loss bounded by 2π2.
Model cumulative privacy loss as a Martingale [Dinur,D.,Nissimβ03]
Bound on max loss ()
Bound on expected loss (22)
PrM1,β¦,Mk[ | i loss from Mi | > z π A + kB] < exp(-z2/2)
Flavor of Privacy Proof
A
B
Sep 19, 2012 Differential Privacy and Machine Learning 36
Reduce to previous case via βdense model theoremβ [MPRV09]
z
Extension to (,d)-dp mechanisms
Y
(, d)-dp
(, d)-dp
Yβ
d - close (,0)-dp
Sep 19, 2012 Differential Privacy and Machine Learning 37
Composition Theorem βπ, πΏ, πΏβ²: the π-fold composition of (π, πΏ)-dp mechanisms is
( 2π ln1
πΏβ² π + ππ ππ β 1 , ππΏ + πΏβ²)-dp
What is Bobβs lifetime exposure risk?
Eg, 10,000 π-dp or (π, πΏ)-dp databases, for lifetime cost of
(1, πΏβ²)-dp
What should be the value of π?
1/801
OMG, that is small! Can we do better?
Sep 19, 2012 Differential Privacy and Machine Learning 38
In the presentation
1. Intuitions
Anonymity means privacy?
A running example: Justin Bieber
What exactly does DP protects? Smoker Mary example.
2. What and how
π-Differential Privacy
Global sensitivity and Laplace Mechanism
π, πΏ -Differential Privacy and Composition Theorem
3. Many queries
A disappointing lower bound (Dinur-Nissim Attack 03)
Sparse-vector technique
Sep 19, 2012 Differential Privacy and Machine Learning 39
Dinur-Nissim03 Attack
Dinur and Nissim shows that the following negative
results:
If adversary has exponential computational power (ask exp(n))
questions, a π π perturbation is needed for privacy.
If adversary has poly(n) computation powers, at least π π
perturbation is needed.
They also gave the following positive news:
If adversary can ask only less than T(π) questions, then a
perturbation error of roughly π(π) is sufficient to guarantee
privacy.
Sep 19, 2012 Differential Privacy and Machine Learning 40
Sparse Vector Technique Database size π
# Queries π β« π, eg, πsuper-polynomial in π
# βSignificantβ Queries π β π π
For now: Counting queries only
Significant: count exceeds publicly known threshold π
Goal: Find, and optionally release, counts for significant
queries, paying only for significant queries
insignificant insig insignificant insig insig
Sep 19, 2012 Differential Privacy and Machine Learning 41
Algorithm and Privacy Analysis
Algorithm:
When given query ππ‘:
If ππ‘(π₯) β€ π: [insignificant]
Output β₯
Otherwise [significant]
Output ππ‘ π₯ + πΏππ π
β’ First attempt: Itβs obvious, right?
β Number of significant queries π ββ€ π invocations of Laplace mechanism
β Can choose π so as to get error π1/2
Caution:
Conditional branch
leaks private
information!
Need noisy
threshold
π + πΏππ π
[Hardt-Rothblum]
Sep 19, 2012 Differential Privacy and Machine Learning 42
Algorithm:
When given query ππ‘:
If ππ‘(π₯) β€ π + πΏππ(π): [insignificant] Output β₯
Otherwise [significant] Output ππ‘ π₯ + πΏππ π
β’ Intuition: counts far below T leak nothing
β Only charge for noisy counts in this range:
π β
Caution:
Conditional branch
leaks private
information!
Algorithm and Privacy Analysis
Sep 19, 2012 Differential Privacy and Machine Learning 43
Sparse Vector Technique
Probability of (significantly) exceeding expected number of
borderline events is negligible (Chernoff)
Assuming not exceeded: Use Azuma to argue that whp actual
total loss does not significantly exceed expected total loss
Utility: With probability at least 1 β π½ all errors are bounded
by π(ln π + ln(1
π½)).
Choose π = 8 2 ln(2
πΏ)(4π + ln
2
πΏ) π
Linear in π, and only log in π!
Expected total privacy loss πΈπ = π(π
π2)
Sep 19, 2012 Differential Privacy and Machine Learning 44
In the presentation
4. Advanced Techniques
Local sensitivity
Sample and Aggregate
Exponential mechanism and Net-mechanism
5. Diff-Private in Machine Learning
Diff-Private logistic regression (Perturb objective)
Diff-Private low-rank approximation
Diff-Private PCA (use Exponential Mechanism)
Diff-Private SVM
Sep 19, 2012 Differential Privacy and Machine Learning 45
Large sensitivity queries
Thus far, weβve been consider counting queries or
similarly low sensitivity queries.
What if the query itself is of high sensitivity?
What is the age of the oldest person who like Justin Bieber?
Sensitivity is 50? 100? Itβs not even well defined.
Can use histogram: <10, 10-20, 20-30, 30-45, >45
If data is unbounded, what is the sensitivity of PCA?
1st Principal components can turn 90 degrees!
Sep 19, 2012 Differential Privacy and Machine Learning 46
Local sensitivity/smooth sensitivity
The sensitivity depends on π but not the range of data.
Consider f= median income
Global sensitivity is k! Perturb by Lap(k/π) gives no utility at all.
For typical data however we may do better:
Sep 19, 2012 Differential Privacy and Machine Learning 47
Local sensitivity/smooth sensitivity
Local sensitivity is defined to particular data D, but adding
πΏππ(πΏππ(π·)/π) doesnβt guarantee π-dp.
Because πΏππ(π·) itself is sensitive to the values in D!
Solution: Smooth the upper bound of πΏππ
Sep 19, 2012 Differential Privacy and Machine Learning 48
Smooth sensitivity
Simple proof similar to original Laplace mechanism proof.
Theorem: Let ππ be an π-smooth upper bound on π. Then
an algorithm that output:
π π· = π π· + πΏππ(ππ(π·)/π)
is 2π-differentially private.
Sep 19, 2012 Differential Privacy and Machine Learning 49
Subsample-and-Aggregate mechanism
Subsample-and-Aggregate [Nissim, Raskhodnikova, Smithβ07]
Sep 19, 2012 Differential Privacy and Machine Learning 50
Beyond perturbation
Discrete-valued functions: π π₯ β π = {π¦1, π¦2, β¦ , π¦π}
Strings, experts, small databases, β¦
Auction:
Sep 19, 2012 Differential Privacy and Machine Learning 51
Exponential Mechanism
Define utility function:
Each π¦ β R has a utility for π₯, denoted q(π₯, π¦)
Exponential Mechanism [McSherry-Talwarβ07]
Output π¦ with probability β πππ π₯,π¦
2Ξq
Idea: Make high utility outputs exponentially more likely at
a rate that depends on the sensitivity of q(π₯, π¦).
Sep 19, 2012 Differential Privacy and Machine Learning 52
Exponential Mechanism
Sep 19, 2012 Differential Privacy and Machine Learning 53
Exponential Mechanism
Sep 19, 2012 Differential Privacy and Machine Learning 54
πΌ, π½ -usefulness of a private algorithm
A mechanism M is πΌ, π½ -useful with respect to queries in
class πΆ if for every database π· β π π with probability at least 1 β π½, the output
maxππβπΆ
ππ π· β π(ππ , π·) β€ πΌ
So it is to compare the private algorithm with non-private algorithm in PAC setting.
A remark here: The tradeoff privacy may be absorbed in the inherent noisy measurement! Ideally, there can be no impact scale-wise!
Sep 19, 2012 Differential Privacy and Machine Learning 55
Usefulness of Exponential Mechanism
How good is the output?
The results depends ONLY on Ξ (logarithm to π ).
Example: counting query. What is the majority gender that
likes Justin Bieber? π = 2
Error is 2
πlog 2 + 5 with probability 1 β πβ5! Percent
errorβ 0, when number of data become large. Sep 19, 2012 Differential Privacy and Machine Learning 56
Net Mechanism
Many (fractional) counting queries [Blum, Ligett, Rothβ08]:
Given π-row database π₯, set π of properties, produce a synthetic
database π¦ giving good approx to βWhat fraction of rows of π₯
satisfy property π?β βπ β π.
π is set of all databases of size π β π (log |π| πΌ2) βͺ π
π’ π₯, π¦ = β maxπβπ
| π π₯ β π(π¦)|
The size of π is the πΌ-net cover number of π· with
respect to query class π.
Sep 19, 2012 Differential Privacy and Machine Learning 57
Net Mechanism
Usefulness
For counting queries
Logarithm to number of queries! Private to exponential
number of queries!
Well exceeds the fundamental limit of Dinur-Nissim03 for
perturbation based privacy guarantee. (why?)
Sep 19, 2012 Differential Privacy and Machine Learning 58
Other mechanisms
Transform to Fourier domain then add Laplace noise.
Contingency table release.
SuLQ mechanism use for any sublinear Statistical Query
Model algorithms
Examples includes PCA, k-means and Gaussian Mixture Model
Private PAC learning with Exponential Mechanism
for all Classes with Polynomial VC-dimension
Blum, A., Ligett, K., Roth, A.: A Learning Theory Approach to Non-Interactive
Database Privacy (2008)
Sep 19, 2012 Differential Privacy and Machine Learning 59
In the presentation
4. Advanced Techniques
Local sensitivity
Sample and Aggregate
Exponential mechanism and Net-mechanism
5. Diff-Private in Machine Learning
Diff-Private PCA (use Exponential Mechanism)
Diff-Private low-rank approximation
Diff-Private logistic regression (Perturb objective)
Diff-Private SVM
Sep 19, 2012 Differential Privacy and Machine Learning 60
Structure of a private machine learning
paper
Introduction:
Why the data need to be learnt privately.
Main contribution:
Propose a randomized algorithm.
Show this randomization indeed guarantees (π, πΏ)-dp.
Show the sample complexity/usefulness under randomization.
Evaluation:
Compare to standard Laplace Mechanism (usually Laplace
mechanism is quite bad.)
Compare to non-private algorithm and say the deterioration in
performance is not significant. (Argue itβs the price of privacy.)
Sep 19, 2012 Differential Privacy and Machine Learning 61
Differential Private-PCA
K. Chaudhuri, A. D. Sarwate, K. Sinha, Near-optimal
Differential Private PCA (NIPSβ12):
http://arxiv.org/abs/1207.2812
An instance of Exponential mechanism
Utility function is defined such that output close to ordinary
PCA output is exponentially more likely.
Sample from Bingham distribution using Markov Chain Monte
Carlo procedure.
Adding privacy as a trait to RPCA?
Sep 19, 2012 Differential Privacy and Machine Learning 62
Differential Private Low-Rank
Approximation
Moritz Hardt, Aaron Roth, Beating Randomized Response
on Incoherent Matrices (STOCβ12)
http://arxiv.org/abs/1111.0623
Motivated by Netflix challenge, yet they donβt assume missing
data/matrix completion setting, but study general low rank
approx.
Privatize Troppβs 2-step Low-rank approximation by adding
noise. Very dense analysis, but not difficult.
Assume the sparse matrix itself is incoherent.
Sep 19, 2012 Differential Privacy and Machine Learning 63
Differentially private logistic regression
K. Chaudhuri, C. Monteleoni, Privacy-preserving logistic
regression (NIPSβ08)
http://www1.ccls.columbia.edu/~cmontel/cmNIPS2008.pdf
Journal version: Privacy-preserving Empirical Risk
Minimization (JMLR 2011)
http://jmlr.csail.mit.edu/papers/volume12/chaudhuri11a/ch
audhuri11a.pdf
Iβd like to talk a bit more on this paper as a typical
example of DP machine learning paper.
The structure is exactly what I described a few slides back.
Sep 19, 2012 Differential Privacy and Machine Learning 64
Differentially private logistic regression
Refresh on logistic regression
Input: {π₯1, β¦ , π₯π}, each π₯π β π π and π₯π β€ 1
π¦1, β¦ , π¦π , π¦π β {β1,1} are class labels assigned to each π₯π .
Output:
Vector π€ β π π, ππΊπ(π€ππ₯) gives the predicted classification of a point π₯.
Algorithm for logistic regression:
π€β = argminπ€
1
2ππ€ππ€ +
1
π log 1 + πβπ¦ππ€
ππ₯πππ=1
π π π€ =1
2ππ€ππ€ +
1
π log 1 + πβπ¦ππ€
ππ₯π =1
2π π€ 2 + πΏ π€π
π=1
Sep 19, 2012 Differential Privacy and Machine Learning 65
Differentially private logistic regression
Results perturbation approach
Sensitivity:
maxπ₯
π€β π, π β π€β π, π₯ , [π, π¦] β€2
ππ
Algorithm Laplace Mechanism:
1.
2.
3.
Usefulness: with probability 1 β πΏ
Sep 19, 2012 Differential Privacy and Machine Learning 66
Differentially private logistic regression
Objective perturbation approach
Algorithm:
1.
2. Output
Observe:
Size of perturbation is independent to sensitivity! And
independent to π.
When π β β, this optimization is consistent.
Sep 19, 2012 Differential Privacy and Machine Learning 67
Differentially private logistic regression
Theorem: The objective perturbation approach preserves
π-differential privacy.
Proof:
Because both regularization and loss function are differentiable
everywhere. There is a unique π for any output π€β.
Consider two adjacent databases differing only at one point,
we have π1 and π2 that gives π€β.
Because π1 and π2 both gives zero derivative at π€β. We have an
equation. Further with Triangular inequality, β2 β€ π1 β π2 β€ 2
Lastly, by definition:
Sep 19, 2012 Differential Privacy and Machine Learning 68
Differentially private logistic regression
Generalization to a class of convex objective functions!
Proof is very similar to the special case in Logistic Regression.
However wrong, corrected in their JMLR versionβ¦
Sep 19, 2012 Differential Privacy and Machine Learning 69
Differentially private logistic regression
Learning Guarantee:
Generalization bound (assume iid drawn from
distribution)
With probability 1 β πΏ, classification output is at most
Proof is standard by Nati Srebroβs NIPSβ08 paper about
regularized objective functions.
Sep 19, 2012 Differential Privacy and Machine Learning 70
Differentially private logistic regression
Similar generalization bound is given for βresults
perturbation approachβ
A:
To compare with the bound for the proposed objective
perturbation:
B:
A is always greater or equal to B
In low-loss (high accuracy) cases where π€0 > 1, A is much
worse than B
Sep 19, 2012 Differential Privacy and Machine Learning 71
Differentially private logistic regression
Simulation results:
Separable: Random data on hypersphere with small 0.03 gap
separating two labels.
Unseparable: Random data on hypersphere with 0.1 gap,
then 0.2 Probability random flipping.
Sep 19, 2012 Differential Privacy and Machine Learning 72
Sep 19, 2012 Differential Privacy and Machine Learning 73
Differential Private-ERM in high dimension
Follow-up:
D. Sarwate, K. Chaudhuri, C. Monteleoni, Differentially Private
Support Vector Machines
Extend βobjective perturbationβ to larger class of convex method
Private non-linear kernel SVM
D. Kifer, A. Smith, A. Thakurta, Private Convex Empirical Risk
Minimization and High-dimensional Regression (COLTβ12)
Extend the βobjective perturbationβ to smaller added noise, and apply
to problem with non-differentiable regularizer.
Best algorithm for private linear regression in low-dimensional setting.
First DP-sparse regression in high-dimensional setting.
Sep 19, 2012 Differential Privacy and Machine Learning 74
Reiterate the key points
What does Differential Privacy protect against?
Deconstruct harm. Minimize risk of joining a database.
Protect all personal identifiable information.
Elements of (π, πΏ)-dp
Global sensitivity (a function of π) and Smooth (local) sensitivity (a function of π and π·)
Composition theorem (roughly ππ for π queries)
Algorithms
Laplace mechanism (perturbation)
Exponential mechanism (utility function, net-mechanism)
Sample and aggregate (for unknown sensitivity)
Objective perturbation (for many convex optimization based learning algorithms)
Sep 19, 2012 Differential Privacy and Machine Learning 75
Take-away from this Tutorial
Differential privacy as a new design parameter for algorithm
Provable (In fact, Iβve no idea how DP can be evaluated by simulation).
No complicated math. (Well, it can be complicatedβ¦)
Relevant to key strength of our group (noise, corruption robustness)
This is a relatively new field (as in machine learning).
Criticisms:
The bound is a bit paranoid (assume very strong adversary).
Hard/impossible to get practitioners to use it as accuracy is sacrificed. (Unless thereβs a legal requirement.)
Sep 19, 2012 Differential Privacy and Machine Learning 76
Questions and Answers
0 R 2R 3R 4R 5R -R -2R -3R -4R Sep 19, 2012 Differential Privacy and Machine Learning 77