68
From Stability to Differential Privacy Abhradeep Guha Thakurta Yahoo! Labs, Sunnyvale

From Stability to Differential Privacy

Embed Size (px)

DESCRIPTION

From Stability to Differential Privacy. Abhradeep Guha Thakurta Yahoo! Labs , Sunnyvale. Thesis: Stable algorithms yield differentially private algorithms. Differential privacy: A short tutorial. Privacy in Machine Learning Systems. Individuals. - PowerPoint PPT Presentation

Citation preview

Page 1: From Stability to Differential Privacy

From Stability to Differential Privacy

Abhradeep Guha ThakurtaYahoo! Labs, Sunnyvale

Page 2: From Stability to Differential Privacy

Thesis: Stable algorithms yield differentially private algorithms

Page 3: From Stability to Differential Privacy

Differential privacy: A short tutorial

Page 4: From Stability to Differential Privacy

Privacy in Machine Learning Systems

𝑑1𝑑2𝑑1

𝑑𝑛− 1

𝑑𝑛

Individuals

Page 5: From Stability to Differential Privacy

Privacy in Machine Learning Systems

𝑑1𝑑2𝑑1

𝑑𝑛− 1

𝑑𝑛

Individuals

Trusted learning Algorithm

Page 6: From Stability to Differential Privacy

Privacy in Machine Learning Systems

𝑑1𝑑2𝑑1

𝑑𝑛− 1

𝑑𝑛

Individuals

Trusted learning Algorithm

UsersSumma

ry statistic

s1. Classifiers2. Clusters3. Regressio

n coefficients

Page 7: From Stability to Differential Privacy

Privacy in Machine Learning Systems

𝑑1𝑑2𝑑1

𝑑𝑛− 1

𝑑𝑛

Individuals

Trusted learning Algorithm

UsersSumma

ry statistic

s1. Classifiers2. Clusters3. Regressio

n coefficients

Attacker

Page 8: From Stability to Differential Privacy

Privacy in Machine Learning Systems

Learning Algorithm

𝑑1𝑑2𝑑1

𝑑𝑛− 1

𝑑𝑛

Two conflicting goals:

1. Utility: Release accurate information

2. Privacy: Protect privacy of individual entries

Balancing the tradeoff is a difficult problem:

1. Netflix prize database attack [NS08]

2. Facebook advertisement system attack [Korolova11]

3. Amazon recommendation system attack [CKNFS11]

Data privacy is an active area of research:

• Computer science, economics, statistics, biology, social sciences …

Users

Page 9: From Stability to Differential Privacy

Differential Privacy [DMNS06, DKMMN06]Intuition:

• Adversary learns essentially the same thing irrespective of your presence or absence in the data set

• and are called neighboring data sets

• Require: Neighboring data sets induce close distribution on outputs

M

Random coins

𝑑1M()

M

Random coins

𝑑1M()

Data set: Data set:

Page 10: From Stability to Differential Privacy

Differential Privacy [DMNS06, DKMMN06]

Definition:

A randomized algorithm M is -differentially private if

• for all data sets and that differ in one element• for all sets of answers

Page 11: From Stability to Differential Privacy

•Differential privacy is a condition on the algorithm

•Guarantee is meaningful in the presence of any auxiliary information

• Typically, think of privacy parameters: and , where = # of data samples

• Composition: ’s and ‘s add up over multiple executions

Semantics of Differential Privacy

Page 12: From Stability to Differential Privacy

Laplace Mechanism [DMNS06]

Data set and be a function on

Sensitivity: S()

1. Random variable sampled from Lap() 2. Output

Theorem (Privacy): Algorithm is -differentially private

Page 13: From Stability to Differential Privacy

This Talk

1. Differential privacy via stability arguments: A meta-algorithm

2. Sample and aggregate framework and private model selection

3. Non-private sparse linear regression in high-dimensions

4. Private sparse linear regression with (nearly) optimal rate

Page 14: From Stability to Differential Privacy

Perturbation stability (a.k.a. zero local sensitivity)

Page 15: From Stability to Differential Privacy

Perturbation Stability

Function

Data set

Output

Page 16: From Stability to Differential Privacy

Perturbation Stability

Function

Data set

Output

Stability of at : The output does not change on changing any one entryEquivalently, local sensitivity of at is zero

Page 17: From Stability to Differential Privacy

Distance to Instability Property

•Definition: A function is stable at a data set if• For any data set , with ,

•Distance to instability:

•Objective: Output while preserving differential privacy

All data setsUnstable data sets

𝐷Distance

Stable data sets

Page 18: From Stability to Differential Privacy

Propose-Test-Release (PTR) framework [DL09, KRSY11, Smith T.’13]

Page 19: From Stability to Differential Privacy

1. If, then return , else return

A Meta-algorithm: Propose-Test-Release (PTR)

Theorem: The algorithm is differentially private

Theorem: If is -stable at , then w.p. the algorithm outputs

Basic tool: Laplace mechanism

Page 20: From Stability to Differential Privacy

1. Differential privacy via stability arguments: A meta-algorithm

2. Sample and aggregate framework and private model selection

3. Non-private sparse linear regression in high-dimensions

4. Private sparse linear regression with (nearly) optimal rate

This Talk

Page 21: From Stability to Differential Privacy

Sample and aggregate framework[NRS07, Smith11, Smith T.’13]

Page 22: From Stability to Differential Privacy

Sample and Aggregate FrameworkData set

Subsample

𝐷1 𝐷𝑚

Output

Algorithm

Aggregator

Page 23: From Stability to Differential Privacy

Sample and Aggregate Framework

Theorem: If the aggregator is differentially private, then the overall framework is differentially private

Assumption: Each entry appears in data blocks

Proof: Each data entry affects only one data block

Page 24: From Stability to Differential Privacy

A differentially private aggregator using PTR framework [Smith T.’13]

Page 25: From Stability to Differential Privacy

Assumption: discrete possible outputs

𝑆1 𝑆2 𝑆∗ 𝑆𝑟

Coun

t

𝐷1 𝐷𝑚

Vote Vote

An differentially Private Aggregator

Page 26: From Stability to Differential Privacy

Function : Candidate output with the maximum votes

PTR+Report-Noisy-Max Aggregator

1. If, then return , else return

Observation: is the gap between the counts of highest and the second highest scoring modelObservation: The algorithm is always computationally efficient

Page 27: From Stability to Differential Privacy

Analysis of the aggregator under subsampling stability [Smith T.’13]

Page 28: From Stability to Differential Privacy

Subsampling Stability

Data set

Random subsamplewith replacement 𝐷1 𝐷𝑚

Function

Stability:

Function¿ w.p.

Page 29: From Stability to Differential Privacy

A Private Aggregator using Subsampling Stability

Voting histogram (in expectation)

𝑆1 𝑆2 𝑆∗ 𝑆𝑟

12m

34𝑚

12𝑚

14𝑚

• : Sample each entry from w.p.

• Each entry of appears in data blocks

Page 30: From Stability to Differential Privacy

PTR+Report-Noisy-Max Aggregator

• : Sample each entry from w.p.

• Each entry of appears in data blocks w.p.

1. If, then return , else return 𝑆1 𝑆2 𝑆∗ 𝑆𝑟

Page 31: From Stability to Differential Privacy

Theorem: Above algorithm is differentially private

Theorem: If ,then with probability at least , the true answer is output

A Private Aggregator using Subsampling Stability

Notice: Utility guarantee does not depend on the number of candidate models

Page 32: From Stability to Differential Privacy

This Talk

1. Differential privacy via stability arguments: A meta-algorithm

2. Sample and aggregate framework and private model selection

3. Non-private sparse linear regression in high-dimensions

4. Private sparse linear regression with (nearly) optimal rate

Page 33: From Stability to Differential Privacy

Sparse linear regression in high-dimensions and the LASSO

Page 34: From Stability to Differential Privacy

Sparse Linear Regression in High-dimensions ()• Data set: where and

• Assumption: Data generated by noisy linear system

𝑦 𝑖 +¿¿

𝑥𝑖

𝜃𝑝× 1∗

𝑤𝑖

Para

mete

r vect

or

Field noise

Feature vector

Data normalization:

• is sub-Gaussian

Page 35: From Stability to Differential Privacy

Sparse Linear Regression in High-dimensions ()• Data set: where and

• Assumption: Data generated by noisy linear system

𝑦 𝑛×1

+¿¿𝑋𝑛×𝑝

𝜃𝑝× 1∗

𝑤𝑛× 1

Resp

onse

vect

or Design matrix

Para

mete

r vect

or

Field

nois

e

Page 36: From Stability to Differential Privacy

• Sparsity: has non-zero entries

• Bounded norm: for arbitrary small const.

Model selection problem: Find the non-zero coordinates of

Sparse Linear Regression in High-dimensions ()

𝑦 𝑛×1

+¿¿𝑋𝑛×𝑝

𝜃𝑝× 1∗

𝑤𝑛× 1

Resp

onse

vect

or Design matrix

Field

nois

e

Page 37: From Stability to Differential Privacy

Model selection: Non-zero coordinates (or the support) of

Solution: LASSO estimator [Tibshirani94,EFJT03,Wainwright06,CT07,ZY07,…]

Sparse Linear Regression in High-dimensions ()

𝑦 𝑛×1

+¿¿𝑋𝑛×𝑝

𝜃𝑝× 1∗

𝑤𝑛× 1

Resp

onse

vect

or Design matrix

Field

nois

e

Page 38: From Stability to Differential Privacy

Incoherence Restricted Strong Convexity

Consistency of the LASSO Estimator

Consistency conditions* [Wainwright06,ZY07]:

• Support of the underlying parameter vector

+¿¿

𝑋 Γ 𝑋 Γ 𝑐

Page 39: From Stability to Differential Privacy

Restricted Strong Convexity

Consistency of the LASSO Estimator

Consistency conditions* [Wainwright06,ZY07]:

• Support of the underlying parameter vector

+¿¿

Theorem*: Under proper choice of and , support of the LASSO estimator equals support of

Incoherence

Page 40: From Stability to Differential Privacy

Incoherence Restricted Strong Convexity

Stochastic Consistency of the LASSO

Consistency conditions* [Wainwright06,ZY07]:

• Support of the underlying parameter vector

+¿¿

Theorem [Wainwright06,ZY07]: If each data entry in , then the assumptions above are satisfied w.h.p.

Page 41: From Stability to Differential Privacy

We show [Smith,T.’13]

Consistency conditions

Perturbation stability Proxy conditions(Efficiently testable with

privacy)

Page 42: From Stability to Differential Privacy

This Talk

1. Differential privacy via stability arguments: A meta-algorithm

2. Sample and aggregate framework and private model selection

3. Non-private sparse linear regression in high-dimensions

4. Private sparse linear regression with (nearly) optimal rate

Page 43: From Stability to Differential Privacy

Interlude: A simple subsampling based private LASSO algorithm [Smith,T.’13]

Page 44: From Stability to Differential Privacy

Notion of Neighboring Data sets

𝑛

𝑝

𝑥𝑖 𝑦 𝑖

Data set =

Design matrix Response vector

Page 45: From Stability to Differential Privacy

Notion of Neighboring Data sets

𝑛

𝑝

𝑥𝑖 ′ 𝑦 𝑖′

Data set =

and are neighboring data sets

Design matrix Response vector

Page 46: From Stability to Differential Privacy

Recap: Subsampling Stability

Data set

Random subsamplewith replacement 𝐷1 𝐷𝑚

Function

Stability:

Function¿ w.p.

Page 47: From Stability to Differential Privacy

Recap: PTR+Report-Noisy-Max Aggregator

Assumption: All candidate models

𝑆1 𝑆2 𝑆∗ 𝑆𝑘

Coun

t

𝐷1 𝐷𝑚

Vote Vote

+¿¿

𝑓 𝑓 𝑓

Page 48: From Stability to Differential Privacy

Recap: PTR+Report-Noisy-Max Aggregator

• : Sample each entry from w.p.

• Each entry of appears in data blocks w.p.

• Fix

1. If, then return , else return 𝑆1 𝑆2 𝑆∗ 𝑆𝑟

Page 49: From Stability to Differential Privacy

Subsampling Stability of the LASSO

Stochastic assumptions: Each data entry in Noise

𝑦 𝑛×1

+¿¿𝑋𝑛×𝑝

𝜃𝑝× 1∗

𝑤𝑛× 1

Resp

onse

vect

or Design matrix

Para

mete

r vect

or

Field

nois

e

Page 50: From Stability to Differential Privacy

Subsampling Stability of the LASSO

Stochastic assumptions: Each data entry in Noise

+¿¿

Theorem [Wainwright06,ZY07]: Under proper choice of and , support of the LASSO estimator equals support of

Theorem: Under proper choice of , and , the output of the aggregator equals support of

Notice the gap of

Scale of

Page 51: From Stability to Differential Privacy

Perturbation stability based private LASSO and optimal sample complexity [Smith,T.’13]

Page 52: From Stability to Differential Privacy

Recap: Distance to Instability Property

•Definition: A function is stable at a data set if• For any data set , with ,

•Distance to instability:

•Objective: Output while preserving differential privacy

All data setsUnstable data sets

𝐷Distance

Stable data sets

Page 53: From Stability to Differential Privacy

1. If, then return , else return

Recap: Propose-Test-Release Framework (PTR)

Theorem: The algorithm is differentially private

Theorem: If is -stable at , then w.p. the algorithm outputs

TBD: Some global sensitivity one query

Page 54: From Stability to Differential Privacy

Instantiation of PTR for the LASSO

LASSO:

• Set function support of

• Issue: For , distance to instability might not be efficiently

computable

+¿¿

Page 55: From Stability to Differential Privacy

From [Smith,T.’13] Consistency

conditions

Perturbation stability Proxy conditions(Efficiently testable with

privacy)

Page 56: From Stability to Differential Privacy

This talk

Consistency conditions

Perturbation stability Proxy conditions(Efficiently testable with

privacy)

Page 57: From Stability to Differential Privacy

Perturbation Stability of the LASSO

LASSO: +¿¿

Theorem: Consistency conditions on LASSO are sufficient for perturbation stabilityProof Sketch: 1. Analyze Karush-Kuhn-Tucker (KKT) optimality conditions at

2. Show that support() is stable via using ‘’dual certificate’’ on stable instances

Page 58: From Stability to Differential Privacy

Perturbation Stability of the LASSO+¿¿

Lasso objective on

Proof Sketch: Gradient of LASSO =

0∈𝜕 𝐽 𝐷( �̂�)

Lasso objective on

0∈𝜕 𝐽 𝐷 ′ (�̂� ′ )

Page 59: From Stability to Differential Privacy

Perturbation Stability of the LASSO+¿¿

Proof Sketch: Gradient of LASSO =

Argue using the optimality conditions of and

1. No zero coordinates of become non-zero in (use mutual incoherence condition)

2. No non-zero coordinates of become zero in (use restricted strong convexity condition)

Page 60: From Stability to Differential Privacy

Perturbation Stability Test for the LASSO

0 0

: Support of : Complement of the support of

Test for the following (real test is more complex):• Restricted Strong Convexity (RSC): Minimum eigenvalue of

is

• Strong stability: Negative of the (absolute) coordinates of

the gradient of the least-squared loss in are

+¿¿

Page 61: From Stability to Differential Privacy

Intuition: Strong convexity ensures supp() supp()

1. Strong convexity ensures is small

2. If is large, then

3. Consistency conditions imply is large

Geometry of the Stability of LASSO+¿¿

Dimension 2 in

Dimension 1 in �̂�

Lasso objective along

Page 62: From Stability to Differential Privacy

Intuition: Strong stability ensures no zero coordinate in becomes non-zero in

• For the minimizer to move along , the perturbation to the gradient of least-squared loss has to be large

Geometry of the Stability of LASSO+¿¿

Dimension 2 in

Dimension 1 in

Slope:

Slope: -

�̂�

Lasso objective along

Page 63: From Stability to Differential Privacy

Geometry of the Stability of LASSO+¿¿

Gradient of the least-squared loss:

𝑎𝑖

𝑎𝑝

Γ

Γ c

• Strong stability: for all has a sub-gradient of zero for LASSO()

−𝑋𝑇 (𝑦−𝑋 �̂� )=¿Dimension 2 in

Dimension 1 in

Slope:

Slope: -

�̂�

Lasso objective along

Page 64: From Stability to Differential Privacy

Test for Restricted Strong Convexity:

Test for strong stability:

Issue: If and , then sensitivities are and

Our solution: Proxy distance • has global sensitivity of one

Making the Stability Test Private (Simplified)

𝑔1

𝑔2

and are both largeand insensitive

+¿¿

Page 65: From Stability to Differential Privacy

1. Compute = function of and

2. If, then return , else return

Private Model Selection with Optimal Sample Complexity

Theorem: The algorithm is differentially private

Theorem: Under consistency conditions , and , w.h.p. the support of is output. Here .

+¿¿Nearly optimal sample complexity

Page 66: From Stability to Differential Privacy

Thesis: Stable algorithms yield differentially private algorithms

Two notions of stability:

1. Perturbation stability

2. Subsampling stability

Page 67: From Stability to Differential Privacy

This Talk

1. Differential privacy via stability arguments: A meta-algorithm

2. Sample and aggregate framework and private model selection

3. Non-private sparse linear regression in high-dimensions

4. Private sparse linear regression with (nearly) optimal rate

Page 68: From Stability to Differential Privacy

Concluding Remarks1. Sample and aggregate framework with PTR+report-noisy-

max aggregator is a generic tool for designing learning algorithms

• Example: learning with non-convex models [Bilenko,Dwork,Rothblum,T.]

2. Propose-test-release framework is an interesting tool if one can compute distance to instability efficiently

3. Open problem: Private high-dimensional learning without assumptions like incoherence and restricted strong convexity