Semidefinite relaxations for certifying robustness to
adversarial examples
Jacob Steinhardt Percy Liang
Aditi Raghunathan
2
ML: Powerful But Fragile
2
ML: Powerful But Fragile• ML is successful on several tasks: object recognition, game playing, face
recognition
2
ML: Powerful But Fragile• ML is successful on several tasks: object recognition, game playing, face
recognition
• ML systems fail catastrophically in presence of adversaries
2
ML: Powerful But Fragile• ML is successful on several tasks: object recognition, game playing, face
recognition
• ML systems fail catastrophically in presence of adversaries
2
ML: Powerful But Fragile• ML is successful on several tasks: object recognition, game playing, face
recognition
• ML systems fail catastrophically in presence of adversaries
2
• Different kinds of adversarial manipulations — data poisoning, manipulation of test inputs, model theft, membership inference etc.
ML: Powerful But Fragile• ML is successful on several tasks: object recognition, game playing, face
recognition
• ML systems fail catastrophically in presence of adversaries
2
• Different kinds of adversarial manipulations — data poisoning, manipulation of test inputs, model theft, membership inference etc.
• Focus on adversarial examples — manipulation of test inputs
Adversarial Examples
3
Adversarial Examples
[Sharif et al. 2016]
Glasses ! Impersonation
3
Adversarial Examples
[Sharif et al. 2016]
Glasses ! Impersonation Banana + patch !Toaster
[Brown et al. 2017]
3
Adversarial Examples
[Sharif et al. 2016]
Glasses ! Impersonation Banana + patch !Toaster Stop + sticker !Yield
[Brown et al. 2017] [Evtimov et al. 2017]
3
Adversarial Examples
4
Adversarial Examples
3D Turtle ! Rifle
[Athalye et al. 2017]
4
Adversarial Examples
3D Turtle ! Rifle
[Athalye et al. 2017]
Noise ! “Ok Google”
[Carlini et al. 2017]
4
Adversarial Examples
3D Turtle ! Rifle
[Athalye et al. 2017]
Noise ! “Ok Google”
[Carlini et al. 2017]
Malware ! Benign
[Grosse et al. 2017]
4
5
What is an adversarial example?
Definition of attack model usually application specific and complex
5
What is an adversarial example?
Definition of attack model usually application specific and complex
We consider the well studied attack model `1
5
What is an adversarial example?
Definition of attack model usually application specific and complex
We consider the well studied attack model `1
5
What is an adversarial example?
Panda Gibbon
Szegedy et al. 2014
Definition of attack model usually application specific and complex
We consider the well studied attack model `1
5
What is an adversarial example?
|xadv � x|i ✏ for i = 1, 2, . . . d
Panda Gibbon
Szegedy et al. 2014
Definition of attack model usually application specific and complex
We consider the well studied attack model `1
5
What is an adversarial example?
|xadv � x|i ✏ for i = 1, 2, . . . d xadv 2 B✏(x)
Panda Gibbon
Szegedy et al. 2014
History
6
History
6
Hard to defend even in this well defined model…
History• [Szegedy+ 2014]: First discover adversarial examples
6
Hard to defend even in this well defined model…
History• [Szegedy+ 2014]: First discover adversarial examples
• [Goodfellow+ 2015]: Adversarial training (AT) against FGSM
6
Hard to defend even in this well defined model…
History• [Szegedy+ 2014]: First discover adversarial examples
• [Goodfellow+ 2015]: Adversarial training (AT) against FGSM
• [Papernot+ 2015]: Defensive Distillation
6
Hard to defend even in this well defined model…
History• [Szegedy+ 2014]: First discover adversarial examples
• [Goodfellow+ 2015]: Adversarial training (AT) against FGSM
• [Papernot+ 2015]: Defensive Distillation
• [Carlini & Wagner 2016]: Distillation is not secure
6
Hard to defend even in this well defined model…
History• [Szegedy+ 2014]: First discover adversarial examples
• [Goodfellow+ 2015]: Adversarial training (AT) against FGSM
• [Papernot+ 2015]: Defensive Distillation
• [Carlini & Wagner 2016]: Distillation is not secure
• [Papernot + 2017]: Better distillation
6
Hard to defend even in this well defined model…
History• [Szegedy+ 2014]: First discover adversarial examples
• [Goodfellow+ 2015]: Adversarial training (AT) against FGSM
• [Papernot+ 2015]: Defensive Distillation
• [Carlini & Wagner 2016]: Distillation is not secure
• [Papernot + 2017]: Better distillation
• [Carlini & Wagner 2017]: Ten detection strategies fail
6
Hard to defend even in this well defined model…
History• [Szegedy+ 2014]: First discover adversarial examples
• [Goodfellow+ 2015]: Adversarial training (AT) against FGSM
• [Papernot+ 2015]: Defensive Distillation
• [Carlini & Wagner 2016]: Distillation is not secure
• [Papernot + 2017]: Better distillation
• [Carlini & Wagner 2017]: Ten detection strategies fail
• [Madry+ 2017]: AT against PGD, informal argument about optimality
6
Hard to defend even in this well defined model…
History• [Szegedy+ 2014]: First discover adversarial examples
• [Goodfellow+ 2015]: Adversarial training (AT) against FGSM
• [Papernot+ 2015]: Defensive Distillation
• [Carlini & Wagner 2016]: Distillation is not secure
• [Papernot + 2017]: Better distillation
• [Carlini & Wagner 2017]: Ten detection strategies fail
• [Madry+ 2017]: AT against PGD, informal argument about optimality
• [Lu + July 12 2017]: ”NO Need to Worry about Adversarial Examples in Object Detection in Autonomous Vehicles”
6
Hard to defend even in this well defined model…
History• [Szegedy+ 2014]: First discover adversarial examples
• [Goodfellow+ 2015]: Adversarial training (AT) against FGSM
• [Papernot+ 2015]: Defensive Distillation
• [Carlini & Wagner 2016]: Distillation is not secure
• [Papernot + 2017]: Better distillation
• [Carlini & Wagner 2017]: Ten detection strategies fail
• [Madry+ 2017]: AT against PGD, informal argument about optimality
• [Lu + July 12 2017]: ”NO Need to Worry about Adversarial Examples in Object Detection in Autonomous Vehicles”
• [Athalye and Sutskever July 17 2017]: Break above defense
6
Hard to defend even in this well defined model…
History• [Szegedy+ 2014]: First discover adversarial examples
• [Goodfellow+ 2015]: Adversarial training (AT) against FGSM
• [Papernot+ 2015]: Defensive Distillation
• [Carlini & Wagner 2016]: Distillation is not secure
• [Papernot + 2017]: Better distillation
• [Carlini & Wagner 2017]: Ten detection strategies fail
• [Madry+ 2017]: AT against PGD, informal argument about optimality
• [Lu + July 12 2017]: ”NO Need to Worry about Adversarial Examples in Object Detection in Autonomous Vehicles”
• [Athalye and Sutskever July 17 2017]: Break above defense
• [Athalye, Carlini, Wagner]: Break 6 out of 7 ICLR defenses 6
Hard to defend even in this well defined model…
History• [Szegedy+ 2014]: First discover adversarial examples
• [Goodfellow+ 2015]: Adversarial training (AT) against FGSM
• [Papernot+ 2015]: Defensive Distillation
• [Carlini & Wagner 2016]: Distillation is not secure
• [Papernot + 2017]: Better distillation
• [Carlini & Wagner 2017]: Ten detection strategies fail
• [Madry+ 2017]: AT against PGD, informal argument about optimality
• [Lu + July 12 2017]: ”NO Need to Worry about Adversarial Examples in Object Detection in Autonomous Vehicles”
• [Athalye and Sutskever July 17 2017]: Break above defense
• [Athalye, Carlini, Wagner]: Break 6 out of 7 ICLR defenses 6
Hard to defend even in this well defined model…
Vs.
7
Provable robustness
7
Provable robustness
7
Can we get robustness to all attacks?
Provable robustness
7
Can we get robustness to all attacks?
f(x)Let be the scoring function and adversary wants to maximizef(x)
Provable robustness
7
Can we get robustness to all attacks?
f(x)Let be the scoring function and adversary wants to maximizef(x)
Provable robustness
7
Can we get robustness to all attacks?
Attacks: Generate points in B✏(x)
f(x)Let be the scoring function and adversary wants to maximizef(x)
Provable robustness
7
Can we get robustness to all attacks?
Attacks: Generate points in B✏(x)
Afgsm(x) = x+ ✏ sign�rf(x)
�
f(x)Let be the scoring function and adversary wants to maximizef(x)
Provable robustness
7
Can we get robustness to all attacks?
Attacks: Generate points in B✏(x)
Afgsm(x) = x+ ✏ sign�rf(x)
�
f(x)Let be the scoring function and adversary wants to maximizef(x)
Aopt(x) = argmaxx
f(x)
Provable robustness
7
Can we get robustness to all attacks?
Attacks: Generate points in B✏(x)
Afgsm(x) = x+ ✏ sign�rf(x)
�
Network is provably robust if optimal attack fails
f(x)Let be the scoring function and adversary wants to maximizef(x)
Aopt(x) = argmaxx
f(x)
Provable robustness
7
Can we get robustness to all attacks?
Attacks: Generate points in B✏(x)
Afgsm(x) = x+ ✏ sign�rf(x)
�
Network is provably robust if optimal attack fails
f(x)Let be the scoring function and adversary wants to maximizef(x)
Aopt(x) = argmaxx
f(x)
f? ⌘ f(Aopt(x)) < 0
8
8
Network is provably robust if
Provable robustnessf? ⌘ f(Aopt(x)) < 0
8
Network is provably robust if
Provable robustnessf? ⌘ f(Aopt(x)) < 0
Computing is intractable in generalf?
8
Network is provably robust if
• Combinatorial approaches to compute
Provable robustness
f?
f? ⌘ f(Aopt(x)) < 0
Computing is intractable in generalf?
8
Network is provably robust if
• Combinatorial approaches to compute
• SMT based Reluplex [Katz+ 2018]
Provable robustness
f?
f? ⌘ f(Aopt(x)) < 0
Computing is intractable in generalf?
8
Network is provably robust if
• Combinatorial approaches to compute
• SMT based Reluplex [Katz+ 2018]
• MILP based with specialized preprocessing [Tjeng+ 2018]
Provable robustness
f?
f? ⌘ f(Aopt(x)) < 0
Computing is intractable in generalf?
8
Network is provably robust if
• Combinatorial approaches to compute
• SMT based Reluplex [Katz+ 2018]
• MILP based with specialized preprocessing [Tjeng+ 2018]
• Convex relaxations to compute upper bound on
Provable robustness
f?
f?
f? ⌘ f(Aopt(x)) < 0
Computing is intractable in generalf?
8
Network is provably robust if
• Combinatorial approaches to compute
• SMT based Reluplex [Katz+ 2018]
• MILP based with specialized preprocessing [Tjeng+ 2018]
• Convex relaxations to compute upper bound on
• Upper bound is negative optimal attack fails =)
Provable robustness
f?
f?
f? ⌘ f(Aopt(x)) < 0
Computing is intractable in generalf?
8
Network is provably robust if
• Combinatorial approaches to compute
• SMT based Reluplex [Katz+ 2018]
• MILP based with specialized preprocessing [Tjeng+ 2018]
• Convex relaxations to compute upper bound on
• Upper bound is negative optimal attack fails
• Computationally efficient upper bound =)
Provable robustness
f?
f?
f? ⌘ f(Aopt(x)) < 0
Computing is intractable in generalf?
8
Network is provably robust if
• Combinatorial approaches to compute
• SMT based Reluplex [Katz+ 2018]
• MILP based with specialized preprocessing [Tjeng+ 2018]
• Convex relaxations to compute upper bound on
• Upper bound is negative optimal attack fails
• Computationally efficient upper bound =)
Provable robustness
f?
f?
0 f?
Not robust
fupper
f? ⌘ f(Aopt(x)) < 0
Computing is intractable in generalf?
8
Network is provably robust if
• Combinatorial approaches to compute
• SMT based Reluplex [Katz+ 2018]
• MILP based with specialized preprocessing [Tjeng+ 2018]
• Convex relaxations to compute upper bound on
• Upper bound is negative optimal attack fails
• Computationally efficient upper bound =)
Provable robustness
f?
f?
0 f?
Not robust
fupper 0f? fupper
Robust and certified
f? ⌘ f(Aopt(x)) < 0
Computing is intractable in generalf?
8
Network is provably robust if
• Combinatorial approaches to compute
• SMT based Reluplex [Katz+ 2018]
• MILP based with specialized preprocessing [Tjeng+ 2018]
• Convex relaxations to compute upper bound on
• Upper bound is negative optimal attack fails
• Computationally efficient upper bound =)
Provable robustness
f?
f?
0 f?
Not robust
fupper 0f? fupper
Robust and certified
0f? fupper
Robust and not certified
f? ⌘ f(Aopt(x)) < 0
Computing is intractable in generalf?
9
Two layer networks
9
Two layer networks
9
Two layer networks
9
Two layer networks
9
Key idea: Uniformly bound gradients
Two layer networks
f(x) f(x) + ✏maxx
krf(x)k19
Key idea: Uniformly bound gradients
Two layer networks
10
Two layer networks
10
Two layer networks
f(x) f(x) + ✏maxx
krf(x)k1
10
Key idea: Uniformly bound gradients
Two layer networks
Bound on gradient:
f(x) f(x) + ✏maxx
krf(x)k1
10
Key idea: Uniformly bound gradients
Two layer networks
Bound on gradient:
f(x) f(x) + ✏maxx
krf(x)k1
10
Key idea: Uniformly bound gradients
Two layer networks
Bound on gradient:
f(x) f(x) + ✏maxx
krf(x)k1
10
Key idea: Uniformly bound gradients
Two layer networks
Bound on gradient:
optimize over activations
f(x) f(x) + ✏maxx
krf(x)k1
10
Key idea: Uniformly bound gradients
Two layer networks
Bound on gradient:
optimize over activations
optimize over signs of perturbation
f(x) f(x) + ✏maxx
krf(x)k1
10
Key idea: Uniformly bound gradients
Two layer networks
Bound on gradient:
optimize over activations
optimize over signs of perturbation
Final step: SDP relaxation (similar to MAXCUT) leads to Grad-cert
f(x) f(x) + ✏maxx
krf(x)k1
10
Key idea: Uniformly bound gradients
Relaxation Training
11
Relaxation TrainingTraining a neural network
11
Relaxation TrainingTraining a neural network
Objective:
11
Relaxation TrainingTraining a neural network
Objective:
11
Relaxation TrainingTraining a neural network
Objective:
Differentiable objective but expensive gradients
11
Relaxation TrainingTraining a neural network
Objective:
Differentiable objective but expensive gradients
Duality to the rescue!
11
Relaxation TrainingTraining a neural network
Objective:
Differentiable objective but expensive gradients
Duality to the rescue!
Regularizer:
11
D · �+max
�(M(v,W )� diag(c)
�+ 1> max(c, 0)
Relaxation TrainingTraining a neural network
Objective:
Differentiable objective but expensive gradients
Duality to the rescue!
Regularizer:
Just one max eigenvalue computation for gradients
11
D · �+max
�(M(v,W )� diag(c)
�+ 1> max(c, 0)
Results on MNIST
12
Results on MNIST
Attack: Projected Gradient Descent attack of Madry et al. 2018 Adversarial training: Minimizes this lower bound on training set
12
Results on MNIST
Attack: Projected Gradient Descent attack of Madry et al. 2018 Adversarial training: Minimizes this lower bound on training set
12
Results on MNIST
13
Results on MNIST
Attack: Projected Gradient Descent attack of Madry et al. 2018 Adversarial training: Minimizes this lower bound on training set
13
Results on MNIST
Attack: Projected Gradient Descent attack of Madry et al. 2018 Adversarial training: Minimizes this lower bound on training set
13
Gradient based bound is quite loose
Results on MNIST
14
Train with Grad-cert(attack)
Results on MNIST
Attack: Projected Gradient Descent attack of Madry et al. 2018 Our method: Minimize gradient based upper bound 14
Train with Grad-cert(attack)
Results on MNIST
15
Train with Grad-cert(attack)Train with Grad-cert(Certified)
Results on MNIST
Attack: Projected Gradient Descent attack of Madry et al. 2018 Our method: Minimize gradient based upper bound 15
Train with Grad-cert(attack)Train with Grad-cert(Certified)
Results on MNIST
Attack: Projected Gradient Descent attack of Madry et al. 2018 Our method: Minimize gradient based upper bound 15
Train with Grad-cert(attack)Train with Grad-cert(Certified)
Gradient based bound is much better!
Results on MNIST
16
Results on MNIST
16
Training a network to minimize gradient upper bound finds networks where the bound is tight
Results on MNIST
16
Training a network to minimize gradient upper bound finds networks where the bound is tight
Comparison with Wong and Kolter 2018 (LP-cert)
Results on MNIST
16
Training a network to minimize gradient upper bound finds networks where the bound is tight
Comparison with Wong and Kolter 2018 (LP-cert)
Network PGD-attack LP-cert Grad-cert
LP-NN 22% 26% 93%Grad-NN 15% 97% 35%
Results on MNIST
16
Training a network to minimize gradient upper bound finds networks where the bound is tight
Comparison with Wong and Kolter 2018 (LP-cert)
Bounds are tight when you train
Network PGD-attack LP-cert Grad-cert
LP-NN 22% 26% 93%Grad-NN 15% 97% 35%
Results on MNIST
17
Results on MNISTTraining a network to minimize gradient upper bound
finds networks where the bound is tight
Comparison with Wong and Kolter 2018 (LP-cert)
Bounds are tight when you train
Network PGD-attack LP-cert Grad-cert
LP-NN 22% 26% 93%Grad-NN 15% 97% 35%
17
Results on MNISTTraining a network to minimize gradient upper bound
finds networks where the bound is tight
Comparison with Wong and Kolter 2018 (LP-cert)
Bounds are tight when you train Bounds are tight only when you train
Network PGD-attack LP-cert Grad-cert
LP-NN 22% 26% 93%Grad-NN 15% 97% 35%
17
Results on MNISTTraining a network to minimize gradient upper bound
finds networks where the bound is tight
Comparison with Wong and Kolter 2018 (LP-cert)
Bounds are tight when you train
Some networks are empirically robust but not certified (e.g. Adversarial Training of Madry et al. 2018)
Bounds are tight only when you train
Network PGD-attack LP-cert Grad-cert
LP-NN 22% 26% 93%Grad-NN 15% 97% 35%
17
Results on MNISTTraining a network to minimize gradient upper bound
finds networks where the bound is tight
Comparison with Wong and Kolter 2018 (LP-cert)
Bounds are tight when you train
Some networks are empirically robust but not certified (e.g. Adversarial Training of Madry et al. 2018)
Can we certify such “foreign” networks?
Bounds are tight only when you train
Network PGD-attack LP-cert Grad-cert
LP-NN 22% 26% 93%Grad-NN 15% 97% 35%
17
18
Summary so far…
18
Summary so far…
• Certified robustness: relaxed optimization to bound worst-case attack
18
Summary so far…
• Certified robustness: relaxed optimization to bound worst-case attack
• Grad-cert: Upper bound on worst case attack using uniform bound on
gradient
18
Summary so far…
• Certified robustness: relaxed optimization to bound worst-case attack
• Grad-cert: Upper bound on worst case attack using uniform bound on
gradient
• Training against the bound makes it tight
18
Summary so far…
• Certified robustness: relaxed optimization to bound worst-case attack
• Grad-cert: Upper bound on worst case attack using uniform bound on
gradient
• Training against the bound makes it tight
• LP-cert and Grad-cert are tight only on training
18
Summary so far…
• Certified robustness: relaxed optimization to bound worst-case attack
• Grad-cert: Upper bound on worst case attack using uniform bound on
gradient
• Training against the bound makes it tight
• LP-cert and Grad-cert are tight only on training
• Goal: Efficiently certify foreign multi-layer networks
18
19
19
New SDP-cert relaxation
19
New SDP-cert relaxation
19
New SDP-cert relaxation
19
New SDP-cert relaxationx x1 x2 x3 ⌘ xL
19
New SDP-cert relaxation
Attack model constraints:
x x1 x2 x3 ⌘ xL
19
New SDP-cert relaxation
Attack model constraints:|x� x|i ✏
for i = 1, 2, . . . d
x x1 x2 x3 ⌘ xL
19
New SDP-cert relaxation
Attack model constraints:|x� x|i ✏
for i = 1, 2, . . . d
x x1 x2
for i = 1, 2, . . . Lxi = ReLU(Wi�1xi�1)
x3 ⌘ xL
Neural net constraintsW2W1W0
19
New SDP-cert relaxation
Attack model constraints:|x� x|i ✏
for i = 1, 2, . . . d
x x1 x2
for i = 1, 2, . . . Lxi = ReLU(Wi�1xi�1)
Objective
x3 ⌘ xL
Neural net constraintsW2W1W0
19
New SDP-cert relaxation
cy
cy
Attack model constraints:|x� x|i ✏
for i = 1, 2, . . . d
x x1 x2
for i = 1, 2, . . . Lxi = ReLU(Wi�1xi�1)
Objective
x3 ⌘ xL
Neural net constraintsW2W1W0
19
New SDP-cert relaxation
cy
cy
Attack model constraints:|x� x|i ✏
for i = 1, 2, . . . d
x x1 x2
for i = 1, 2, . . . Lxi = ReLU(Wi�1xi�1)
Objective
x3 ⌘ xL
Neural net constraints
f? = maxx
(cy � cy)>xL
W2W1W0
19
Source of non-convexity is the ReLU constraints
New SDP-cert relaxation
cy
cy
Attack model constraints:|x� x|i ✏
for i = 1, 2, . . . d
x x1 x2
for i = 1, 2, . . . Lxi = ReLU(Wi�1xi�1)
Objective
x3 ⌘ xL
Neural net constraints
f? = maxx
(cy � cy)>xL
W2W1W0
20
Handling ReLU constraints
20
Handling ReLU constraintsConsider single ReLU constraint z = max(0, x)
20
Handling ReLU constraintsConsider single ReLU constraint
Key insight: Can be replaced by linear + quadratic constraints
z = max(0, x)
20
Handling ReLU constraintsConsider single ReLU constraint
Key insight: Can be replaced by linear + quadratic constraints
z = max(0, x)
z � x Linear
20
Handling ReLU constraintsConsider single ReLU constraint
Key insight: Can be replaced by linear + quadratic constraints
z = max(0, x)
z � 0
z � x
Linear
Linear
20
Handling ReLU constraintsConsider single ReLU constraint
Key insight: Can be replaced by linear + quadratic constraints
z = max(0, x)
z � 0
z � x
Linear
Linear
20
{is greater than z x, 0
Handling ReLU constraintsConsider single ReLU constraint
Key insight: Can be replaced by linear + quadratic constraints
z = max(0, x)
z � 0
z � x
z(z � x) = 0 Quadratic
Linear
Linear
20
{is greater than z x, 0
Handling ReLU constraintsConsider single ReLU constraint
Key insight: Can be replaced by linear + quadratic constraints
z = max(0, x)
z � 0
z � x
z(z � x) = 0 Quadratic
Linear
Linear
20
{is greater than z x, 0
z equal to one of x, 0
Handling ReLU constraintsConsider single ReLU constraint
Key insight: Can be replaced by linear + quadratic constraints
z = max(0, x)
z � 0
z � x
z(z � x) = 0 Quadratic
Linear
Linear
20
{is greater than z x, 0
z equal to one of x, 0
Handling ReLU constraintsConsider single ReLU constraint
Key insight: Can be replaced by linear + quadratic constraints
z = max(0, x)
z � 0
z � x
z(z � x) = 0 Quadratic
Linear
Linear
20
{is greater than z x, 0
z equal to one of x, 0
Can relax quadratic constraints to get a semidefinite program
21
SDP relaxation
21
SDP relaxation
21
Single ReLU constraint ⌘ Linear + Quadratic constraintsz = max(0, x)
SDP relaxation
21
Single ReLU constraint ⌘ Linear + Quadratic constraintsz = max(0, x)
M =
2
41 x zx x2 xzz xz z2
3
5
SDP relaxation
21
Single ReLU constraint ⌘ Linear + Quadratic constraintsz = max(0, x)
M =
2
41 x zx x2 xzz xz z2
3
5 z � x
SDP relaxation
21
Single ReLU constraint ⌘ Linear + Quadratic constraintsz = max(0, x)
M =
2
41 x zx x2 xzz xz z2
3
5 z � x
SDP relaxation
21
Single ReLU constraint ⌘ Linear + Quadratic constraintsz = max(0, x)
M =
2
41 x zx x2 xzz xz z2
3
5 z � 0
SDP relaxation
21
Single ReLU constraint ⌘ Linear + Quadratic constraintsz = max(0, x)
M =
2
41 x zx x2 xzz xz z2
3
5 z(z � x) = 0
z2 = xz
SDP relaxation
21
Single ReLU constraint ⌘ Linear + Quadratic constraintsz = max(0, x)
M =
2
41 x zx x2 xzz xz z2
3
5
SDP relaxation
21
Single ReLU constraint ⌘ Linear + Quadratic constraintsz = max(0, x)
M =
2
41 x zx x2 xzz xz z2
3
5 ReLU constraints as linear constraints on matrix entries
SDP relaxation
21
Single ReLU constraint ⌘ Linear + Quadratic constraintsz = max(0, x)
M =
2
41 x zx x2 xzz xz z2
3
5
Constraint on M
ReLU constraints as linear constraints on matrix entries
SDP relaxation
21
Single ReLU constraint ⌘ Linear + Quadratic constraintsz = max(0, x)
M =
2
41 x zx x2 xzz xz z2
3
5
Constraint on M
M = vv> Exact but non-convex set
ReLU constraints as linear constraints on matrix entries
SDP relaxation
21
Single ReLU constraint ⌘ Linear + Quadratic constraintsz = max(0, x)
M =
2
41 x zx x2 xzz xz z2
3
5
Constraint on M
M = vv> Exact but non-convex set
M = V V > Relaxed and convex set
ReLU constraints as linear constraints on matrix entries
SDP relaxation
21
Single ReLU constraint ⌘ Linear + Quadratic constraintsz = max(0, x)
M =
2
41 x zx x2 xzz xz z2
3
5
Constraint on M
M = vv> Exact but non-convex set
M = V V > Relaxed and convex set
ReLU constraints as linear constraints on matrix entries
M ⌫ 0
SDP relaxation
21
Single ReLU constraint ⌘ Linear + Quadratic constraintsz = max(0, x)
M =
2
41 x zx x2 xzz xz z2
3
5
Constraint on M
M = vv> Exact but non-convex set
M = V V > Relaxed and convex set
ReLU constraints as linear constraints on matrix entries
M ⌫ 0
Generalizes to multiple layers: large matrix M with all activations
SDP relaxation
22
SDP relaxationInteraction between different hidden units
22
SDP relaxationInteraction between different hidden units
x1, x2 2 [�✏, ✏]
22
SDP relaxationInteraction between different hidden units
x1, x2 2 [�✏, ✏]
22
z1 = ReLU(x1 + x2)
z2 = ReLU(x1 � x2)
SDP relaxationInteraction between different hidden units
x1, x2 2 [�✏, ✏]
x1 = x2 = 0.5✏
22
z1 = ReLU(x1 + x2)
z2 = ReLU(x1 � x2)
SDP relaxationInteraction between different hidden units
x1, x2 2 [�✏, ✏]
x1 = x2 = 0.5✏
Unrelaxed value
22
z1 = ReLU(x1 + x2)
z2 = ReLU(x1 � x2)
SDP relaxationInteraction between different hidden units
x1, x2 2 [�✏, ✏]
x1 = x2 = 0.5✏
Unrelaxed value
LP treats units independently SDP reasons jointly
22
z1 = ReLU(x1 + x2)
z2 = ReLU(x1 � x2)
SDP relaxationInteraction between different hidden units
x1, x2 2 [�✏, ✏]
x1 = x2 = 0.5✏
Unrelaxed value
LP treats units independently SDP reasons jointly
Theorem: For a random two layer network with hidden nodes and input dimension , opt(LP) = and opt(SDP) =
md ⇥(md) ⇥(m
pd+ d
pm)
22
z1 = ReLU(x1 + x2)
z2 = ReLU(x1 � x2)
23
Results on MNIST
23
Results on MNIST
23
Three different robust networks
Results on MNIST
23
Three different robust networks
Grad-NN [Raghunathan et al. 2018]
Results on MNIST
23
Three different robust networks
Grad-NN [Raghunathan et al. 2018]
LP-NN [Wong and Kolter 2018]
Results on MNIST
23
Three different robust networks
Grad-NN [Raghunathan et al. 2018]
LP-NN [Wong and Kolter 2018]
PGD-NN [Madry et al. 2018]
Results on MNIST
23
Three different robust networks
Grad-NN [Raghunathan et al. 2018]
LP-NN [Wong and Kolter 2018]
PGD-NN [Madry et al. 2018]
Grad-NN LP-NN PGD-NN
Grad-cert 35% 93% N/A
LP-cert 97% 22% 100%
SDP-cert 20% 20% 18%
PGD-attack 15% 18% 9%
Results on MNIST
23
Three different robust networks
Grad-NN [Raghunathan et al. 2018]
LP-NN [Wong and Kolter 2018]
PGD-NN [Madry et al. 2018]
Grad-NN LP-NN PGD-NN
Grad-cert 35% 93% N/A
LP-cert 97% 22% 100%
SDP-cert 20% 20% 18%
PGD-attack 15% 18% 9%
SDP provides good certificates on all three different networks
Results on MNIST
24
Results on MNIST
24
PGD-NN [Madry et al. 2018]
Results on MNIST
24
PGD-NN [Madry et al. 2018]
Results on MNIST
24
PGD-NN [Madry et al. 2018]
Uncertified points are more vulnerable to attack
Scaling up…
25
Scaling up…In general, CNNs are more robust than fully connected networks
25
Scaling up…In general, CNNs are more robust than fully connected networks
Off-the-shelf SDP solvers do not exploit the CNN structure
25
Scaling up…In general, CNNs are more robust than fully connected networks
Off-the-shelf SDP solvers do not exploit the CNN structure
Ongoing work:
25
Scaling up…In general, CNNs are more robust than fully connected networks
Off-the-shelf SDP solvers do not exploit the CNN structure
Ongoing work:
25
Scaling up…In general, CNNs are more robust than fully connected networks
Off-the-shelf SDP solvers do not exploit the CNN structure
Ongoing work:
First order matrix-vector product based SDP solvers
25
Scaling up…In general, CNNs are more robust than fully connected networks
Off-the-shelf SDP solvers do not exploit the CNN structure
Ongoing work:
First order matrix-vector product based SDP solvers
25
Scaling up…In general, CNNs are more robust than fully connected networks
Off-the-shelf SDP solvers do not exploit the CNN structure
Ongoing work:
First order matrix-vector product based SDP solvers
Exploit efficient CNN implementations in Tensorflow
25
Scaling up…In general, CNNs are more robust than fully connected networks
Off-the-shelf SDP solvers do not exploit the CNN structure
Ongoing work:
First order matrix-vector product based SDP solvers
Exploit efficient CNN implementations in Tensorflow
Concurrent work:
25
Scaling up…In general, CNNs are more robust than fully connected networks
Off-the-shelf SDP solvers do not exploit the CNN structure
Ongoing work:
First order matrix-vector product based SDP solvers
Exploit efficient CNN implementations in Tensorflow
Concurrent work:
25
Scaling up…In general, CNNs are more robust than fully connected networks
Off-the-shelf SDP solvers do not exploit the CNN structure
Ongoing work:
First order matrix-vector product based SDP solvers
Exploit efficient CNN implementations in Tensorflow
Concurrent work:
MILP solving with efficient preprocessing [Tjeng+ 2018, Xiao+ 2018]
25
Scaling up…In general, CNNs are more robust than fully connected networks
Off-the-shelf SDP solvers do not exploit the CNN structure
Ongoing work:
First order matrix-vector product based SDP solvers
Exploit efficient CNN implementations in Tensorflow
Concurrent work:
MILP solving with efficient preprocessing [Tjeng+ 2018, Xiao+ 2018]
25
Scaling up…In general, CNNs are more robust than fully connected networks
Off-the-shelf SDP solvers do not exploit the CNN structure
Ongoing work:
First order matrix-vector product based SDP solvers
Exploit efficient CNN implementations in Tensorflow
Concurrent work:
MILP solving with efficient preprocessing [Tjeng+ 2018, Xiao+ 2018]
Scaling up LP based methods [Dvijotham+ 2018, Wong and Kolter 2018]25
Scaling up…In general, CNNs are more robust than fully connected networks
Off-the-shelf SDP solvers do not exploit the CNN structure
Ongoing work:
First order matrix-vector product based SDP solvers
Exploit efficient CNN implementations in Tensorflow
Concurrent work:
MILP solving with efficient preprocessing [Tjeng+ 2018, Xiao+ 2018]
Scaling up LP based methods [Dvijotham+ 2018, Wong and Kolter 2018]25
26
Summary
26
Summary • Robustness for attack model
26
`1
Summary • Robustness for attack model
• Certified evaluation to avoid arms race
26
`1
Summary • Robustness for attack model
• Certified evaluation to avoid arms race
• Presented two different relaxations for certification
26
`1
Summary • Robustness for attack model
• Certified evaluation to avoid arms race
• Presented two different relaxations for certification
• Adversarial examples more broadly..
26
`1
Summary • Robustness for attack model
• Certified evaluation to avoid arms race
• Presented two different relaxations for certification
• Adversarial examples more broadly..
• Does there exist a mathematically well defined attack model?
26
`1
Summary • Robustness for attack model
• Certified evaluation to avoid arms race
• Presented two different relaxations for certification
• Adversarial examples more broadly..
• Does there exist a mathematically well defined attack model?
• Would the current techniques (deep learning + appropriate
regularization) transfer to this attack model?
26
`1
Summary • Robustness for attack model
• Certified evaluation to avoid arms race
• Presented two different relaxations for certification
• Adversarial examples more broadly..
• Does there exist a mathematically well defined attack model?
• Would the current techniques (deep learning + appropriate
regularization) transfer to this attack model?
• Secure vs. better models?
26
`1
Summary • Robustness for attack model
• Certified evaluation to avoid arms race
• Presented two different relaxations for certification
• Adversarial examples more broadly..
• Does there exist a mathematically well defined attack model?
• Would the current techniques (deep learning + appropriate
regularization) transfer to this attack model?
• Secure vs. better models?
• Adversarial examples expose limitations of current systems
26
`1
Summary • Robustness for attack model
• Certified evaluation to avoid arms race
• Presented two different relaxations for certification
• Adversarial examples more broadly..
• Does there exist a mathematically well defined attack model?
• Would the current techniques (deep learning + appropriate
regularization) transfer to this attack model?
• Secure vs. better models?
• Adversarial examples expose limitations of current systems
• How do we get models to learn “the right thing”?
26
`1
Thank you!
Jacob Steinhardt Percy Liang
“Certified Defenses against Adversarial Examples” https://arxiv.org/abs/1801.09344 [ICLR 2018]
“Semidefinite Relaxations for Certifying Robustness to Adversarial Examples” https://arxiv.org/abs/1811.01057 [NeurIPS 2018]
Open Philanthropy
Project