192
Semidefinite relaxations for certifying robustness to adversarial examples Jacob Steinhardt Percy Liang Aditi Raghunathan

Semidefinite relaxations for certifying robustness to adversarial … · 2019. 1. 13. · Semidefinite relaxations for certifying robustness to adversarial examples Jacob Steinhardt

  • Upload
    others

  • View
    4

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Semidefinite relaxations for certifying robustness to adversarial … · 2019. 1. 13. · Semidefinite relaxations for certifying robustness to adversarial examples Jacob Steinhardt

Semidefinite relaxations for certifying robustness to

adversarial examples

Jacob Steinhardt Percy Liang

Aditi Raghunathan

Page 2: Semidefinite relaxations for certifying robustness to adversarial … · 2019. 1. 13. · Semidefinite relaxations for certifying robustness to adversarial examples Jacob Steinhardt

2

Page 3: Semidefinite relaxations for certifying robustness to adversarial … · 2019. 1. 13. · Semidefinite relaxations for certifying robustness to adversarial examples Jacob Steinhardt

ML: Powerful But Fragile

2

Page 4: Semidefinite relaxations for certifying robustness to adversarial … · 2019. 1. 13. · Semidefinite relaxations for certifying robustness to adversarial examples Jacob Steinhardt

ML: Powerful But Fragile• ML is successful on several tasks: object recognition, game playing, face

recognition

2

Page 5: Semidefinite relaxations for certifying robustness to adversarial … · 2019. 1. 13. · Semidefinite relaxations for certifying robustness to adversarial examples Jacob Steinhardt

ML: Powerful But Fragile• ML is successful on several tasks: object recognition, game playing, face

recognition

• ML systems fail catastrophically in presence of adversaries

2

Page 6: Semidefinite relaxations for certifying robustness to adversarial … · 2019. 1. 13. · Semidefinite relaxations for certifying robustness to adversarial examples Jacob Steinhardt

ML: Powerful But Fragile• ML is successful on several tasks: object recognition, game playing, face

recognition

• ML systems fail catastrophically in presence of adversaries

2

Page 7: Semidefinite relaxations for certifying robustness to adversarial … · 2019. 1. 13. · Semidefinite relaxations for certifying robustness to adversarial examples Jacob Steinhardt

ML: Powerful But Fragile• ML is successful on several tasks: object recognition, game playing, face

recognition

• ML systems fail catastrophically in presence of adversaries

2

• Different kinds of adversarial manipulations — data poisoning, manipulation of test inputs, model theft, membership inference etc.

Page 8: Semidefinite relaxations for certifying robustness to adversarial … · 2019. 1. 13. · Semidefinite relaxations for certifying robustness to adversarial examples Jacob Steinhardt

ML: Powerful But Fragile• ML is successful on several tasks: object recognition, game playing, face

recognition

• ML systems fail catastrophically in presence of adversaries

2

• Different kinds of adversarial manipulations — data poisoning, manipulation of test inputs, model theft, membership inference etc.

• Focus on adversarial examples — manipulation of test inputs

Page 9: Semidefinite relaxations for certifying robustness to adversarial … · 2019. 1. 13. · Semidefinite relaxations for certifying robustness to adversarial examples Jacob Steinhardt

Adversarial Examples

3

Page 10: Semidefinite relaxations for certifying robustness to adversarial … · 2019. 1. 13. · Semidefinite relaxations for certifying robustness to adversarial examples Jacob Steinhardt

Adversarial Examples

[Sharif et al. 2016]

Glasses ! Impersonation

3

Page 11: Semidefinite relaxations for certifying robustness to adversarial … · 2019. 1. 13. · Semidefinite relaxations for certifying robustness to adversarial examples Jacob Steinhardt

Adversarial Examples

[Sharif et al. 2016]

Glasses ! Impersonation Banana + patch !Toaster

[Brown et al. 2017]

3

Page 12: Semidefinite relaxations for certifying robustness to adversarial … · 2019. 1. 13. · Semidefinite relaxations for certifying robustness to adversarial examples Jacob Steinhardt

Adversarial Examples

[Sharif et al. 2016]

Glasses ! Impersonation Banana + patch !Toaster Stop + sticker !Yield

[Brown et al. 2017] [Evtimov et al. 2017]

3

Page 13: Semidefinite relaxations for certifying robustness to adversarial … · 2019. 1. 13. · Semidefinite relaxations for certifying robustness to adversarial examples Jacob Steinhardt

Adversarial Examples

4

Page 14: Semidefinite relaxations for certifying robustness to adversarial … · 2019. 1. 13. · Semidefinite relaxations for certifying robustness to adversarial examples Jacob Steinhardt

Adversarial Examples

3D Turtle ! Rifle

[Athalye et al. 2017]

4

Page 15: Semidefinite relaxations for certifying robustness to adversarial … · 2019. 1. 13. · Semidefinite relaxations for certifying robustness to adversarial examples Jacob Steinhardt

Adversarial Examples

3D Turtle ! Rifle

[Athalye et al. 2017]

Noise ! “Ok Google”

[Carlini et al. 2017]

4

Page 16: Semidefinite relaxations for certifying robustness to adversarial … · 2019. 1. 13. · Semidefinite relaxations for certifying robustness to adversarial examples Jacob Steinhardt

Adversarial Examples

3D Turtle ! Rifle

[Athalye et al. 2017]

Noise ! “Ok Google”

[Carlini et al. 2017]

Malware ! Benign

[Grosse et al. 2017]

4

Page 17: Semidefinite relaxations for certifying robustness to adversarial … · 2019. 1. 13. · Semidefinite relaxations for certifying robustness to adversarial examples Jacob Steinhardt

5

What is an adversarial example?

Page 18: Semidefinite relaxations for certifying robustness to adversarial … · 2019. 1. 13. · Semidefinite relaxations for certifying robustness to adversarial examples Jacob Steinhardt

Definition of attack model usually application specific and complex

5

What is an adversarial example?

Page 19: Semidefinite relaxations for certifying robustness to adversarial … · 2019. 1. 13. · Semidefinite relaxations for certifying robustness to adversarial examples Jacob Steinhardt

Definition of attack model usually application specific and complex

We consider the well studied attack model `1

5

What is an adversarial example?

Page 20: Semidefinite relaxations for certifying robustness to adversarial … · 2019. 1. 13. · Semidefinite relaxations for certifying robustness to adversarial examples Jacob Steinhardt

Definition of attack model usually application specific and complex

We consider the well studied attack model `1

5

What is an adversarial example?

Panda Gibbon

Szegedy et al. 2014

Page 21: Semidefinite relaxations for certifying robustness to adversarial … · 2019. 1. 13. · Semidefinite relaxations for certifying robustness to adversarial examples Jacob Steinhardt

Definition of attack model usually application specific and complex

We consider the well studied attack model `1

5

What is an adversarial example?

|xadv � x|i ✏ for i = 1, 2, . . . d

Panda Gibbon

Szegedy et al. 2014

Page 22: Semidefinite relaxations for certifying robustness to adversarial … · 2019. 1. 13. · Semidefinite relaxations for certifying robustness to adversarial examples Jacob Steinhardt

Definition of attack model usually application specific and complex

We consider the well studied attack model `1

5

What is an adversarial example?

|xadv � x|i ✏ for i = 1, 2, . . . d xadv 2 B✏(x)

Panda Gibbon

Szegedy et al. 2014

Page 23: Semidefinite relaxations for certifying robustness to adversarial … · 2019. 1. 13. · Semidefinite relaxations for certifying robustness to adversarial examples Jacob Steinhardt

History

6

Page 24: Semidefinite relaxations for certifying robustness to adversarial … · 2019. 1. 13. · Semidefinite relaxations for certifying robustness to adversarial examples Jacob Steinhardt

History

6

Hard to defend even in this well defined model…

Page 25: Semidefinite relaxations for certifying robustness to adversarial … · 2019. 1. 13. · Semidefinite relaxations for certifying robustness to adversarial examples Jacob Steinhardt

History• [Szegedy+ 2014]: First discover adversarial examples

6

Hard to defend even in this well defined model…

Page 26: Semidefinite relaxations for certifying robustness to adversarial … · 2019. 1. 13. · Semidefinite relaxations for certifying robustness to adversarial examples Jacob Steinhardt

History• [Szegedy+ 2014]: First discover adversarial examples

• [Goodfellow+ 2015]: Adversarial training (AT) against FGSM

6

Hard to defend even in this well defined model…

Page 27: Semidefinite relaxations for certifying robustness to adversarial … · 2019. 1. 13. · Semidefinite relaxations for certifying robustness to adversarial examples Jacob Steinhardt

History• [Szegedy+ 2014]: First discover adversarial examples

• [Goodfellow+ 2015]: Adversarial training (AT) against FGSM

• [Papernot+ 2015]: Defensive Distillation

6

Hard to defend even in this well defined model…

Page 28: Semidefinite relaxations for certifying robustness to adversarial … · 2019. 1. 13. · Semidefinite relaxations for certifying robustness to adversarial examples Jacob Steinhardt

History• [Szegedy+ 2014]: First discover adversarial examples

• [Goodfellow+ 2015]: Adversarial training (AT) against FGSM

• [Papernot+ 2015]: Defensive Distillation

• [Carlini & Wagner 2016]: Distillation is not secure

6

Hard to defend even in this well defined model…

Page 29: Semidefinite relaxations for certifying robustness to adversarial … · 2019. 1. 13. · Semidefinite relaxations for certifying robustness to adversarial examples Jacob Steinhardt

History• [Szegedy+ 2014]: First discover adversarial examples

• [Goodfellow+ 2015]: Adversarial training (AT) against FGSM

• [Papernot+ 2015]: Defensive Distillation

• [Carlini & Wagner 2016]: Distillation is not secure

• [Papernot + 2017]: Better distillation

6

Hard to defend even in this well defined model…

Page 30: Semidefinite relaxations for certifying robustness to adversarial … · 2019. 1. 13. · Semidefinite relaxations for certifying robustness to adversarial examples Jacob Steinhardt

History• [Szegedy+ 2014]: First discover adversarial examples

• [Goodfellow+ 2015]: Adversarial training (AT) against FGSM

• [Papernot+ 2015]: Defensive Distillation

• [Carlini & Wagner 2016]: Distillation is not secure

• [Papernot + 2017]: Better distillation

• [Carlini & Wagner 2017]: Ten detection strategies fail

6

Hard to defend even in this well defined model…

Page 31: Semidefinite relaxations for certifying robustness to adversarial … · 2019. 1. 13. · Semidefinite relaxations for certifying robustness to adversarial examples Jacob Steinhardt

History• [Szegedy+ 2014]: First discover adversarial examples

• [Goodfellow+ 2015]: Adversarial training (AT) against FGSM

• [Papernot+ 2015]: Defensive Distillation

• [Carlini & Wagner 2016]: Distillation is not secure

• [Papernot + 2017]: Better distillation

• [Carlini & Wagner 2017]: Ten detection strategies fail

• [Madry+ 2017]: AT against PGD, informal argument about optimality

6

Hard to defend even in this well defined model…

Page 32: Semidefinite relaxations for certifying robustness to adversarial … · 2019. 1. 13. · Semidefinite relaxations for certifying robustness to adversarial examples Jacob Steinhardt

History• [Szegedy+ 2014]: First discover adversarial examples

• [Goodfellow+ 2015]: Adversarial training (AT) against FGSM

• [Papernot+ 2015]: Defensive Distillation

• [Carlini & Wagner 2016]: Distillation is not secure

• [Papernot + 2017]: Better distillation

• [Carlini & Wagner 2017]: Ten detection strategies fail

• [Madry+ 2017]: AT against PGD, informal argument about optimality

• [Lu + July 12 2017]: ”NO Need to Worry about Adversarial Examples in Object Detection in Autonomous Vehicles”

6

Hard to defend even in this well defined model…

Page 33: Semidefinite relaxations for certifying robustness to adversarial … · 2019. 1. 13. · Semidefinite relaxations for certifying robustness to adversarial examples Jacob Steinhardt

History• [Szegedy+ 2014]: First discover adversarial examples

• [Goodfellow+ 2015]: Adversarial training (AT) against FGSM

• [Papernot+ 2015]: Defensive Distillation

• [Carlini & Wagner 2016]: Distillation is not secure

• [Papernot + 2017]: Better distillation

• [Carlini & Wagner 2017]: Ten detection strategies fail

• [Madry+ 2017]: AT against PGD, informal argument about optimality

• [Lu + July 12 2017]: ”NO Need to Worry about Adversarial Examples in Object Detection in Autonomous Vehicles”

• [Athalye and Sutskever July 17 2017]: Break above defense

6

Hard to defend even in this well defined model…

Page 34: Semidefinite relaxations for certifying robustness to adversarial … · 2019. 1. 13. · Semidefinite relaxations for certifying robustness to adversarial examples Jacob Steinhardt

History• [Szegedy+ 2014]: First discover adversarial examples

• [Goodfellow+ 2015]: Adversarial training (AT) against FGSM

• [Papernot+ 2015]: Defensive Distillation

• [Carlini & Wagner 2016]: Distillation is not secure

• [Papernot + 2017]: Better distillation

• [Carlini & Wagner 2017]: Ten detection strategies fail

• [Madry+ 2017]: AT against PGD, informal argument about optimality

• [Lu + July 12 2017]: ”NO Need to Worry about Adversarial Examples in Object Detection in Autonomous Vehicles”

• [Athalye and Sutskever July 17 2017]: Break above defense

• [Athalye, Carlini, Wagner]: Break 6 out of 7 ICLR defenses 6

Hard to defend even in this well defined model…

Page 35: Semidefinite relaxations for certifying robustness to adversarial … · 2019. 1. 13. · Semidefinite relaxations for certifying robustness to adversarial examples Jacob Steinhardt

History• [Szegedy+ 2014]: First discover adversarial examples

• [Goodfellow+ 2015]: Adversarial training (AT) against FGSM

• [Papernot+ 2015]: Defensive Distillation

• [Carlini & Wagner 2016]: Distillation is not secure

• [Papernot + 2017]: Better distillation

• [Carlini & Wagner 2017]: Ten detection strategies fail

• [Madry+ 2017]: AT against PGD, informal argument about optimality

• [Lu + July 12 2017]: ”NO Need to Worry about Adversarial Examples in Object Detection in Autonomous Vehicles”

• [Athalye and Sutskever July 17 2017]: Break above defense

• [Athalye, Carlini, Wagner]: Break 6 out of 7 ICLR defenses 6

Hard to defend even in this well defined model…

Vs.

Page 36: Semidefinite relaxations for certifying robustness to adversarial … · 2019. 1. 13. · Semidefinite relaxations for certifying robustness to adversarial examples Jacob Steinhardt

7

Page 37: Semidefinite relaxations for certifying robustness to adversarial … · 2019. 1. 13. · Semidefinite relaxations for certifying robustness to adversarial examples Jacob Steinhardt

Provable robustness

7

Page 38: Semidefinite relaxations for certifying robustness to adversarial … · 2019. 1. 13. · Semidefinite relaxations for certifying robustness to adversarial examples Jacob Steinhardt

Provable robustness

7

Can we get robustness to all attacks?

Page 39: Semidefinite relaxations for certifying robustness to adversarial … · 2019. 1. 13. · Semidefinite relaxations for certifying robustness to adversarial examples Jacob Steinhardt

Provable robustness

7

Can we get robustness to all attacks?

f(x)Let be the scoring function and adversary wants to maximizef(x)

Page 40: Semidefinite relaxations for certifying robustness to adversarial … · 2019. 1. 13. · Semidefinite relaxations for certifying robustness to adversarial examples Jacob Steinhardt

Provable robustness

7

Can we get robustness to all attacks?

f(x)Let be the scoring function and adversary wants to maximizef(x)

Page 41: Semidefinite relaxations for certifying robustness to adversarial … · 2019. 1. 13. · Semidefinite relaxations for certifying robustness to adversarial examples Jacob Steinhardt

Provable robustness

7

Can we get robustness to all attacks?

Attacks: Generate points in B✏(x)

f(x)Let be the scoring function and adversary wants to maximizef(x)

Page 42: Semidefinite relaxations for certifying robustness to adversarial … · 2019. 1. 13. · Semidefinite relaxations for certifying robustness to adversarial examples Jacob Steinhardt

Provable robustness

7

Can we get robustness to all attacks?

Attacks: Generate points in B✏(x)

Afgsm(x) = x+ ✏ sign�rf(x)

f(x)Let be the scoring function and adversary wants to maximizef(x)

Page 43: Semidefinite relaxations for certifying robustness to adversarial … · 2019. 1. 13. · Semidefinite relaxations for certifying robustness to adversarial examples Jacob Steinhardt

Provable robustness

7

Can we get robustness to all attacks?

Attacks: Generate points in B✏(x)

Afgsm(x) = x+ ✏ sign�rf(x)

f(x)Let be the scoring function and adversary wants to maximizef(x)

Aopt(x) = argmaxx

f(x)

Page 44: Semidefinite relaxations for certifying robustness to adversarial … · 2019. 1. 13. · Semidefinite relaxations for certifying robustness to adversarial examples Jacob Steinhardt

Provable robustness

7

Can we get robustness to all attacks?

Attacks: Generate points in B✏(x)

Afgsm(x) = x+ ✏ sign�rf(x)

Network is provably robust if optimal attack fails

f(x)Let be the scoring function and adversary wants to maximizef(x)

Aopt(x) = argmaxx

f(x)

Page 45: Semidefinite relaxations for certifying robustness to adversarial … · 2019. 1. 13. · Semidefinite relaxations for certifying robustness to adversarial examples Jacob Steinhardt

Provable robustness

7

Can we get robustness to all attacks?

Attacks: Generate points in B✏(x)

Afgsm(x) = x+ ✏ sign�rf(x)

Network is provably robust if optimal attack fails

f(x)Let be the scoring function and adversary wants to maximizef(x)

Aopt(x) = argmaxx

f(x)

f? ⌘ f(Aopt(x)) < 0

Page 46: Semidefinite relaxations for certifying robustness to adversarial … · 2019. 1. 13. · Semidefinite relaxations for certifying robustness to adversarial examples Jacob Steinhardt

8

Page 47: Semidefinite relaxations for certifying robustness to adversarial … · 2019. 1. 13. · Semidefinite relaxations for certifying robustness to adversarial examples Jacob Steinhardt

8

Network is provably robust if

Provable robustnessf? ⌘ f(Aopt(x)) < 0

Page 48: Semidefinite relaxations for certifying robustness to adversarial … · 2019. 1. 13. · Semidefinite relaxations for certifying robustness to adversarial examples Jacob Steinhardt

8

Network is provably robust if

Provable robustnessf? ⌘ f(Aopt(x)) < 0

Computing is intractable in generalf?

Page 49: Semidefinite relaxations for certifying robustness to adversarial … · 2019. 1. 13. · Semidefinite relaxations for certifying robustness to adversarial examples Jacob Steinhardt

8

Network is provably robust if

• Combinatorial approaches to compute

Provable robustness

f?

f? ⌘ f(Aopt(x)) < 0

Computing is intractable in generalf?

Page 50: Semidefinite relaxations for certifying robustness to adversarial … · 2019. 1. 13. · Semidefinite relaxations for certifying robustness to adversarial examples Jacob Steinhardt

8

Network is provably robust if

• Combinatorial approaches to compute

• SMT based Reluplex [Katz+ 2018]

Provable robustness

f?

f? ⌘ f(Aopt(x)) < 0

Computing is intractable in generalf?

Page 51: Semidefinite relaxations for certifying robustness to adversarial … · 2019. 1. 13. · Semidefinite relaxations for certifying robustness to adversarial examples Jacob Steinhardt

8

Network is provably robust if

• Combinatorial approaches to compute

• SMT based Reluplex [Katz+ 2018]

• MILP based with specialized preprocessing [Tjeng+ 2018]

Provable robustness

f?

f? ⌘ f(Aopt(x)) < 0

Computing is intractable in generalf?

Page 52: Semidefinite relaxations for certifying robustness to adversarial … · 2019. 1. 13. · Semidefinite relaxations for certifying robustness to adversarial examples Jacob Steinhardt

8

Network is provably robust if

• Combinatorial approaches to compute

• SMT based Reluplex [Katz+ 2018]

• MILP based with specialized preprocessing [Tjeng+ 2018]

• Convex relaxations to compute upper bound on

Provable robustness

f?

f?

f? ⌘ f(Aopt(x)) < 0

Computing is intractable in generalf?

Page 53: Semidefinite relaxations for certifying robustness to adversarial … · 2019. 1. 13. · Semidefinite relaxations for certifying robustness to adversarial examples Jacob Steinhardt

8

Network is provably robust if

• Combinatorial approaches to compute

• SMT based Reluplex [Katz+ 2018]

• MILP based with specialized preprocessing [Tjeng+ 2018]

• Convex relaxations to compute upper bound on

• Upper bound is negative optimal attack fails =)

Provable robustness

f?

f?

f? ⌘ f(Aopt(x)) < 0

Computing is intractable in generalf?

Page 54: Semidefinite relaxations for certifying robustness to adversarial … · 2019. 1. 13. · Semidefinite relaxations for certifying robustness to adversarial examples Jacob Steinhardt

8

Network is provably robust if

• Combinatorial approaches to compute

• SMT based Reluplex [Katz+ 2018]

• MILP based with specialized preprocessing [Tjeng+ 2018]

• Convex relaxations to compute upper bound on

• Upper bound is negative optimal attack fails

• Computationally efficient upper bound =)

Provable robustness

f?

f?

f? ⌘ f(Aopt(x)) < 0

Computing is intractable in generalf?

Page 55: Semidefinite relaxations for certifying robustness to adversarial … · 2019. 1. 13. · Semidefinite relaxations for certifying robustness to adversarial examples Jacob Steinhardt

8

Network is provably robust if

• Combinatorial approaches to compute

• SMT based Reluplex [Katz+ 2018]

• MILP based with specialized preprocessing [Tjeng+ 2018]

• Convex relaxations to compute upper bound on

• Upper bound is negative optimal attack fails

• Computationally efficient upper bound =)

Provable robustness

f?

f?

0 f?

Not robust

fupper

f? ⌘ f(Aopt(x)) < 0

Computing is intractable in generalf?

Page 56: Semidefinite relaxations for certifying robustness to adversarial … · 2019. 1. 13. · Semidefinite relaxations for certifying robustness to adversarial examples Jacob Steinhardt

8

Network is provably robust if

• Combinatorial approaches to compute

• SMT based Reluplex [Katz+ 2018]

• MILP based with specialized preprocessing [Tjeng+ 2018]

• Convex relaxations to compute upper bound on

• Upper bound is negative optimal attack fails

• Computationally efficient upper bound =)

Provable robustness

f?

f?

0 f?

Not robust

fupper 0f? fupper

Robust and certified

f? ⌘ f(Aopt(x)) < 0

Computing is intractable in generalf?

Page 57: Semidefinite relaxations for certifying robustness to adversarial … · 2019. 1. 13. · Semidefinite relaxations for certifying robustness to adversarial examples Jacob Steinhardt

8

Network is provably robust if

• Combinatorial approaches to compute

• SMT based Reluplex [Katz+ 2018]

• MILP based with specialized preprocessing [Tjeng+ 2018]

• Convex relaxations to compute upper bound on

• Upper bound is negative optimal attack fails

• Computationally efficient upper bound =)

Provable robustness

f?

f?

0 f?

Not robust

fupper 0f? fupper

Robust and certified

0f? fupper

Robust and not certified

f? ⌘ f(Aopt(x)) < 0

Computing is intractable in generalf?

Page 58: Semidefinite relaxations for certifying robustness to adversarial … · 2019. 1. 13. · Semidefinite relaxations for certifying robustness to adversarial examples Jacob Steinhardt

9

Page 59: Semidefinite relaxations for certifying robustness to adversarial … · 2019. 1. 13. · Semidefinite relaxations for certifying robustness to adversarial examples Jacob Steinhardt

Two layer networks

9

Page 60: Semidefinite relaxations for certifying robustness to adversarial … · 2019. 1. 13. · Semidefinite relaxations for certifying robustness to adversarial examples Jacob Steinhardt

Two layer networks

9

Page 61: Semidefinite relaxations for certifying robustness to adversarial … · 2019. 1. 13. · Semidefinite relaxations for certifying robustness to adversarial examples Jacob Steinhardt

Two layer networks

9

Page 62: Semidefinite relaxations for certifying robustness to adversarial … · 2019. 1. 13. · Semidefinite relaxations for certifying robustness to adversarial examples Jacob Steinhardt

Two layer networks

9

Key idea: Uniformly bound gradients

Page 63: Semidefinite relaxations for certifying robustness to adversarial … · 2019. 1. 13. · Semidefinite relaxations for certifying robustness to adversarial examples Jacob Steinhardt

Two layer networks

f(x) f(x) + ✏maxx

krf(x)k19

Key idea: Uniformly bound gradients

Page 64: Semidefinite relaxations for certifying robustness to adversarial … · 2019. 1. 13. · Semidefinite relaxations for certifying robustness to adversarial examples Jacob Steinhardt

Two layer networks

10

Page 65: Semidefinite relaxations for certifying robustness to adversarial … · 2019. 1. 13. · Semidefinite relaxations for certifying robustness to adversarial examples Jacob Steinhardt

Two layer networks

10

Page 66: Semidefinite relaxations for certifying robustness to adversarial … · 2019. 1. 13. · Semidefinite relaxations for certifying robustness to adversarial examples Jacob Steinhardt

Two layer networks

f(x) f(x) + ✏maxx

krf(x)k1

10

Key idea: Uniformly bound gradients

Page 67: Semidefinite relaxations for certifying robustness to adversarial … · 2019. 1. 13. · Semidefinite relaxations for certifying robustness to adversarial examples Jacob Steinhardt

Two layer networks

Bound on gradient:

f(x) f(x) + ✏maxx

krf(x)k1

10

Key idea: Uniformly bound gradients

Page 68: Semidefinite relaxations for certifying robustness to adversarial … · 2019. 1. 13. · Semidefinite relaxations for certifying robustness to adversarial examples Jacob Steinhardt

Two layer networks

Bound on gradient:

f(x) f(x) + ✏maxx

krf(x)k1

10

Key idea: Uniformly bound gradients

Page 69: Semidefinite relaxations for certifying robustness to adversarial … · 2019. 1. 13. · Semidefinite relaxations for certifying robustness to adversarial examples Jacob Steinhardt

Two layer networks

Bound on gradient:

f(x) f(x) + ✏maxx

krf(x)k1

10

Key idea: Uniformly bound gradients

Page 70: Semidefinite relaxations for certifying robustness to adversarial … · 2019. 1. 13. · Semidefinite relaxations for certifying robustness to adversarial examples Jacob Steinhardt

Two layer networks

Bound on gradient:

optimize over activations

f(x) f(x) + ✏maxx

krf(x)k1

10

Key idea: Uniformly bound gradients

Page 71: Semidefinite relaxations for certifying robustness to adversarial … · 2019. 1. 13. · Semidefinite relaxations for certifying robustness to adversarial examples Jacob Steinhardt

Two layer networks

Bound on gradient:

optimize over activations

optimize over signs of perturbation

f(x) f(x) + ✏maxx

krf(x)k1

10

Key idea: Uniformly bound gradients

Page 72: Semidefinite relaxations for certifying robustness to adversarial … · 2019. 1. 13. · Semidefinite relaxations for certifying robustness to adversarial examples Jacob Steinhardt

Two layer networks

Bound on gradient:

optimize over activations

optimize over signs of perturbation

Final step: SDP relaxation (similar to MAXCUT) leads to Grad-cert

f(x) f(x) + ✏maxx

krf(x)k1

10

Key idea: Uniformly bound gradients

Page 73: Semidefinite relaxations for certifying robustness to adversarial … · 2019. 1. 13. · Semidefinite relaxations for certifying robustness to adversarial examples Jacob Steinhardt

Relaxation Training

11

Page 74: Semidefinite relaxations for certifying robustness to adversarial … · 2019. 1. 13. · Semidefinite relaxations for certifying robustness to adversarial examples Jacob Steinhardt

Relaxation TrainingTraining a neural network

11

Page 75: Semidefinite relaxations for certifying robustness to adversarial … · 2019. 1. 13. · Semidefinite relaxations for certifying robustness to adversarial examples Jacob Steinhardt

Relaxation TrainingTraining a neural network

Objective:

11

Page 76: Semidefinite relaxations for certifying robustness to adversarial … · 2019. 1. 13. · Semidefinite relaxations for certifying robustness to adversarial examples Jacob Steinhardt

Relaxation TrainingTraining a neural network

Objective:

11

Page 77: Semidefinite relaxations for certifying robustness to adversarial … · 2019. 1. 13. · Semidefinite relaxations for certifying robustness to adversarial examples Jacob Steinhardt

Relaxation TrainingTraining a neural network

Objective:

Differentiable objective but expensive gradients

11

Page 78: Semidefinite relaxations for certifying robustness to adversarial … · 2019. 1. 13. · Semidefinite relaxations for certifying robustness to adversarial examples Jacob Steinhardt

Relaxation TrainingTraining a neural network

Objective:

Differentiable objective but expensive gradients

Duality to the rescue!

11

Page 79: Semidefinite relaxations for certifying robustness to adversarial … · 2019. 1. 13. · Semidefinite relaxations for certifying robustness to adversarial examples Jacob Steinhardt

Relaxation TrainingTraining a neural network

Objective:

Differentiable objective but expensive gradients

Duality to the rescue!

Regularizer:

11

D · �+max

�(M(v,W )� diag(c)

�+ 1> max(c, 0)

Page 80: Semidefinite relaxations for certifying robustness to adversarial … · 2019. 1. 13. · Semidefinite relaxations for certifying robustness to adversarial examples Jacob Steinhardt

Relaxation TrainingTraining a neural network

Objective:

Differentiable objective but expensive gradients

Duality to the rescue!

Regularizer:

Just one max eigenvalue computation for gradients

11

D · �+max

�(M(v,W )� diag(c)

�+ 1> max(c, 0)

Page 81: Semidefinite relaxations for certifying robustness to adversarial … · 2019. 1. 13. · Semidefinite relaxations for certifying robustness to adversarial examples Jacob Steinhardt

Results on MNIST

12

Page 82: Semidefinite relaxations for certifying robustness to adversarial … · 2019. 1. 13. · Semidefinite relaxations for certifying robustness to adversarial examples Jacob Steinhardt

Results on MNIST

Attack: Projected Gradient Descent attack of Madry et al. 2018 Adversarial training: Minimizes this lower bound on training set

12

Page 83: Semidefinite relaxations for certifying robustness to adversarial … · 2019. 1. 13. · Semidefinite relaxations for certifying robustness to adversarial examples Jacob Steinhardt

Results on MNIST

Attack: Projected Gradient Descent attack of Madry et al. 2018 Adversarial training: Minimizes this lower bound on training set

12

Page 84: Semidefinite relaxations for certifying robustness to adversarial … · 2019. 1. 13. · Semidefinite relaxations for certifying robustness to adversarial examples Jacob Steinhardt

Results on MNIST

13

Page 85: Semidefinite relaxations for certifying robustness to adversarial … · 2019. 1. 13. · Semidefinite relaxations for certifying robustness to adversarial examples Jacob Steinhardt

Results on MNIST

Attack: Projected Gradient Descent attack of Madry et al. 2018 Adversarial training: Minimizes this lower bound on training set

13

Page 86: Semidefinite relaxations for certifying robustness to adversarial … · 2019. 1. 13. · Semidefinite relaxations for certifying robustness to adversarial examples Jacob Steinhardt

Results on MNIST

Attack: Projected Gradient Descent attack of Madry et al. 2018 Adversarial training: Minimizes this lower bound on training set

13

Gradient based bound is quite loose

Page 87: Semidefinite relaxations for certifying robustness to adversarial … · 2019. 1. 13. · Semidefinite relaxations for certifying robustness to adversarial examples Jacob Steinhardt

Results on MNIST

14

Train with Grad-cert(attack)

Page 88: Semidefinite relaxations for certifying robustness to adversarial … · 2019. 1. 13. · Semidefinite relaxations for certifying robustness to adversarial examples Jacob Steinhardt

Results on MNIST

Attack: Projected Gradient Descent attack of Madry et al. 2018 Our method: Minimize gradient based upper bound 14

Train with Grad-cert(attack)

Page 89: Semidefinite relaxations for certifying robustness to adversarial … · 2019. 1. 13. · Semidefinite relaxations for certifying robustness to adversarial examples Jacob Steinhardt

Results on MNIST

15

Train with Grad-cert(attack)Train with Grad-cert(Certified)

Page 90: Semidefinite relaxations for certifying robustness to adversarial … · 2019. 1. 13. · Semidefinite relaxations for certifying robustness to adversarial examples Jacob Steinhardt

Results on MNIST

Attack: Projected Gradient Descent attack of Madry et al. 2018 Our method: Minimize gradient based upper bound 15

Train with Grad-cert(attack)Train with Grad-cert(Certified)

Page 91: Semidefinite relaxations for certifying robustness to adversarial … · 2019. 1. 13. · Semidefinite relaxations for certifying robustness to adversarial examples Jacob Steinhardt

Results on MNIST

Attack: Projected Gradient Descent attack of Madry et al. 2018 Our method: Minimize gradient based upper bound 15

Train with Grad-cert(attack)Train with Grad-cert(Certified)

Gradient based bound is much better!

Page 92: Semidefinite relaxations for certifying robustness to adversarial … · 2019. 1. 13. · Semidefinite relaxations for certifying robustness to adversarial examples Jacob Steinhardt

Results on MNIST

16

Page 93: Semidefinite relaxations for certifying robustness to adversarial … · 2019. 1. 13. · Semidefinite relaxations for certifying robustness to adversarial examples Jacob Steinhardt

Results on MNIST

16

Training a network to minimize gradient upper bound finds networks where the bound is tight

Page 94: Semidefinite relaxations for certifying robustness to adversarial … · 2019. 1. 13. · Semidefinite relaxations for certifying robustness to adversarial examples Jacob Steinhardt

Results on MNIST

16

Training a network to minimize gradient upper bound finds networks where the bound is tight

Comparison with Wong and Kolter 2018 (LP-cert)

Page 95: Semidefinite relaxations for certifying robustness to adversarial … · 2019. 1. 13. · Semidefinite relaxations for certifying robustness to adversarial examples Jacob Steinhardt

Results on MNIST

16

Training a network to minimize gradient upper bound finds networks where the bound is tight

Comparison with Wong and Kolter 2018 (LP-cert)

Network PGD-attack LP-cert Grad-cert

LP-NN 22% 26% 93%Grad-NN 15% 97% 35%

Page 96: Semidefinite relaxations for certifying robustness to adversarial … · 2019. 1. 13. · Semidefinite relaxations for certifying robustness to adversarial examples Jacob Steinhardt

Results on MNIST

16

Training a network to minimize gradient upper bound finds networks where the bound is tight

Comparison with Wong and Kolter 2018 (LP-cert)

Bounds are tight when you train

Network PGD-attack LP-cert Grad-cert

LP-NN 22% 26% 93%Grad-NN 15% 97% 35%

Page 97: Semidefinite relaxations for certifying robustness to adversarial … · 2019. 1. 13. · Semidefinite relaxations for certifying robustness to adversarial examples Jacob Steinhardt

Results on MNIST

17

Page 98: Semidefinite relaxations for certifying robustness to adversarial … · 2019. 1. 13. · Semidefinite relaxations for certifying robustness to adversarial examples Jacob Steinhardt

Results on MNISTTraining a network to minimize gradient upper bound

finds networks where the bound is tight

Comparison with Wong and Kolter 2018 (LP-cert)

Bounds are tight when you train

Network PGD-attack LP-cert Grad-cert

LP-NN 22% 26% 93%Grad-NN 15% 97% 35%

17

Page 99: Semidefinite relaxations for certifying robustness to adversarial … · 2019. 1. 13. · Semidefinite relaxations for certifying robustness to adversarial examples Jacob Steinhardt

Results on MNISTTraining a network to minimize gradient upper bound

finds networks where the bound is tight

Comparison with Wong and Kolter 2018 (LP-cert)

Bounds are tight when you train Bounds are tight only when you train

Network PGD-attack LP-cert Grad-cert

LP-NN 22% 26% 93%Grad-NN 15% 97% 35%

17

Page 100: Semidefinite relaxations for certifying robustness to adversarial … · 2019. 1. 13. · Semidefinite relaxations for certifying robustness to adversarial examples Jacob Steinhardt

Results on MNISTTraining a network to minimize gradient upper bound

finds networks where the bound is tight

Comparison with Wong and Kolter 2018 (LP-cert)

Bounds are tight when you train

Some networks are empirically robust but not certified (e.g. Adversarial Training of Madry et al. 2018)

Bounds are tight only when you train

Network PGD-attack LP-cert Grad-cert

LP-NN 22% 26% 93%Grad-NN 15% 97% 35%

17

Page 101: Semidefinite relaxations for certifying robustness to adversarial … · 2019. 1. 13. · Semidefinite relaxations for certifying robustness to adversarial examples Jacob Steinhardt

Results on MNISTTraining a network to minimize gradient upper bound

finds networks where the bound is tight

Comparison with Wong and Kolter 2018 (LP-cert)

Bounds are tight when you train

Some networks are empirically robust but not certified (e.g. Adversarial Training of Madry et al. 2018)

Can we certify such “foreign” networks?

Bounds are tight only when you train

Network PGD-attack LP-cert Grad-cert

LP-NN 22% 26% 93%Grad-NN 15% 97% 35%

17

Page 102: Semidefinite relaxations for certifying robustness to adversarial … · 2019. 1. 13. · Semidefinite relaxations for certifying robustness to adversarial examples Jacob Steinhardt

18

Page 103: Semidefinite relaxations for certifying robustness to adversarial … · 2019. 1. 13. · Semidefinite relaxations for certifying robustness to adversarial examples Jacob Steinhardt

Summary so far…

18

Page 104: Semidefinite relaxations for certifying robustness to adversarial … · 2019. 1. 13. · Semidefinite relaxations for certifying robustness to adversarial examples Jacob Steinhardt

Summary so far…

• Certified robustness: relaxed optimization to bound worst-case attack

18

Page 105: Semidefinite relaxations for certifying robustness to adversarial … · 2019. 1. 13. · Semidefinite relaxations for certifying robustness to adversarial examples Jacob Steinhardt

Summary so far…

• Certified robustness: relaxed optimization to bound worst-case attack

• Grad-cert: Upper bound on worst case attack using uniform bound on

gradient

18

Page 106: Semidefinite relaxations for certifying robustness to adversarial … · 2019. 1. 13. · Semidefinite relaxations for certifying robustness to adversarial examples Jacob Steinhardt

Summary so far…

• Certified robustness: relaxed optimization to bound worst-case attack

• Grad-cert: Upper bound on worst case attack using uniform bound on

gradient

• Training against the bound makes it tight

18

Page 107: Semidefinite relaxations for certifying robustness to adversarial … · 2019. 1. 13. · Semidefinite relaxations for certifying robustness to adversarial examples Jacob Steinhardt

Summary so far…

• Certified robustness: relaxed optimization to bound worst-case attack

• Grad-cert: Upper bound on worst case attack using uniform bound on

gradient

• Training against the bound makes it tight

• LP-cert and Grad-cert are tight only on training

18

Page 108: Semidefinite relaxations for certifying robustness to adversarial … · 2019. 1. 13. · Semidefinite relaxations for certifying robustness to adversarial examples Jacob Steinhardt

Summary so far…

• Certified robustness: relaxed optimization to bound worst-case attack

• Grad-cert: Upper bound on worst case attack using uniform bound on

gradient

• Training against the bound makes it tight

• LP-cert and Grad-cert are tight only on training

• Goal: Efficiently certify foreign multi-layer networks

18

Page 109: Semidefinite relaxations for certifying robustness to adversarial … · 2019. 1. 13. · Semidefinite relaxations for certifying robustness to adversarial examples Jacob Steinhardt

19

Page 110: Semidefinite relaxations for certifying robustness to adversarial … · 2019. 1. 13. · Semidefinite relaxations for certifying robustness to adversarial examples Jacob Steinhardt

19

New SDP-cert relaxation

Page 111: Semidefinite relaxations for certifying robustness to adversarial … · 2019. 1. 13. · Semidefinite relaxations for certifying robustness to adversarial examples Jacob Steinhardt

19

New SDP-cert relaxation

Page 112: Semidefinite relaxations for certifying robustness to adversarial … · 2019. 1. 13. · Semidefinite relaxations for certifying robustness to adversarial examples Jacob Steinhardt

19

New SDP-cert relaxation

Page 113: Semidefinite relaxations for certifying robustness to adversarial … · 2019. 1. 13. · Semidefinite relaxations for certifying robustness to adversarial examples Jacob Steinhardt

19

New SDP-cert relaxationx x1 x2 x3 ⌘ xL

Page 114: Semidefinite relaxations for certifying robustness to adversarial … · 2019. 1. 13. · Semidefinite relaxations for certifying robustness to adversarial examples Jacob Steinhardt

19

New SDP-cert relaxation

Attack model constraints:

x x1 x2 x3 ⌘ xL

Page 115: Semidefinite relaxations for certifying robustness to adversarial … · 2019. 1. 13. · Semidefinite relaxations for certifying robustness to adversarial examples Jacob Steinhardt

19

New SDP-cert relaxation

Attack model constraints:|x� x|i ✏

for i = 1, 2, . . . d

x x1 x2 x3 ⌘ xL

Page 116: Semidefinite relaxations for certifying robustness to adversarial … · 2019. 1. 13. · Semidefinite relaxations for certifying robustness to adversarial examples Jacob Steinhardt

19

New SDP-cert relaxation

Attack model constraints:|x� x|i ✏

for i = 1, 2, . . . d

x x1 x2

for i = 1, 2, . . . Lxi = ReLU(Wi�1xi�1)

x3 ⌘ xL

Neural net constraintsW2W1W0

Page 117: Semidefinite relaxations for certifying robustness to adversarial … · 2019. 1. 13. · Semidefinite relaxations for certifying robustness to adversarial examples Jacob Steinhardt

19

New SDP-cert relaxation

Attack model constraints:|x� x|i ✏

for i = 1, 2, . . . d

x x1 x2

for i = 1, 2, . . . Lxi = ReLU(Wi�1xi�1)

Objective

x3 ⌘ xL

Neural net constraintsW2W1W0

Page 118: Semidefinite relaxations for certifying robustness to adversarial … · 2019. 1. 13. · Semidefinite relaxations for certifying robustness to adversarial examples Jacob Steinhardt

19

New SDP-cert relaxation

cy

cy

Attack model constraints:|x� x|i ✏

for i = 1, 2, . . . d

x x1 x2

for i = 1, 2, . . . Lxi = ReLU(Wi�1xi�1)

Objective

x3 ⌘ xL

Neural net constraintsW2W1W0

Page 119: Semidefinite relaxations for certifying robustness to adversarial … · 2019. 1. 13. · Semidefinite relaxations for certifying robustness to adversarial examples Jacob Steinhardt

19

New SDP-cert relaxation

cy

cy

Attack model constraints:|x� x|i ✏

for i = 1, 2, . . . d

x x1 x2

for i = 1, 2, . . . Lxi = ReLU(Wi�1xi�1)

Objective

x3 ⌘ xL

Neural net constraints

f? = maxx

(cy � cy)>xL

W2W1W0

Page 120: Semidefinite relaxations for certifying robustness to adversarial … · 2019. 1. 13. · Semidefinite relaxations for certifying robustness to adversarial examples Jacob Steinhardt

19

Source of non-convexity is the ReLU constraints

New SDP-cert relaxation

cy

cy

Attack model constraints:|x� x|i ✏

for i = 1, 2, . . . d

x x1 x2

for i = 1, 2, . . . Lxi = ReLU(Wi�1xi�1)

Objective

x3 ⌘ xL

Neural net constraints

f? = maxx

(cy � cy)>xL

W2W1W0

Page 121: Semidefinite relaxations for certifying robustness to adversarial … · 2019. 1. 13. · Semidefinite relaxations for certifying robustness to adversarial examples Jacob Steinhardt

20

Page 122: Semidefinite relaxations for certifying robustness to adversarial … · 2019. 1. 13. · Semidefinite relaxations for certifying robustness to adversarial examples Jacob Steinhardt

Handling ReLU constraints

20

Page 123: Semidefinite relaxations for certifying robustness to adversarial … · 2019. 1. 13. · Semidefinite relaxations for certifying robustness to adversarial examples Jacob Steinhardt

Handling ReLU constraintsConsider single ReLU constraint z = max(0, x)

20

Page 124: Semidefinite relaxations for certifying robustness to adversarial … · 2019. 1. 13. · Semidefinite relaxations for certifying robustness to adversarial examples Jacob Steinhardt

Handling ReLU constraintsConsider single ReLU constraint

Key insight: Can be replaced by linear + quadratic constraints

z = max(0, x)

20

Page 125: Semidefinite relaxations for certifying robustness to adversarial … · 2019. 1. 13. · Semidefinite relaxations for certifying robustness to adversarial examples Jacob Steinhardt

Handling ReLU constraintsConsider single ReLU constraint

Key insight: Can be replaced by linear + quadratic constraints

z = max(0, x)

z � x Linear

20

Page 126: Semidefinite relaxations for certifying robustness to adversarial … · 2019. 1. 13. · Semidefinite relaxations for certifying robustness to adversarial examples Jacob Steinhardt

Handling ReLU constraintsConsider single ReLU constraint

Key insight: Can be replaced by linear + quadratic constraints

z = max(0, x)

z � 0

z � x

Linear

Linear

20

Page 127: Semidefinite relaxations for certifying robustness to adversarial … · 2019. 1. 13. · Semidefinite relaxations for certifying robustness to adversarial examples Jacob Steinhardt

Handling ReLU constraintsConsider single ReLU constraint

Key insight: Can be replaced by linear + quadratic constraints

z = max(0, x)

z � 0

z � x

Linear

Linear

20

{is greater than z x, 0

Page 128: Semidefinite relaxations for certifying robustness to adversarial … · 2019. 1. 13. · Semidefinite relaxations for certifying robustness to adversarial examples Jacob Steinhardt

Handling ReLU constraintsConsider single ReLU constraint

Key insight: Can be replaced by linear + quadratic constraints

z = max(0, x)

z � 0

z � x

z(z � x) = 0 Quadratic

Linear

Linear

20

{is greater than z x, 0

Page 129: Semidefinite relaxations for certifying robustness to adversarial … · 2019. 1. 13. · Semidefinite relaxations for certifying robustness to adversarial examples Jacob Steinhardt

Handling ReLU constraintsConsider single ReLU constraint

Key insight: Can be replaced by linear + quadratic constraints

z = max(0, x)

z � 0

z � x

z(z � x) = 0 Quadratic

Linear

Linear

20

{is greater than z x, 0

z equal to one of x, 0

Page 130: Semidefinite relaxations for certifying robustness to adversarial … · 2019. 1. 13. · Semidefinite relaxations for certifying robustness to adversarial examples Jacob Steinhardt

Handling ReLU constraintsConsider single ReLU constraint

Key insight: Can be replaced by linear + quadratic constraints

z = max(0, x)

z � 0

z � x

z(z � x) = 0 Quadratic

Linear

Linear

20

{is greater than z x, 0

z equal to one of x, 0

Page 131: Semidefinite relaxations for certifying robustness to adversarial … · 2019. 1. 13. · Semidefinite relaxations for certifying robustness to adversarial examples Jacob Steinhardt

Handling ReLU constraintsConsider single ReLU constraint

Key insight: Can be replaced by linear + quadratic constraints

z = max(0, x)

z � 0

z � x

z(z � x) = 0 Quadratic

Linear

Linear

20

{is greater than z x, 0

z equal to one of x, 0

Can relax quadratic constraints to get a semidefinite program

Page 132: Semidefinite relaxations for certifying robustness to adversarial … · 2019. 1. 13. · Semidefinite relaxations for certifying robustness to adversarial examples Jacob Steinhardt

21

Page 133: Semidefinite relaxations for certifying robustness to adversarial … · 2019. 1. 13. · Semidefinite relaxations for certifying robustness to adversarial examples Jacob Steinhardt

SDP relaxation

21

Page 134: Semidefinite relaxations for certifying robustness to adversarial … · 2019. 1. 13. · Semidefinite relaxations for certifying robustness to adversarial examples Jacob Steinhardt

SDP relaxation

21

Single ReLU constraint ⌘ Linear + Quadratic constraintsz = max(0, x)

Page 135: Semidefinite relaxations for certifying robustness to adversarial … · 2019. 1. 13. · Semidefinite relaxations for certifying robustness to adversarial examples Jacob Steinhardt

SDP relaxation

21

Single ReLU constraint ⌘ Linear + Quadratic constraintsz = max(0, x)

M =

2

41 x zx x2 xzz xz z2

3

5

Page 136: Semidefinite relaxations for certifying robustness to adversarial … · 2019. 1. 13. · Semidefinite relaxations for certifying robustness to adversarial examples Jacob Steinhardt

SDP relaxation

21

Single ReLU constraint ⌘ Linear + Quadratic constraintsz = max(0, x)

M =

2

41 x zx x2 xzz xz z2

3

5 z � x

Page 137: Semidefinite relaxations for certifying robustness to adversarial … · 2019. 1. 13. · Semidefinite relaxations for certifying robustness to adversarial examples Jacob Steinhardt

SDP relaxation

21

Single ReLU constraint ⌘ Linear + Quadratic constraintsz = max(0, x)

M =

2

41 x zx x2 xzz xz z2

3

5 z � x

Page 138: Semidefinite relaxations for certifying robustness to adversarial … · 2019. 1. 13. · Semidefinite relaxations for certifying robustness to adversarial examples Jacob Steinhardt

SDP relaxation

21

Single ReLU constraint ⌘ Linear + Quadratic constraintsz = max(0, x)

M =

2

41 x zx x2 xzz xz z2

3

5 z � 0

Page 139: Semidefinite relaxations for certifying robustness to adversarial … · 2019. 1. 13. · Semidefinite relaxations for certifying robustness to adversarial examples Jacob Steinhardt

SDP relaxation

21

Single ReLU constraint ⌘ Linear + Quadratic constraintsz = max(0, x)

M =

2

41 x zx x2 xzz xz z2

3

5 z(z � x) = 0

z2 = xz

Page 140: Semidefinite relaxations for certifying robustness to adversarial … · 2019. 1. 13. · Semidefinite relaxations for certifying robustness to adversarial examples Jacob Steinhardt

SDP relaxation

21

Single ReLU constraint ⌘ Linear + Quadratic constraintsz = max(0, x)

M =

2

41 x zx x2 xzz xz z2

3

5

Page 141: Semidefinite relaxations for certifying robustness to adversarial … · 2019. 1. 13. · Semidefinite relaxations for certifying robustness to adversarial examples Jacob Steinhardt

SDP relaxation

21

Single ReLU constraint ⌘ Linear + Quadratic constraintsz = max(0, x)

M =

2

41 x zx x2 xzz xz z2

3

5 ReLU constraints as linear constraints on matrix entries

Page 142: Semidefinite relaxations for certifying robustness to adversarial … · 2019. 1. 13. · Semidefinite relaxations for certifying robustness to adversarial examples Jacob Steinhardt

SDP relaxation

21

Single ReLU constraint ⌘ Linear + Quadratic constraintsz = max(0, x)

M =

2

41 x zx x2 xzz xz z2

3

5

Constraint on M

ReLU constraints as linear constraints on matrix entries

Page 143: Semidefinite relaxations for certifying robustness to adversarial … · 2019. 1. 13. · Semidefinite relaxations for certifying robustness to adversarial examples Jacob Steinhardt

SDP relaxation

21

Single ReLU constraint ⌘ Linear + Quadratic constraintsz = max(0, x)

M =

2

41 x zx x2 xzz xz z2

3

5

Constraint on M

M = vv> Exact but non-convex set

ReLU constraints as linear constraints on matrix entries

Page 144: Semidefinite relaxations for certifying robustness to adversarial … · 2019. 1. 13. · Semidefinite relaxations for certifying robustness to adversarial examples Jacob Steinhardt

SDP relaxation

21

Single ReLU constraint ⌘ Linear + Quadratic constraintsz = max(0, x)

M =

2

41 x zx x2 xzz xz z2

3

5

Constraint on M

M = vv> Exact but non-convex set

M = V V > Relaxed and convex set

ReLU constraints as linear constraints on matrix entries

Page 145: Semidefinite relaxations for certifying robustness to adversarial … · 2019. 1. 13. · Semidefinite relaxations for certifying robustness to adversarial examples Jacob Steinhardt

SDP relaxation

21

Single ReLU constraint ⌘ Linear + Quadratic constraintsz = max(0, x)

M =

2

41 x zx x2 xzz xz z2

3

5

Constraint on M

M = vv> Exact but non-convex set

M = V V > Relaxed and convex set

ReLU constraints as linear constraints on matrix entries

M ⌫ 0

Page 146: Semidefinite relaxations for certifying robustness to adversarial … · 2019. 1. 13. · Semidefinite relaxations for certifying robustness to adversarial examples Jacob Steinhardt

SDP relaxation

21

Single ReLU constraint ⌘ Linear + Quadratic constraintsz = max(0, x)

M =

2

41 x zx x2 xzz xz z2

3

5

Constraint on M

M = vv> Exact but non-convex set

M = V V > Relaxed and convex set

ReLU constraints as linear constraints on matrix entries

M ⌫ 0

Generalizes to multiple layers: large matrix M with all activations

Page 147: Semidefinite relaxations for certifying robustness to adversarial … · 2019. 1. 13. · Semidefinite relaxations for certifying robustness to adversarial examples Jacob Steinhardt

SDP relaxation

22

Page 148: Semidefinite relaxations for certifying robustness to adversarial … · 2019. 1. 13. · Semidefinite relaxations for certifying robustness to adversarial examples Jacob Steinhardt

SDP relaxationInteraction between different hidden units

22

Page 149: Semidefinite relaxations for certifying robustness to adversarial … · 2019. 1. 13. · Semidefinite relaxations for certifying robustness to adversarial examples Jacob Steinhardt

SDP relaxationInteraction between different hidden units

x1, x2 2 [�✏, ✏]

22

Page 150: Semidefinite relaxations for certifying robustness to adversarial … · 2019. 1. 13. · Semidefinite relaxations for certifying robustness to adversarial examples Jacob Steinhardt

SDP relaxationInteraction between different hidden units

x1, x2 2 [�✏, ✏]

22

z1 = ReLU(x1 + x2)

z2 = ReLU(x1 � x2)

Page 151: Semidefinite relaxations for certifying robustness to adversarial … · 2019. 1. 13. · Semidefinite relaxations for certifying robustness to adversarial examples Jacob Steinhardt

SDP relaxationInteraction between different hidden units

x1, x2 2 [�✏, ✏]

x1 = x2 = 0.5✏

22

z1 = ReLU(x1 + x2)

z2 = ReLU(x1 � x2)

Page 152: Semidefinite relaxations for certifying robustness to adversarial … · 2019. 1. 13. · Semidefinite relaxations for certifying robustness to adversarial examples Jacob Steinhardt

SDP relaxationInteraction between different hidden units

x1, x2 2 [�✏, ✏]

x1 = x2 = 0.5✏

Unrelaxed value

22

z1 = ReLU(x1 + x2)

z2 = ReLU(x1 � x2)

Page 153: Semidefinite relaxations for certifying robustness to adversarial … · 2019. 1. 13. · Semidefinite relaxations for certifying robustness to adversarial examples Jacob Steinhardt

SDP relaxationInteraction between different hidden units

x1, x2 2 [�✏, ✏]

x1 = x2 = 0.5✏

Unrelaxed value

LP treats units independently SDP reasons jointly

22

z1 = ReLU(x1 + x2)

z2 = ReLU(x1 � x2)

Page 154: Semidefinite relaxations for certifying robustness to adversarial … · 2019. 1. 13. · Semidefinite relaxations for certifying robustness to adversarial examples Jacob Steinhardt

SDP relaxationInteraction between different hidden units

x1, x2 2 [�✏, ✏]

x1 = x2 = 0.5✏

Unrelaxed value

LP treats units independently SDP reasons jointly

Theorem: For a random two layer network with hidden nodes and input dimension , opt(LP) = and opt(SDP) =

md ⇥(md) ⇥(m

pd+ d

pm)

22

z1 = ReLU(x1 + x2)

z2 = ReLU(x1 � x2)

Page 155: Semidefinite relaxations for certifying robustness to adversarial … · 2019. 1. 13. · Semidefinite relaxations for certifying robustness to adversarial examples Jacob Steinhardt

23

Page 156: Semidefinite relaxations for certifying robustness to adversarial … · 2019. 1. 13. · Semidefinite relaxations for certifying robustness to adversarial examples Jacob Steinhardt

Results on MNIST

23

Page 157: Semidefinite relaxations for certifying robustness to adversarial … · 2019. 1. 13. · Semidefinite relaxations for certifying robustness to adversarial examples Jacob Steinhardt

Results on MNIST

23

Three different robust networks

Page 158: Semidefinite relaxations for certifying robustness to adversarial … · 2019. 1. 13. · Semidefinite relaxations for certifying robustness to adversarial examples Jacob Steinhardt

Results on MNIST

23

Three different robust networks

Grad-NN [Raghunathan et al. 2018]

Page 159: Semidefinite relaxations for certifying robustness to adversarial … · 2019. 1. 13. · Semidefinite relaxations for certifying robustness to adversarial examples Jacob Steinhardt

Results on MNIST

23

Three different robust networks

Grad-NN [Raghunathan et al. 2018]

LP-NN [Wong and Kolter 2018]

Page 160: Semidefinite relaxations for certifying robustness to adversarial … · 2019. 1. 13. · Semidefinite relaxations for certifying robustness to adversarial examples Jacob Steinhardt

Results on MNIST

23

Three different robust networks

Grad-NN [Raghunathan et al. 2018]

LP-NN [Wong and Kolter 2018]

PGD-NN [Madry et al. 2018]

Page 161: Semidefinite relaxations for certifying robustness to adversarial … · 2019. 1. 13. · Semidefinite relaxations for certifying robustness to adversarial examples Jacob Steinhardt

Results on MNIST

23

Three different robust networks

Grad-NN [Raghunathan et al. 2018]

LP-NN [Wong and Kolter 2018]

PGD-NN [Madry et al. 2018]

Grad-NN LP-NN PGD-NN

Grad-cert 35% 93% N/A

LP-cert 97% 22% 100%

SDP-cert 20% 20% 18%

PGD-attack 15% 18% 9%

Page 162: Semidefinite relaxations for certifying robustness to adversarial … · 2019. 1. 13. · Semidefinite relaxations for certifying robustness to adversarial examples Jacob Steinhardt

Results on MNIST

23

Three different robust networks

Grad-NN [Raghunathan et al. 2018]

LP-NN [Wong and Kolter 2018]

PGD-NN [Madry et al. 2018]

Grad-NN LP-NN PGD-NN

Grad-cert 35% 93% N/A

LP-cert 97% 22% 100%

SDP-cert 20% 20% 18%

PGD-attack 15% 18% 9%

SDP provides good certificates on all three different networks

Page 163: Semidefinite relaxations for certifying robustness to adversarial … · 2019. 1. 13. · Semidefinite relaxations for certifying robustness to adversarial examples Jacob Steinhardt

Results on MNIST

24

Page 164: Semidefinite relaxations for certifying robustness to adversarial … · 2019. 1. 13. · Semidefinite relaxations for certifying robustness to adversarial examples Jacob Steinhardt

Results on MNIST

24

PGD-NN [Madry et al. 2018]

Page 165: Semidefinite relaxations for certifying robustness to adversarial … · 2019. 1. 13. · Semidefinite relaxations for certifying robustness to adversarial examples Jacob Steinhardt

Results on MNIST

24

PGD-NN [Madry et al. 2018]

Page 166: Semidefinite relaxations for certifying robustness to adversarial … · 2019. 1. 13. · Semidefinite relaxations for certifying robustness to adversarial examples Jacob Steinhardt

Results on MNIST

24

PGD-NN [Madry et al. 2018]

Uncertified points are more vulnerable to attack

Page 167: Semidefinite relaxations for certifying robustness to adversarial … · 2019. 1. 13. · Semidefinite relaxations for certifying robustness to adversarial examples Jacob Steinhardt

Scaling up…

25

Page 168: Semidefinite relaxations for certifying robustness to adversarial … · 2019. 1. 13. · Semidefinite relaxations for certifying robustness to adversarial examples Jacob Steinhardt

Scaling up…In general, CNNs are more robust than fully connected networks

25

Page 169: Semidefinite relaxations for certifying robustness to adversarial … · 2019. 1. 13. · Semidefinite relaxations for certifying robustness to adversarial examples Jacob Steinhardt

Scaling up…In general, CNNs are more robust than fully connected networks

Off-the-shelf SDP solvers do not exploit the CNN structure

25

Page 170: Semidefinite relaxations for certifying robustness to adversarial … · 2019. 1. 13. · Semidefinite relaxations for certifying robustness to adversarial examples Jacob Steinhardt

Scaling up…In general, CNNs are more robust than fully connected networks

Off-the-shelf SDP solvers do not exploit the CNN structure

Ongoing work:

25

Page 171: Semidefinite relaxations for certifying robustness to adversarial … · 2019. 1. 13. · Semidefinite relaxations for certifying robustness to adversarial examples Jacob Steinhardt

Scaling up…In general, CNNs are more robust than fully connected networks

Off-the-shelf SDP solvers do not exploit the CNN structure

Ongoing work:

25

Page 172: Semidefinite relaxations for certifying robustness to adversarial … · 2019. 1. 13. · Semidefinite relaxations for certifying robustness to adversarial examples Jacob Steinhardt

Scaling up…In general, CNNs are more robust than fully connected networks

Off-the-shelf SDP solvers do not exploit the CNN structure

Ongoing work:

First order matrix-vector product based SDP solvers

25

Page 173: Semidefinite relaxations for certifying robustness to adversarial … · 2019. 1. 13. · Semidefinite relaxations for certifying robustness to adversarial examples Jacob Steinhardt

Scaling up…In general, CNNs are more robust than fully connected networks

Off-the-shelf SDP solvers do not exploit the CNN structure

Ongoing work:

First order matrix-vector product based SDP solvers

25

Page 174: Semidefinite relaxations for certifying robustness to adversarial … · 2019. 1. 13. · Semidefinite relaxations for certifying robustness to adversarial examples Jacob Steinhardt

Scaling up…In general, CNNs are more robust than fully connected networks

Off-the-shelf SDP solvers do not exploit the CNN structure

Ongoing work:

First order matrix-vector product based SDP solvers

Exploit efficient CNN implementations in Tensorflow

25

Page 175: Semidefinite relaxations for certifying robustness to adversarial … · 2019. 1. 13. · Semidefinite relaxations for certifying robustness to adversarial examples Jacob Steinhardt

Scaling up…In general, CNNs are more robust than fully connected networks

Off-the-shelf SDP solvers do not exploit the CNN structure

Ongoing work:

First order matrix-vector product based SDP solvers

Exploit efficient CNN implementations in Tensorflow

Concurrent work:

25

Page 176: Semidefinite relaxations for certifying robustness to adversarial … · 2019. 1. 13. · Semidefinite relaxations for certifying robustness to adversarial examples Jacob Steinhardt

Scaling up…In general, CNNs are more robust than fully connected networks

Off-the-shelf SDP solvers do not exploit the CNN structure

Ongoing work:

First order matrix-vector product based SDP solvers

Exploit efficient CNN implementations in Tensorflow

Concurrent work:

25

Page 177: Semidefinite relaxations for certifying robustness to adversarial … · 2019. 1. 13. · Semidefinite relaxations for certifying robustness to adversarial examples Jacob Steinhardt

Scaling up…In general, CNNs are more robust than fully connected networks

Off-the-shelf SDP solvers do not exploit the CNN structure

Ongoing work:

First order matrix-vector product based SDP solvers

Exploit efficient CNN implementations in Tensorflow

Concurrent work:

MILP solving with efficient preprocessing [Tjeng+ 2018, Xiao+ 2018]

25

Page 178: Semidefinite relaxations for certifying robustness to adversarial … · 2019. 1. 13. · Semidefinite relaxations for certifying robustness to adversarial examples Jacob Steinhardt

Scaling up…In general, CNNs are more robust than fully connected networks

Off-the-shelf SDP solvers do not exploit the CNN structure

Ongoing work:

First order matrix-vector product based SDP solvers

Exploit efficient CNN implementations in Tensorflow

Concurrent work:

MILP solving with efficient preprocessing [Tjeng+ 2018, Xiao+ 2018]

25

Page 179: Semidefinite relaxations for certifying robustness to adversarial … · 2019. 1. 13. · Semidefinite relaxations for certifying robustness to adversarial examples Jacob Steinhardt

Scaling up…In general, CNNs are more robust than fully connected networks

Off-the-shelf SDP solvers do not exploit the CNN structure

Ongoing work:

First order matrix-vector product based SDP solvers

Exploit efficient CNN implementations in Tensorflow

Concurrent work:

MILP solving with efficient preprocessing [Tjeng+ 2018, Xiao+ 2018]

Scaling up LP based methods [Dvijotham+ 2018, Wong and Kolter 2018]25

Page 180: Semidefinite relaxations for certifying robustness to adversarial … · 2019. 1. 13. · Semidefinite relaxations for certifying robustness to adversarial examples Jacob Steinhardt

Scaling up…In general, CNNs are more robust than fully connected networks

Off-the-shelf SDP solvers do not exploit the CNN structure

Ongoing work:

First order matrix-vector product based SDP solvers

Exploit efficient CNN implementations in Tensorflow

Concurrent work:

MILP solving with efficient preprocessing [Tjeng+ 2018, Xiao+ 2018]

Scaling up LP based methods [Dvijotham+ 2018, Wong and Kolter 2018]25

Page 181: Semidefinite relaxations for certifying robustness to adversarial … · 2019. 1. 13. · Semidefinite relaxations for certifying robustness to adversarial examples Jacob Steinhardt

26

Page 182: Semidefinite relaxations for certifying robustness to adversarial … · 2019. 1. 13. · Semidefinite relaxations for certifying robustness to adversarial examples Jacob Steinhardt

Summary

26

Page 183: Semidefinite relaxations for certifying robustness to adversarial … · 2019. 1. 13. · Semidefinite relaxations for certifying robustness to adversarial examples Jacob Steinhardt

Summary • Robustness for attack model

26

`1

Page 184: Semidefinite relaxations for certifying robustness to adversarial … · 2019. 1. 13. · Semidefinite relaxations for certifying robustness to adversarial examples Jacob Steinhardt

Summary • Robustness for attack model

• Certified evaluation to avoid arms race

26

`1

Page 185: Semidefinite relaxations for certifying robustness to adversarial … · 2019. 1. 13. · Semidefinite relaxations for certifying robustness to adversarial examples Jacob Steinhardt

Summary • Robustness for attack model

• Certified evaluation to avoid arms race

• Presented two different relaxations for certification

26

`1

Page 186: Semidefinite relaxations for certifying robustness to adversarial … · 2019. 1. 13. · Semidefinite relaxations for certifying robustness to adversarial examples Jacob Steinhardt

Summary • Robustness for attack model

• Certified evaluation to avoid arms race

• Presented two different relaxations for certification

• Adversarial examples more broadly..

26

`1

Page 187: Semidefinite relaxations for certifying robustness to adversarial … · 2019. 1. 13. · Semidefinite relaxations for certifying robustness to adversarial examples Jacob Steinhardt

Summary • Robustness for attack model

• Certified evaluation to avoid arms race

• Presented two different relaxations for certification

• Adversarial examples more broadly..

• Does there exist a mathematically well defined attack model?

26

`1

Page 188: Semidefinite relaxations for certifying robustness to adversarial … · 2019. 1. 13. · Semidefinite relaxations for certifying robustness to adversarial examples Jacob Steinhardt

Summary • Robustness for attack model

• Certified evaluation to avoid arms race

• Presented two different relaxations for certification

• Adversarial examples more broadly..

• Does there exist a mathematically well defined attack model?

• Would the current techniques (deep learning + appropriate

regularization) transfer to this attack model?

26

`1

Page 189: Semidefinite relaxations for certifying robustness to adversarial … · 2019. 1. 13. · Semidefinite relaxations for certifying robustness to adversarial examples Jacob Steinhardt

Summary • Robustness for attack model

• Certified evaluation to avoid arms race

• Presented two different relaxations for certification

• Adversarial examples more broadly..

• Does there exist a mathematically well defined attack model?

• Would the current techniques (deep learning + appropriate

regularization) transfer to this attack model?

• Secure vs. better models?

26

`1

Page 190: Semidefinite relaxations for certifying robustness to adversarial … · 2019. 1. 13. · Semidefinite relaxations for certifying robustness to adversarial examples Jacob Steinhardt

Summary • Robustness for attack model

• Certified evaluation to avoid arms race

• Presented two different relaxations for certification

• Adversarial examples more broadly..

• Does there exist a mathematically well defined attack model?

• Would the current techniques (deep learning + appropriate

regularization) transfer to this attack model?

• Secure vs. better models?

• Adversarial examples expose limitations of current systems

26

`1

Page 191: Semidefinite relaxations for certifying robustness to adversarial … · 2019. 1. 13. · Semidefinite relaxations for certifying robustness to adversarial examples Jacob Steinhardt

Summary • Robustness for attack model

• Certified evaluation to avoid arms race

• Presented two different relaxations for certification

• Adversarial examples more broadly..

• Does there exist a mathematically well defined attack model?

• Would the current techniques (deep learning + appropriate

regularization) transfer to this attack model?

• Secure vs. better models?

• Adversarial examples expose limitations of current systems

• How do we get models to learn “the right thing”?

26

`1

Page 192: Semidefinite relaxations for certifying robustness to adversarial … · 2019. 1. 13. · Semidefinite relaxations for certifying robustness to adversarial examples Jacob Steinhardt

Thank you!

Jacob Steinhardt Percy Liang

“Certified Defenses against Adversarial Examples” https://arxiv.org/abs/1801.09344 [ICLR 2018]

“Semidefinite Relaxations for Certifying Robustness to Adversarial Examples” https://arxiv.org/abs/1811.01057 [NeurIPS 2018]

Open Philanthropy

Project

Google