44
Seyed-Mohsen Moosavi-Dezfooli Amirkabir Artificial Intelligence Summer Summit July 2019 A geometric perspective on the robustness of deep networks

A geometric perspective on the robustness of deep networkssmoosavi.me/slides/talk_aut_july2019.pdf · Adversarial examples are “blind spots”. Deep classifiers are “too linear”

  • Upload
    others

  • View
    12

  • Download
    0

Embed Size (px)

Citation preview

Page 1: A geometric perspective on the robustness of deep networkssmoosavi.me/slides/talk_aut_july2019.pdf · Adversarial examples are “blind spots”. Deep classifiers are “too linear”

Seyed-Mohsen Moosavi-Dezfooli

Amirkabir Artificial Intelligence Summer Summit July 2019

A geometric perspective onthe robustness of deep networks

Page 2: A geometric perspective on the robustness of deep networkssmoosavi.me/slides/talk_aut_july2019.pdf · Adversarial examples are “blind spots”. Deep classifiers are “too linear”

2

Tehran Polytechnic Iran

Page 3: A geometric perspective on the robustness of deep networkssmoosavi.me/slides/talk_aut_july2019.pdf · Adversarial examples are “blind spots”. Deep classifiers are “too linear”

3

EPFL Lausanne, Switzerland

Page 4: A geometric perspective on the robustness of deep networkssmoosavi.me/slides/talk_aut_july2019.pdf · Adversarial examples are “blind spots”. Deep classifiers are “too linear”

Omar Fawzi ENS-Lyon

Stefano Soatto UCLA

Pascal Frossard EPFL

Alhussein Fawzi Google DeepMind

Jonathan Uesato Google DeepMind

Can Kanbak Bilkent

Apostolos Modas EPFL

Collaborators

Page 5: A geometric perspective on the robustness of deep networkssmoosavi.me/slides/talk_aut_july2019.pdf · Adversarial examples are “blind spots”. Deep classifiers are “too linear”

Rachel Jones—Biomedical Computation Review

David Paul Morris—Bloomberg/Getty Images

Google Research

Are we ready?

Page 6: A geometric perspective on the robustness of deep networkssmoosavi.me/slides/talk_aut_july2019.pdf · Adversarial examples are “blind spots”. Deep classifiers are “too linear”

(a) (b)

Figure 5: Adversarial examples generated for AlexNet [9].(Left) is a correctly predicted sample, (center) dif-ference between correct image, and image predicted incorrectly magnified by 10x (values shifted by 128 andclamped), (right) adversarial example. All images in the right column are predicted to be an “ostrich, Struthiocamelus”. Average distortion based on 64 examples is 0.006508. Plase refer to http://goo.gl/huaGPbfor full resolution images. The examples are strictly randomly chosen. There is not any postselection involved.

(a) (b)

Figure 6: Adversarial examples for QuocNet [10]. A binary car classifier was trained on top of the last layerfeatures without fine-tuning. The randomly chosen examples on the left are recognized correctly as cars, whilethe images in the middle are not recognized. The rightmost column is the magnified absolute value of thedifference between the two images.

the original training set all the time. We used weight decay, but no dropout for this network. Forcomparison, a network of this size gets to 1.6% errors when regularized by weight decay alone andcan be improved to around 1.3% by using carefully applied dropout. A subtle, but essential detailis that we only got improvements by generating adversarial examples for each layer outputs whichwere used to train all the layers above. The network was trained in an alternating fashion, maintain-ing and updating a pool of adversarial examples for each layer separately in addition to the originaltraining set. According to our initial observations, adversarial examples for the higher layers seemedto be significantly more useful than those on the input or lower layers. In our future work, we planto compare these effects in a systematic manner.

For space considerations, we just present results for a representative subset (see Table 1) of theMNIST experiments we performed. The results presented here are consistent with those on a largervariety of non-convolutional models. For MNIST, we do not have results for convolutional mod-els yet, but our first qualitative experiments with AlexNet gives us reason to believe that convolu-tional networks may behave similarly as well. Each of our models were trained with L-BFGS untilconvergence. The first three models are linear classifiers that work on the pixel level with variousweight decay parameters �. All our examples use quadratic weight decay on the connection weights:lossdecay = �

Pw2

i /k added to the total loss, where k is the number of units in the layer. Threeof our models are simple linear (softmax) classifier without hidden units (FC10(�)). One of them,FC10(1), is trained with extremely high � = 1 in order to test whether it is still possible to generateadversarial examples in this extreme setting as well.Two other models are a simple sigmoidal neuralnetwork with two hidden layers and a classifier. The last model, AE400-10, consists of a single layersparse autoencoder with sigmoid activations and 400 nodes with a Softmax classifier. This networkhas been trained until it got very high quality first layer filters and this layer was not fine-tuned. Thelast column measures the minimum average pixel level distortion necessary to reach 0% accuracy

on the training set. The distortion is measure byqP

(x0i�xi)2

n between the original x and distorted

6

+

=

School Bus

Ostrich

x

r k̂(·)

▪ Intriguing properties of neural networks, Szegedy et al., ICLR 2014.

Adversarial perturbations

Page 7: A geometric perspective on the robustness of deep networkssmoosavi.me/slides/talk_aut_july2019.pdf · Adversarial examples are “blind spots”. Deep classifiers are “too linear”

7▪Universal adversarial perturbations,

Moosavi et al., CVPR 2017.

BallonJoystick Flag pole

Face powder

Labrador

LabradorChihuahua

Chihuahua

Universal (adversarial) perturbations

Page 8: A geometric perspective on the robustness of deep networkssmoosavi.me/slides/talk_aut_july2019.pdf · Adversarial examples are “blind spots”. Deep classifiers are “too linear”

8

“Geometry is not true, it is advantageous.”

Henri Poincaré

Page 9: A geometric perspective on the robustness of deep networkssmoosavi.me/slides/talk_aut_july2019.pdf · Adversarial examples are “blind spots”. Deep classifiers are “too linear”

Adversarial perturbations How large is the “space” of adversarial examples?

9

Universal perturbations What causes the vulnerability of deep networks to universal perturbations?

Geometry of …

Adversarial training What geometric features contribute to a better robustness properties?

Page 10: A geometric perspective on the robustness of deep networkssmoosavi.me/slides/talk_aut_july2019.pdf · Adversarial examples are “blind spots”. Deep classifiers are “too linear”

10

Geometry of adversarial perturbations

Page 11: A geometric perspective on the robustness of deep networkssmoosavi.me/slides/talk_aut_july2019.pdf · Adversarial examples are “blind spots”. Deep classifiers are “too linear”

r⇤ = argminr

krk2 s.t. k̂(x+ r) 6= k̂(x)

x+ r⇤x0

x 2 Rd

Geometric interpretation of adversarial perturbations

Page 12: A geometric perspective on the robustness of deep networkssmoosavi.me/slides/talk_aut_july2019.pdf · Adversarial examples are “blind spots”. Deep classifiers are “too linear”

Adversarial examples are “blind spots”.

Deep classifiers are “too linear”.

Two hypotheses

▪ Intriguing properties of neural networks,Szegedy et al., ICLR 2014.

▪Explaining and harnessing adversarial examples,Goodfellow et al., ICLR 2015.

Page 13: A geometric perspective on the robustness of deep networkssmoosavi.me/slides/talk_aut_july2019.pdf · Adversarial examples are “blind spots”. Deep classifiers are “too linear”

▪Robustness of classifiers:from adversarial to random noise,Fawzi, Moosavi, Frossard, NIPS 2016.

U

TxBxv

r

Normal cross-sections of decision boundary

Page 14: A geometric perspective on the robustness of deep networkssmoosavi.me/slides/talk_aut_july2019.pdf · Adversarial examples are “blind spots”. Deep classifiers are “too linear”

▪Robustness of classifiers:from adversarial to random noise,Fawzi, Moosavi, Frossard, NIPS 2016.

Curvature of decision boundary of deep nets

-100 -50 0 50 100 150

-2

-1

0

1

2

x

B 2

B 1

Decision boundary of CNNs is almost flat along random directions.

Page 15: A geometric perspective on the robustness of deep networkssmoosavi.me/slides/talk_aut_july2019.pdf · Adversarial examples are “blind spots”. Deep classifiers are “too linear”

rS(x) = argminr2S

krk s.t. k̂(x+ r) 6= k̂(x)

rS(x) = ⇥

rd

mr(x)

!

Adversarial perturbations constrained to a random subspace of dimension m.

Space of adversarial perturbations

For low curvature classifiers, w.h.p., we have

r∗

S

x∗

r∗S

x

Page 16: A geometric perspective on the robustness of deep networkssmoosavi.me/slides/talk_aut_july2019.pdf · Adversarial examples are “blind spots”. Deep classifiers are “too linear”

Flowerpot Pineapple

+ =

Structured additive perturbations

▪Robustness of classifiers:from adversarial to random noise,Fawzi, Moosavi, Frossard, NIPS 2016.

The “space” of adversarial examples is quite vast.

Page 17: A geometric perspective on the robustness of deep networkssmoosavi.me/slides/talk_aut_july2019.pdf · Adversarial examples are “blind spots”. Deep classifiers are “too linear”

Summary

Geometry of adversarial examples Decision boundary is “locally” almost flat. Datapoints lie close to the decision boundary.

Flatness can be used to construct diverse set of perturbations. design efficient attacks.

Page 18: A geometric perspective on the robustness of deep networkssmoosavi.me/slides/talk_aut_july2019.pdf · Adversarial examples are “blind spots”. Deep classifiers are “too linear”

18

Geometry of universal perturbations

Page 19: A geometric perspective on the robustness of deep networkssmoosavi.me/slides/talk_aut_july2019.pdf · Adversarial examples are “blind spots”. Deep classifiers are “too linear”

19▪Universal adversarial perturbations,

Moosavi et al., CVPR 2017.

BallonJoystick Flag pole

Face powder

Labrador

LabradorChihuahua

Chihuahua

Universal adversarial perturbations (UAP)

85%

Page 20: A geometric perspective on the robustness of deep networkssmoosavi.me/slides/talk_aut_july2019.pdf · Adversarial examples are “blind spots”. Deep classifiers are “too linear”

Diversity of UAPs

20

CaffeNet

GoogLeNetResNet-152

VGG-16VGG-19 VGG-F

Diversity of perturbations

Page 21: A geometric perspective on the robustness of deep networkssmoosavi.me/slides/talk_aut_july2019.pdf · Adversarial examples are “blind spots”. Deep classifiers are “too linear”

21

Why do universal perturbations exist?

Flat model

Curved model

Page 22: A geometric perspective on the robustness of deep networkssmoosavi.me/slides/talk_aut_july2019.pdf · Adversarial examples are “blind spots”. Deep classifiers are “too linear”

22Flat model▪Robustness of classifiers to universal perturbations, Moosavi et al., ICLR 2018.

Page 23: A geometric perspective on the robustness of deep networkssmoosavi.me/slides/talk_aut_july2019.pdf · Adversarial examples are “blind spots”. Deep classifiers are “too linear”

23

Flat model (cont’d)▪Robustness of classifiers to universal perturbations,

Moosavi et al., ICLR 2018.

1 50’000

Plot of singular values

Random vectors

Normals of the decision boundary

Normals to the decision boundary are “globally” correlated.

Page 24: A geometric perspective on the robustness of deep networkssmoosavi.me/slides/talk_aut_july2019.pdf · Adversarial examples are “blind spots”. Deep classifiers are “too linear”

24

Flat model (cont’d)▪Robustness of classifiers to universal perturbations,

Moosavi et al., ICLR 2018.

The flat model only partially explains the universality.

13% 38% 85%

RandomUAP

(greedy algorithm)

Page 25: A geometric perspective on the robustness of deep networkssmoosavi.me/slides/talk_aut_july2019.pdf · Adversarial examples are “blind spots”. Deep classifiers are “too linear”

25Curved model▪Robustness of classifiers to universal perturbations, Moosavi et al., ICLR 2018.

The principal curvatures of the decision boundary:

0

0.0

30002500200015001000500

Page 26: A geometric perspective on the robustness of deep networkssmoosavi.me/slides/talk_aut_july2019.pdf · Adversarial examples are “blind spots”. Deep classifiers are “too linear”

26

Curved model (cont’d)▪Robustness of classifiers to universal perturbations,

Moosavi et al., ICLR 2018.

The principal curvatures of the decision boundary:

0

0.0

30002500200015001000500

v

n

x

Page 27: A geometric perspective on the robustness of deep networkssmoosavi.me/slides/talk_aut_july2019.pdf · Adversarial examples are “blind spots”. Deep classifiers are “too linear”

27

Curved model (cont’d)▪Robustness of classifiers to universal perturbations,

Moosavi et al., ICLR 2018.

The principal curvatures of the decision boundary:

0

0.0

30002500200015001000500

x v

n

Page 28: A geometric perspective on the robustness of deep networkssmoosavi.me/slides/talk_aut_july2019.pdf · Adversarial examples are “blind spots”. Deep classifiers are “too linear”

28

Curved model (cont’d)▪Robustness of classifiers to universal perturbations,

Moosavi et al., ICLR 2018.

The principal curvatures of the decision boundary:

0

0.0

30002500200015001000500

x v

n

Page 29: A geometric perspective on the robustness of deep networkssmoosavi.me/slides/talk_aut_july2019.pdf · Adversarial examples are “blind spots”. Deep classifiers are “too linear”

29

Curved directions are shared▪Robustness of classifiers to universal perturbations,

Moosavi et al., ICLR 2018.

Normal sections of the decision boundary (for different datapoints) along a single direction:

UAPdirection

Random direction

Page 30: A geometric perspective on the robustness of deep networkssmoosavi.me/slides/talk_aut_july2019.pdf · Adversarial examples are “blind spots”. Deep classifiers are “too linear”

30

Curved directions are shared (cont’d)▪Robustness of classifiers to universal perturbations,

Moosavi et al., ICLR 2018.

The curved model better explains the existence of universal perturbations.

67%13% 38% 85%

UAPRandomCurved model

Flat model

Page 31: A geometric perspective on the robustness of deep networkssmoosavi.me/slides/talk_aut_july2019.pdf · Adversarial examples are “blind spots”. Deep classifiers are “too linear”

31Summary

Universality of perturbations Shared curved directions explain this vulnerability.

A possible solution Regularizing the geometry to combat against universal perturbations.

Why are deep nets curved? ▪ With friends like these, who needs adversaries?,

Jetley et al., NeurIPS 2018.

Page 32: A geometric perspective on the robustness of deep networkssmoosavi.me/slides/talk_aut_july2019.pdf · Adversarial examples are “blind spots”. Deep classifiers are “too linear”

32

Geometry of adversarial training

Page 33: A geometric perspective on the robustness of deep networkssmoosavi.me/slides/talk_aut_july2019.pdf · Adversarial examples are “blind spots”. Deep classifiers are “too linear”

In a nutshell

x

x+ r

Adversarial perturbations

Image batch

Training

Adversarial training

Curvature regularization

Page 34: A geometric perspective on the robustness of deep networkssmoosavi.me/slides/talk_aut_july2019.pdf · Adversarial examples are “blind spots”. Deep classifiers are “too linear”

Adversarial training

x

x+ r

Adversarial perturbations

Image batch

Training

One of the most effective methods to improve adversarial robustness…

▪Obfuscated gradients give a false sense of security, Athalye et al., ICML 2018. (Best paper)

Page 35: A geometric perspective on the robustness of deep networkssmoosavi.me/slides/talk_aut_july2019.pdf · Adversarial examples are “blind spots”. Deep classifiers are “too linear”

Geometry of adversarial training

Curvature profiles of normally and adversarially trained networks:

0

0.0

30002500200015001000500

Normal Adversarial

▪Robustness via curvature regularisation, and vice versa, Moosavi et al., CVPR 2019.

Page 36: A geometric perspective on the robustness of deep networkssmoosavi.me/slides/talk_aut_july2019.pdf · Adversarial examples are “blind spots”. Deep classifiers are “too linear”

Curvature Regularization (CURE) ▪Robustness via curvature regularisation, and vice versa,

Moosavi et al., CVPR 2019.

94.9%

Normal training

81.2%

CURE

79.4%

Adversarial training

0.0% 36.3% 43.7%PGD with

Clean

kr⇤k1 = 8

Page 37: A geometric perspective on the robustness of deep networkssmoosavi.me/slides/talk_aut_july2019.pdf · Adversarial examples are “blind spots”. Deep classifiers are “too linear”

AT vs CURE

AT CURE

Implicit regularization

Time consuming

SOTA robustness

Explicit regularization

3x to 5x faster

On par with SOTA

▪Robustness via curvature regularisation, and vice versa, Moosavi et al., CVPR 2019.

Page 38: A geometric perspective on the robustness of deep networkssmoosavi.me/slides/talk_aut_july2019.pdf · Adversarial examples are “blind spots”. Deep classifiers are “too linear”

Summary

Inherently more robust classifiers Curvature regularization can significantly improve the robustness properties.

Counter-intuitive observation Due to a more linear nature, an adversarially trained net is “easier” to fool.

A better trade-off? ▪ Adversarial Robustness through Local Linearization,

Qin et al., arXiv.

Page 39: A geometric perspective on the robustness of deep networkssmoosavi.me/slides/talk_aut_july2019.pdf · Adversarial examples are “blind spots”. Deep classifiers are “too linear”

39

Future challenges

Page 40: A geometric perspective on the robustness of deep networkssmoosavi.me/slides/talk_aut_july2019.pdf · Adversarial examples are “blind spots”. Deep classifiers are “too linear”

Architectures Batch-norm, dropout, depth, width, etc.

40

Data # of modes, convexity, distinguishability, etc.

Disentangling different factors

Training Batch size, solver, learning rate, etc.

Page 41: A geometric perspective on the robustness of deep networkssmoosavi.me/slides/talk_aut_july2019.pdf · Adversarial examples are “blind spots”. Deep classifiers are “too linear”

41

Beyond additive perturbations

Bear Fox

▪Geometric robustness of deep networks,Canbak, Moosavi, Frossard, CVPR 2018.

▪Spatially transformed adversarial examples,Xiao et al., ICLR 2018.

“0” “2”

Page 42: A geometric perspective on the robustness of deep networkssmoosavi.me/slides/talk_aut_july2019.pdf · Adversarial examples are “blind spots”. Deep classifiers are “too linear”

42

“Interpretability” and robustness

Original image

Standard training

Adversarial training

▪Robustness may be at odds with accuracy, Tsipras et al., NeurIPS 2018.

Page 43: A geometric perspective on the robustness of deep networkssmoosavi.me/slides/talk_aut_july2019.pdf · Adversarial examples are “blind spots”. Deep classifiers are “too linear”

43

ETHZ Zürich, Switzerland

Google Zürich

Page 44: A geometric perspective on the robustness of deep networkssmoosavi.me/slides/talk_aut_july2019.pdf · Adversarial examples are “blind spots”. Deep classifiers are “too linear”

44

Interested in my research?

[email protected]

smoosavi.me