Terminal Brain Damage - usenix.org · B-Slim 82.19 197K !"# 93 % 46.7% B-Dropout 81.18 776K...

Terminal Brain Damage: Exposing the Graceless Degradation in Deep Neural Networks under Hardware Fault Attacks

Sanghyun Hong1, Pietro Frigo2, Yiğitcan Kaya1, Cristiano Guiffrida2, Tudor Dumitraș1

1University of Maryland, College Park, 2Vrije Universiteit Amsterdam

Sanghyun Hong, http://hardwarefail.ml 2

1990: Optimal Brain Damage – Graceful Degradations: we can remove 60% of model parameters, without the accuracy drop

DNN’s Resilience – False Sense of Security

• Techniques that rely on the graceful degradation– Parameter pruning1: to reduce the inference cost– Parameter quantization2: to compress the network size– Blend noises to parameters3: to improve the robustness

• Prior work showed it is difficult to cause the accuracy drop– Indiscriminate poisoning4: blend a lot of poisons ≈ 11% drop– Storage media errors5: a lot of random bit errors ≈ 5% drop– Hardware fault attacks6,7: a lot of random faults ≈ 7% drops

They focus on the best-case or the average-case perturbations

What is the WORST-CASE perturbation (a bit-flip) that inflicts a SIGNIFICANT accuracy drop exceeding 10%?

Illustration: How DNN Computes• Accuracy: 98.53%

Conv [1, 10, 5x5] Conv [10, 20, 5x5] FC [320, 50] FC [50, 10]

Prior Work: Optimal Brain Damage

• Accuracy:

Conv [1, 10, 5x5] Conv [10, 20, 5x5] FC [320, 50] FC [50, 10]

The unimportant parameters

98.53% (0% drop)

Prior Work: Hardware Fault Attacks• Accuracy: 98.53%

Conv [1, 10, 5x5] Conv [10, 20, 5x5] FC [320, 50] FC [50, 10]

Memory (RAM)

weight + bias weight + bias weight + bias weight + bias

Prior Work: Hardware Fault Attacks

• Accuracy:

Conv [1, 10, 5x5] Conv [10, 20, 5x5] FC [320, 50] FC [50, 10]

93.53% (5% drop)

Memory (RAM)

weight + bias

0.3504: 1.401 x 2"# : 0 | 0111 1101 | 011 0011 0110 1111 1101 0001

Sign Exponent Mantissa

0.0219: 1.401 x 2"$ : 0 | 0011 1001 | 011 0011 0110 1111 1101 0001

Can We Find a Worst-case Bit-flip?

• Accuracy:

Conv [1, 10, 5x5] Conv [10, 20, 5x5] FC [320, 50] FC [50, 10]

Memory (RAM)

weight + bias1.2E+38: 1.401 x 2"#$: 0 | 1011 1101 | 011 0011 0110 1111 1101 0001

0.3504: 1.401 x 2%& : 0 | 0111 1101 | 011 0011 0110 1111 1101 0001

Sign Exponent Mantissa

57.52% (41.01% drop)

Research Questions• RQ-1: How vulnerable are DNNs to a single bit-flip?

• RQ-2: What properties influence this vulnerability?

• RQ-3: Can an attacker exploit this vulnerability?

• RQ-4: Can we utilize DNN-level mechanisms for mitigation?

RQ-1: How Vulnerable are DNNs to a Bit-flip?

• Metric

– Relative Accuracy Drop [RAD] = !""#$%&' ( !""#)**+,-%.!""#$%&'

• Methodology– Flip (0→1 and 1→0) each bit in all parameters of a model– Measure the RAD over the entire validation set, each time– Achilles bit: when the bit flips, the flip inflicts RAD > 10%

• Vulnerability– Max RAD: the maximum RAD that an Achilles bit can inflict– Ratio: the percentage of vulnerable parameters in a model

Network Acc. # Params Max RAD Ratio

B(ase) 95.71 21,840 98 % 50%

B-Wide 98.46 85,670 99 % 50%

B-PReLU 98.13 21,843 99 % 99%

B-Dropout 96.86 21,840 99 % 49%

B-DP-Norm 97.97 21,962 99 % 51%

L(eNet)5 98.81 61,706 99 % 47%

L5-Dropout 98.72 61,706 99 % 45%

L5-D-Norm 99.05 62,598 98 % 49%

RQ-1: Vulnerability Analysis in MNIST

• Maximum RAD ≈ 98% in all models

• > 45% of paramsare vulnerable in all the MNIST models

RQ-1: How Vulnerable Are Larger Models?• Metric

– Relative Accuracy Drop [RAD] = !""#$%&' ( !""#)**+,-%.

!""#$%&'

• Methodology

– Flip (0→1 and 1→0) each bit in all parameters of a model

– Measure the RAD over the entire validation set, each time

[e.g. VGG16-ImageNet: examine 138M parameters ≈ 942 days]

RQ-1: How Vulnerable Are Larger Models?

• Metric

– Relative Accuracy Drop [RAD] = !""#$%&' ( !""#)**+,-%.!""#$%&'

• Methodology– Flip (0→1 and 1→0) each bit in all parameters of a model– Measure the RAD over the entire validation set, each time

• Speed-up heuristics– Sampled validation set (SV): use 10% of the validation set– Inspect only specific bits (SB): the exponents or their MSBs

– Sampled parameters (SP): uniformly sample 20k parameters

RQ-1: Vulnerability Analysis in Large Models

Dataset Network Acc. # Params SV SB SP Max RAD Ratio

B(ase) 83.74 776K ✓ ✓!"# ✗B-Slim 82.19 197K ✓ ✓!"# ✗B-Dropout 81.18 776K ✓ ✓!"# ✗B-D-Norm 80.17 778K ✓ ✓!"# ✗AlexNet 83.96 2.5M ✓ ✓!"# ✗VGG16 91.34 14.7M ✓ ✓!"# ✗

AlexNet 79.07 61.1M ✓ ✓$%&' ✓ (20K)

VGG16 90.38 138.4M ✓ ✓$%&' ✓ (20K)

ResNet50 92.86 25.6M ✓ ✓$%&' ✓ (20K)

DenseNet161 93.56 28.9M ✓ ✓$%&' ✓ (20K)

InceptionV3 88.65 27.2M ✓ ✓$%&' ✓ (20K)

RQ-1: Vulnerability Analysis in Large Models

Dataset Network Acc. # Params SV SB SP Max RAD Ratio

B(ase) 83.74 776K ✓ ✓!"# ✗ 94 % 46.8%

B-Slim 82.19 197K ✓ ✓!"# ✗ 93 % 46.7%

B-Dropout 81.18 776K ✓ ✓!"# ✗ 94 % 40.5%

B-D-Norm 80.17 778K ✓ ✓!"# ✗ 97 % 45.9%

AlexNet 83.96 2.5M ✓ ✓!"# ✗ 96 % 47.3%

VGG16 91.34 14.7M ✓ ✓!"# ✗ 99 % 46.2%

AlexNet 79.07 61.1M ✓ ✓$%&' ✓ (20K) 100 % 47.3%

VGG16 90.38 138.4M ✓ ✓$%&' ✓ (20K) 99 % 42.1%

ResNet50 92.86 25.6M ✓ ✓$%&' ✓ (20K) 100 % 47.8%

DenseNet161 93.56 28.9M ✓ ✓$%&' ✓ (20K) 100 % 49.0%

InceptionV3 88.65 27.2M ✓ ✓$%&' ✓ (20K) 100 % 40.8%

RQ-2: Properties that Influence the Vulnerability• (Network-level) DNN-properties• (Parameter-level) Bitwise representation

RQ-2: Impact of the Common Techniques

• (Network-level) DNN-properties– The dropout and batch-norm do not affect the vulnerability

Dataset Network Base acc. # Params SV SB SP Max RAD Ratio

L(eNet)5 98.81 61,706 ✗ ✗ ✗ 99 % 47%

L5-Dropout 98.72 61,706 ✗ ✗ ✗ 99 % 45%

L5-D-Norm 99.05 62,598 ✗ ✗ ✗ 98 % 49%

B(ase) 83.74 776 K ✓ ✓ ✗ 94 % 47%

B-Dropout 81.18 776 K ✓ ✓ ✗ 94 % 41%

B-D-Norm 80.17 778 K ✓ ✓ ✗ 97 % 46%

RQ-2: Impact of the Other DNN Properties• (Network-level) DNN-properties– The dropout and batch-norm cannot reduce the vulnerability – The vulnerability increases proportionally with the width– The activation with negative values doubles the vulnerability – The vulnerability is consistent across 19 DNNs’ architectures

• [8 MNIST, 5 CIFAR-10, and 5 ImageNet architectures]

RQ-2: Impact of the Parameter Sign

• (Parameter-level) Bitwise representation– Flip the MSB of the exponents mostly lead to [RAD > 10%]– The only (0→1) flip direction leads to [RAD > 10%] – The positive parameters are likely to be vulnerable

to bit-flips than the negative parameters

RQ-3: Threat Model – Attacker’s Capability

• Capability– Surgical: can cause a bit-flip at an intended location– Blind: cannot control the location of a bit-flip

RQ-3: Threat Model – Attacker’s Knowledge

• Capability– Surgical: can cause a bit-flip at an intended location– Blind: cannot control the location of a bit-flip

• Knowledge:– White-box: knows the victim model internals– Black-box: has no knowledge of the victim model

RQ-3: Threat Model – Single Bit Adversary

White-boxBlack-box

Strongest attacker!(#$% &' ()#$**+, -$%) ≈ 100%

Weakest attacker! #$% &' ()#$**+, -$% ≈ /[VGG16: 42.1% / 32-bits ≈ 1.32%]

Surgical

RQ-3: Practical Weapon – Rowhammer• Rowhammer attacks– Single-bit corruption primitives at DRAM-level– Software-induced hardware fault attacks

[The attacker only requires a user-level access to memory]

Row Buffer

Double-sided Rowhammer attack

0 0 1 0 1

RQ-3: Practical Weapon – Rowhammer• Rowhammer attacks– Single-bit corruption primitives at DRAM-level– Software-induced hardware fault attacks

[The attacker only requires a user-level access to memory]

Double-sided Rowhammer attack

Row Buffer

0 1 1 0 1

RQ-3: Threat Model (Re-visited)

White-boxBlack-box

BlindSurgical

Strongest attacker

Weakest attacker! "#$ %& '("#))*+ ,#$ ≈ -[VGG16: 42.1% / 32-bits ≈ 1.32%]

!("#$ %& '("#))*+ ,#$) ≈ 100%

RQ-3: If Our Adversary Can Flip Multiple-Bits

White-boxBlack-box

BlindSurgical

Strongest attacker

(Weakest) Stronger attacker∑" #$% &' ()#$**+, -$% ≫ 1.3%

"(#$% &' ()#$**+, -$%) ≈ 100%

• Evaluation– MLaaS scenario: a VM runs under the Rowhammer pressure

• A Python process that constantly queries the VGG16 ImageNet model• Make bit-flips to the process memory: both on the code and data

[Consequences: RAD > 10%, process crash, or RAD <= 10%]

– Method: Hammertime1 DB• Explore Rowhammer effects systematically in 12 different DRAM chips

[Vulnerability of DRAM: based on the number of bits subjected to flip]

– Experiments• 25 experiments for each of 12 different DRAM chips• 300 cumulative bit-flip attempts for each experiment

RQ-3: The Weakest Attacker with Rowhammer

1Tatar et al., Defeating Software Mitigations against Rowhammer: a Surgical Precision Hammer, RAID’18

• Blind attack results– The attacker can inflict the Terminal Brain Damage (RAD >

10%) to the victim model, effectively• On average, 62% (15.6/25) of the experiments were successful• With the most vulnerable DRAM chip, 96% (24/25) successes • With the least vulnerable DRAM chip, 4% (1/25) successes

– It is Challenging to Detect the blind attacker• Only 6 crashes observed over the entire 7.5k bit-flip attempts

RQ-3: The Weakest Attacker with Rowhammer

Blind Rowhammer attack is practical against DNN models

Research Questions• RQ-1: How vulnerable are DNNs to single bit-flips?

RQ-4: Rowhammer Defenses• Hardware-supported defenses to fault attack– ECC: Error correcting code in memory1

– Detection based on hardware performance counters2

• System-level defenses to fault attack– CATT: Memory isolation of the kernel and user space3

– ZebRAM: Software-based isolation of every DRAM row4

1Kim et al., Flipping Bits in Memory without Accessing Them: An Experimental Study of DRAM …, ACM SIGARCH’142Aweke et al., Anvil: Software-based Protection against Next-generation Rowhammer attacks, ACM SIGPLAN’163Brasser et al., Can’t Touch This: Software-only Mitigation against Rowhammer Attacks …, USENIX’174Konoth et al., Zebram: Comprehensive and Compatible Software Protection against Rowhammer Attacks, OSDI’18

They require infrastructure-wide changes, or they are not effective against other hardware faults

RQ-4: Can We Mitigate this Vulnerability?

• Investigate DNN-level defenses:– Restrict activation magnitudes: Tanh or ReLU6– Use low-precision numbers: quantization or binarization

RQ-4: Pros and Cons of Our Defenses• Pros– Both the directions reduce the # of vulnerable parameters

• Cons– Require to re-train a whole model from scratch

RQ-4: Pros and Cons of Our Defenses

• Pros– Both directions reduce the # of vulnerable parameters– Substitute activation functions without re-training

• Cons– Require to re-train a whole model from scratch– Expect the accuracy drop of a model without re-training

Summary of Our Results• RQ-1: How vulnerable are DNNs to single bit-flips?

All DNNs have a bit whose flip causes RAD up to 100%40-50% of all parameters in a model are vulnerable

• RQ-2: What properties influence this vulnerability?The vulnerability is consistent across multiple DNNs

• RQ-3: Can an attacker exploit this vulnerability?Blind Rowhammer attacker can exploit this practically

• RQ-4: Can we utilize DNN-level mechanisms for mitigation?We reduce the vulnerable parameters in a model; butours degrade the performance or require the re-training

Conclusions and Implications• DNNs are not resilient to worst-case parameter perturbations– Re-examine techniques relying on graceful degradations with security lens

• The vulnerability of DNNs to !-arch. attacks is under-studied– Explore and evaluate new attacks, particularly thought hard– These attacks may be inflicted with weak attackers, e.g. blind Rowhammer

• For AI systems, system-level defenses are not sufficient– Consider additional model-level defenses that account for DNN properties

Thank you!Sanghyun Hongshhong@cs.umd.edu

http://hardwarefail.ml

70Sanghyun Hong, http://hardwarefail.ml

Terminal Brain Damage - usenix.org · B-Slim 82.19 197K !"# 93 % 46.7% B-Dropout 81.18 776K...

Documents

AND DATED 11-10-2017 QUAID-E-AWAM … 7652 537 HAMNA DAWOOD SHAIKH MUHAMMAD DAWOOD Sukkur U 82.19 80 50 8.219 32 25 65.219 [B-upper][A-1] PE F 29 7564 83 DUA SANA AWAN MAQBOOL AHMED

― 23...177, 325 12, 261, 901 6, 918, 228 (197k) (19k) (191) (19k) IOO 2, 627, 2, 395, 276B 2, 476, 2, 476, 702 IOO 3, 135, 3, 420, 3, 420, 2 (B) (7k) . ('k) 5B (7k)l, 318

˘前連結会計年度当連結会計年度期中平均相場決算日の直物相場期中平均相場決算日の直物相場米ドル 85.74円 83.15円 79.08円 82.19円

Saturday, April 2nd @ 10:00 AM AL’S Car Country · Greg Shipley 513 BBWF & XBred $157.25 ... John Howe 844 XBred $118.75 ... Tilt, Cruise AC, 197K $4,950 660-748-8400

Documento (1)...85.99-6-03 - 85.99-6-04 • 85.99+05 • 77.33-1-00 • 82.19-9-99 - Suporte técnico, manutençäo e outros serviços em tecnologia da informaçäo Construçäo de

· УДК 81'25 ББК 81.18 ББК Ш18 П 27 Научно-художественный журнал «Переводчик» (Печатный орган Забайкальског

primarysite-prod-sorted.s3.amazonaws.com · Web view2020-06-08 · Maths 2 - Core 4. 872 153 + 81 611 = 8.174 + 3.29 = 824 638 - 14 274 = 82.19 - 12.3 = 7361 x 4 = 3572 x 34 = 8372

N Е К R O L - Muzeum Historii Polskibazhum.muzhp.pl/media//files/Kwartalnik_Historii... · 197. r.2 i) Historia Turcji (dru 197k r.3 uważa) ł za najważniejsz 1. e Jan Anton Wacłai

VESPA CLUB PORCIA...ET3: 11 tester durante la gul 197K A Casco non O Come riconoscerla La Vespa Primwera ET3 è riconoscioile a colpo d 'occhio per 'e strisce adesivecon la SCritta

CALIFORNIA JUDGES · PDF fileCALIFORNIA JUDGES BENCHGUIDES Benchguide 82 ... 82.19] Arraignment b. ... script that might be used,

Nashville - Council Report for Monthly Contract Abstracts...Ewing Moving Service Inc. & Storage 81.18 $54,615.00 Awarded Ted R. Sanders Moving and Warehouse 71.98 $63,368.00 Awarded

Corporate & Strategy Reports · INV 80 x 80mm Thermo paper for DigiPoS Receipt 50.95 INV Nedlands Library Stationery Order - BDS thermal 82.19 827.12532-01Total Eden Pty Ltd INV Fittings

SZEGEDI VÍZMŰ ZRT. · 2018-07-16 · 81.22 Egyéb épület-, ipari takarítás 81.29 Egyéb takarítás 81.30 Zöldterület-kezelés 82.19 Fénymásolás, egyéb irodai szolgáltatás

NETWORKS Setting WLAN Performance Expectations4 Scotch Malt Whisky Society 82.19 ”Stunning!” The nose producedfascinating and weighty aromas – cola cubes, toffee, dark chocolate,

PRODUCT AND PERFORMANCE WARRANTY...output of the solar module and in the ending of 25th year with the 81.18% of the nominal power tested under STC of 25 °C, 1.5 AM, 1000 W/m2 as mentioned

AnnuAl RepoRt 2012 · 2012-07-30 · notes: (1) u.S. dollar amounts have been translated from Japanese yen, solely for the convenience of readers, at the rate of ¥82.19 = uS$1, the

1データカタログサイト等の質・量両面での拡充（参考資料）70.96% 環境省 439 0 809 1046 45.60% 内閣府 28 21 912 8625 89.97% 財務省 625 2 0 2893 82.19%

ANNUAL REPORT 2012...2020/04/14 · Note: For convenience only, U.S. dollar amounts in this report have been converted from yen at the rate of ¥82.19=US$1, the approximate rate of

生態監測第 6 期工作─197k~200.5K) · 西濱快速公路(台61 線)員林大排至西濱大橋新建工程 190K+028~209K+117 環境、生態監測及交通量調查工作

Financial Considerations; Determining Developer’s ROI in Downtown Palmetto Bay 223 k 185 k 189 k 280k 168 k 218k 248k 196k 197k 216k 227k 236k 255k 235k