25
Undefined Behavior in 2017 John Regehr University of Utah, USA

Undefined Behavior in 2017 - School of Computingregehr/ub-2017-qualcomm.pdf•Undefined behavior (UB) is a design choice • UB is the most efficient alternave because it imposes

  • Upload
    others

  • View
    2

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Undefined Behavior in 2017 - School of Computingregehr/ub-2017-qualcomm.pdf•Undefined behavior (UB) is a design choice • UB is the most efficient alternave because it imposes

UndefinedBehaviorin2017

JohnRegehrUniversityofUtah,USA

Page 2: Undefined Behavior in 2017 - School of Computingregehr/ub-2017-qualcomm.pdf•Undefined behavior (UB) is a design choice • UB is the most efficient alternave because it imposes

Today:•  Whatisundefinedbehavior(UB)?•  Whydoesitexist?•  WhataretheconsequencesofUBinCandC++?•  ModernUBdetecFonandmiFgaFon

Page 3: Undefined Behavior in 2017 - School of Computingregehr/ub-2017-qualcomm.pdf•Undefined behavior (UB) is a design choice • UB is the most efficient alternave because it imposes

sqrt(-1)=?•  i•  NaN•  Anarbitraryvalue•  ThrowanexcepFon•  Aborttheprogram•  Undefinedbehavior

Page 4: Undefined Behavior in 2017 - School of Computingregehr/ub-2017-qualcomm.pdf•Undefined behavior (UB) is a design choice • UB is the most efficient alternave because it imposes

•  Undefinedbehavior(UB)isadesignchoice•  UBisthemostefficientalternaFvebecauseitimposesthefewestrequirements– “…behavior,uponuseofanonportableorerroneousprogramconstructoroferroneousdata,forwhichthisInternaFonalStandardimposesnorequirements”

Page 5: Undefined Behavior in 2017 - School of Computingregehr/ub-2017-qualcomm.pdf•Undefined behavior (UB) is a design choice • UB is the most efficient alternave because it imposes

CandC++havelotsofUB•  Toavoidoverhead•  Toavoidcompilercomplexity•  ToprovidemaximalcompaFbilityacrossimplementaFonsandtargets

Page 6: Undefined Behavior in 2017 - School of Computingregehr/ub-2017-qualcomm.pdf•Undefined behavior (UB) is a design choice • UB is the most efficient alternave because it imposes

AccordingtoAppendixJofthestandard,C11has199undefinedbehaviors•  Butthislistisn’tcomplete•  Andthere’snocomparablelistforC++•  AndnewUBsarebeingadded

From:hTp://en.cppreference.com/w/cpp/language/staFc_cast

Page 7: Undefined Behavior in 2017 - School of Computingregehr/ub-2017-qualcomm.pdf•Undefined behavior (UB) is a design choice • UB is the most efficient alternave because it imposes

Whathappenswhenyouexecuteundefinedbehavior?•  Case1:Programbreaksimmediately(segfault,mathexcepFon)

•  Case2:ProgramconFnues,butwillfaillater(corruptedRAM,etc.)

•  Case3:Programworksasexpected– However,theUBisa“Fmebomb”–alatentproblemwaiFngtogooffwhenthecompiler,compilerversion,orcompilerflagsischanged

•  Case4:Youdon’tknowhowtotriggertheUBbutsomeoneelsedoes

Page 8: Undefined Behavior in 2017 - School of Computingregehr/ub-2017-qualcomm.pdf•Undefined behavior (UB) is a design choice • UB is the most efficient alternave because it imposes

Importanttrendsoverthelast25years:•  UBdetecFontoolshavebeengebngbeTer– StarFngperhapswithPurifyintheearly1990s– Moreabouttheselater

•  CompilershavebeengebngclevereratexploiFngUBtoimprovecodegeneraFon– MakingtheFmebombsgooff– Moreaboutthissoon

•  SecurityhasbecomeaprimaryconsideraFoninsodwaredevelopment

Page 9: Undefined Behavior in 2017 - School of Computingregehr/ub-2017-qualcomm.pdf•Undefined behavior (UB) is a design choice • UB is the most efficient alternave because it imposes

•  ButofcourselegacyCandC++containplentyofUB

•  Whatcanwedoaboutthat?– SaneopFon1:Gobackandfixtheoldcode

•  Expensive...– SaneopFon2:StopsebngoffFmebombsbymakingopFmizersmoreaggressive

– Notassane:KeepmakingopFmizersmoreaggressivewhilealsonotinvesFnginmaintenanceofoldercodes

Page 10: Undefined Behavior in 2017 - School of Computingregehr/ub-2017-qualcomm.pdf•Undefined behavior (UB) is a design choice • UB is the most efficient alternave because it imposes

int foo (int x) { return (x + 1) > x; } int main() { printf("%d\n", (INT_MAX + 1) > INT_MAX); printf("%d\n", foo(INT_MAX)); return 0; } $ gcc -O2 foo.c ; ./a.out 0 1

Page 11: Undefined Behavior in 2017 - School of Computingregehr/ub-2017-qualcomm.pdf•Undefined behavior (UB) is a design choice • UB is the most efficient alternave because it imposes

int main() { int *p = (int *)malloc(sizeof(int)); int *q = (int *)realloc(p, sizeof(int)); *p = 1; *q = 2; if (p == q) printf("%d %d\n", *p, *q); } $ clang –O foo.c ; ./a.out 1 2

Page 12: Undefined Behavior in 2017 - School of Computingregehr/ub-2017-qualcomm.pdf•Undefined behavior (UB) is a design choice • UB is the most efficient alternave because it imposes

void foo(char *p) { #ifdef DEBUG printf("%s\n", p); #endif if (p) bar(p); }

_foo: testq %rdi, %rdi je L1 jmp _bar L1: ret

Without-DDEBUG

Page 13: Undefined Behavior in 2017 - School of Computingregehr/ub-2017-qualcomm.pdf•Undefined behavior (UB) is a design choice • UB is the most efficient alternave because it imposes

void foo(char *p) { #ifdef DEBUG printf("%s\n", p); #endif if (p) bar(p); }

_foo: pushq %rbx movq %rdi, %rbx call _puts movq %rbx, %rdi popq %rbx jmp _bar

With-DDEBUG

Page 14: Undefined Behavior in 2017 - School of Computingregehr/ub-2017-qualcomm.pdf•Undefined behavior (UB) is a design choice • UB is the most efficient alternave because it imposes

void foo(int *p, int *q, size_t n) { memcpy(p, q, n); if (!q) abort(); }

_foo: jmp memcpy

OpBmizaBonisvalidevenwhenn==0

Page 15: Undefined Behavior in 2017 - School of Computingregehr/ub-2017-qualcomm.pdf•Undefined behavior (UB) is a design choice • UB is the most efficient alternave because it imposes

TherearealotmoreUBs!•  Strictaliasing–  Iftwopointersrefertodifferenttypes,compilermayassumetheydon’trefertothesameobject

– Vastmajorityofprogramsviolatestrictaliasing•  Infiniteloopsthatdon’tcontainside-effecFngoperaFonsareUB– Commoncase:compilercausesthelooptoexit–  InC11thecompilercannotterminateaninfiniteloopswhosecontrollingexpressionisaconstantexpression•  Sothenwecanatleastrelyonwhile (1) …

Page 16: Undefined Behavior in 2017 - School of Computingregehr/ub-2017-qualcomm.pdf•Undefined behavior (UB) is a design choice • UB is the most efficient alternave because it imposes

•  EffectsofUBcanprecedethefirstundefinedoperaFon– PotenFallyundefinedoperaFonsarenotseenasside-effecFng

– CompilercanmoveacrashingoperaFoninfrontofadebugprintout!

– ThisisexplicitintheC++standard:

Page 17: Undefined Behavior in 2017 - School of Computingregehr/ub-2017-qualcomm.pdf•Undefined behavior (UB) is a design choice • UB is the most efficient alternave because it imposes

SowhatcandoweaboutUBinCandC++?•  StaFcdetecFon•  DynamicdetecFon•  MiFgaFon

Page 18: Undefined Behavior in 2017 - School of Computingregehr/ub-2017-qualcomm.pdf•Undefined behavior (UB) is a design choice • UB is the most efficient alternave because it imposes

StaFcanalysis•  Enableandheedcompilerwarnings– Use–Werror

•  CodereviewersshouldbethinkingaboutUB•  RununsoundstaFcanalysistools– Coverity– ClangstaFcanalyzer– Lotsmore

•  RunsoundstaFcanalysistools– PolyspaceAnalyzer– Frama-C/TISAnalyzer

Page 19: Undefined Behavior in 2017 - School of Computingregehr/ub-2017-qualcomm.pdf•Undefined behavior (UB) is a design choice • UB is the most efficient alternave because it imposes

DynamicAnalysisusingClang(andGCC)•  AddressSaniFzer(ASan)–  Memorysafetyerrors

•  UndefinedBehaviorSaniFzer(UBSan)– Shiderrors,signedintegeroverflow,alignmentissues,missingreturnstatements,etc.

•  MemorySaniFzer(MSan)– UseofuniniFalizedstorage

•  ThreadSaniFzer(TSan)– Dataraces,deadlocks

Page 20: Undefined Behavior in 2017 - School of Computingregehr/ub-2017-qualcomm.pdf•Undefined behavior (UB) is a design choice • UB is the most efficient alternave because it imposes

•  Dynamicanalysisisgreat,sinceyougetaconcreteexecuFontrace

•  Dynamicanalysisisterrible,sinceyouneedtestcasestodriveconcreteexecuFontraces

•  Wheredowegetconcreteinputs?– Testsuites– Fuzzers– …

Page 21: Undefined Behavior in 2017 - School of Computingregehr/ub-2017-qualcomm.pdf•Undefined behavior (UB) is a design choice • UB is the most efficient alternave because it imposes

Missingdynamicanalysistools•  Strictaliasing•  Non-terminaFngloops•  Unsequencedsideeffects–  InafuncFonargumentlist–  Inanexpression

Page 22: Undefined Behavior in 2017 - School of Computingregehr/ub-2017-qualcomm.pdf•Undefined behavior (UB) is a design choice • UB is the most efficient alternave because it imposes

UBMiFgaFon•  Linuxcompileswith-fno-delete-null-pointer-checks

•  MySQLcompileswith–fwrapv •  Manyprogramscompilewith-fno-strict-aliasing

•  PartsofAndroiduseUBSaninproducFon•  Chromeisbuiltwithcontrolflowintegrity(CFI)enabledinx86-64– ProvidedbyrecentLLVMs– Overhead<1%

Page 23: Undefined Behavior in 2017 - School of Computingregehr/ub-2017-qualcomm.pdf•Undefined behavior (UB) is a design choice • UB is the most efficient alternave because it imposes

IssueswithUBmiFgaFon•  SoluFonsaren’tstandardizedorportable•  There’snomiFgaFonforconcurrencyerrors•  MemorysafetyerrormiFgaFontendstobeexpensiveandmaybreakcode– AndASanisnotahardeningtool

•  UBSancanbeconfiguredasahardeningtool– SodwaredevelopersneedtodecideiftheywanttoturnapotenFalexploitintoacrash

Page 24: Undefined Behavior in 2017 - School of Computingregehr/ub-2017-qualcomm.pdf•Undefined behavior (UB) is a design choice • UB is the most efficient alternave because it imposes

Summary

•  UBissFllaseriousproblem–  It’sprobablytoolatetofixtheCorC++standard– There’shopeforsaferdialects

•  ToolsformanagingUBaresteadilyimproving•  AllofstaFcdetecFon,dynamicdetecFon,andmiFgaFonshouldbeused– SodwaretesFngremainsextremelydifficult

Page 25: Undefined Behavior in 2017 - School of Computingregehr/ub-2017-qualcomm.pdf•Undefined behavior (UB) is a design choice • UB is the most efficient alternave because it imposes

Thanks!