Upload
peter-breuer
View
55
Download
3
Tags:
Embed Size (px)
DESCRIPTION
Slide presentation for Certifying (RISC) Machine Code Safe from Aliasing, presented at OpenCert 2013, Madrid. See http://www.academia.edu/3244313/Certifying_Machine_Code_Safe_from_Hardware_Aliasing_RISC_is_not_necessarily_risky.
Citation preview
Certifying (RISC) Machine Code Safe from Aliasing
Peter T. BreuerUniversity of Birmingham, UK
Jonathan P. BowenLondon South Bank University, UK
Little and Large Problem
● Small arithmetic unit, embedded processor– 40 bit arithmetic
● Large memory unit– 64 bit addressing
● What do we do with the extra wires?
Hardware Aliasing
● What happens to the extra wires?– depends on the hardware
● 4 + 0xfffffffffffffffc = 0x0000000000000000 or0xfffff00000000000 ?
● Both mean 0– If use arithmetic to calculate address 0
● Sometimes get the 0 you want● Sometimes not!
Also happens in KPU
● A KPU is an encrypted processor– Instead of 4 - 4 = 0
– Does 99900 - 99900 = 78763298● Homomorphism conditions on encrypted
arithmetic guarantee correct behaviour
– Real encryption is always 1-many● The encoding of 0 is 9896861● 99900 - 99900 = 78763298
9896861● Another encoding of 0 is 78763298
– Encrypted arithmetic gives different result ● Depending on how you do the calculation
Problem
● How to check a program is safe from hardware aliasing
● Where `hardware aliasing' means that arithmetic on addresses does not always give the same result.
– Trust only exactly the same calculation
– Because 4 - 4 != 0 – It's `equivalent' to 0, not identical!
Can imagine in both cases ...
● Values have invisible extra bits● 42.1101101● Represent different encodings of '42'
● Arithmetic ignores but mutates the extra bits● 42.1101101 + 42.1100001 = 84.0110110
● Memory unit is sensitive to invisible extra bits● Can't see just '42'.
● Needs loving care from programmer
How to deal with hardware aliasing
● Left program returns different alias of SP to caller
Subroutine foo:
SP -= 32 # 8 local vars…code ...SP += 32 # destroy framereturn
Subroutine foo:GP = SPSP -= 32…code ...SP = GPreturn
GoodBad
Regard machine code as compiled from Stack Machine
control language● Good code:
cspt GP # copy stack pointer to GPpush 32 # make 32B space on stack…rspf GP # restore stack pointer from GPreturn
What makes that SM code safe?
● No access outside the current frame– The stack access commands are
● Get 10 gp # 10th stack cell contents.. ● Put 10 gp # .. transfer to/from reg gp
– If all access offsets in current frame range● Only one way to access stack content..● By offset from current stack pointer
– Can only make new frame, not shift sp● Push 32
– Can only return sp to value saved earlier● Cspt gp … rspf gp
Heap access
● Deal with that later!– Look for array and string treatment in text
Verifying SM code
● Means verifying that all stack accessesare within the current frame boundary
● That's so easy! Check n in 'get n r'.● But we have machine code, not SM code!
Machine code looks like this
● Mov gp sp # cspt gpAddi sp sp -32 # push 32…mov sp gp # rspf gpjr ra # return
● Is it compiled from safe SM code?
To prove m/c safe
● Apply Hoare-like rules of reasoning– Whose names are the SM code that the
m/c is supposed to be compiled from
● Requires human being to chose rule– Or an automaton to search solution space
– Either way, it's deduction-guided disassembly
Example
● Think about a 32B current frame
{ sp=c32!10; (10)=x } ld gp 10(sp) [get 10 gp] {sp=c32!10; (10)=gp=x}● 'c32!10' means pointer to 32B
– Already written at offset 10
● (10)=x means stack cell 10 has an x-thing● Machine code is 'ld gp 10(sp)'
– Load reg gp from offset 10 from stack ptr
● Name of the rule is 'get 10 gp'
Types● Logic is based on stack machine model
– manipulates types in register/stack/heap
● C32 – pointer to stack frame of size 32– Only access by bounded offset from ptr
● U10 – array of size 10 on heap– Can only access by offset from fixed base
● C1 - string accessed in increments of 1– String is like a stack of frames size 1
– Stepping up `pops one off the stack'
– Access within `current frame' only
●
Typing
● Milner typing– Assign type variables to every register
and stack position within current frame
– Calculate effect of instructions
– Ambiguous modulo assignment of rule● Equals dis-assembly of instruction
● Proved – soundness– Assigned types say what really happens
Other Proved Things
● Termination– Milner algorithm terminates
– With a typing, if one exists, errors if not
● Uniqueness– The type found is unique most general
● For a given annotation
● There are at most 32 valid annotations– Differ in position of stack pointer register
Conclusion
1.Disassemble machine code • Human activity
2.Apply Milner typing• Includes stack machine bounds verification• Automated activity
3.Certify m/c as hardware alias safe● Steps 1 & 2 can be mixed/simultaneous
● Inference-guided disassembly
4.Apply to assembler in Linux kernel