Transcript
Page 1: Certifying (RISC) Machine Code Safe from Aliasing  (OpenCert 2013)

Certifying (RISC) Machine Code Safe from Aliasing

Peter T. BreuerUniversity of Birmingham, UK

Jonathan P. BowenLondon South Bank University, UK

Page 2: Certifying (RISC) Machine Code Safe from Aliasing  (OpenCert 2013)

Little and Large Problem

● Small arithmetic unit, embedded processor– 40 bit arithmetic

● Large memory unit– 64 bit addressing

● What do we do with the extra wires?

Page 3: Certifying (RISC) Machine Code Safe from Aliasing  (OpenCert 2013)

Hardware Aliasing

● What happens to the extra wires?– depends on the hardware

● 4 + 0xfffffffffffffffc = 0x0000000000000000 or0xfffff00000000000 ?

● Both mean 0– If use arithmetic to calculate address 0

● Sometimes get the 0 you want● Sometimes not!

Page 4: Certifying (RISC) Machine Code Safe from Aliasing  (OpenCert 2013)

Also happens in KPU

● A KPU is an encrypted processor– Instead of 4 - 4 = 0

– Does 99900 - 99900 = 78763298● Homomorphism conditions on encrypted

arithmetic guarantee correct behaviour

– Real encryption is always 1-many● The encoding of 0 is 9896861● 99900 - 99900 = 78763298

9896861● Another encoding of 0 is 78763298

– Encrypted arithmetic gives different result ● Depending on how you do the calculation

Page 5: Certifying (RISC) Machine Code Safe from Aliasing  (OpenCert 2013)

Problem

● How to check a program is safe from hardware aliasing

● Where `hardware aliasing' means that arithmetic on addresses does not always give the same result.

– Trust only exactly the same calculation

– Because 4 - 4 != 0 – It's `equivalent' to 0, not identical!

Page 6: Certifying (RISC) Machine Code Safe from Aliasing  (OpenCert 2013)

Can imagine in both cases ...

● Values have invisible extra bits● 42.1101101● Represent different encodings of '42'

● Arithmetic ignores but mutates the extra bits● 42.1101101 + 42.1100001 = 84.0110110

● Memory unit is sensitive to invisible extra bits● Can't see just '42'.

● Needs loving care from programmer

Page 7: Certifying (RISC) Machine Code Safe from Aliasing  (OpenCert 2013)

How to deal with hardware aliasing

● Left program returns different alias of SP to caller

Subroutine foo:

SP -= 32 # 8 local vars…code ...SP += 32 # destroy framereturn

Subroutine foo:GP = SPSP -= 32…code ...SP = GPreturn

GoodBad

Page 8: Certifying (RISC) Machine Code Safe from Aliasing  (OpenCert 2013)

Regard machine code as compiled from Stack Machine

control language● Good code:

cspt GP # copy stack pointer to GPpush 32 # make 32B space on stack…rspf GP # restore stack pointer from GPreturn

Page 9: Certifying (RISC) Machine Code Safe from Aliasing  (OpenCert 2013)

What makes that SM code safe?

● No access outside the current frame– The stack access commands are

● Get 10 gp # 10th stack cell contents.. ● Put 10 gp # .. transfer to/from reg gp

– If all access offsets in current frame range● Only one way to access stack content..● By offset from current stack pointer

– Can only make new frame, not shift sp● Push 32

– Can only return sp to value saved earlier● Cspt gp … rspf gp

Page 10: Certifying (RISC) Machine Code Safe from Aliasing  (OpenCert 2013)

Heap access

● Deal with that later!– Look for array and string treatment in text

Page 11: Certifying (RISC) Machine Code Safe from Aliasing  (OpenCert 2013)

Verifying SM code

● Means verifying that all stack accessesare within the current frame boundary

● That's so easy! Check n in 'get n r'.● But we have machine code, not SM code!

Page 12: Certifying (RISC) Machine Code Safe from Aliasing  (OpenCert 2013)

Machine code looks like this

● Mov gp sp # cspt gpAddi sp sp -32 # push 32…mov sp gp # rspf gpjr ra # return

● Is it compiled from safe SM code?

Page 13: Certifying (RISC) Machine Code Safe from Aliasing  (OpenCert 2013)

To prove m/c safe

● Apply Hoare-like rules of reasoning– Whose names are the SM code that the

m/c is supposed to be compiled from

● Requires human being to chose rule– Or an automaton to search solution space

– Either way, it's deduction-guided disassembly

Page 14: Certifying (RISC) Machine Code Safe from Aliasing  (OpenCert 2013)

Example

● Think about a 32B current frame

{ sp=c32!10; (10)=x } ld gp 10(sp) [get 10 gp] {sp=c32!10; (10)=gp=x}● 'c32!10' means pointer to 32B

– Already written at offset 10

● (10)=x means stack cell 10 has an x-thing● Machine code is 'ld gp 10(sp)'

– Load reg gp from offset 10 from stack ptr

● Name of the rule is 'get 10 gp'

Page 15: Certifying (RISC) Machine Code Safe from Aliasing  (OpenCert 2013)

Types● Logic is based on stack machine model

– manipulates types in register/stack/heap

● C32 – pointer to stack frame of size 32– Only access by bounded offset from ptr

● U10 – array of size 10 on heap– Can only access by offset from fixed base

● C1 - string accessed in increments of 1– String is like a stack of frames size 1

– Stepping up `pops one off the stack'

– Access within `current frame' only

Page 16: Certifying (RISC) Machine Code Safe from Aliasing  (OpenCert 2013)

Typing

● Milner typing– Assign type variables to every register

and stack position within current frame

– Calculate effect of instructions

– Ambiguous modulo assignment of rule● Equals dis-assembly of instruction

● Proved – soundness– Assigned types say what really happens

Page 17: Certifying (RISC) Machine Code Safe from Aliasing  (OpenCert 2013)

Other Proved Things

● Termination– Milner algorithm terminates

– With a typing, if one exists, errors if not

● Uniqueness– The type found is unique most general

● For a given annotation

● There are at most 32 valid annotations– Differ in position of stack pointer register

Page 18: Certifying (RISC) Machine Code Safe from Aliasing  (OpenCert 2013)

Conclusion

1.Disassemble machine code • Human activity

2.Apply Milner typing• Includes stack machine bounds verification• Automated activity

3.Certify m/c as hardware alias safe● Steps 1 & 2 can be mixed/simultaneous

● Inference-guided disassembly

4.Apply to assembler in Linux kernel