Facilities for x86 debugging Introduction to x86 CPU features that can assist programmers in the...

Preview:

Citation preview

Facilities for x86 debugging

Introduction to x86 CPU features that can assist programmers in the debugging of their software

Any project ‘bugs’?

• As you work on designing your solution for the programming assignment in Project #2 it is possible (likely?) that you may run into some program failures

• What can you do if your program doesn’t behave as you had expected it would?

• How can you diagnose the causes?

• Where does your problem first appear?

Single-stepping

• An ability to trace through your program’s code, one instruction at a time, often can be extremely helpful in identifying where a program flaw is occurring – and also why

• Intel’s x86 processor provides hardware assistance in implementing a ‘debugging’ capability such as ‘single-stepping’.

RF

The EFLAGS register

TF

816

RF = RESUME flag (bit 16) By setting this flag-bit in the EFLAGS register-image that gets saved on the stack, the ‘iret’ instruction will be inhibited from generating yet another CPU exception

TF = TRAP flag (bit 8) By setting this flag-bit in the EFLAGS register-image that gets saved on the stack when a ‘pushfl’ is executed, and then executing ‘popfl’, the CPU will begin triggering a ‘single-step’ exception after each instruction-executes

TF-bit in EFLAGS

• Our ‘usedebug.s’ demo shows how to use the TF-bit to perform ‘single-stepping’ of a Linux application (e.g., our ‘linuxapp.o’)

• The ‘popfw’ instruction is used to set TF

• The exception-handler for INT-1 displays information about the state of the program

• But single-stepping starts only AFTER the immediately following instruction executes

How to do it

• Here’s a code-fragment that we could use to initiate single-stepping from the start of our ‘ring3’ application-progam:

pushw $userSS # selector for ring3 stack-segmentpushw $userTOS # offset for ring3 ‘top-of-stack’ pushw $userCS # selector for ring3 code-segmentpushw $0 # offset for the ring3 entry-point

pushfw # push current FLAGSbtsw $8, (%esp) # set image of the TF-bitpopfw # modify FLAGS to set TF

lret # transfer to ring3 application

Using assembler listings

• You can generate an assembler ‘listing’ of the instructions in our ‘linuxapp.o’ file, then use that listing to follow along while you’re ‘single-stepping’ through that file’s code

• Here’s how to do it:

$ as –al linuxapp.s > linuxapp.lst

• (The ‘-al’ option is for ‘assembly listing’)

A slight ‘flaw’

• We cannot single-step the execution of an ‘int-0x80’ instruction (Linux’s system-calls)

• Our exception-handler’s ‘iret’ instruction will restore the TF-bit to EFLAGS, but the single-step ‘trap’ doesn’t take effect until after the immediately following instruction

• This means we ‘skip’ seeing a display of the registers immediately after ‘int-0x80’

Fixing that ‘flaw’

• The x86 offers us a way to overcome the delayed effect of TF when ‘iret’ executes

• We can use the Debug Registers to set an instruction ‘breakpoint’ which will interrupt the CPU at a specific instruction-address

• There are six Debug Registers:DR0, DR1, DR2, DR3 (breakpoints)DR6 (the Debug Status register)DR7 (the Debug Control register)

Breakpoint Address Registers

DR0

DR1

DR2

DR3

Special ‘MOV’ instructions

• Use ‘mov %reg, %DRn’ to write into DRn

• Use ‘mov %DRn, %reg’ to read from DRn

• Here ‘reg’ stands for any one of the CPU’s general-purpose registers (e.g., EAX, etc.)

• These special instructions are ‘privileged’ (i.e., they can only be executed by code that is running in ring0)

Debug Control Register (DR7)

0 0GD

0 0 1GE

LE

G3

L3

G2

L2

G1

L1

G0

L0

LEN3

R/W3

LEN2

R/W2

LEN1

R/W1

LEN0

R/W0

15 0

31 16

Least significant word

Most significant word

What kinds of breakpoints?

LEN R/W

LEN 00 = one byte 01 = two bytes 10 = undefined 11 = four bytes

R/W 00 = break on instruction fetch only 01 = break on data writes only 10 = undefined (unless DE set in CR4) 11 = break on data reads or writes (but

not on instruction fetches)

Control Register CR4

• The x86 CPU uses Control Register CR4 to activate certain extended features of the processor, while still allowing for backward compatibility of software written for earlier Intel x86 processors

• An example: Debug Extensions (DE-bit)

other feature bitsCR4 DE

331 0

Debug Status Register (DR6)

BD

0 1 1 1 1 1 1 1B3

B2

B1

unused ( all bits here are set to 1 )

15 0

31 16

Least significant word

Most significant word

BS

BT

1B0

LEGEND: B0 (Breakpoint by DR0) BT (Break on Task-switch trap) B1 (Breakpoint by DR1) BS (Break on Single-step trap) B2 (Breakpoint by DR2) BD (Break on Debug-register access) B3 (Breakpoint by DR3)

Where to set a breakpoint

• Suppose you want to trigger a ‘debug’ trap at the instruction immediately following the Linux software ‘int $0x80’ system-call

• Your debug exception-handler can use the saved CS:EIP values on its stack to check that ‘int $0x80’ has caused an exception

• Machine-code is: 0xCD, 0x80 (2 bytes)

• So set a ‘breakpoint’ at address EIP+2

Computing a code-breakpointisrDBG: pushal # preserve general registers

pushl $ds # preserve DS registerpushl $es # preserve ES register

lds 40(%esp), %esi # point DS:ESI to faulting instructioncmpb $0xCD, (%esi) # a software interrupt instruction?jne notINT # if not, don’t set a breakpointadd $2, %esi # else point past 2-byte instruction

# now we want to compute the ‘linear address’ represented by # the logical-address (i.e., segment:offset values) in DS:ESI

# NOTE# It’s easy for operating systems like Linux, where segments # for code and data have a base-address that’s equal to zero # but our current program-examples use memory-segments # that don’t begin at address 0x00000000

15 3 2 1 0

Segment-selector format

array-index for descriptor-table entryRPL

TI

TI (Table Indicator) 0 = GDT 1 = LDT

Segment-Descriptor Format

Base[31..24] G DRSV

AVL

Limit[19..16]

PDPL

S XC/D

R/

WA Base[23..16]

Base[15..0] Limit[15..0]

63 32

31 0

Several instances of this basic ‘segment-descriptor’ data-structure will occur in the Global Descriptor Table (and maybe also in some Local Descriptor Tables)

Getting the base-address

# The base-address for the memory-segment whose segment-selector is# in register DS will need to be extracted from its segment-descriptor

mov %ds, %ecx # segment-selector to ECXlea theGDT, %ebx # setup GDT’s offset in EBXbt $2, %ecx # is the selector’s TI-bit set?jnc useEBX # no, do table-lookup in GDTlea theLDT, %ebx # else do the lookup in LDT

useEBX:and $0xFFF8, %ecx # isolate selector’s index-fieldmov %cs:0(%ebx, %ecx), %eax # descriptor [31..0]mov %cs:4(%ebx, %ecx), %al # descriptor [39..32]mov %cs:7(%ebx, %ecx), %ah # descriptor [63..56]rol $16, %eax # rotate these bits into position

Enabling the breakpoint

# instruction linear-address is base-address plus segment-offset add %eax, %esi # add base-address to offset

# setup this breakpoint-address in Debug Register DR0mov %esi, %dr0 # breakpoint-address in DR0

# now activate a ‘local’ code-breakpoint for the address in DR0mov %dr7, %eaxbts $0, %eax # set LE0 (Local Enable 0)mov %eax, %dr7…popl %espopl %dspopaliret

Detecting a ‘breakpoint’

• Your debug exception-handler can read DR6 to check for an occurrence of breakpoint0

mov %dr6, %eax ; get debug status

bt $0, %eax ; breakpoint #0?

jnc notBP0 ; no, another cause

btsl $16, 12(%ebp) ; set the RF-bit

# or disable breakpoint0 in register DR7

notBP0:

Detecting a ‘breakpoint’

• Your debug exception-handler reads DR6 to learn why a debug-exception occurred

# EXAMPLE# was this exception triggered by a breakpoint defined in DR0…DR3?mov %dr6, %eax # read debug status-registertest $0xF, %eax # any breakpoint matches?jz notBP # no, leave RF-bit unchanged

# OK, we need to set the RF-bit (Resume Flag) before we execute ‘iret’# (so as not to immediately encounter the very same breakpoint again)btsl $16, 48(%esp) # set RF-bit in EFLAGS image

notBP:

In-class exercise #1

• Our ‘usedebug.s’ demo illustrates the idea of single-stepping through a program, but after several steps it encounter a General Protection Exception (i.e., interrupt $0x0D)

• You will recognize a display of information from registers that gets saved on the stack

• Can you determine why this fault occurs, and then modify our code to eliminate it?

The GP Fault’s stack-layoutEFLAGS----- CS

EIP error-code

EAX ECX EDX EBX ESP EBP ESI EDI

----- DS----- ES----- FS----- GS

Intel x86 instruction-format

• Intel’s instructions vary in length from 1 to 15 bytes, and are comprised of five fields:

instructionprefixes

0,1,2 or 3 bytes

opcodefield

1 or 2 bytes

addressingmode field

0, 1 or 2 bytes

addressdisplacement

0, 1, 2 or 4 bytes

immediatedata

0, 1, 2 or 4 bytes

Maximum number of bytes = 15

NOTE: When the processor’s IA32e mode is activated, some of these field-sizesmay be larger, to accommodate additional addressing-modes and operand-sizes

A few examples

• 1-byte instruction: in %dx, %al• 2-byte instruction: int $0x16• A prefixed instruction: rep movsb• And here’s a 12-byte instruction:

cmpl $0, %fs:0x400(%ebx, %edi, 2)– 1 prefix byte– 1 opcode byte– 2 address-mode bytes– 4 address-displacement bytes– 4 immediate-data bytes

In-class exercise #2

• Modify the debug exception-handler in our ‘usedebug.s’ demo-program so it will use a different Debug Register (i.e.,, DR1, DR2, or DR3) to set an instruction-breakpoint at the entry-point to your ‘int $0x80’ system-service interrupt-routine (i.e., at ‘isrDBG’)

• This can allow you to do single-stepping of your system-call handlers (e.g., ‘do_write’)

Recommended