failing like never before


Smashing the Stack for Extra Credit

(So this one is a little old... I have a habit of writing up drafts, stashing them away to be uploaded later, and then completely forgetting about them.)

A quick intro to buffer overflow attacks for the unlearned (feel free to skip this bit).

I would highly recommend reading AlephOne's Smashing the Stack for Fun and Profit if you really want to learn about buffer-overflow attacks, but you can read my bit instead if you just want a quick idea of whats its all about.

Programs written in non-type safe languages like C/C++, do not perform bounds checking on memory when doing reads and writes, and are therefore often to susceptible to what is known as a buffer-overflow attack. Basically, when a program allocates some array on the program stack, the possibility exists that if the programmer is not careful, her program may accidentally overwrite the bounds of the array. One could imagine a situation where a program allocates N amounts of bytes on the stack, reads in input from stdin using scanf (which terminates reading input when it hits a newline) and stores the read bytes in the allocated array. However, the user sitting at the terminal might decide to enter in more then N bytes of data, causing the program to overwrite the bounds of its array. Other values on the stack could then be unintentionally altered which could cause the program to execute erratically and even crash.

But a would-be attacker can do more then just crash a program with a buffer-overflow attack, they could potentially gain control of the program and even cause it to execute arbitrary code. (By "executing arbitrary code" I mean that the attacker could make the program do anything.) An attacker can take control of a program by writing down the program stack past the array's bounds and changing the return address stored on the stack, so that when the currently executing function returns control to the calling function, it actually ends up executing some completely different segment of code. At first glance, this seems rather useless. But an attacker can set the instruction pointer to anything, she could even make it point back to the start of the array that was just overwritten. A carefully crafted attack message, could cause the array to be filled with some bits of arbitrary assembly code (perhaps to fork a shell and then connect that shell to a remote machine) and then overwrite the return address to point to the top of the overwritten array.

A problem with the generic buffer overflow attack is that the starting location of the stack is determined at runtime and therefore can change slightly. This makes it difficult for an attacker to know exactly where the top of the array is every single time. A solution to this, is to use a "NOP slide," where the attack message doesn't immediately begin with the assembly code but rather begins with a stream of NOPs. A NOP is an assembly language instruction that basically does nothing (I believe that it was originally included in the x86 ISA to deal with hazards) so as long as the instruction pointer is reset to point into somewhere the NOP slide, the computer will "slide" (weeeeee!!!!) into the rest of the injected assembly code.

Sounds simple so far? Just you wait....

The trials and tribulations of my feeble attempt to "smash the stack."

The professor for my security class threw in a nice little extra credit problem on a homework assignment last quarter. One of the problems in the homework asked us to crash a flawed web-server using a buffer overflow attack, but we could get extra credit if we managed to root the server with the buffer overflow. Buffer overflow attacks on real software are nontrivial, something my professor made sure to emphasize when he told us that in his whole experience of teaching the class, only one student had ever successfully rooted the server. Now Scott Adams, author of Dilbert, mentioned in one of his books that the best way to motivate an engineer, is to tell them that a task is nearly impossible and that if they're not up for the challenge, then its no big deal because so-and-so could probably do it better. This must be true, because after discussion section, I went straight to a computer and started reading Smashing the Stack for Fun and Profit, intent on being the second person to "smash the stack" in my school.

I was able to crash the server software within less then one minute of swapping the machine's disk image in, as it was a ridiculously simple task to guess where an unbounded buffer was being used in the server code and force a segmentation fault. It took me a few more minutes to trace down the exact location of where the overflow was occurring (there was a location in the code where the program would use strcat() to copy a message), but as soon as I did, I booted up GDB and started gearing up for some hard work.

I created a Python script to generate the message that would crash the server, and ran it a few times against the server, stepping through instructions in GDB to get a better look at what was happening. The first thing I noticed was that there were local pointers and variables on the stack that would get over written when I overflowed the unbounded buffer. Since these values were used after the server called strcat() (which would copy my message into an unbounded buffer on the stack), corrupting these values could very well result in the program crashing prematurely, before it even reached the end of the function. This is generally a bad thing, since my intent is not to crash the software, but rather to make it do whatever I want it to. So the first thing I had to do, was figure out how to generate a message that would overwrite the local stack variables with values that would allow the function to think that everything was functioning properly. Since the size of the local stack is known, I was able to adjust my message length so I could modify stack variables. I had to modify pointers on the stack (since they were above the stored instruction pointer on the stack) and changed them (which were supposed to point to dynamically allocated memory on the heap) to point to someplace much further down on the stack; this corrupted the values near the base of the stack but prevented segmentation faults from occurring. It took me a while to remember that x86 is a little-endian machine, and that I would have to be careful how I arranged the bytes in my message. It wasn't long however, before I hit a brick wall.

I had already copied the shell code from Aleph One's Smashing the Stack for Fun and Profit into my message and filled the space before it with a nice little no-op slide, so at this point I figured I was ready for overwriting the function's return address. I tried for over an hour to get the return address to point to somewhere in my overflowed buffer but no matter how I set it up, the function always crashed when it returned. On a whim, I set the return address on the stack to point to a location not on the stack, but rather in the data segment where program code is stored. This worked beautifully. Unfortunately, this was quite useless since nowhere in the program code, did it execvp() a bash shell and connect it to my machine. Sure I could make the program do some really crazy things, but this still wasn't the same as rooting the machine.

By this point in time, I was spending a lot of time checking out register values, and was making extensive use of the "disassemble" instruction in GDB. After spending a few minutes searching the inter-webs, I learned about an instruction in GDB called "si" which is basically a "step instruction" kind of instruction that steps through each assembly level instruction in the code. Using "si" I found that the server was crashing on the "ret" assembly instruction, and that more specifically, it was crashing right as it tried to load the $EIP register. I guessed that what was happening was that the stack pointer had been reset to the former base pointer and so my overflowed buffer which resided outside of the program stack was being viewed by the machine as unallocated memory so attempts to access it were resulting in a segmentation fault, but this turned out to be wrong.

It took a few minutes of pondering, but then I realized, that clearly, the machine was operating with executable bit protection (where a control bit on each page of memory allows a page to be either executable or writable but not both) and there was no way it was going to run an instruction that wasn't sitting in a page of memory that had the executable bit set. (Note that executable bit protection requires not only a processor that has the necessary technology, but also an operating system that can recognize.) So there was no way I could get the machine to run a single instruction on my carefully crafted no-op slide.

I had already invested a great deal of time into this attack by this point. I've explained things pretty simply up until now, but make no mistake, every bit of progress that I made was bought with hours of work (for the most part, I was learning completely new things). I was faced with two choices at this point, give up on trying to root the system, thereby wasting many hours of work, and commit myself to finishing the rest of the homework, or else continue with my attack. My obsession won out and I chose the hard route; continue with the attack.

Executable bit protection, which is a sort of protected memory, is not the end all method to protecting against buffer overflow attacks as one might first naively think; there are still ways to finagle root access to a machine with a buffer overflow. One in particular, the return-to-libc attack, makes use of the knowledge that on Unix systems, libc is always linked to the program and is accessed at a well known location in memory. Libc contains many useful functions for an intrepid hacker, including system(). In a libc attack, instead of filling the stack with some assembly code and then getting the return address to point to that assembly code, the intent instead is to point the return address to a specific function in libc and then pass as arguments, some value on the stack. Calling a function in libc is perfect legitimate behavior for a program so this kind of buffer-overflow based attack is much harder to protect against. However, it does complicate an attacker's life...

Now heres a rough ASCII picture of the stack for the web-server program that I was trying to crash (note that the stack grows up).
======top of stack=====
|                 |
|                 |
|                 |
|                 |
++local variables++
++frame pointer++++
++return address+++
++socket id++++++++
=====bottom of stack====

Note that below the return address, is a socket identifier that is passed as an argument to the callee function. In the web-server program, the value of the socket identifier is used after a message is copied into the array. What this means, is that if I wanted to write down the stack past the return address, I had to make sure to maintain the value of the socket identifier in order to prevent the machine from crashing on a bad socket id. This was an important fact to know, because in order to be able to make a useful call to a function in libc, I had to be able to pass arguments to the function. Arguments are of coursed passed to other functions by pushing values onto the top of the stack, which in this case would mean overwriting the socket id.

And at this point, I hit a problem that I simply couldn't shift.

I thought at first, that all I had to do was figure out how to put a value in the socket id that would allow the program to execute normally, and could also double as an argument to excevp (which is by no means a trivial task). However, the big problem was that the socket id number was generally a small number, in this case, typically ranging from 4 to 7. The socket id was a 4 byte signed integer, and 7 represented in 4 bytes in hex is:

0x0000 0007

Now recall that the way I was achieving a buffer overflow, was to send a really long message, and then expect the flawed web-server to use strcat() to copy that message into a buffer that was too small to properly accommodate the data. The function strcat() copies bytes from a source pointer to a destination pointer until it hits a null byte. Now look at the hex value above, and notice all the leading zeros! Those leading zeros would be interpreted by strcat() as a null byte which would cause it to immediately stop copying bytes! When I realized this, I was a little stunned, but quickly recovered when I figured that I could simply open a bunch of different connections to the server and hold them open, thereby increasing the socket number used with each successive connection. Lucky me, the server software was quite stupid, and it didn't use SYN cookies and nor did it make any attempt to close connections that were left open and inactive for too long. In order to remove all traces of null bytes from the 32 bit integer, I would need at the very least, a socket number value of:

0x0101 0101

This is a value slightly larger then 16 million, and it gives rise to yet another problem. So lets say that for whatever reason, the server's OS doesn't grant socket numbers in perfect successive values, and instead gives socket ids as multiples of 10 (10, 20 30, 40...), then I would only have to open up 1.6 million connections instead of 16 million! Each full connection that I opened up to the server would cause the server to fork a process to handle the new data connection, so if I opened up 1.6 million connections, the server would have to fork 1.6 million children.

Question: How many processes can a vanilla Red Hat Linux installation handle?

Answer: A lot less then 1.6 million...

Likewise, there was no way the server had the ability to handle 1.6 million half-open connections. So there was no way I was going to be able to pump the socket id up to a high enough value without DOSing the hell out of the server. As my OS professor would say, I was hosed. At this point in time, I was only about a third done with my homework and it was due in about two hours. For the sake of my grade, I was forced to forget the extra credit challenge and pour my efforts into trying to finish the rest of the assignment. (I ended up getting about a 40% on the assignment.) For all the hours and hours of effort that I put into smashing the stack, the only things I succeeded in doing were to make the server crash in several ways and to DOS it. Over the weekend, I put a few more hours of thinking into the problem, but was unable to come up with a satisfactory solution.

So what did I learn from this 12 hour ordeal?

A good buffer overflow attack on real software is a non-trivial problem. AlephOne and every other buffer overflow explanation on the Internet use highly contrived attack scenarios that make it seem like a simple thing, but the fact remains that the devil is in the details. All the little things, like how to get around needed values on the stack, and how to copy a null byte into a buffer are completely ignored in introductory tutorials.

Buffer overflows are hard!

Comments (0) Trackbacks (0)

No comments yet.

Leave a comment

Security Code:

Trackbacks are disabled.