Buffer Overflow - Easy to understand primer

Points: 286 views Comments: 2 Comments Tag:

Every week, there are security vulnerabilities reported in widely deployed softwares. Many of these threats are buffer-overflow exploitation using which a malicious user could gain control of a computer system by crafting a special input data. These buffer overflows are found in web-browsers, web-servers and all other types of programs and services. No doubt, buffer overflow is a serious threat to system and data integrity.

Could someone please describe in simple terms what buffer overflow exactly means with some examples like the recent heapspray vulnerability and other recent exploits ?

Thanks in Advance.

Basic Process & Memory Information

Most modern systems can run multiple programs and perform multi-tasking. The operating system provide access points ( known as system calls ) to these programs to execute properly. Now these processes which are running on the system generally consist of :

  • PCB - Information about the state of the process. In order to manage and control processes, the operating system must know certain specific information about each process stored in Process Control Block.
  • Text - The executable program code itself,
  • Data - The data on which the program will execute,
  • The resources required for the execution such as memory, files access (Heap & Stack)

An executing program is made up of three main memory areas :

  • Instruction Memory - contains the program code in machine language, which is executed by the CPU
  • Stack Area - This is composed of Activation records which contains information related to function calls, arguments passed,local variables and so forth. Both caller and callee must know the location of data to access it, this is achieved by using Instruction Pointer. The stack portion of memory, is used by the process to aid the invocation of functions. Every time a function is called, the process reserves a portion of stack memory to store the values of parameters passed to the functions as well as for results returned by the functions and the local variables used within the functions. The stack is also where space for all declared data types and structures is reserved at compile time.
  • Heap Area - This area of memory contains dynamic length data. The heap is a portion of memory allocated dynamically (as needed, at runtime) for the use of the process. Whenever the malloc or calloc functions are used in C for instance, they reserve space in heap memory.

Buffer overflows generally occur on the on the heap or the stack. But, since data on heap does not control information flow there is it rarely used in coding exploits.

Understanding Function Call & Memory Stack

I will try to explain what happens, when a function is called. Consider the code below for this example

void function (int a, int b, int c) {
char buffer1[6];
char buffer2[20];
}
int main() {
function(1,2,3);
}

For easier follow up, i have broken the process into 11 steps :

  1. Push the parameters (a, b, c) onto the stack
  2. Call the function1 by pushing EIP into stack. EIP contains the address of next CPU instruction.
  3. Now we are into the new function, so create a new stack frame. Push EBP which points to the top of the frame. EBP is the Base Pointer also known as Frame Pointer, used to calculate distance to function parameters and local variables
  4. Allocate local variables, between EBP and ESP. ESP ( stack pointer ) is the last element used on the stack.
  5. Save Old CPU register values, this is done to prevent local variables overwrite the global variables.
  6. ...Function runs and executes its code....
  7. Release the local variables
  8. Restore the CPU registers
  9. Restore the value of EBP
  10. Return from function, POP the old EIP and jump to that point in the stack
  11. Clean up the passed parameter values from the stack

So when you are on Step 6 , the stack looks like this ...

At last, What is Buffer Overflow ?

Now, consider a case where the function is expecting a string of max length of 100 characters but your pass a string parameter which is more than 100 characters(lets say 140 characters). In such a case, if not properly handled the system will try to push a 140character string on a 100character allocated buffer - hence, the extra characters will run past the buffer and overwrite the space allocated for EBP, EIP and so on. ( Not good )

This, in turn, will corrupt the process stack. So a properly crafted buffer overflow can overwrite a function's return address (EIP), which in turn can alter the program's execution path. EIP is the address of the next instruction in memory, which is executed immediately after the function returns.

Once a hacker can overwrite a function's return address, he will want to spawn a shell (with root permissions) by jumping the execution path to such code. But, lets say there is no such code in the buggy program which has root permissions then what ?

If thats the case, then the hacker would place the code he is trying to execute in the buffer's overflowing area. Then he has to overwrite the return address so it points back to the buffer and executes the intended code. Such code can be inserted into the program using environment variables or program input parameters.

Hope this helps...

Post new comment

  • Allowed HTML tags: <a> <em> <strong> <cite> <code> <ul> <ol> <li> <dl> <dt> <dd> <h1> <quote> <img>
  • Lines and paragraphs break automatically.

More information about formatting options

CAPTCHA
This question is for testing whether you are a human visitor and to prevent automated spam submissions.
Image CAPTCHA
Copy the characters (respecting upper/lower case) from the image.