Is my understanding of buffer overflows correct?

Question

I am new to pentesting and wondering if my current understanding of buffer overflow exploits is correct. Assuming an operating address space of 3 addresses, an instruction space of 2 addresses, and an adjacent 1 address instruction pointer:

(The arrow is the instruction pointer's current position.)

==>000x243  
001xFFF

002x26B  
003x4FF  
004x14C  

005x000 (points to address 000 as the next instruction)

Would the buffer overflow replace the operating space with 4 addresses, overwriting the instruction pointer call to the operating space with the address of one of the operating space, thus making the next executed machine code be one of the exploiter's choosing? Shown here:

000x243  
001xFFF

==>002x456 (arbitrary machine code to be executed)  
003xB3B  
004x00F

005x002 (points to address 002 as next instruction)

[this answer](http://security.stackexchange.com/questions/82750/why-are-buffer-overflows-executed-in-the-direction-they-are/82846#82846) might be useful to you. — RoraΖ, Feb 25 '16 at 15:51

score 2 · Answer 1 · answered Feb 25 '16 at 16:07

Usually, no, buffer overflow exploits is not about overwriting code. But there are definition issues.

A buffer overflow exploit (of the "write" kind) leverages a situation where the target system can be made to write more data than fits in the area that it writes to, thereby overwriting whatever was in RAM next to the buffer, with data that can be more or less controlled by the attacker. The classical exploit technique is to locate in that overwritten data a pointer to code that will be, at some point, followed by the target application. Thus, the attacker can make the application derail and got executing instructions that are elsewhere in RAM.

While code is in RAM, most if not all of it is in read-only sections; the operating system uses the platform's MMU to mark these sections as such, and any attempt at writing over code will trigger an exception from the OS kernel. Also, in a typical system, code sections and data sections are rather far apart in the address space, making the overrun hard to attempt.

A very classical situation is when the overflown buffer is on the stack (in C parlance, it is a "local variable"), because the stack also includes pointers to code, namely the saved instruction pointer. When a function is entered, the instruction pointer is saved on the stack, so that when the function returns, execution can jump back to the call site. This saved instruction pointer is thus a code pointer that will be followed at some point in the execution, so it is a prime target for overflows.

Another rather common case is related to the use of vtables in languages that support an object-oriented development style (typically C++, but "object-oriented C" would also qualify). Vtables are tables in RAM, full of pointers to functions, that are followed when methods are invoked. Such pointers are good targets for overflow exploits (the PS3 Jailbreak was working that way).

Now there are languages that are not compiled to raw instructions, but to a special format called threaded code. Threaded code is basically a long succession of pointers to code (to other threaded code, or to sequences of native instructions). This is typical of languages that accept building code at runtime, i.e. things which are often described as "interpreters". A buffer overflow in that context could lead to an overwrite of the (threaded) code, because, from the point of view of the kernel and the CPU, threaded code is data. This is the part where definition details can bite you.

Note that all of the above is about an overflow where data is written into a buffer, and past that buffer. There are also "read" overflows (maybe the term "overrun" would be more appropriate), in which data is read past a source buffer (and presumably written into another buffer). This can lead to leakage of secrets, and also to the application working on inconsistent data. Consequences vary greatly, depending on what the application does with the data. This can be very inconvenient, that is, very convenient for an attacker, if the overflow is such that the attacker obtains parts of cryptographic keys or other sensitive data.

But, when it overwrites the saved instruction pointer, couldn't it use the filler as code as well? eg, the space is 20 addresses, and it writes 20 bytes of machine code and then 1 byte that specifies the start of the previously written machine code? Wouldn't that streamline it a bit? — ThePracticalCryptographer, Feb 25 '16 at 16:20
Wouldn't that be a good idea to streamline a buffer overflow? — ThePracticalCryptographer, Feb 25 '16 at 16:21
Ah, well, while the OS traps attempts at writing data over code, it traditionally did not stop using data as code, so the classical exploit would include the attacker's chosen alternate code right into the data it sends for the overflow. Modern desktop/server OS mark the stack as "non-executable" so this kind of thing is harder. — Tom Leek, Feb 25 '16 at 16:53

Is my understanding of buffer overflows correct?

1 Answers1