In this article, we will talk about buffer overflow, which is one of the most popular vulnerabilities known to be exploited with great success in the wild. We’ll see what makes them so exciting for the security research community, what are the most common buffer overflow techniques, and then we will take a deep dive into an application that is known to be vulnerable to this type of attacks: prepare for a step-by-step guide that shows you how it can be exploited. After the JAD Java Decompiler 1.5.8e study case, you’ll find a comprehensive list of resources for you to learn more about this topic.
So, what exactly is it that makes the buffer overflow vulnerability so interesting?
The answer is simple: a malicious actor can change or modify the entire functionality of a software.
To understand what happens when someone exploits this type of vulnerability, I’ll give you a simple example of buffer overflow, thinking about a glass of water. What happens if you pour 1 liter of water into a 250ml glass? Obviously, the water will come out of the glass and spread on the table. This is how this vulnerability works. Someone is trying to add more data to a memory location than is allowed. This will result in overwriting of certain areas that allow an attacker to manipulate that software.
Examples of popular attacks
- RCE on Steam Client via buffer overflow in Server Info: In 2018, according to the HackerOne user @vinnievan, he crafted a python server for CS:GO, Half-Life, TF2. After a successful implementation of the protocol, he fuzzed all the parameters and found a stack-based buffer overflow. The full report can be found here: HackerOne (Bounty reward $18,000).
- 1-byte heap buffer overflow in DNS resolver nginx: In 2021, according to the HackerOne user @luismerino, he used a specially crafted DNS response to overwrite 1 byte memory from heap in nginx, this resulting in crash or potential remote code execution. The full report can be found here: HackerOne (Bounty reward $500).
- VLC 4.0.0 – Stack Buffer Overflow (SEH): In 2019, according to the HackerOne user @qrayn, he found Buffer overflow in rist.c module for VLC which caused application crash and SEH record overwrite. Successful exploitation can compromise the full system. The full report can be found here: HackerOne (Bounty reward $2,817).
What are the most common buffer overflow techniques?
Did you know that buffer overflow was discovered in 1988? This type of vulnerability was exploited by the Morris worm and the purpose of this software was to gauge the size of the Internet. But this did not happen well: the worm caused Denial of Service. After this incident, many researchers have developed all sorts of exploitation techniques called buffer overflow and protection methods to stop this kind of vulnerability.
Among the most common buffer overflow techniques we can mention:
- Buffer overflow Smash the Stack: In short, someone can use the return address to push malicious code into the stack. Thus, an attacker can manipulate the program itself.
- Buffer overflow Structured Exception Handling (SEH): Structured exception handling (SEH) is simply code in a program that is meant to handle situations when a program throws an exception due to a hardware or software issue. This means identifying those situations and trying to solve them. Someone can overwrite the SEH record (its pointer to the next SEH record and current SEH record’s handler) and then inject a malicious code.
- Buffer overflow Unicode: An attacker can do buffer overflow by inserting Unicode characters into the expected input of ASCII characters. The memory will be “translated” in Unicode: this operating method is a bit harder to do because it has 2 stages, one by which the stack is aligned and all null bytes are removed (Venetian shellcode), then the stage in which the exact offset of the NOPs is calculated so that the malicious shellcode can be executed.
- Buffer overflow Return Oriented Programming (ROP) Chains: This technique is used when the program has a very small buffer as a limit or when bypassing system protections such as DEP (Data Execution Prevention).
Note: Of course, the list above is just a small part of the techniques and the most popular ones. Other well known techniques of buffer overflow are: Heap Buffer Overflow, ret2lib, ret2DlResolve, ret2csu, lazy bind techniquest, ASCII Armour bypass technique, ROP (Return Oriented Programming), JOP (Jump Oriented Programing) and more.
More about the exploit: the code that takes advantage of a software vulnerability
Exploits are one of the most interesting topics in cybersecurity because someone can create or modify the functionality of the main application to do something else. In this respect, I would like to showcase the Linux exploitation and development and replicate JAD Java Decompiler 1.5.8e – Local Buffer Overflow by Juan Sacco. Although this vulnerability is quite old, it is found in some applications. In addition, if you want to test such vulnerabilities, I encourage you to solve some exercises from the pwn section available on CyberEDU.
Before I start, let’s come back to what exactly buffer overflow is. According to Wikipedia, buffer overflow is an anomaly where a program writes data in the buffer location. This section in many cases has some limit imposed by programmers. What happens if you exceed this limitation? Buffer data overwrites the memory location, which means it’s possible to control the memory and execute code maliciously (shellcode, in other words).
Note: Shellcode is a piece of code written in Assembly Language which is used as the payload in the software exploitation.
You can install Java Decompiler 1.5.8e directly from here.
The next step is to analyze vulnerable functions: go to IDA Pro at sub_8048ADC. The application tries to read some file – we can see this “mov edi, offset aPrematureEof”. We know the EDI register has special uses with the string instructions, that means he moves all the strings on to the stack using “push offset aJavaclassfiler”.
But what happens when we read more strings than the program allows?
Now we send some strings. Open the application in gdb.
The location of the stack pointed to by the (ESP) register is at 0xffffa660 and we can manipulate the value stored. In this case, we overwrite ESP with AAAAA:
Before calculating the offset, we know the location of the buffer variable is at “ 0xffffa660”, but we need to know where the “start of the ESP” is or “where the end of ESP” is.
We can calculate the offset with a simple mathematical calculation (ffffa660-ffff8686=8154).
Repeat the step. Send junk and check if we have control of the EIP registry.
In other words, if we find some gadget “call esp”, we can store our shellcode inside the stack and execute malicious commands.
Send the junk with “call esp” address and some “\xcc (int 3)” as a breakpoint.
Now let’s create our own shellcode (piece of malicious code).
Use editor online for converting ASM in opcode: https://defuse.ca/online-x86-assembler.htm#disassembly2.
Final Proof of Concept:
Useful references for future reading and practical exercises
How can you deepen buffer overflow and other similar techniques?
Always remember that Google is our friend, try to look for resources online that would help you better understand the topic. Look for blogs, articles and e-books – it’s free and it’s a gold mine when it comes to the outcome you’ll get. Below are just a few examples:
- Corelan Coders blog
- Hacking: The Art of Exploitation, 2nd Edition 2nd Edition, Kindle Edition
- The Shellcoder’s Handbook: Discovering and Exploiting Security Holes, 2nd Edition 2nd Edition
How can you deepen Assembly Language?
Before you start to go deep in the Exploit Development field, you should start with Assembly Language. Maybe you wonder “why?”. The first step in understanding a vulnerability like buffer overflow is learning and knowing “How Intel/AMD x86 architecture works”. After you do that, you need to use reverse engineering techniques to disassemble a software in opcode (machine code). This opcode can be translated later in Assembly Language with tools like GNU Debugger, IDA Pro and Ghidra. Below are just a few examples:
- Introduction to x86 Assembly Language videos
- The Art of Assembly Language, 2nd Edition Second Edition
- The Ghidra Book: The Definitive Guide Kindle Edition
How can you deepen Python for exploitation skills?
You might ask “why do I need Python”, right? Well, in cybersecurity and especially in exploitation it helps you to craft a reliable proof of concept. Sometimes you can’t do all the tasks, so Python is your friend. This programming language is easy to understand and use.
It offers you a mindset and all the tools to do your job better. Currently, there are some modules especially created for exploitation development called pwntools. Below are just a few examples of books and tutorials:
- Pwntool tutorial
- Black Hat Python: Python Programming for Hackers and Pentesters 1st Edition
- Gray Hat Python: Python Programming for Hackers and Reverse Engineers
Where can I practice what I learned?
CyberEDU offers exercises that will help you bypass some protection techniques, such as ASLR, NX, ROP, Format string attack and more. Explore the exercises below to develop a more in-depth approach and apply the skills you achieved.
- cookies (UNbreakable individual competition). In this challenge you will have the opportunity to mitigate the Stack Canary (NX) protection in the Linux system leaking the address of canary.
- can-you-jump (ROCSC 2021 competitions). In this challenge you will experiment with Jump Oriented Programming (JOP) which allows you to craft a new technique based on the ROP (Return Oriented Programing).
- baby-pwn (UNbreakable teams competition). In this challenge you will have the opportunity to do the basic buffer overflow smash the stack using the obsolete function.
- function-check (UNbreakable teams competition). In this challenge you will experiment with the format string attack. You will need to figure out how to change a variable value to pass the execution flow.
- bazooka (DefCamp 2020 competitions). In this challenge you will experiment with Return Oriented Programming and Address Space Layout randomization (ASLR) which leads you to leak the libc address and the execute bash command via system address.
- darkmagic (DefCamp 2020 competitions). In this challenge you will have the opportunity to mitigate the Stack Canary (NX) protection in the Linux system leaking the address of canary. But to do that you need to figure out how to manipulate the loop from inside the program.
Where to and what’s next?
If you want to learn more about the topic and improve your cybersecurity skills, you can follow CyberEDU on Facebook, Twitter and LinkedIn. Moreover, if you are a student interested in pursuing a cybersecurity career, make sure you explore the opportunities that UNbreakable Romania has to offer.
UNbreakable Romania is an end-to-end cybersecurity educational program for high school and university students in Romania. Through its activities, UNbreakable provides an X-ray and visualisation of the level of cybersecurity skills nationally. UNbreakable’s mission is to bring together students who are passionate about cybersecurity, so that they have all the resources required to develop the necessary skills to become good cybersecurity specialists. It also provides a competitive environment that encourages collaboration and experience exchange. This way, UNbreakable is actively participating in bridging the cyber security workforce gap locally and internationally.
About the Author
Darius Moldovan (T3jv1l)
Passionate about application security, Darius currently works as a Penetration Tester at Bit Sentinel. He is also Cyber Security Tournaments Manager & Security Labs Author at CyberEDU. In his spare time, he participates in the Synack Red Team in private bug bounty programs or explores challenges in the area of reverse engineering and exploit development.
In the past, Darius has been one of the authors for UNbreakable Romania 2020 & 2021, DefCamp Capture the Flag 2020 and Romanian Cyber Security Challenge (RoCSC) 2021.