MindshaRE is our weekly look at some simple reverse engineering tips and tricks. The goal is to keep things small and discuss every day aspects of reversing. You can view previous entries here by going through our blog history.
It is important to know as much about your target as humanly possible before you start reverse engineering anything. By doing this we have a better understanding of how things "probably" work. Getting insight through available documentation on the net or included with the product is our first stop.
We often scour the vendors site, focusing on any technical documents available. This can give us solid ideas about what we are attempting to reverse. Especially when doing vulnerability analysis it is imperative to dig into the setup and use of a product. Also keeping in mind support forums and general problems people may have. If bugs exist in normal day to day operation, exploitable security bugs may not be too far behind.
We must also first understand how all the components work together before we can break down a single binary. This again can be gleaned from installation or operation documentation. More often than not a vendor will have an "Administration Guide" or "Installation Guide" that will help us get a feeling for the larger picture.
Once we have established a good understanding of how the binary works in its respective environment we open the binary in IDA and take a deep breath. After letting IDA do its auto-analysis we begin to navigate the binary. During this time we are not trying to read any of the assembly, but instead make sure IDA did its job well by looking for unidentified functions, ambiguous blocks of unidentified data, and various other common analysis mistakes.
Spending some time making sure IDA has done its job again gives us a solid foundation to work from. If we are trying to actually reverse a binary, and have to stop every ten seconds to fix a function, or cross reference, it tends to slow us down more than investing the time in an initial fix-up stage.
Once we feel comfortable with the disassembly we check out each section, looking to see how much code exists, how much data exists, what the read only section looks like, and most importantly the import section.
The import section is often the first time we really pay attention to the information in the binary, and not just the disassembly. By looking at the library calls that occur in the binary you can get a great idea of what the binary does. For instance seeing the library call InternetOpenUrl tells us in an instant that at some point this binary will access an outside resource (most likely an HTTP URL). Some of the interesting family of imports we tend to look out for are sockets, files, windowing, debugging, and often misused string libraries.
After we get a feel for the libraries being used we'll typically jump over to IDA's strings window. Like the imported libraries, the strings being referenced can tell a story about the binaries inner workings. It's prudent to always keep an eye out for descriptive strings like debugging messages or verbose logging options. Finding strings like this, as we have discussed in previous MindshaRE articles, can be a boon for a reverse engineer. The idea is to constantly be getting a feel for the binary as a whole, and how it works on a level just above the actual assembly code.
Next, we typically quickly move through the data section looking for vtables or other important data structures that may catch our eye. For instance, if we see a vtable with a large number of code cross references we can make a mental note for when we encounter these while reversing their use. We also spend some time fixing up the data types. IDA tries to correctly identify type information in the data section, but it always errs on the side of caution. We will create dwords where seen fit, and define any other obvious structures, always making mental notes.
Now that we have gained a little understanding from a higher level we will dig into the actual assembly. It is of the utmost importance that we set a goal, or a set of questions we want to answer, before really diving into the assembly. Our goal must be well defined and outlined. Sticking to this goal will always keep us moving forward and not get distracted. Here are some of the common goals our team is specifically interested in when performing binary audits:
- How does this process receive network data?
- How does this process parse network data?
- How does this process interact with the user?
- Does this process use a database?
- Does this process use RPC or any other remoting method?
- Does this process contain encryption routines?
With that said, one of the first things we do is look at the most cross referenced functions. By doing so, our efforts will always help future endeavors. Let's say we have a function that is called 4000 times throughout the binary. If we identify that it is a memory allocation routine, we have just made those other 4000 functions that much easier to understand. These common functions are the building blocks for more complex code we may encounter later.
So we are finally at our "starting" point. This is where a generic approach can no longer be described. Each goal, or question we try and answer will need special attention and may rely on varying techniques. For things like auditing network protocols we like to start at the reception of a packet. For RPC functions we begin by identifying the registration of client and server interfaces. It really depends on what we are trying to achieve, but the idea is to use the information we spent time in the beginning gathering. Hopefully by now you have a good understanding of the playing field.
If you notice we tend to work from the top down. High level documentation all the way down to the actually assembly code being executed by the processor. In our experience this is the best way for the human mind to actually understand what we are looking at. It also has proved the easiest, and most rewarding for us. As reverse engineers we take all the information we can get. This lessens our need to extract each and every clue from the assembly level, which is often very time consuming.
To summarize it all, learn everything you can about the process before you start. Make sure you have a solid base to work from. And finally, outline a goal so that you can stay focused and make progress.
Leave a comment with some additional ideas. We would love to add to our repertoire.
-Cody
