MindshaRE is our weekly look at some simple reverse engineering tips and tricks. The goal is to keep things small and discuss every day aspects of reversing. You can view previous entries here by going through our blog history.
When analyzing a binary looking for patterns can help quickly identify what purpose a function may serve. By doing this we can gain an insight into how a binary works. There are plenty of patterns you can identify. In this case we will be discussing functions that handle encryption or compression.
There are hundreds of instructions in Intel assembly language. Most are never used. In fact, running some heuristics proves that less than 100 are used (in most cases). We can use this to our advantage when identifying encryption/compression routines. These functions in almost every case do bit shifting and flipping. Doing so requires the usage of a few key instructions such as xor, shl, shr, ror.
Obviously these instructions can be used for many things. However, in encryption/compression functions they occur in an easily identifiable pattern. Lets look at a sample from the Kraken bot.
001AF08F shl eax, 4
001AF092 add eax, [ebp+var_8]
001AF095 mov edi, edx
001AF097 shr edi, 5
001AF09A add edi, [ebp+var_C]
001AF09D xor eax, edi
001AF09F lea edi, [esi+edx]
001AF0A2 xor eax, ediOne of our hints is the xor. The xor of two different registers is a tell-tale sign of encryption or compression. If we can identify a few of these we might be able to automate the identification of such routines.
I have come up with a few metrics to do this. I give each rule a weight. My script runs through each function in a binary, and calculates a score. If a function scores high enough it will print out its location. This has proved fairly effective at quickly identifying interesting functions. Here's my rules.
- xor of different registers is weighted the highest
- shl, shr, ror, rol, and cdq are counted as well, all having a lower score than xor since they occur naturally
- If any of these instructions occur in a loop it increases the score
- If any of these instructions are in the same basic block it increases the score
We are always looking for ways to better understand functions in a binary. Using patterns is a good way to do this quickly. Try putting this in a script and running it on various binaries.
-Cody
