TippingPoint Digital Vaccine Laboratories
DID YOU KNOW... DVLabs and our Zero Day Initiative were credited with discovering 17 Microsoft vulnerabilities in 2006 alone.

MindshaRE: Looping in Assembly


MindshaRE is our weekly look at some simple reverse engineering tips and tricks.  The goal is to keep things small and discuss every day aspects of reversing.  You can view previous entries here by going through our blog history.

After the entry last week comparing source to disassembly I thought it might be a good idea to cover some basics.  Often when learning how to read assembly is helps to take source code, compile it, and then look at it in your disassembler of choice to get an understanding of how a language looks in its final form.  By doing this you can pick out common patterns in assembly quickly.

So for today we are going to look at loops in assembly.  In particular these are the 3 looping constructs available in C, for, while, and do while.  For each one I will give a brief explanation and a comment about the loop being used.  I have included the source, disassembly, and screenshot of the diassembly using the IDA graph view.  I know a lot of people detest the IDA graph view, but for loops it is very handy and I use it religiously to quickly see code flow in loops.

All of these examples have been compiled with the Microsoft compiler version 15.00.21022.08.  No optimization or debug flags have been used.  For the curious try and compile your own with various optimization and debug flags.

Source: for_loop.c

printf("I am executing loop\n");
for (i=0; i<256; i++)
{
    printf("I am executing %d\n", i);
}
printf("I am done executing loop\n");

Binary: for_loop.exe

00401018  mov     [ebp+var_4], 0
0040101F  jmp     short loc_40102A
00401021  mov     eax, [ebp+var_4]
00401024  add     eax, 1
00401027  mov     [ebp+var_4], eax
0040102A  cmp     [ebp+var_4], 100h
00401031  jge     short loc_401046
00401033  mov     ecx, [ebp+var_4]
00401036  push    ecx
00401037  push    offset aIAmExecutingD ; "I am executing %d\n"
0040103C  call    printf
00401041  add     esp, 8
00401044  jmp     short loc_401021
00401046  push    offset aIAmDoneExecuti ; "I am done executing loop\n"
0040104B  call    printf

Screenshot: for_loop.jpg





Anyone familiar with programming has surely written a few thousand for loops.  Our tell-tell sign is the initialization of the counter variable used in the for loop before the actual loop test.  In our case we are setting a local variable "i" to 0.  This can be seen at .text:00401018.  Looking at the graph view allows us to quickly see our comparison to 256 and the branch to either continue execution or terminate.  It also allows us to see the "add eax, 1" (AKA i++) before our next iteration of the loop.

Source: while_loop.c

printf("I am executing loop\n");
while (i < 256)
{
    printf("I am executing %d\n", i);
    i++;
}
printf("I am done executing loop\n");

Binary: while_loop.exe

00401015 add     esp, 4
00401018 cmp     [ebp+var_4], 100h
0040101F jge     short loc_40103D
00401021 mov     eax, [ebp+var_4]
00401024 push    eax
00401025 push    offset aIAmExecutingD ; "I am executing %d\n"
0040102A call    printf
0040102F add     esp, 8
00401032 mov     ecx, [ebp+var_4]
00401035 add     ecx, 1
00401038 mov     [ebp+var_4], ecx
0040103B jmp     short loc_401018
0040103D push    offset aIAmDoneExecuti ; "I am done executing loop\n"
00401042 call    printf

Screenshot: while_loop.jpg



The while loop is a much simpler loop to look at because it does not have the intrinsic ability to initialize data being tested.  In our case we are again checking to make sure our counter "i" is less than 256.  As previously mentioned in a while loop we do not see the initialization of the counter before the loop begins because it is up to the programmer to prepare any tests being measured in the loop.  As you can see in the graph view we also have less basic blocks.  This is because the compiler is not incrimenting our counter for us.  Instead it is compiling our code into a single basic block.  An astute reader will notice that by using a while loop we save a branch instruction.

Source: do_while_loop.c

printf("I am executing loop\n");
do
{
    printf("I am executing %d\n", i);
    i++;
} while (i < 256);
printf("I am done executing loop\n");

Binary: do_while_loop.exe
00401018 mov     eax, [ebp+var_4]
0040101B push    eax
0040101C push    offset aIAmExecutingD ; "I am executing %d\n"
00401021 call    printf
00401026 add     esp, 8
00401029 mov     ecx, [ebp+var_4]
0040102C add     ecx, 1
0040102F mov     [ebp+var_4], ecx
00401032 cmp     [ebp+var_4], 100h
00401039 jl      short loc_401018
0040103B push    offset aIAmDoneExecuti ; "I am done executing loop\n"
00401040 call    printf

Screenshot: do_while_loop.jpg



The do while loop is obviously similar to the while loop.  Except for one very important distinction, the lack of a check at the top of the loop.  This means we will always execute code at least once, then check for our condition.  Once again going to the graph view shows us the loop is happening in a single basic block.  Our code is executed, our counter is incrimented, and then our check against 256 happens.  Again those paying attention to potential optimization will notice the do while in this case only hase a single branch instruction.

I hope this has been a handy example of loops in assembly.  Obviously in the real world looping in general is much more complex.  However, they all share the same test and branch logic as these examples. Try and spot some loops in other binaries you may have.  Maybe in future weeks we can revist this and see how other language features compile into assembly.
 
Tags: reverse engineering,assembly,MindshaRE
Published On: 2008-06-13 17:33:35

Comments post a comment

No comments.
Trackback