MindshaRE is our weekly look at some simple reverse engineering tips and tricks. The goal is to keep things small and discuss every day aspects of reversing. You can view previous entries here by going through our blog history.
After the entry last week comparing source to disassembly I thought it might be a good idea to cover some basics. Often when learning how to read assembly is helps to take source code, compile it, and then look at it in your disassembler of choice to get an understanding of how a language looks in its final form. By doing this you can pick out common patterns in assembly quickly.
So for today we are going to look at loops in assembly. In particular these are the 3 looping constructs available in C, for, while, and do while. For each one I will give a brief explanation and a comment about the loop being used. I have included the source, disassembly, and screenshot of the diassembly using the IDA graph view. I know a lot of people detest the IDA graph view, but for loops it is very handy and I use it religiously to quickly see code flow in loops.
All of these examples have been compiled with the Microsoft compiler version 15.00.21022.08. No optimization or debug flags have been used. For the curious try and compile your own with various optimization and debug flags.
Source: for_loop.c
printf("I am executing loop\n");for (i=0; i<256; i++)
{ printf("I am executing %d\n", i);}
printf("I am done executing loop\n");Binary: for_loop.exe
00401018 mov [ebp+var_4], 0
0040101F jmp short loc_40102A
00401021 mov eax, [ebp+var_4]
00401024 add eax, 1
00401027 mov [ebp+var_4], eax
0040102A cmp [ebp+var_4], 100h
00401031 jge short loc_401046
00401033 mov ecx, [ebp+var_4]
00401036 push ecx
00401037 push offset aIAmExecutingD ; "I am executing %d\n"
0040103C call printf
00401041 add esp, 8
00401044 jmp short loc_401021
00401046 push offset aIAmDoneExecuti ; "I am done executing loop\n"
0040104B call printf
Screenshot: for_loop.jpg

Anyone familiar with programming has surely written a few thousand for loops. Our tell-tell sign is the initialization of the counter variable used in the for loop before the actual loop test. In our case we are setting a local variable "i" to 0. This can be seen at .text:00401018. Looking at the graph view allows us to quickly see our comparison to 256 and the branch to either continue execution or terminate. It also allows us to see the "add eax, 1" (AKA i++) before our next iteration of the loop.
Source: while_loop.c
printf("I am executing loop\n");while (i < 256)
{ printf("I am executing %d\n", i);i++;
}
printf("I am done executing loop\n");Binary: while_loop.exe
00401015 add esp, 4
00401018 cmp [ebp+var_4], 100h
0040101F jge short loc_40103D
00401021 mov eax, [ebp+var_4]
00401024 push eax
00401025 push offset aIAmExecutingD ; "I am executing %d\n"
0040102A call printf
0040102F add esp, 8
00401032 mov ecx, [ebp+var_4]
00401035 add ecx, 1
00401038 mov [ebp+var_4], ecx
0040103B jmp short loc_401018
0040103D push offset aIAmDoneExecuti ; "I am done executing loop\n"
00401042 call printf
Screenshot: while_loop.jpg

The while loop is a much simpler loop to look at because it does not have the intrinsic ability to initialize data being tested. In our case we are again checking to make sure our counter "i" is less than 256. As previously mentioned in a while loop we do not see the initialization of the counter before the loop begins because it is up to the programmer to prepare any tests being measured in the loop. As you can see in the graph view we also have less basic blocks. This is because the compiler is not incrimenting our counter for us. Instead it is compiling our code into a single basic block. An astute reader will notice that by using a while loop we save a branch instruction.
Source: do_while_loop.c
printf("I am executing loop\n");do
{ printf("I am executing %d\n", i);i++;
} while (i < 256);
printf("I am done executing loop\n");Binary: do_while_loop.exe
00401018 mov eax, [ebp+var_4]
0040101B push eax
0040101C push offset aIAmExecutingD ; "I am executing %d\n"
00401021 call printf
00401026 add esp, 8
00401029 mov ecx, [ebp+var_4]
0040102C add ecx, 1
0040102F mov [ebp+var_4], ecx
00401032 cmp [ebp+var_4], 100h
00401039 jl short loc_401018
0040103B push offset aIAmDoneExecuti ; "I am done executing loop\n"
00401040 call printf
Screenshot: do_while_loop.jpg

The do while loop is obviously similar to the while loop. Except for one very important distinction, the lack of a check at the top of the loop. This means we will always execute code at least once, then check for our condition. Once again going to the graph view shows us the loop is happening in a single basic block. Our code is executed, our counter is incrimented, and then our check against 256 happens. Again those paying attention to potential optimization will notice the do while in this case only hase a single branch instruction.
I hope this has been a handy example of loops in assembly. Obviously in the real world looping in general is much more complex. However, they all share the same test and branch logic as these examples. Try and spot some loops in other binaries you may have. Maybe in future weeks we can revist this and see how other language features compile into assembly.
