MindshaRE is our weekly look at some simple reverse engineering tips and tricks. The goal is to keep things small and discuss every day aspects of reversing. You can view previous entries here by going through our blog history.
Everyone knows what a structure is. A defined container for structured data that will be programmatically accessed. In a higher level language access to these elements is typically by name, for instance sk_buff->len. However, in assembly, we have to use an offset from the start of the structure. This may be where new reverse engineers go cross-eyed. It's easy to understand that accessing sk_buff->len gets the length of the packet data in our structure. But when you encounter "mov eax, dword ptr [ebx+30h]" things may get a little confusing (Note: I didn't look up the actual offset for sk_buff->len). No need to fret though, assembly can be much easier to understand if we spend time defining structures, and their members, into a more readable form.
First, lets look at the structures window in IDA (Shift+F9). Opening that up doesn't look to inviting, sans some help text. Here is what you probably see.
; Ins/Del : create/delete structure
; D/A/* : create structure member (data/ascii/array)
; N : rename structure or structure member
; U : delete structure memberIf you have loaded symbols you may have some additional structures listed, but in general this window is empty when disassembling a new binary. The commands should be straight forward. When we want a new structure we use Ins/Del (Sorry Apple laptops!) to create it. Doing so will ask us for a name. There also exist some extra options like "Create before current structure" and "Don't include in the list" which are useful, but in most cases will not be needed.
Before we finish with this window by hitting "OK" click the "Add standard structure" button. A slew of important data structures should populate the window. The almost 10k structures listed are for common structures that occur in various SDK's like the Windows Platform SDK. Choosing one of the structures will automatically add it, and all of its associated members which can be extremely helpful. You can experiment with these later, for now hit "Cancel" and create your new structure. You should get the following.
00000000 example struc ; (sizeof=0x0)
00000000 example endsThis is our empty structure. Not very exciting, so lets add a member. Clicking the top of the example structure and hitting "D" gives us a new field, or member. The default size of newly created members/fields is one byte. We can easily change this by selecting the field, and hitting "D" again. Just like working with data in the disassembly window, repeated "D" keystrokes will cycle this between the supported data types (Byte, Word, Dword). Also notice the size of the structure will update accordingly. Let's add a few more just for fun. Here's mine.
00000000 example struc ; (sizeof=0x10)
00000000 field_0 dd ?
00000004 field_4 dd ?
00000008 field_8 dd ?
0000000C field_C db ?
0000000D field_D db ?
0000000E field_E db ?
0000000F field_F db ?
00000010 example endsThe automatic naming of members is handy. As you can see they are named according to their offset as well. For instance field_4 will be "example+4" in assembly. Let's say that through our reversing efforts we know that example+4 is a dword containing a type. We can change this name and get that much closer to a readable structure we can use in our disassembly. To achieve this highlight field_4 and hit "N". This brings up a name window. Let's put in "type" for the name.
00000000 example struc ; (sizeof=0x10)
00000000 field_0 dd ?
00000004 type dd ?
00000008 field_8 dd ?
0000000C field_C db ?
0000000D field_D db ?
0000000E field_E db ?
0000000F field_F db ?
00000010 example ends
Fine. We have a structure represented in our structures window. Now we must use it. One of the most important things to keep in mind when we start to use these structures is to be certain we are applying them correctly. It does you zero good to apply this example struct to something that is actually an exception handler structure. Let's pretend this assembly snippet is accessing our newly created structure.
.text:01004130 push dword ptr [eax+4]
.text:01004133 call _createnum
This is typical structure access. Without applying a type to it the argument seems ambiguous. Let's fix that by highlighting the offset "4" in "eax+4" and hitting "T". This brings up our defined structures. You should see the following.

Selecting our example.type member will convert the meaningless "eax+4" into the easily readable assembly below.
.text:01004130 push [eax+example.type]
.text:01004133 call _createnumCreating and applying structures may seem tedious. But I promise it will make your life much easier when you start applying your newly created structures to your binary. Creating structures can indeed get overwhelming when dealing with large structures. For instance, creating a structure with over 30 members by hand is a nightmare. In this case we can automate the task.
#include <idc.idc>
static main()
{auto id, rc;
auto i, count;
auto sname, oname;
sname = AskStr("user_struct", "Structure name");count = AskLong(64, "Number of dword sized members") / 4;
id = AddStrucEx(-1, sname, 0);
for (i=0; i <= count; i++)
{oname = "field_" + ltoa(i * 4, 16);
rc = AddStrucMember(id, oname, i*4, 0x20000400, -1, 4);
}
}Running this will create a structure with your name and number of dword elements. Writing IDC scripts to define structures can be very powerful. Lets take another look at a more complex example combining all of these techniques.
Adobe Acrobat's plugin architecture makes extensive use of structures in the form of classes. Taking a look at the assembly will give you nightmares at night if you do not define and apply structure labels. Take a look at a small example.
.text:23834076 mov esi, dword_239345BC
.text:2383407C mov eax, dword_2393459C
.text:23834081 add esi, 18h
.text:23834084 call dword ptr [eax+600h]
...
.text:238340A7 movzx eax, ax
.text:238340AA mov [ebp+var_4], eax
.text:238340AD mov eax, dword_239345BC
.text:238340B2 push esi
.text:238340B3 call dword ptr [eax+30h]
.text:238340B6 add esp, 24h
.text:238340B9
.text:238340B9 loc_238340B9:
.text:238340B9 mov eax, dword_23934560
.text:238340BE call dword ptr [eax+0Ch]With nothing labeled this is nonsense. Fixing up the names and adding structures gives us the following.
.text:23834076 mov esi, pASExtraHFT
.text:2383407C mov eax, pAcroViewHFT
.text:23834081 add esi, 18h
.text:23834084 call [eax+s_acroviewHFT.AVAppGetLanguageEncoding] ; AVProcs.h
...
.text:238340A7 movzx eax, ax
.text:238340AA mov [ebp+var_4], eax
.text:238340AD mov eax, pASExtraHFT
.text:238340B2 push esi
.text:238340B3 call [eax+s_asextraHFT.ASTextDestroy] ; ASExtraProcs.h
.text:238340B6 add esp, 24h
.text:238340B9
.text:238340B9 loc_238340B9:
.text:238340B9 mov eax, pCoreHFT
.text:238340BE call [eax+s_coreHFT.ACPopExceptionFrame] ; AcroRd32.ACPopExceptionFrameMuch better. We can now focus on what this function is doing, instead of the methods it is invoking. Also notice when we apply a name we get a comment inserted. You can do this by adding comments to members in your defined structure. All of these names were automatically added to the IDB via a script. A little research and work before reversing has saved countless hours.
There are many other facets to adding and using structures. I have touched on their basic usage. Try to play around with creating structures and applying them to your IDB. I cant stress enough how important it is when getting into larger projects. Hope this gave you a good starting point.
-Cody
[UPDATE] I uploaded the vt2st.idc IDC script that Ali mentioned below in the comments.
