TippingPoint Digital Vaccine Laboratories

MindshaRE: Using Structures

This week on MindshaRE we take a quick look at structures. I often see new reverse engineers skipping the creation of structures they encounter when disassembling a binary. While it is true that they can be slightly time consuming to create, the payoff in the end can far outweigh the minimal time investment. The biggest benefit will be during such things as OO method invocation, file format parsing, or packet tracing.  Hopefully the examples I have will convince you to spend those extra 20 minutes defining clean structures next time you run across them in a binary.

MindshaRE is our weekly look at some simple reverse engineering tips and tricks. The goal is to keep things small and discuss every day aspects of reversing. You can view previous entries here by going through our blog history.

Everyone knows what a structure is.  A defined container for structured data that will be programmatically accessed.  In a higher level language access to these elements is typically by name, for instance sk_buff->len.  However, in assembly, we have to use an offset from the start of the structure.  This may be where new reverse engineers go cross-eyed. It's easy to understand that accessing sk_buff->len gets the length of the packet data in our structure.  But when you encounter "mov eax, dword ptr [ebx+30h]" things may get a little confusing (Note: I didn't look up the actual offset for sk_buff->len). No need to fret though, assembly can be much easier to understand if we spend time defining structures, and their members, into a more readable form.

First, lets look at the structures window in IDA (Shift+F9).  Opening that up doesn't look to inviting, sans some help text.  Here is what you probably see.
 ; Ins/Del : create/delete structure
 ; D/A/*   : create structure member (data/ascii/array)
 ; N       : rename structure or structure member
 ; U       : delete structure member
If you have loaded symbols you may have some additional structures listed, but in general this window is empty when disassembling a new binary.  The commands should be straight forward.  When we want a new structure we use Ins/Del (Sorry Apple laptops!) to create it.  Doing so will ask us for a name.  There also exist some extra options like "Create before current structure" and "Don't include in the list" which are useful, but in most cases will not be needed.

Before we finish with this window by hitting "OK" click the "Add standard structure" button.  A slew of important data structures should populate the window.  The almost 10k structures listed are for common structures that occur in various SDK's like the Windows Platform SDK.  Choosing one of the structures will automatically add it, and all of its associated members which can be extremely helpful. You can experiment with these later, for now hit "Cancel" and create your new structure. You should get the following.
00000000 example         struc ; (sizeof=0x0)
00000000 example         ends
This is our empty structure.  Not very exciting, so lets add a member. Clicking the top of the example structure and hitting "D" gives us a new field, or member.  The default size of newly created members/fields is one byte.  We can easily change this by selecting the field, and hitting "D" again.  Just like working with data in the disassembly window, repeated "D" keystrokes will cycle this between the supported data types (Byte, Word, Dword).  Also notice the size of the structure will update accordingly.  Let's add a few more just for fun.  Here's mine.
00000000 example         struc ; (sizeof=0x10)
00000000 field_0         dd ?
00000004 field_4         dd ?
00000008 field_8         dd ?
0000000C field_C         db ?
0000000D field_D         db ?
0000000E field_E         db ?
0000000F field_F         db ?
00000010 example         ends
The automatic naming of members is handy.  As you can see they are named according to their offset as well.  For instance field_4 will be "example+4" in assembly.  Let's say that through our reversing efforts we know that example+4 is a dword containing a type.  We can change this name and get that much closer to a readable structure we can use in our disassembly. To achieve this highlight field_4 and hit "N".  This brings up a name window.  Let's put in "type" for the name.

00000000 example         struc ; (sizeof=0x10)
00000000 field_0         dd ?
00000004 type            dd ?
00000008 field_8         dd ?
0000000C field_C         db ?
0000000D field_D         db ?
0000000E field_E         db ?
0000000F field_F         db ?
00000010 example         ends

Fine. We have a structure represented in our structures window.  Now we must use it. One of the most important things to keep in mind when we start to use these structures is to be certain we are applying them correctly.  It does you zero good to apply this example struct to something that is actually an exception handler structure.  Let's pretend this assembly snippet is accessing our newly created structure.

.text:01004130     push    dword ptr [eax+4]
.text:01004133     call    _createnum

This is typical structure access.  Without applying a type to it the argument seems ambiguous.  Let's fix that by highlighting the offset "4" in "eax+4" and hitting "T".  This brings up our defined structures.  You should see the following.



Selecting our example.type member will convert the meaningless "eax+4" into the easily readable assembly below.
.text:01004130     push    [eax+example.type]
.text:01004133     call    _createnum
Creating and applying structures may seem tedious.  But I promise it will make your life much easier when you start applying your newly created structures to your binary. Creating structures can indeed get overwhelming when dealing with large structures.  For instance, creating a structure with over 30 members by hand is a nightmare.  In this case we can automate the task.
#include <idc.idc>

static main()
{
    auto id, rc;
    auto i, count;
    auto sname, oname;
    
    sname = AskStr("user_struct", "Structure name");
    count = AskLong(64, "Number of dword sized members") / 4;
    
    id = AddStrucEx(-1, sname, 0);
    for (i=0; i <= count; i++)
    {
        oname = "field_" + ltoa(i * 4, 16);
        rc = AddStrucMember(id, oname, i*4, 0x20000400, -1, 4);
    }
}
Running this will create a structure with your name and number of dword elements.  Writing IDC scripts to define structures can be very powerful.  Lets take another look at a more complex example combining all of these techniques.

Adobe Acrobat's plugin architecture makes extensive use of structures in the form of classes.  Taking a look at the assembly will give you nightmares at night if you do not define and apply structure labels.  Take a look at a small example.
.text:23834076     mov     esi, dword_239345BC
.text:2383407C     mov     eax, dword_2393459C
.text:23834081     add     esi, 18h
.text:23834084     call    dword ptr [eax+600h]
...                
.text:238340A7     movzx   eax, ax
.text:238340AA     mov     [ebp+var_4], eax
.text:238340AD     mov     eax, dword_239345BC
.text:238340B2     push    esi
.text:238340B3     call    dword ptr [eax+30h]
.text:238340B6     add     esp, 24h
.text:238340B9
.text:238340B9 loc_238340B9:
.text:238340B9     mov     eax, dword_23934560
.text:238340BE     call    dword ptr [eax+0Ch]
With nothing labeled this is nonsense.  Fixing up the names and adding structures gives us the following.
.text:23834076     mov     esi, pASExtraHFT
.text:2383407C     mov     eax, pAcroViewHFT
.text:23834081     add     esi, 18h
.text:23834084     call    [eax+s_acroviewHFT.AVAppGetLanguageEncoding] ; AVProcs.h
...                
.text:238340A7     movzx   eax, ax
.text:238340AA     mov     [ebp+var_4], eax
.text:238340AD     mov     eax, pASExtraHFT
.text:238340B2     push    esi
.text:238340B3     call    [eax+s_asextraHFT.ASTextDestroy] ; ASExtraProcs.h
.text:238340B6     add     esp, 24h
.text:238340B9
.text:238340B9 loc_238340B9:
.text:238340B9     mov     eax, pCoreHFT
.text:238340BE     call    [eax+s_coreHFT.ACPopExceptionFrame] ; AcroRd32.ACPopExceptionFrame
Much better.  We can now focus on what this function is doing, instead of the methods it is invoking.  Also notice when we apply a name we get a comment inserted.  You can do this by adding comments to members in your defined structure. All of these names were automatically added to the IDB via a script.  A little research and work before reversing has saved countless hours.

There are many other facets to adding and using structures.  I have touched on their basic usage.  Try to play around with creating structures and applying them to your IDB.  I cant stress enough how important it is when getting into larger projects.  Hope this gave you a good starting point.

-Cody

[UPDATE] I uploaded the vt2st.idc IDC script that Ali mentioned below in the comments.
Tags:
Published On: 2008-09-04 13:36:25

Comments post a comment

  1. Ali Rizvi-Santiago commented on 2008-09-04 @ 17:02

    If anybody digs through a lot of code with c++ vtables.
    I wrote the following .idc script for automatically building structures as one incrementally discovers the methods for a particular object. Might be useful to other people who like to enumerate all methods of an object, as it can save a lot of typing. ;)

  2. Cody Pierce commented on 2008-09-04 @ 18:01

    @Ali: We appended the script to the end of the post. You rock.

  3. Max commented on 2008-09-07 @ 17:43

    Great article, as usual. FWIW, Fn-M sends Insert to a Fusion guest on a mac laptop. Creating IDA structures is the only reason why this is useful at all :)

  4. Cody Pierce commented on 2008-09-09 @ 11:14

    @Max: Thanks for the keyboard help. I am still bitter at my macbook because I spent 10 minutes figuring out how to break into WinDbg. Fn-Ctrl-Shift-Esc, is what I finally used :)

  5. Tey' commented on 2008-09-09 @ 12:35

    I wish I read your article a few months ago, it would have avoid me a few headaches :)

    However, as you said above, it's really tedious to convert every operand to structure field pointer manually. Is there any way to tell IDA that a specific register holds the address to a structure variable, so that IDA automatically converts the next references to that register to structure fields accesses ? Some plugins claim to be able to do that, but my (freeware) version of IDA can't use them -_-

  6. Cody Pierce commented on 2008-09-10 @ 11:20

    @Tey': I am not aware of how to declare a register as a structure for IDA so that it will automatically apply a name. You can highlight a group of instructions, press T, and it should pop up the structure window so you can apply labels to all highlighted register+offsets. I bet you could script this to a degree, although you'd have to make sure you are dealing with the right register/structure.

  7. Tey' commented on 2008-09-11 @ 15:34

    Oh great ! I didn't know IDA was able to handle it when selecting multiple lines. Thanks for the tip ... and thanks for your interesting articles also ;)


Trackback