TippingPoint Digital Vaccine Laboratories

MindshaRE: Cross References in IDA


I would say besides the navigation keys (Esc, Enter, Ctrl-Enter, Arrows), the most often sequence I use is X / Ctrl-X.  That's right, cross references.  Okay, maybe I use others just as much, but for today's MindshaRE we will be discussing cross references in IDA (I wanted to add some impact to the topic).  I will briefly cover what they are, the different types of references, and share some scripts utilizing xrefs that hopefully make your day easier.

MindshaRE is our weekly look at some simple reverse engineering tips and tricks.  The goal is to keep things small and discuss every day aspects of reversing.  You can view previous entries here by going through our blog history.

Cross references in IDA are invaluable.  They show any code, or data, which reference or are referenced from your current position within the binary.  This can be in the form of function references, local variable cross references, or data xrefs.  When navigating a binary one of the most common uses is function cross references.  We need the ability to see what other pieces of code may hit the one we are interested in.  This is exactly what xrefs are for.

Lets say we want to see all functions that call the following routine:
.text:76F2CA90 Dns_GetRandomXid proc near  ; CODE XREF: Dns_BuildPacket+79
.text:76F2CA90                             ; Dns_NegotiateTkeyWithServer+20C
.text:76F2CA90
.text:76F2CA90 call    Dns_RandGenerateWord
.text:76F2CA95 jmp     loc_76F23D8A
.text:76F2CA95 Dns_GetRandomXid endp
Pressing Ctrl-X (JumpXref) when are cursor is on the functon name we get the following dialog listing the cross references.



Note that references are also visible as automatic comments under the function name.  This is useful for enumerating xrefs at a glance. The number of references shown is configurable through the SHOW_XREFS variable in IDA.CFG.

Looking at the dialog box we see four columns, Direction, Type, Address, Text.  The first column denotes which direction in the binary the caller is located. Up being before the current function, at a lower address. Down being after the function, at a higher address.  Type, indicates what type of cross reference we are looking at.  In our example all our references are of type "p" meaning procedure, later we will see others also exists such as write, read, and offset. Address is the location of the cross reference.  In this instance the address is the location of the call to our current function.  In our example we have symbols so it is an offset from that symbol, but if we didn't have symbols it would be a typical hex address.  Text, is the text that appears at our references address.  For this example we see a typical call, others may show the instructions reading or writing to our cross reference.

Having symbols helps us easily identify the purpose of a function.  Cross references help us identify the path code takes to a function.  As an example I created an IDA Python script called get_recursive_xrefs.py that takes matters one step further.  This script will take a function, and recursively grab cross references to the function.  This gives us a calling tree as far back as possible to the current function.  Running the script produces the following sample output:
Getting xrefs to 76f39c9f (Dns_RandGenerateWord)
========================================================================
Dns_RandGenerateWord
 Dns_GetRandomXid
  Dns_BuildPacket
   Query_SingleName
    Query_Main
     Query_InProcess
      DnsQuery_W
       privateNarrowToWideQuery
        DnsQuery_UTF8
         DnsFindAuthoritativeZone
          DoQuickFAZ
          CompareTwoAdaptersForSameNameSpace
         DnsQuery_A
        ShimDnsQueryEx
         CombinedQueryEx
          DnsQueryExW
    QueryDirectEx
     Dns_FindAuthoritativeZoneLib
      Dns_PingAdapterServers
       DnsModifyRRSet_Ex
        DnsRegisterRRSet_Ex
         DnsModifyRRSet_Ex
          DnsRegisterRRSet_Ex
     Dns_UpdateLib
      Dns_UpdateLibEx
       DnsUpdate
        DoQuickUpdateEx
         DoMultiMasterUpdate
          DoQuickUpdate
   Dns_NegotiateTkeyWithServer
    Dns_DoSecureUpdate
     sub_76F3BB68
      Dns_UpdateLib
       Dns_UpdateLibEx
        DnsUpdate
         DoQuickUpdateEx
          DoMultiMasterUpdate
As you can see we get a nice list of calling functions.  In an instant I can find what might create transaction IDs in Windows.

I mentioned earlier that functions are certainly not the only thing you can cross reference.  When in a function we can also reference local or global variables in an operand (JumpOpXref).  Lets look at the next cross section of assembly.
.text:76F39CCC inc     edi
.text:76F39CCD push    edi
.text:76F39CCE push    offset aMicrosoftStron ; "Microsoft Strong Crypto"...
.text:76F39CD3 push    ebx
.text:76F39CD4 lea     eax, [ebp+var_4]
.text:76F39CD7 push    eax
.text:76F39CD8 call    ds:_imp__CryptAcquireContextA
.text:76F39CDE test    eax, eax
.text:76F39CE0 jnz     short loc_76F39CE5
Putting our cursor on the local name "var_4" at address 76F39CD4 and pressing "X" gives us a familiar dialog box.



We discussed all of the columns already, the only new information is the "w", and "r" types.  I alluded to this earlier but it simply means the instruction either reads, or writes to the target variable.  This can be extremely helpful in identifying when a variable is initialized.

Lets move to a different section in the binary.  Looking through the .data section of most binaries can be interesting.  We obviously see lots of global addresses that are used to hold values.  Knowing both the x-ref shortcuts (Ctrl-X, X) can get us the references to those locations.  But lets look at another common occurrence in the data section.  Vtables are accessed via the data section in most cases.  The problem is the only xref from the data section will be at the beginning of the vtable.  However if we do a xref on each function in the binary we can determine a function that is called from the data section.  For instance the xrefs to the function MxWireRead look like this.



Following those show us an obvious table of handlers.
.data:0104E300 RRWireReadTable ; DATA XREF: Wire_CreateRecordFromWire+43
.data:0104E300   dd offset CopyWireRead 
.data:0104E304   dd offset AWireRead
.data:0104E308   dd offset PtrWireRead
.data:0104E30C   dd offset PtrWireRead
.data:0104E310   dd offset PtrWireRead
.data:0104E314   dd offset PtrWireRead
.data:0104E318   dd offset SoaWireRead
.data:0104E31C   dd offset PtrWireRead
.data:0104E320   dd offset PtrWireRead
.data:0104E324   dd offset PtrWireRead
.data:0104E328   dd offset CopyWireRead
.data:0104E32C   dd offset CopyWireRead
.data:0104E330   dd offset PtrWireRead
.data:0104E334   dd offset CopyWireRead
.data:0104E338   dd offset MinfoWireRead
.data:0104E33C   dd offset MxWireRead...
...
This is clearly a tedious process, perfect for automating with a script. So I wrote an IDA Python script called find_data_section_functions.py which runs through every functions xrefs that originate from the data section producing the following example output:
0x0104e164: AFileRead
...
0x0104e234: AWireWrite
0x0104e1d0: AaaaFileRead
0x0104e030: AaaaFlatRead
0x0104df60: AaaaValidate
0x0104e1e8: AtmaFileRead
0x0104e048: AtmaFlatRead
0x010135a8: ControlCallback
...
0x0104e0f4: KeyFlatWrite
0x01013acf: Log_PrintRoutine
0x01006f9c: MIDL_user_allocate
0x01006fa0: MIDL_user_free
...
0x0104edb4: MxFileWrite
0x0104edc0: MxFileWrite
0x0104edcc: MxFileWrite
0x0104dffc: MxFlatRead
0x0104e008: MxFlatRead
0x0104e014: MxFlatRead

...
0x01006ff0: R_DnssrvComplexOperation
0x01007004: R_DnssrvComplexOperation2
0x01006ff4: R_DnssrvEnumRecords
0x01007008: R_DnssrvEnumRecords2
0x01006fe8: R_DnssrvOperation
0x01006ffc: R_DnssrvOperation2
0x01006fec: R_DnssrvQuery
0x01007000: R_DnssrvQuery2
...
0x01017044: freeDpInfo
0x01018b41: freeServerObject
0x0101736d: freeStringArray
0x0101aa4f: pluginAllocator
0x0101aa4a: pluginFree
0x0104ed00: processPrimaryLine
0x0101b67d: recurseConnectCallback
0x01041e90: respondToServiceControlMessage
0x0104eea4: startDnsServer
0x0104d20c: sub_1020330
0x0104d214: sub_1020330
0x0102f8c7: updateForwardConnectCallback
0x0103606d: zoneTransferSendThread
Finally, there are three additional graphs IDA provides the user for viewing cross references.  These graphically display the same cross references we covered, and can even display a down graph showing all the functions your current function may call.  These can be located in the Views->Graphs->Xrefs to/Xrefs from/User xrefs chart.  While these may be handy they suffer two problems.  First and foremost you can't navigate them.  That means you can see an interesting function but have to switch back to IDA and manually type the address in to jump there.  Secondly, it can be unwieldy often times showing thousands of functions.  This can be limited using the User xrefs chart and limiting the recursion depth, but i would rather just run a script I can interact with.  Play around with the graphs, you may find them very helpful.

There are many many other uses for cross references.  I can't possibly cover everything in this little post.  I hope this has been a good intro to them, or maybe sparked some ideas of your own.  Leave a comment if you have any novel uses, or other useful hints.  I hope you enjoyed this weeks MindshaRE.

-Cody
Tags:
Published On: 2008-07-24 16:29:30

Comments post a comment

  1. Dima commented on 2008-07-24 @ 22:27

    When working with MSVC compiled binaries i also find useful to have a script which gives you class names, extracted from runtime type identification information, which reference a particular function.

  2. haj commented on 2008-07-25 @ 05:34

    The MindshaRE series is brilliant!
    Little chunks of really useful information accessible to those of us still learning the art of reversing.
    Thanks for sharing them and keep 'em coming!

  3. Cody Pierce commented on 2008-07-25 @ 10:35

    @Dima: Great idea! I have some similar scripts that I will hopefully get around to talking about.

    @haj: Thanks for the kind words. I am glad you enjoy it.

  4. Rolf Rolles commented on 2008-07-28 @ 10:42

    Agreed: it's a shame that the user xrefs feature still uses WinGraph instead of IDA's graph viewer. I've spent a few hours investigating this and it seems like it shouldn't be too difficult to whip up a plugin that does this. IDA's graphing interface is IMO one of the nicer parts of the IDA SDK; it's a shame that so few plugins have surfaced which take advantage of it.

  5. djinus commented on 2008-08-12 @ 08:24

    get_recursive_xrefs.py when i try does not work it say nameerror name log is not defiend

  6. Cody Pierce commented on 2008-08-13 @ 14:04

    @djinus: Sorry about that. I use a helper library for IDAPython scripts I write. You can replace the calls to "log" with the standard python "print" and it should work just fine.


Trackback