I would say besides the navigation keys (Esc, Enter, Ctrl-Enter, Arrows), the most often sequence I use is X / Ctrl-X. That's right, cross references. Okay, maybe I use others just as much, but for today's MindshaRE we will be discussing cross references in IDA (I wanted to add some impact to the topic). I will briefly cover what they are, the different types of references, and share some scripts utilizing xrefs that hopefully make your day easier.
MindshaRE is our weekly look at some simple reverse engineering tips and tricks. The goal is to keep things small and discuss every day aspects of reversing. You can view previous entries here by going through our blog history.
Cross references in IDA are invaluable. They show any code, or data, which reference or are referenced from your current position within the binary. This can be in the form of function references, local variable cross references, or data xrefs. When navigating a binary one of the most common uses is function cross references. We need the ability to see what other pieces of code may hit the one we are interested in. This is exactly what xrefs are for.
Lets say we want to see all functions that call the following routine:
.text:76F2CA90 Dns_GetRandomXid proc near ; CODE XREF: Dns_BuildPacket+79
.text:76F2CA90 ; Dns_NegotiateTkeyWithServer+20C
.text:76F2CA90
.text:76F2CA90 call Dns_RandGenerateWord
.text:76F2CA95 jmp loc_76F23D8A
.text:76F2CA95 Dns_GetRandomXid endpPressing Ctrl-X (JumpXref) when are cursor is on the functon name we get the following dialog listing the cross references.

Note that references are also visible as automatic comments under the function name. This is useful for enumerating xrefs at a glance. The number of references shown is configurable through the SHOW_XREFS variable in IDA.CFG.
Looking at the dialog box we see four columns, Direction, Type, Address, Text. The first column denotes which direction in the binary the caller is located. Up being before the current function, at a lower address. Down being after the function, at a higher address. Type, indicates what type of cross reference we are looking at. In our example all our references are of type "p" meaning procedure, later we will see others also exists such as write, read, and offset. Address is the location of the cross reference. In this instance the address is the location of the call to our current function. In our example we have symbols so it is an offset from that symbol, but if we didn't have symbols it would be a typical hex address. Text, is the text that appears at our references address. For this example we see a typical call, others may show the instructions reading or writing to our cross reference.
Having symbols helps us easily identify the purpose of a function. Cross references help us identify the path code takes to a function. As an example I created an IDA Python script called get_recursive_xrefs.py that takes matters one step further. This script will take a function, and recursively grab cross references to the function. This gives us a calling tree as far back as possible to the current function. Running the script produces the following sample output:
Getting xrefs to 76f39c9f (Dns_RandGenerateWord)
========================================================================
Dns_RandGenerateWord
Dns_GetRandomXid
Dns_BuildPacket
Query_SingleName
Query_Main
Query_InProcess
DnsQuery_W
privateNarrowToWideQuery
DnsQuery_UTF8
DnsFindAuthoritativeZone
DoQuickFAZ
CompareTwoAdaptersForSameNameSpace
DnsQuery_A
ShimDnsQueryEx
CombinedQueryEx
DnsQueryExW
QueryDirectEx
Dns_FindAuthoritativeZoneLib
Dns_PingAdapterServers
DnsModifyRRSet_Ex
DnsRegisterRRSet_Ex
DnsModifyRRSet_Ex
DnsRegisterRRSet_Ex
Dns_UpdateLib
Dns_UpdateLibEx
DnsUpdate
DoQuickUpdateEx
DoMultiMasterUpdate
DoQuickUpdate
Dns_NegotiateTkeyWithServer
Dns_DoSecureUpdate
sub_76F3BB68
Dns_UpdateLib
Dns_UpdateLibEx
DnsUpdate
DoQuickUpdateEx
DoMultiMasterUpdateAs you can see we get a nice list of calling functions. In an instant I can find what might create transaction IDs in Windows.
I mentioned earlier that functions are certainly not the only thing you can cross reference. When in a function we can also reference local or global variables in an operand (JumpOpXref). Lets look at the next cross section of assembly.
.text:76F39CCC inc edi
.text:76F39CCD push edi
.text:76F39CCE push offset aMicrosoftStron ; "Microsoft Strong Crypto"...
.text:76F39CD3 push ebx
.text:76F39CD4 lea eax, [ebp+var_4]
.text:76F39CD7 push eax
.text:76F39CD8 call ds:_imp__CryptAcquireContextA
.text:76F39CDE test eax, eax
.text:76F39CE0 jnz short loc_76F39CE5Putting our cursor on the local name "var_4" at address 76F39CD4 and pressing "X" gives us a familiar dialog box.

We discussed all of the columns already, the only new information is the "w", and "r" types. I alluded to this earlier but it simply means the instruction either reads, or writes to the target variable. This can be extremely helpful in identifying when a variable is initialized.
Lets move to a different section in the binary. Looking through the .data section of most binaries can be interesting. We obviously see lots of global addresses that are used to hold values. Knowing both the x-ref shortcuts (Ctrl-X, X) can get us the references to those locations. But lets look at another common occurrence in the data section. Vtables are accessed via the data section in most cases. The problem is the only xref from the data section will be at the beginning of the vtable. However if we do a xref on each function in the binary we can determine a function that is called from the data section. For instance the xrefs to the function MxWireRead look like this.

Following those show us an obvious table of handlers.
.data:0104E300 RRWireReadTable ; DATA XREF: Wire_CreateRecordFromWire+43
.data:0104E300 dd offset CopyWireRead
.data:0104E304 dd offset AWireRead
.data:0104E308 dd offset PtrWireRead
.data:0104E30C dd offset PtrWireRead
.data:0104E310 dd offset PtrWireRead
.data:0104E314 dd offset PtrWireRead
.data:0104E318 dd offset SoaWireRead
.data:0104E31C dd offset PtrWireRead
.data:0104E320 dd offset PtrWireRead
.data:0104E324 dd offset PtrWireRead
.data:0104E328 dd offset CopyWireRead
.data:0104E32C dd offset CopyWireRead
.data:0104E330 dd offset PtrWireRead
.data:0104E334 dd offset CopyWireRead
.data:0104E338 dd offset MinfoWireRead
.data:0104E33C dd offset MxWireRead...
...This is clearly a tedious process, perfect for automating with a script. So I wrote an IDA Python script called find_data_section_functions.py which runs through every functions xrefs that originate from the data section producing the following example output:
0x0104e164: AFileRead
...
0x0104e234: AWireWrite
0x0104e1d0: AaaaFileRead
0x0104e030: AaaaFlatRead
0x0104df60: AaaaValidate
0x0104e1e8: AtmaFileRead
0x0104e048: AtmaFlatRead
0x010135a8: ControlCallback
...
0x0104e0f4: KeyFlatWrite
0x01013acf: Log_PrintRoutine
0x01006f9c: MIDL_user_allocate
0x01006fa0: MIDL_user_free
...
0x0104edb4: MxFileWrite
0x0104edc0: MxFileWrite
0x0104edcc: MxFileWrite
0x0104dffc: MxFlatRead
0x0104e008: MxFlatRead
0x0104e014: MxFlatRead
...
0x01006ff0: R_DnssrvComplexOperation
0x01007004: R_DnssrvComplexOperation2
0x01006ff4: R_DnssrvEnumRecords
0x01007008: R_DnssrvEnumRecords2
0x01006fe8: R_DnssrvOperation
0x01006ffc: R_DnssrvOperation2
0x01006fec: R_DnssrvQuery
0x01007000: R_DnssrvQuery2
...
0x01017044: freeDpInfo
0x01018b41: freeServerObject
0x0101736d: freeStringArray
0x0101aa4f: pluginAllocator
0x0101aa4a: pluginFree
0x0104ed00: processPrimaryLine
0x0101b67d: recurseConnectCallback
0x01041e90: respondToServiceControlMessage
0x0104eea4: startDnsServer
0x0104d20c: sub_1020330
0x0104d214: sub_1020330
0x0102f8c7: updateForwardConnectCallback
0x0103606d: zoneTransferSendThreadFinally, there are three additional graphs IDA provides the user for viewing cross references. These graphically display the same cross references we covered, and can even display a down graph showing all the functions your current function may call. These can be located in the Views->Graphs->Xrefs to/Xrefs from/User xrefs chart. While these may be handy they suffer two problems. First and foremost you can't navigate them. That means you can see an interesting function but have to switch back to IDA and manually type the address in to jump there. Secondly, it can be unwieldy often times showing thousands of functions. This can be limited using the User xrefs chart and limiting the recursion depth, but i would rather just run a script I can interact with. Play around with the graphs, you may find them very helpful.
There are many many other uses for cross references. I can't possibly cover everything in this little post. I hope this has been a good intro to them, or maybe sparked some ideas of your own. Leave a comment if you have any novel uses, or other useful hints. I hope you enjoyed this weeks MindshaRE.
-Cody
