TippingPoint Digital Vaccine Laboratories
DID YOU KNOW... DVLabs and our Zero Day Initiative were credited with discovering 17 Microsoft vulnerabilities in 2006 alone.

MindshaRE: Using Symbols

I've mentioned in a previous posting that cross references are the crux of reverse engineering. Exploring the connections between blocks of code and from function to function will reveal large quantities of information about your target. Those cross references however are useless without symbolic information, which can include names generated by the reverse engineer as well as names applied through a symbol file. Symbol files are easy to use, yet I still see people that are unaware of them, or do not know how to properly use them.  So in today's MindshaRE we will cover the topic of symbols.  To keep things short we are only going to be dealing with Microsoft's PDB symbol files for PE executables.

MindshaRE is our weekly look at some simple reverse engineering tips and tricks.  The goal is to keep things small and discuss every day aspects of reversing.  You can view previous entries here by going through our blog history.

Symbols are a mapping of names to addresses in a compiled binary.  Developers use them to ease debugging of their applications.  Obviously if we have access to such information it will also ease the reverse engineering process.  Luckily Microsoft provides a clean way to handle this information.

When a developer wants this mapping they produce a PDB file (.pdb extension).  This file is then loaded via a debugger like WinDbg using the Microsoft provided library dbghelp.dll.  Dbghelp.dll has exported functions that can properly parse the file and map its contents to the target binary.  One great thing about Microsoft symbols is the fact they can obviously be found on disk, but also via a symbol server.  The symbol server is responsible for handling requests for debugging symbols for a particular binary.  Lets say we want the symbols for calc.exe but don't have the .pdb on disk.  We can use a debugger like WinDbg to query the Microsoft symbol server asking for the correct pdb, which in turn downloads the file and loads into into our application of choice.  What is invaluable about this to us, as reverse engineers, is we can do the same thing in IDA.

IDA has built in functionality to load symbols.  This is provided as a plugin named pdb.plw that runs automatically when a new binary is loaded.  When run, the plugin will first check for the pdb in the same directory as the binary.  If it cannot find the symbols, it will ask the symbol server you have configured in either the _NT_SYMBOL_PATH or _NT_ALTERNATE_SYMBOL_PATH system environment variables.  Once the symbol file is located it is then mapped into your idb via dbghelp.dll.  This then updates all the function names, and variables to their loaded symbol names.  For instance take a look at a snippet of functions in calc.exe before we have loaded symbols.

...
sub_1007C26        .text 01007C26 00000026 R . . . . . . 
sub_1007C4C        .text 01007C4C 00000042 R . . . . . . 
sub_1007C8E        .text 01007C8E 00000045 R . . . . . . 
sub_1007CD3        .text 01007CD3 0000004F R . . . . . . 
sub_1007D22        .text 01007D22 0000009C R . . . B . . 
sub_1007DBE        .text 01007DBE 000000BE R . . . B . . 
sub_1007E7C        .text 01007E7C 000000DB R . . . B . . 
sub_1007F57        .text 01007F57 00000275 R . . . . T . 
sub_10081CC        .text 010081CC 0000002C R . . . . . . 
sub_10081F8        .text 010081F8 0000015E R . . . B . . 
sub_1008356        .text 01008356 000002EC R . . . B . . 
sub_1008642        .text 01008642 00000086 R . . . B . . 
sub_10086C8        .text 010086C8 0000008C R . . . B . . 
sub_1008754        .text 01008754 0000008D R . . . . . . 
sub_10087E1        .text 010087E1 00000273 R . . . B T . 
sub_1008A54        .text 01008A54 00000138 R . . . B . . 
sub_1008B8C        .text 01008B8C 000000D0 R . . . B . . 
sub_1008C5C        .text 01008C5C 000000E8 R . . . B . . 
sub_1008D44        .text 01008D44 000000E8 R . . . B . . 
sub_1008E2C        .text 01008E2C 000000EB R . . . B . . 
sub_1008F17        .text 01008F17 000000EB R . . . B . . 
sub_1009002        .text 01009002 000000D5 R . . . B . . 
sub_10090D7        .text 010090D7 00000115 R . . . B . . 
sub_10091EC        .text 010091EC 0000029D R . . . B . . 
sub_1009489        .text 01009489 0000015B R . . . B . . 
sub_10095E4        .text 010095E4 00001525 R . . . . . . 
sub_100AB09        .text 0100AB09 000000D4 R . . . . . . 
sub_100ABDD        .text 0100ABDD 00000FAB R . . . . . . 
sub_100BB88        .text 0100BB88 0000004C R . . . . . . 
sub_100BBD4        .text 0100BBD4 00000080 R . . . . . . 
sub_100BC54        .text 0100BC54 0000010D R . . . B . . 
sub_100BD61        .text 0100BD61 0000000E R . . . . . . 
sub_100BD6F        .text 0100BD6F 000000BA R . . . B . . 
...
Not very welcoming is it?  Take a look after we apply the pdb symbols to the idb in IDA.
...
_createrat         .text 01007C26 00000026 R . . . . . . 
longtonum          .text 01007C4C 00000042 R . . . . . . 
numtolong          .text 01007C8E 00000045 R . . . . . . 
stripzeroesnum     .text 01007CD3 0000004F R . . . . . . 
numpowlong         .text 01007D22 0000009C R . . . B . . 
nRadixxtonum       .text 01007DBE 000000BE R . . . B . . 
numtonRadixx       .text 01007E7C 000000DB R . . . B . . 
innum              .text 01007F57 00000275 R . . . . T . 
longtorat          .text 010081CC 0000002C R . . . . . . 
rattolong          .text 010081F8 0000015E R . . . B . . 
putnum             .text 01008356 000002EC R . . . B . . 
putrat             .text 01008642 00000086 R . . . B . . 
ratpowlong         .text 010086C8 0000008C R . . . B . . 
numtorat           .text 01008754 0000008D R . . . . . . 
inrat              .text 010087E1 00000273 R . . . B T . 
intrat             .text 01008A54 00000138 R . . . B . . 
rat_equ            .text 01008B8C 000000D0 R . . . B . . 
rat_ge             .text 01008C5C 000000E8 R . . . B . . 
rat_gt             .text 01008D44 000000E8 R . . . B . . 
rat_le             .text 01008E2C 000000EB R . . . B . . 
rat_lt             .text 01008F17 000000EB R . . . B . . 
rat_neq            .text 01009002 000000D5 R . . . B . . 
scale              .text 010090D7 00000115 R . . . B . . 
scale2pi           .text 010091EC 0000029D R . . . B . . 
inbetween          .text 01009489 0000015B R . . . B . . 
_readconstants     .text 010095E4 00001525 R . . . . . . 
trimit             .text 0100AB09 000000D4 R . . . . . . 
ChangeConstants    .text 0100ABDD 00000FAB R . . . . . . 
fracrat            .text 0100BB88 0000004C R . . . . . . 
mulrat             .text 0100BBD4 00000080 R . . . . . . 
addrat             .text 0100BC54 0000010D R . . . B . . 
zerrat             .text 0100BD61 0000000E R . . . . . . 
divrat             .text 0100BD6F 000000BA R . . . B . . 
...
Much better right?  This is the power of symbols.  Its such an easy thing to accomplish with IDA.

The main problem people have with symbols is setting up their symbol server properly.  In some cases you can find enterprise applications with the symbol files on disk and it is easy to copy over and load with IDA.  However with things such as Microsoft binaries using a symbol server is essential.

There really is no black magic to setting things up.  All you need is to set the two environment variables mentioned above.  The only tricky part is the path to your symbols.  In most cases this would be the following.
SRV*c:\windows\symbols*http://msdl.microsoft.com/download/symbols
This instructs the application loading the symbols to use the symbol store you specify.  The "*" delimiter is our search path.  First we will check "C:\windows\symbols" (After it has looked in the same directory as your binary) for a pdb file sharing the same name as the binary needing them (i.e. calc.pdb).  If nothing can be find it will go on down the list.  In this case we have also specified a url to the Microsoft symbol server which will then get queried, and if found, will download the proper pdb for the application.  It's as simple as that.

But it can be even simpler.  I personally use WinDbg to set up my symbol store.  This is because I can easily set the variables, and check to make sure it works.  Load up WinDbg and go to File->Symbol Search Path (Ctrl-S).  Type add your symbol store like we have above.  If you are already attached to a process (maybe calc.exe?) click the "reload" box.  Clicking OK will save your information in the environment variables.  Back in the main WinDbg window I type !peb because its the shortest way to verify.  If you don't get a big warning about not having symbols you are good to go.  Here is a good step-by-step covering this from the awesome guys at uninformed.

We have covered how IDA loads symbols.  But sometimes IDA isn't the best method.  IDA versions below 5.3 will not load all of the symbol information stored in a pdb.  Luckily though, we have 2 great plugins that do a better job of loading all the contents of a pdb.

PDB Plus was one of the first plugins that did a better job of loading pdb files.  It has a small GUI that allows you to specify which type of symbol you want to load from local variables to global structures.  I find it to work very well and does a better job than the default pdb.plw IDA comes with.

Determina PDB Plugin from the venerable Alexander Sotirov is a relatively new plugin that again does a more complete job of loading.  One of the cool features is it allows you to see all of the symbol information being applied to your binary before it happens.

To use these plugins simply make a copy of the old pdb.plw in your IDA installs "plugin" directory, and copy the new pdb.plw there.  This will then get called when IDA loads a new binary, and any additional information from the plugin will be displayed.

Symbols are invaluable.  Unfortunately in most situations you don't get to use them so when you do please take advantage.  I know this is basic information but setting up your environment before you even dive into your first RE endeavor can save you lots of time and headaches in the future.  Hope you enjoyed it.

-Cody
Tags: MindshaRE
Published On: 2008-07-31 13:12:05

Comments post a comment

No comments.
Trackback