In this week's MindshaRE we will take a look at strings. We will cover some of the obvious uses for strings as well as helpful application of strings in the binary.
MindshaRE is our weekly look at some simple reverse engineering tips and tricks. The goal is to keep things small and discuss every day aspects of reversing. You can view previous entries here by going through our blog history.
String examination is a frequent starting point for many reverser engineers. Whether it is how they begin learning reverse engineering or perhaps how they dive into a new application. That is not to say some people don't jump straight into the code, but it's generally a good starting point. I personally examine strings before any other analysis primarily because I can pick up an idea of the binaries purpose, verbosity, history, and other tidbits quickly. The combination of interesting strings and their cross references to associated library calls allows us to label many functions in a very short period of time.
Here are some interesting examples of strings from a single binary.
004EF594 db '$Workfile: SDIBase.cpp $ Copyright (c) 1998 Selsius Systems Inc.,'
004EF594 db ' all rights reserved.',0
...
004BC8D4 db '<CiscoIPPhoneExecute><ExecuteItem URL="Play:%s"/></CiscoIPPhoneExecute>',0
...
004D34B0 db 'Password',0
...
004E3434 db 'SELECT D.Name,D.tkModel,D.tkClass, N.DNOrPattern, C.Name AS Expr1'
004E3434 db ' FROM Device D INNER JOIN DeviceNumPlanMap M ON M.fkDevice = D.p'
004E3434 db 'kid INNER JOIN NumPlan N ON M.fkNumPlan = N.pkid LEFT OUTER JOIN '
004E3434 db 'RoutePartition C ON N.fkRoutePartition = C.pkid WHERE(N.DNOrPatte'
004E3434 db 'rn = ',27h,0Just looking around the strings quickly reveals that the binary in question has portions licensed from another firm, does some form of XML parsing and in some way communicates with a SQL server.
In many cases we can rely on string patterns to partially recover symbolic information. Analyzing a binary with symbols is much easier then parsing through call graphs of anonymous sub-routines. Often developers will have a layer for logging, debugging, or tracing a process. In many cases this is for customer support reasons. If a customer has a problem, the developers, or support personnel, can quickly look at a call trace and determine the problem. We can use this to our advantage. In our example we can see a group of strings that appear to be class methods.
.data:004EE93C db 'ServiceParamInfo::SetServParamEventForCTI...',0
.data:004EE96C db 'ServiceParamInfo::SetServParamEventForCCM...',0
.data:004EE99C db 'ServiceParamInfo::SetServParamEventForCEF...',0
.data:004EE9CC db 'ServiceParamInfo::SetRISDCEvent...',0
.data:004EE9F0 db 'ServiceParamInfo::ResetRISDCEvent...',0
.data:004EEA18 db 'ServiceParamInfo::SetSNMPEvent...',0You can view these by going through the "Strings" view in IDA, or looking through the data sections in most binaries (be aware strings can be in other segments as well). Now if we follow the cross reference of these strings we see they are pushed to a function.
004957FE push offset `string' ; "ServiceParamInfo::SetServParamEventForC"...
00495803 lea ecx, [esp+420h+var_418]
00495807 call sub_496600Following that we can see a pretty tell-tell sign of a logging function. I have cut out some assembly and branches for brevity.
...
0041E3B5 push offset tmpbuf
0041E3BA push offset `string' ; "%s:%s\n"
0041E3BF push ecx
0041E3C0 call ds:_imp__fprintfSo we know that the function sub_496600 handles some sort of trace logging. If we find all the cross references to this function we may be able to build a list of all the function names that are logged when tracing.

As you can see the functions have names. This is because I have gone through each cross reference, and renamed the function based off the string contents. You will also notice there are only 12 cross references, the reason being that in this particular binary there are several functions responsible for logging. You may need to hunt down many different routines and repeat this process. If you are interested in a sample script to automate these actions, take a look at resolve_symbols.py. The script takes any cross reference from the current cursor, walks the pushed arguments for any strings, and applies those as function names to the caller.
Another good use of strings is combining them with imported library calls to create a listing of all external functions being called from a binary. A good example of this are calls to the registry functions. By reconstructing these we can generate the following example output:
RegOpenKeyEx( 0x80000002,
'SYSTEM\CurrentControlSet\Services\Eventlog\Application\Cisco \
Extended Function',
0x0,
0x20019,
*var_10 );I hope this installment of MindshaRE has given you some ideas on helpful ways to use strings in a binary. There are many uses for them when reverse engineering a binary. Don't be shy when it comes to digging into the strings first, there is nothing wrong with having more knowledge about a binary. If you have any ideas please leave a comment, I am always interested in the way other people reverse engineer. See you next week.
-Cody
