TippingPoint Digital Vaccine Laboratories
DID YOU KNOW... Frost and Sullivan announced in their Feb. 2007 report, "Analysis of Vulnerability Discovery and Disclosure", that TippingPoint was the fastest growing discoverer of new vulnerabilities and the leader in the discovery of both high-severity and Microsoft vulnerabilities.

MindshaRE: Live Analysis Markup

I have mentioned before that I am always trying to bridge the gap between static analysis and live analysis. I try to always reverse statically but lets face it, sometimes due to time constraints, complexity, or dynamic resolution of functions we need a little help from our favorite debugger. So today I'll demonstrate a little tool I use to help me easily pull the information I need from a debugger and still stay focused in IDA. My simple live analysis markup utility might help you in these situations as well.

MindshaRE is our weekly look at some simple reverse engineering tips and tricks. The goal is to keep things small and discuss every day aspects of reversing. You can view previous entries here by going through our blog history.

Lets set up a scenario.  You are reverse engineering calc.exe so you can patch its addition functionality.  Maybe you want to make 2 + 2 equal 5. When you encounter the following snippet you feel the need to just take the easy route and use a debugger.
.text:01011605    push    edi
.text:01011606    mov     edi, [eax]
.text:01011608    mov     ecx, [edi+4]
.text:0101160B    mov     eax, [edi+8]
Normally, you would switch gears and open your debugger.  Then you'd have to set a breakpoint at each address you are interested in.  Once the breakpoint hits you would then inspect the register or memory address you need. Not exactly a quick and painless process.

Thats why I wrote this simple markup script for IDA.  It works like this.  You add a comment on the address you are interested in.  The comment contains the information you'd like the debugger to report.  It supports three different types of information, Register, Operand, and Memory.  Registers are self explainatory.  Operand allows you to automatically resolve the type of data at that operand.  For instance at 0x01011606 the script would understand a pointer is being dereferenced.  Memory allows you to specify a memory address you want to read.  Here is an example using the snippet from above.
.text:01011605    push    edi             ; **LA R:eax
.text:01011606    mov     edi, [eax]      ; **LA O:1
.text:01011608    mov     ecx, [edi+4]    ; **LA O:1
.text:0101160B    mov     eax, [edi+8]
The comments tell our debugger that we want the contents of eax at 0x01011605, and the second operands at 0x01011606 and 0x01011608.  In order to do this we run a script which outputs this into a list we can feed to our debugger.  The output of this script follows.
1011605,r,4,EAX
1011606,p,4,2
1011608,o,4,2
Simple enough.  But what kind of debugger can actually read this in and give us what we want? I hope by now you've checked out PyDbg.  A fully scriptable debugger implemented in Python. A perfect use for this little utility. My PyDbg script will read in this list and set all of our breakpoints. When a breakpoint is hit, it prints the info.  Here is our run using the generated list of breakpoints.
C:\Code\Python\live_analysis>live_analysis.py calc.exe la.conf
[*] Trying to attach to existing calc.exe
[*] Attaching to calc.exe (2932)
[*] Setting bp @ 0x01011605
[*] Setting bp @ 0x01011606
[*] Setting bp @ 0x01011608
[*] 0x01011605      EAX [Reg    ] is 0x7f7b8    [4]
[*] 0x01011606        2 [Pointer] is 0xa8038    [4]
[*] 0x01011608        2 [Offset ] is 0x1        [4]

C:\Code\Python\live_analysis>
Not bad. Certainly this can help. We can also get more information. Check this out.

.text:01011605    push    edi             ; **LA R:eax,R:ebx,R:ecx,O:0
.text:01011606    mov     edi, [eax]      ; **LA O:1,R:eax,R:edi
.text:01011608    mov     ecx, [edi+4]    ; **LA O:1,O:0,R:ECX

1011605,r,4,EAX
1011605,r,4,EBX
1011605,r,4,ECX
1011605,r,4,EDI
1011606,p,4,2
1011606,r,4,EAX
1011606,r,4,EDI
1011608,o,4,2
1011608,r,4,ECX
1011608,r,4,ECX

C:\Code\Python\live_analysis>live_analysis.py calc.exe la.conf
[*] Trying to attach to existing calc.exe
[*] Attaching to calc.exe (2188)
[*] Setting bp @ 0x01011605
[*] Setting bp @ 0x01011606
[*] Setting bp @ 0x01011608
[*] 0x01011605      EAX [Reg    ] is 0x7f7b8    [4]
[*] 0x01011605      EBX [Reg    ] is 0xa8038    [4]
[*] 0x01011605      ECX [Reg    ] is 0x7c8099fd [4]
[*] 0x01011605      EDI [Reg    ] is 0x0        [4]
[*] 0x01011606        2 [Pointer] is 0xa8038    [4]
[*] 0x01011606      EAX [Reg    ] is 0x7f7b8    [4]
[*] 0x01011606      EDI [Reg    ] is 0x0        [4]
[*] 0x01011608        2 [Offset ] is 0x1        [4]
[*] 0x01011608      ECX [Reg    ] is 0x7c8099fd [4]
[*] 0x01011608      ECX [Reg    ] is 0x7c8099fd [4]
[*] 0x01011605      EAX [Reg    ] is 0x7f7b8    [4]
[*] 0x01011605      EBX [Reg    ] is 0xb4410    [4]
[*] 0x01011605      ECX [Reg    ] is 0x7c8099fd [4]
[*] 0x01011605      EDI [Reg    ] is 0x0        [4]
[*] 0x01011606        2 [Pointer] is 0xb4410    [4]
[*] 0x01011606      EAX [Reg    ] is 0x7f7b8    [4]
[*] 0x01011606      EDI [Reg    ] is 0x0        [4]
[*] 0x01011608        2 [Offset ] is 0x1        [4]
[*] 0x01011608      ECX [Reg    ] is 0x7c8099fd [4]
[*] 0x01011608      ECX [Reg    ] is 0x7c8099fd [4]

C:\Code\Python\live_analysis>
With this we can easily pull out interesting information and stay centered in IDA. In the future I will actually call PyDbg from within IDA. Thus making it even more simple. There also exists a method for exporting this data to an IDC for loading back in IDA, but it is not on by default. This is because breakpoints can get hit multiple times, and you may not want this to get convoluted.

I always want to keep my concentration on IDA. For me it's always difficult to "switch gears" and go into debugger mode. I use this script for quick access to information without having to lose track of what I am currently doing.

I hope this can be of some use to you when reverse engineering. I would love to hear how you personally bridge the static/live analysis gap. I know you can achieve some of this in IDA's debugger, if you do this hook us up with some scripts, or info.  Like I have previously stated, one day I'll get use to IDA's debugger.

The two scripts in this post have been bundled into live_analysis.zip.
  • gen_la_config.py - Generates the configuration from your comments in IDA
  • live_analysis.py - PyDbg script that sets breakpoints and logs hits.
Enjoy!

-Cody
Tags:
Published On: 2008-09-18 13:57:20

Comments post a comment

  1. Dima commented on 2008-09-18 @ 23:09

    To avoid switching to the debugger each time I see an indirect branch in IDA listing I usually use a utility based on the binary instrumentation tool pin. The utility records all the addresses where the indirect calls occur and corresponding callee while the program is running. Then the gathered information is exported to IDA.

  2. Anonymous commented on 2008-09-19 @ 00:58

    good one. !! instead of commenting and then executing py script after that ..i prefer olly !! ;)

  3. Cody commented on 2008-09-19 @ 09:06

    @Dima: That is very similar to the script in this posting. One of the reasons I prefer this script over instrumentation, and process stalking is the control you have over the information being logged.

    @Anonymous: Hah. Most people do it that way. That's why I wanted to shake things up and show an alternative to the Olly/WinDbg route :).


Trackback