MindshaRE is our weekly look at some simple reverse engineering tips and tricks. The goal is to keep things small and discuss every day aspects of reversing. You can view previous entries here by going through our blog history.
Lets set up a scenario. You are reverse engineering calc.exe so you can patch its addition functionality. Maybe you want to make 2 + 2 equal 5. When you encounter the following snippet you feel the need to just take the easy route and use a debugger.
.text:01011605 push edi
.text:01011606 mov edi, [eax]
.text:01011608 mov ecx, [edi+4]
.text:0101160B mov eax, [edi+8]Normally, you would switch gears and open your debugger. Then you'd have to set a breakpoint at each address you are interested in. Once the breakpoint hits you would then inspect the register or memory address you need. Not exactly a quick and painless process.
Thats why I wrote this simple markup script for IDA. It works like this. You add a comment on the address you are interested in. The comment contains the information you'd like the debugger to report. It supports three different types of information, Register, Operand, and Memory. Registers are self explainatory. Operand allows you to automatically resolve the type of data at that operand. For instance at 0x01011606 the script would understand a pointer is being dereferenced. Memory allows you to specify a memory address you want to read. Here is an example using the snippet from above.
.text:01011605 push edi ; **LA R:eax
.text:01011606 mov edi, [eax] ; **LA O:1
.text:01011608 mov ecx, [edi+4] ; **LA O:1
.text:0101160B mov eax, [edi+8]The comments tell our debugger that we want the contents of eax at 0x01011605, and the second operands at 0x01011606 and 0x01011608. In order to do this we run a script which outputs this into a list we can feed to our debugger. The output of this script follows.
1011605,r,4,EAX
1011606,p,4,2
1011608,o,4,2Simple enough. But what kind of debugger can actually read this in and give us what we want? I hope by now you've checked out PyDbg. A fully scriptable debugger implemented in Python. A perfect use for this little utility. My PyDbg script will read in this list and set all of our breakpoints. When a breakpoint is hit, it prints the info. Here is our run using the generated list of breakpoints.
C:\Code\Python\live_analysis>live_analysis.py calc.exe la.conf
[*] Trying to attach to existing calc.exe
[*] Attaching to calc.exe (2932)
[*] Setting bp @ 0x01011605
[*] Setting bp @ 0x01011606
[*] Setting bp @ 0x01011608
[*] 0x01011605 EAX [Reg ] is 0x7f7b8 [4]
[*] 0x01011606 2 [Pointer] is 0xa8038 [4]
[*] 0x01011608 2 [Offset ] is 0x1 [4]
Not bad. Certainly this can help. We can also get more information. Check this out.
C:\Code\Python\live_analysis>
.text:01011605 push edi ; **LA R:eax,R:ebx,R:ecx,O:0
.text:01011606 mov edi, [eax] ; **LA O:1,R:eax,R:edi
.text:01011608 mov ecx, [edi+4] ; **LA O:1,O:0,R:ECX
1011605,r,4,EAX
1011605,r,4,EBX
1011605,r,4,ECX
1011605,r,4,EDI
1011606,p,4,2
1011606,r,4,EAX
1011606,r,4,EDI
1011608,o,4,2
1011608,r,4,ECX
1011608,r,4,ECX
C:\Code\Python\live_analysis>live_analysis.py calc.exe la.conf
[*] Trying to attach to existing calc.exe
[*] Attaching to calc.exe (2188)
[*] Setting bp @ 0x01011605
[*] Setting bp @ 0x01011606
[*] Setting bp @ 0x01011608
[*] 0x01011605 EAX [Reg ] is 0x7f7b8 [4]
[*] 0x01011605 EBX [Reg ] is 0xa8038 [4]
[*] 0x01011605 ECX [Reg ] is 0x7c8099fd [4]
[*] 0x01011605 EDI [Reg ] is 0x0 [4]
[*] 0x01011606 2 [Pointer] is 0xa8038 [4]
[*] 0x01011606 EAX [Reg ] is 0x7f7b8 [4]
[*] 0x01011606 EDI [Reg ] is 0x0 [4]
[*] 0x01011608 2 [Offset ] is 0x1 [4]
[*] 0x01011608 ECX [Reg ] is 0x7c8099fd [4]
[*] 0x01011608 ECX [Reg ] is 0x7c8099fd [4]
[*] 0x01011605 EAX [Reg ] is 0x7f7b8 [4]
[*] 0x01011605 EBX [Reg ] is 0xb4410 [4]
[*] 0x01011605 ECX [Reg ] is 0x7c8099fd [4]
[*] 0x01011605 EDI [Reg ] is 0x0 [4]
[*] 0x01011606 2 [Pointer] is 0xb4410 [4]
[*] 0x01011606 EAX [Reg ] is 0x7f7b8 [4]
[*] 0x01011606 EDI [Reg ] is 0x0 [4]
[*] 0x01011608 2 [Offset ] is 0x1 [4]
[*] 0x01011608 ECX [Reg ] is 0x7c8099fd [4]
[*] 0x01011608 ECX [Reg ] is 0x7c8099fd [4]
With this we can easily pull out interesting information and stay centered in IDA. In the future I will actually call PyDbg from within IDA. Thus making it even more simple. There also exists a method for exporting this data to an IDC for loading back in IDA, but it is not on by default. This is because breakpoints can get hit multiple times, and you may not want this to get convoluted.
C:\Code\Python\live_analysis>
I always want to keep my concentration on IDA. For me it's always difficult to "switch gears" and go into debugger mode. I use this script for quick access to information without having to lose track of what I am currently doing.
I hope this can be of some use to you when reverse engineering. I would love to hear how you personally bridge the static/live analysis gap. I know you can achieve some of this in IDA's debugger, if you do this hook us up with some scripts, or info. Like I have previously stated, one day I'll get use to IDA's debugger.
The two scripts in this post have been bundled into live_analysis.zip.
- gen_la_config.py - Generates the configuration from your comments in IDA
- live_analysis.py - PyDbg script that sets breakpoints and logs hits.
-Cody
