Update: Peter was kind enough to whip up some legit web 2.0-ish graphing with some IDAPython to visualize the read() function referenced in this blog post. Check it out here (its draggable, and stuff).
Quite often at the ZDI we receive submissions that go something like this:
"When this fuzzed file is loaded into process X it causes an access violation. Here is the assembly at the point of the crash and a call stack:
0:030> g (530.758): Access violation - code c0000005 (first chance) First chance exceptions are reported before any exception handling. This exception may be expected and handled. eax=020108d8 ebx=015d0178 ecx=02012bf0 edx=41414141 esi=020108d0 edi=015d0000 eip=7c82a99f esp=015afd20 ebp=015afdf0 iopl=0 nv up ei ng nz na pe cy cs=001b ss=0023 ds=0023 es=0023 fs=003b gs=0000 efl=00010287 ntdll!RtlFreeHeap+0x4bd: 7c82a99f 8902 mov dword ptr [edx],eax ds:0023:41414141=???????? 0:030> kv ChildEBP RetAddr Args to Child 015afdf0 78134c39 015d0000 00000000 020108d8 ntdll!RtlFreeHeap+0x4bd 015afe3c 0042f18c 020108d8 00000000 00000000 MSVCR90!free+0xcd 015aff58 009737e6 01c663c0 01bb2380 020108d8 x+0x2f18c 015aff78 781329bb 01c663c0 8113a173 00000000 x+0x846fa 015affb0 78132a47 00000000 77e64829 015d4c88 MSVCR90!_callthreadstartex+0x1b 015affb8 77e64829 015d4c88 00000000 00000000 MSVCR90!_threadstartex+0x66 015affec 00000000 781329e1 015d4c88 00000000 kernel32!BaseThreadStart+0x34 Please make an offer on this information, !exploitable says its exploitable. I do not have the original non-fuzzed file anymore due to disk space problems."
Now, this is obviously heap corruption as the backtrace shows us that a heap chunk's metadata was probably corrupted due to some prior operation and is being used in a free or coalesce. What we are seeing are the effects of the corruption... and unfortunately this doesn't give us too much information that will help us locate the root cause of the bug. Trying to debug from this point is what I usually refer to as "bottom-up" reversing and its usually a bit trickier than reversing "top-down"--that is, reversing from the point of user input to the problematic code.
The process of reversing from user input can be a tedious one and it can be sped up by making use of hardware breakpoints. The idea is that if we find where our user data is first read into the process, we can set a memory breakpoint on it and find the first operation that acts upon it. Now, after you've done this on hundreds of bugs it might occur to you that this can be automated. Indeed, we can automate a lot of this using a debugger, however we'd have to use software breakpoints (int 3) and those require a context switch which can be inefficient.
Summary
In this blog post I will walk through an alternate way to perform this debugging process that can be abstracted to solve a bunch of different problems. Additionally, the example code we'll use here requires simply Python and it's built-in ctypes module.First off, let's start with an overview of the technique we're going to code the functionality for. The following example will be automating the discovery of the point at which a program reads in user data using MSVCR90.dll!read. This just so happens to be the way that Adobe Shockwave reads in DIR files via Internet Explorer, so we'll use that as an example.
So, first off... we need to disassemble MSCVR90.dll and check out it's read() function. Here's a screenshot of it:

Now, we'd like to hook this function at two points. One to catch the arguments that were passed to the function, and the other to search the data that was read in for the value we want to find.
Hook Point #1
When we hook, we are going to need to clobber 5 bytes worth of instructions. This is because what we'll effectively be doing is patching in a jump (5 opcode bytes) that points to our hook code that we want to run.Here is an ideal location to hook:
.text:78586EB5 8B C8 mov ecx, eax .text:78586EB7 C1 F9 05 sar ecx, 5 .text:78586EBA 8D 1C 8D A0 D6 5B 78 lea ebx, ___pioinfo[ecx*4]
We choose this location for several reasons. Firstly, its near the function prologue and the arguments that were pushed to read should be easily retrieved from the stack. Secondly, within 5 bytes from 0x78586EB5 there are no jump instructions. If we were to lift this code out and it had a jump instruction, it would be a near jump. Thus if we relocated it, we'd have to promote the branch to a jmp in order for it to disassemble properly (and thats out of scope for this post).
Hook Point #2
The second ideal location to hook is the following:.text:78586F35 C7 45 FC FE FF FF FF mov [ebp+ms_exc.disabled], 0FFFFFFFEh
This location is ideal as it is near the end of the function and thus the user data would have been read into the destination buffer that was pushed to read() by this point. So, what we'll end up doing is patching in another jump here that will jump to our code that will search that buffer for the value we care about.
Side-note: We don't actually need 2 hook points to accomplish this task, but I'm showing how to do 2 so the ideas can be abstracted for other purposes...
Overview of Injection Plan
Here's the plan... we are going to patch those code locations with jumps. But, jumps to what?Basically, we're going to set up an "arena" in the heap in which to stuff our custom code to run as well as the code we "lifted" from MSVCR90!read (we want our code to run, then we want to run the original code, and then return execution back to MSVCR90). So, here's what the memory arena will look like:
Offset Contains 0x00 our code to grab read()'s arguments off the stack followed by a jump to offset 0x60 0x20 our code to search the destination buffer for our value followed by a jump to offset 0x80 0x60 the original code from MSVCR90!read (0x78586EB5) that we are "lifting" to here 0x80 the original code from MSVCR90!read (0x78586F35) that we are "lifting" to here
The offsets were chosen arbitrarily (I just ensured they were large enough to contain the code we're going to put there).
So, the first thing we need to do is write the code that we want to run at the first hook point. In other words, the code that will grab the arguments off the stack (specifically the destination buffer pointer and the size passed to read). Here's what I cobbled together:
[BITS 32] ; save registers we'll use push ecx push esi mov ecx, [esp+0x10] ; grab the size passed to read mov esi, [esp+0x0c] ; grab the destination pointer mov [0x41414141], ecx ; write it to memory at 0x41414141 mov [0x42424242], esi ; write it to memory at 0x42424242 pop esi pop ecx
Now, you'll notice I used two static addresses there, 0x41414141 and 0x42424242. I will replace those addresses with Python when we inject this code.
At this point we need to assemble that into its opcodes. We'll use nasm for that task:
[deft@host v90]$ nasm -f bin -o grab_args grab_args.asm [deft@host v90]$ xxd grab_args 0000000: 5156 8b4c 2408 8b74 2404 890d 4141 4141 QV.L$..t$...AAAA 0000010: 8935 4242 4242 5e59 .5BBBB^Y [deft@host v90]$
In Python-speak, this will be:
args_hook = "\x51\x56\x8b\x4c\x24\x10\x8b\x74\x24\x0c\x89\x0d" args_hook += saved_size + "\x89\x35" + saved_dst + "\x5e\x59"
We obviously dont want to write those values to those static addresses (0x41414141 and 0x42424242), so what we'll end up doing is allocating some memory to hold them. We'll get to that later but you'll note we have two variables, saved_dst and saved_size.
The second hook is a bit more complicated because it will actually be searching for a 4-byte value in the memory that read() wrote user data to. Here it is:
[BITS 32]
; save registers+flags
pushad
pushfd
; ecx = the size passed to read (retrieved from our first hook)
mov ecx, 0x61616161
; esi = the address of the buffer that
; read wrote to (as retrieved by our first hook)
mov esi, 0x62626262
; divide for 4-byte search
shr ecx, 2
; eax = the value we want to find
; this will be changed in our python
mov eax, 0x41414141
TEXT
.loop
cmp [esi], eax
jnz .increment
jz .success
.increment
lea esi, [esi+4]
dec ecx
jz .fail
jmp .loop
.success ; on success, throw an int3 to the debugger
int3 ; and esi will point to the value in memory
jmp .exit
.fail ; explicitness ;)
jmp .exit
.exit
popfd
popad
And here's what it looks like assembled:
[deft@host v90]$ nasm -f bin -o search search.asm [deft@host v90]$ xxd search 0000000: 609c b961 6161 61be 6262 6262 c1e9 02b8 `..aaaa.bbbb.... 0000010: 4141 4141 3906 7502 7408 8d76 0449 7408 AAAA9.u.t..v.It. 0000020: ebf2 cce9 0500 0000 e900 0000 009d 61 ..............a [deft@host v90]$
...and in Python:
search_hook = "\x60\x9c\x8b\x0d" + "A"*4 + "\x8b\x35" + "B"*4 + "\xc1\xe9\x02\xb8" + needle search_hook += "\x39\x06\x75\x02\x74\x08\x8d\x76\x04\x49\x74\x08\xeb\xf2\xcc\xe9\x05\x00" search_hook += \x00\x00\xe9\x00\x00\x00\x00\x9d\x61"
where needle is the 4 byte value we are looking for.
Injection with Python
In order to accomplish this code injection fu, we're going to need the ability to allocate memory in a remote process as well as write to its memory space (and change memory permissions where needed).We are going to use Python's ctypes module which allow us to load DLLs and call their functions. The module we're going to need is kernel32.dll. Specifically, we're going to use the following functions:
OpenProcess
VirtualAllocEx
VirtualProtectEx
WriteProcessMemory
To utilize the allocation routines and WriteProcessMemory in a remote process, we're going to need a handle to it. So, we need to start off with a call to OpenProcess. Here's the relevant Python:
import ctypes
kernel32 = ctypes.WinDLL('kernel32.dll')
def get_handle(pid):
PROCESS_VM_OPERATION = 0x0008
PROCESS_VM_READ = 0x0010
PROCESS_VM_WRITE = 0x0020
PROCESS_SET_INFORMATION = 0x0200
PROCESS_QUERY_INFORMATION = 0x0400
PROCESS_INFO_ALL = PROCESS_QUERY_INFORMATION|PROCESS_SET_INFORMATION
PROCESS_VM_ALL = PROCESS_VM_OPERATION|PROCESS_VM_READ|PROCESS_VM_WRITE
res = kernel32.OpenProcess(PROCESS_INFO_ALL | PROCESS_VM_ALL, False, pid)
print "Returning handle %d" % res
return res
Once we have a handle, we can allocate memory and change memory permissions. Our first step at this point is to allocate our heap arena to hold MSVCR90's lifted code as well as our hooks. To do this we'll need to use VirtualAllocEx:
def allocate(handle, size):
MEM_COMMIT = 0x1000
MEM_RESERVE = 0x2000
PAGE_EXECUTE_READWRITE = 0x40
count = size
perms = MEM_COMMIT | MEM_RESERVE, PAGE_EXECUTE_READWRITE
res = kernel32.VirtualAllocEx(handle, 0x0, count * 0x1000, perms)
print "Allocated memory for handle %d at 0x%08x" % (handle, res)
return res
Then, we'll want to write our hooks and lifted code to the address returned by allocate. But first, we want to make two more allocations to hold the saved destination pointer and the saved size:
saved_size = struct.pack("L", allocate(handle, 4))
saved_dst = struct.pack("L", allocate(handle, 4))
Now, when we inject out hooks we can use those addresses and thus dynamically change our injected opcodes. Also, let's go ahead and define the "needle", or the four bytes we are going to look for. In this case, we're going to search for the first 4 bytes of any DIR file (Shockwave director) which is "RIFX" as can be seen in this partial hexdump of a DIR file:
0000h: 52 49 46 58 00 00 DC 50 4D 56 39 33 69 6D 61 70 RIFX..ÜPMV93imap 0010h: 00 00 00 18 00 00 00 01 00 00 00 2C 00 00 07 3A ...........,...: 0020h: 00 00 00 00 00 00 00 00 00 00 00 00 6D 6D 61 70 ............mmap 0030h: 00 00 0F E0 00 18 00 14 00 00 00 CA 00 00 00 97 ...à.......Ê...,
Here is the dynamic replacement we perform on our hook's opcodes:
needle = "RIFX" args_hook = "\x51\x56\x8b\x4c\x24\x10\x8b\x74\x24\x0c\x89\x0d" + saved_size + args_hook += "\x89\x35" + saved_dst + "\x5e\x59" search_hook = "\x60\x9c\x8b\x0d" + saved_size + "\x8b\x35" + saved_dst search_hook += "\xc1\xe9\x02\xb8" + needle search_hook += "\x39\x06\x75\x02\x74\x08\x8d\x76\x04\x49\x74\x08" search_hook += "\xeb\xf2\xcc\xe9\x05\x00\x00\x00\xe9\x00\x00\x00\x00\x9d\x61"
Also, let's go ahead and define the original opcode bytes for the code we are lifting out:
# .text:78586EB5 8B C8 mov ecx, eax # .text:78586EB7 C1 F9 05 sar ecx, 5 # .text:78586EBA 8D 1C 8D A0 D6 5B 78 lea ebx, ___pioinfo[ecx*4] hook1_orig = "\x8b\xc8\xc1\xf9\x05\x8d\x1c\x8d\xa0\xd6\x5b\x78" # .text:78586F35 C7 45 FC FE FF FF FF mov [ebp+ms_exc.disabled], 0FFFFFFFEh hook2_orig = "\xc7\x45\xfc\xfe\xff\xff\xff"
Now that we have all the opcodes ready for injection, let's review the offsets we're going to use. Remember:
Offset Contains 0x00 our code to grab read()'s arguments off the stack followed by a jump to offset 0x60 0x20 our code to search the destination buffer for our value followed by a jump to offset 0x80 0x60 the original code from MSVCR90!read (0x78586EB5) that we are "lifting" to here 0x80 the original code from MSVCR90!read (0x78586F35) that we are "lifting" to here
And after each of those chunks of code we're going to want some jumps. The order of execution should be something like this:
MSVCR90!read is called. At 0x78586EB5 the execution will hit a jump we will patch in. This will jump to our heap arena at offset 0x00. After our hook executes, it will jump to our heap arena at offset 0x60 to execute the original code we lifted from 0x78586EB5. After that executes, it will jump back to read() at 0x78586EC1 (right after what we lifted). Execution will continue normally until our hook at 0x78586F35 is hit. Then, the process will hit our patched jump that will go to our heap arena at offset 0x20. After our hook executes, we will jump to the original lifted code at our heap arena offset 0x80. After that's done, it will either throw an INT 3 if it finds our needle, or it will jump back to read() at 0x78586F3C (right after what we lifted). Make sense?
So let's make this easier on ourselves by writing some functions to make these jumps (opcode 0xE9 is a jump instruction):
def makejump(start, target, length):
print "Asked to make a jump from 0x%08x to 0x%08x" % (start, target)
if start < target:
buf = "\xe9" + struct.pack("L", target-start-5)
buf += "\x90"*(length-len(buf))
else:
buf = "\xe9" + struct.pack("L", target-start-5)
buf += "\x90"*(length-len(buf))
return buf
def patchjump(handle, x, y, length):
opcodes = makejump(x, y, length)
dst = ctypes.cast(x, ctypes.c_char_p)
src = ctypes.c_char_p(opcodes)
print "Patching jump from 0x%08x to 0x%08x" % (x, y)
res = ctypes.windll.kernel32.WriteProcessMemory(handle, dst, src, length, 0x0)
print "WriteProcessMemory returned 0x%08x" % res
return res
At this point we should have the ability to allocate memory and to write our opcodes to it. Now we can actually start "doing stuff".
To begin, we should change the permissions on MSCVR90.dll's .text segment to ensure we can actually write our jump there. Its default permissions are the following (from WinDBG):
0:021> !address 0x78586EB5
78520000 : 78521000 - 00096000
Type 01000000 MEM_IMAGE
Protect 00000020 PAGE_EXECUTE_READ
State 00001000 MEM_COMMIT
Usage RegionUsageImage
FullPath C:\WINDOWS\system32\MSVCR90.dll
We need to change the Protect flags to 0x40 (PAGE_EXECUTE_READWRITE).
This can be accomplished with a call to VirtualProtectEx:
def vprotect(handle, address):
PAGE_EXECUTE_READWRITE = 0x40
crap = ctypes.byref(ctypes.create_string_buffer("\x00"*4))
res = kernel32.VirtualProtectEx(handle, address, 0x1000, PAGE_EXECUTE_READWRITE, crap)
print "VirtualProtecEx returned 0x%08x" % res
return res
Once we've run this on the process, we can verify with WinDBG that the permissions were changed:
0:021> !address 0x78586EB5
78520000 : 78586000 - 00001000
Type 01000000 MEM_IMAGE
Protect 00000040 PAGE_EXECUTE_READWRITE
State 00001000 MEM_COMMIT
Usage RegionUsageImage
FullPath C:\WINDOWS\system32\MSVCR90.dll
Now, let's allocate our memory and then write our jumps into read()'s code:
addr = allocate(handle, 1024) # .text:78586EB5 8B C8 mov ecx, eax # .text:78586EB7 C1 F9 05 sar ecx, 5 # .text:78586EBA 8D 1C 8D A0 D6 5B 78 lea ebx, ___pioinfo[ecx*4] hook1_orig = "\x8b\xc8\xc1\xf9\x05\x8d\x1c\x8d\xa0\xd6\x5b\x78" # patch a jump from hook1 to addr patchjump(handle, hook1, addr+0x00, 12) # .text:78586F35 C7 45 FC FE FF FF FF mov [ebp+ms_exc.disabled], 0FFFFFFFEh hook2_orig = "\xc7\x45\xfc\xfe\xff\xff\xff" # patch a jump from hook2 to addr+0x20 patchjump(handle, hook2, addr+0x20, 7)
At this point we can verify that our jumps were patched in to the process properly:
0:001> u 0x78586EB5 MSVCR90!_read+0x5e 78586eb5 e946918c95 jmp+0xde4ffff (0de50000) 78586eba 90 nop 78586ebb 90 nop 78586ebc 90 nop 0:001> u 78586F35 MSVCR90!_read+0xde: 78586f35 e9e6908c95 jmp +0xde5001f (0de50020) 78586f3a 90 nop 78586f3b 90 nop
You should notice that the first hook jumps to offset 0x00 and the second jumps to offset 0x20, which makes sense.
Now, let's implement a quick function to write to memory so that we can copy our opcodes to our heap arena:
def writemem(handle, mem, data):
src = ctypes.c_char_p(data)
dst = ctypes.cast(mem, ctypes.c_char_p)
length = ctypes.c_int(len(data))
res = ctypes.windll.kernel32.WriteProcessMemory(handle, dst, src, length, 0x0)
return res
And to use it:
# write the lifted code to our heap arena writemem(handle, addr+0x60, hook1_orig) writemem(handle, addr+0x80, hook2_orig)
And now we can write our jumps from the end of our hooks to the lifted code as well as the jumps back from the lifted code back to read():
# write jumps after our hooks that goes to the lifted code jmp_hook1 = patchjump(handle, addr+0x00+len(args_hook), addr+0x60, 5) jmp_hook2 = patchjump(handle, addr+0x20+len(search_hook), addr+0x80, 5) # write our hooks to our arena writemem(handle, addr, args_hook) writemem(handle, addr+0x20, search_hook) # write in some patches from the lifted code back to read() patchjump(handle, addr+0x60+len(hook1_orig), hook1+12, 5) patchjump(handle, addr+0x80+len(hook2_orig), hook2+7, 5)
At this point we should be done and we can verify with WinDBG:
0:001> u 0de50000 L9+0xde4ffff: 0de50000 51 push ecx 0de50001 56 push esi 0de50002 8b4c2410 mov ecx,dword ptr [esp+10h] 0de50006 8b74240c mov esi,dword ptr [esp+0Ch] 0de5000a 890d0000c205 mov dword ptr [ +0x5c1ffff (05c20000)],ecx 0de50010 89350400c205 mov dword ptr [ +0x5c20003 (05c20004)],esi 0de50016 5e pop esi 0de50017 59 pop ecx 0de50018 e943000000 jmp +0xde5005f (0de50060) 0:001> u 0de50000+0x20 L13 +0xde5001f: 0de50020 60 pushad 0de50021 9c pushfd 0de50022 8b0d0000c205 mov ecx,dword ptr [ +0x5c1ffff (05c20000)] 0de50028 8b350400c205 mov esi,dword ptr [ +0x5c20003 (05c20004)] 0de5002e c1e902 shr ecx,2 0de50031 b852494658 mov eax,58464952h 0de50036 3906 cmp dword ptr [esi],eax 0de50038 7502 jne +0xde5003b (0de5003c) 0de5003a 7408 je +0xde50043 (0de50044) 0de5003c 8d7604 lea esi,[esi+4] 0de5003f 49 dec ecx 0de50040 7408 je +0xde50049 (0de5004a) 0de50042 ebf2 jmp +0xde50035 (0de50036) 0de50044 cc int 3 0de50045 e905000000 jmp +0xde5004e (0de5004f) 0de5004a e900000000 jmp +0xde5004e (0de5004f) 0de5004f 9d popfd 0de50050 61 popad 0de50051 e92a000000 jmp +0xde5007f (0de50080) 0:001> u 0de50000+0x60 L4 +0xde5005f: 0de50060 8bc8 mov ecx,eax 0de50062 c1f905 sar ecx,5 0de50065 8d1c8da0d65b78 lea ebx,MSVCR90!__pioinfo (785bd6a0)[ecx*4] 0de5006c e9506e736a jmp MSVCR90!_read+0x6a (78586ec1) 0:001> u 0de50000+0x80 L2 +0xde5007f: 0de50080 c745fcfeffffff mov dword ptr [ebp-4],0FFFFFFFEh 0de50087 e9b06e736a jmp MSVCR90!_read+0xe5 (78586f3c)
Now, let's test it out by loading a director file into Internet Explorer with a debugger attached:
0:001> g ModLoad: 01a00000 01a0d000 C:\WINDOWS\system32\Adobe\Shockwave 11\xtras\Speech.x32 ModLoad: 01a20000 01a4d000 C:\WINDOWS\system32\Adobe\Shockwave 11\xtras\Multiusr.x32 ModLoad: 01a10000 01a16000 C:\WINDOWS\system32\Adobe\Shockwave 11\DYNAPLAYER.DLL ModLoad: 69000000 69108000 C:\WINDOWS\system32\Adobe\Shockwave 11\IML32.dll ModLoad: 68000000 681ad000 C:\WINDOWS\system32\Adobe\Shockwave 11\DIRAPI.dll ModLoad: 6c100000 6c119000 C:\WINDOWS\system32\Adobe\Shockwave 11\SwMenu.dll ModLoad: 01ab0000 01ad7000 C:\WINDOWS\system32\Adobe\Shockwave 11\xtras\Netfile.x32 ModLoad: 71ad0000 71ad9000 C:\WINDOWS\system32\WSOCK32.dll ModLoad: 03b10000 03b17000 C:\WINDOWS\system32\Adobe\Shockwave 11\xtras\CBrowser.x32 (11ec.1910): Break instruction exception - code 80000003 (first chance) eax=58464952 ebx=785bd6a0 ecx=00002000 edx=7c90e514 esi=039e2a94 edi=000000c0 eip=0de50044 esp=0160ba3c ebp=0160ba90 iopl=0 nv up ei pl zr na pe nc cs=001b ss=0023 ds=0023 es=0023 fs=003b gs=0000 efl=00200246 Missing image name, possible paged-out or corrupt data. Missing image name, possible paged-out or corrupt data.+0xde50033: 0de50044 cc int 3
As can be seen above, an int3 was thrown at 0x0de50044 (which is inside our heap arena). At this point, we can verify that at ESI is our needle value.
0:008> dc esi 039e2a94 58464952 50dc0000 3339564d 70616d69 RIFX...PMV93imap 039e2aa4 18000000 01000000 2c000000 3a070000 ...........,...: 039e2ab4 00000000 00000000 00000000 70616d6d ............mmap 039e2ac4 e00f0000 14001800 ca000000 97000000 ................ 039e2ad4 90000000 ffffffff 68000000 58464952 ...........hRIFX 039e2ae4 50dc0000 00000000 00000100 00000000 ...P............ 039e2af4 70616d69 18000000 0c000000 00000100 imap............ 039e2b04 24d9af0a 70616d6d e00f0000 2c000000 ...$mmap.......,
At this point we can set a memory breakpoint on that location and it should tell us where Shockwave first decides to parse our data:
0:008> ba r1 039e2a94 0:008> g Breakpoint 0 hit eax=00000052 ebx=0000000c ecx=0000000c edx=0160bb6c esi=039e2a94 edi=0160bb6c eip=69007295 esp=0160ba94 ebp=0160baa0 iopl=0 nv up ei pl nz na po nc cs=001b ss=0023 ds=0023 es=0023 fs=003b gs=0000 efl=00200202 IML32!Ordinal9231+0x7295: 69007295 83c601 add esi,1 0:008> ub IML32!Ordinal9231+0x727b: 6900727b 3bc1 cmp eax,ecx 6900727d 7567 jne IML32!Ordinal9231+0x72e6 (690072e6) 6900727f 8b0dfcbc0b69 mov ecx,dword ptr [IML32!Ordinal2344+0x32fac (690bbcfc)] 69007285 8b7d08 mov edi,dword ptr [ebp+8] 69007288 8b750c mov esi,dword ptr [ebp+0Ch] 6900728b f7c707000000 test edi,7 69007291 740f je IML32!Ordinal9231+0x72a2 (690072a2) 69007293 8a06 mov al,byte ptr [esi] 0:008> t eax=00000052 ebx=0000000c ecx=0000000c edx=0160bb6c esi=039e2a95 edi=0160bb6c eip=69007298 esp=0160ba94 ebp=0160baa0 iopl=0 nv up ei pl nz na pe nc cs=001b ss=0023 ds=0023 es=0023 fs=003b gs=0000 efl=00200206 IML32!Ordinal9231+0x7298: 69007298 8807 mov byte ptr [edi],al ds:0023:0160bb6c=a8 0:008> t eax=00000052 ebx=0000000c ecx=0000000c edx=0160bb6c esi=039e2a95 edi=0160bb6c eip=6900729a esp=0160ba94 ebp=0160baa0 iopl=0 nv up ei pl nz na pe nc cs=001b ss=0023 ds=0023 es=0023 fs=003b gs=0000 efl=00200206 IML32!Ordinal9231+0x729a: 6900729a 83c701 add edi,1 0:008> ba r1 @edi 0:008> g Breakpoint 1 hit eax=58464952 ebx=00000000 ecx=00000000 edx=058273a8 esi=03af32c0 edi=00000000 eip=68002ffc esp=0160bb5c ebp=0160bb98 iopl=0 nv up ei pl zr na pe nc cs=001b ss=0023 ds=0023 es=0023 fs=003b gs=0000 efl=00200246 DIRAPI+0x2ffc: 68002ffc 3d58464952 cmp eax,52494658h 0:008> .formats @eax Evaluate expression: Hex: 58464952 Decimal: 1481001298 Octal: 13021444522 Binary: 01011000 01000110 01001001 01010010 Chars: XFIR Time: Mon Dec 05 23:14:58 2016 Float: low 8.72073e+014 high 0 Double: 7.31712e-315
So, we can see it first copies it to some other buffer. We set a breakpoint on that new buffer they copied to and when its next hit we can see they are comparing it to RIFX. This is the place we should begin reversing to discover more about their file format and corresponding parsing.
This was how Logan and I accomplished a lot of the Shockwave reversing that we talked about during our CanSecWest presentation (PPTX).
The above code can be snagged here in one .py file: thePublicHooker.py.
Expect some future posts on other tricks that can be accomplished using injection in this manner.
--
Aaron (@aaronportnoy)
