TippingPoint Digital Vaccine Laboratories
DID YOU KNOW... The DVLabs research team discovered 10 unique Adobe Shockwave vulnerabilities during October and November of 2010.

MindshaRE: Another Approach To Tracking ReadFile


I. Introduction



We often receive fuzzed file submissions, which at times can be agonizing to analyze. Tools help a lot here, as we have shown in previous posts, such as with Peter's awesome write up on hooking ReadFile and MapViewOfFile. This post approaches the same idea of hooking ReadFile for fuzz file analysis, but uses programmatic debugging to hook ReadFile and inspect the input instead of hot patching (hooking is not really the right term to use here, but we will call it that for simplicity). The goal is simple: attach to the application you are analyzing, set a breakpoint on ReadFile, inspect the input as it is read in, and act accordingly when specific input is found.


II. VDB and vtrace



For this task we will be using components from VDB. VDB is a debugger written in python by Invisigoth (http://visi.kenshoto.com, @invisig0th), and is based on vtrace, a programmatic debugging library by the same author. The code is officially released and hosted on visi.kenshoto.com, but also has community hosted patches and add-ons on code.google.com. This post will cover just a few small examples of what you can do with vtrace; by no means is this coverage in-depth.

III. Simple debugging example


First we start off with a simple python script to demonstrate the basics of setting up and using vtrace.


import sys
import vtrace

# main
if __name__ == "__main__":
    # error checking is for cowards, assume they provided input
    # pid to attach to
    pid = int(sys.argv[1])

    # get trace
    apptrace = vtrace.getTrace()

    # attach to process
    apptrace.attach(pid)

    # tell vtrace to automatically resume execution next time
    apptrace.setMode("RunForever", True)

    # run
    print "+ continuing execution"
    apptrace.run()

    sys.exit(0)


As you can see, to simply attach to a process to debug is straight forward. Getting a trace object is as easy as calling getTrace(), and attach to a program is as simple as calling attach(). After attaching to the process, vtrace returns immediately, similar to an initial attach breakpoint. Here we explicitly instruct vtrace to resume program execution by calling apptrace.run().

Lots of the fundamental debugging functionality in vtrace is provided as an event driven notification system. Events in this case are things like breakpoints being hit, or the application loading a library. When an event is fired, vtrace returns to python to execute a user provided callback. To instruct vtrace to automatically resume execution after a callback has finished, we set the RunForever flag to True using setMode().

Now that we can execute and debug a program of our choosing, let's setup a breakpoint for kernel32's ReadFile. Defining your own breakpoint starts with extending the vtrace Breakpoint class and then overriding the notify() function, as shown below:

class readFileBreakpoint(vtrace.Breakpoint):
    def notify(self, event, trace):
        print "+ breakpoint hit!"
        return


The notify function is called by vtrace internals upon the breakpoint being hit, and the arguments passed in are 'event' (information about the event which fired) and 'trace', a copy of the trace object. The simplest form of instantiating a Breakpoint class is done by passing it an address to break on. For example, we could instantiate the above breakpoint as:

    bp1 = readFileBreakpoint(0xAABBCCDD)


Of course, 0xAABBCCDD here is just an example of an address to break on. To resolve a symbol name instead, pass None as the first argument, with a second argument of a string in the form of DLLName.SymbolName, such as:


    bp1 = readFileBreakpoint(None, "kernel32.ReadFile")



Combining this with the initial code we wrote, we only need to call addBreakpoint() to register our breakpoint instance with our current trace object:


import vtrace
import sys

class readFileBreakpoint(vtrace.Breakpoint):
    def notify(self, event, trace):
        print "+ breakpoint hit!"
        return

# main
if __name__ == "__main__":
    # error checking is for cowards, assume they provided input
    # pid to attach to
    pid = int(sys.argv[1])

    # get trace
    apptrace = vtrace.getTrace()

    # attach to process
    apptrace.attach(pid)

    # tell vtrace to automatically resume execution next time
    apptrace.setMode("RunForever", True)

    # get an instance of our breakpoint class
    bp1 = readFileBreakpoint(None, "kernel32.ReadFile")

    # add the breakpoint to this trace object
    apptrace.addBreakpoint(bp1)

    # run
    print "+ continuing execution"
    apptrace.run()

    
    sys.exit(0)



IV. Data inspection



Now that we have basic breakpoint, let's look at writing something to inspect the data read in by ReadFile. We will not know the result of ReadFile until it is finished, so what we really need is a breakpoint to alert us of ReadFile's completion. From there, we can move forward with seeing if it was successful and, if so, what data was read. If we wanted to be clever, we could break on the only retn instruction inside of ReadFile and begin our inspection there. However, in favor of being dynamic and not hardcoding addresses, we instead will write code to extract the return address and break on that; the downside here is that this method incurs more overhead (context switching, etc) than the former.

Since we will need the return address, and other stack data such as ReadFile arguments, we will start with writing something to extract stack data:

import vtrace
import envi.archs.i386.regs as x86

class vstack():
    def __init__(self, trace):
        self.trace = trace
        self.stacktop = trace.getRegister(x86.REG_ESP)

    def pop32(self):
        x = self.trace.readMemory(self.stacktop, 4)
        dword, = struct.unpack("I", x)
        self.stacktop += 4
        return dword


Shown above is a class written to abstract some of the process of reading from the stack pointer. This class takes a trace argument and provides a pop32() function to read the next value up the stack. For register constants used by vtrace, we import from envi (another VDB component). We set our "stacktop" to ESP on initialization, and the pop32 function reads a 4 byte value and advances the saved pointer. From this example it is easy to see how to use the functions ReadMemory and getRegister.

Applying this class inside of our ReadFile breakpoint, we can read the return address and function arguments, which are ordered as:
  • hFile
  • lpBuffer
  • nNumberOfBytesToRead
  • lpNumberOfBytesRead
  • lpOverlapped

Here's what the code looks like:

class readFileBreakpoint(vtrace.Breakpoint):
    def notify(self, event, trace):
        print "+ breakpoint hit!"

        # setup to read values from stack
        stack = vstack(trace)
        
# unlike WinDBG, vtrace breakpoints hit before the call to push # ebp this means the first value on the stack is the return # address caller = stack.pop32() # first argument hFile = stack.pop32() # second arg, and so on lpBuffer = stack.pop32() numToRead = stack.pop32() numberRead = stack.pop32() return


Now that we know the return address of the caller, and the arguments for where ReadFile stores its results, inspecting the data read is as easy as breaking on the return address, checking the return value, and (upon success) reading from lpBuffer. With the examples provided already, this should not be too hard for the reader to write on their own. Shown below is an example breakpoint illustrating what code might look like to achieve this, along with some other useful vtrace functions.

First we setup a Breakpoint as before, only we override its __init__ function so we can store additional data. Here, we use a vtrace.OneTimeBreak instead of a normal breakpoint; as the name suggests, this Breakpoint will only hit once, and then automatically remove itself. This is done to prevent attempting to add more than one breakpoint to the same location, something vtrace does not like. In addition to the extracted ReadFile arguments, we also pass this function our "needle" to find in the "haystack" of data in lpBuffer.


# breakpoint class to set on ReadFile return address
# we override the __init__ function to allow storing 
# the arguments initially read by our ReadFile Breakpoint
class readFileRet(vtrace.OneTimeBreak):
    def __init__(self, addr, needle, lpbuf, ptrLengthRead):
        vtrace.OneTimeBreak.__init__(self, addr)

        # store the address we're breaking on 
        self.addr = addr

        # needle 
        self.needle = needle

        # buffer of stored data
        self.lpbuf = lpbuf

        # pointer to count of bytes read value
        self.ptrLengthRead = ptrLengthRead




In this breakpoint's notify function, we will want to do a few things. First we want to ensure the read was successful, by checking the return value in EAX. Assuming success, the next thing we want to check is that the amount of data read is big enough to contain the needle value. If that is also true, we will then read from the lpBuffer pointer, and search that data for our needle value. If it is found, for demonstration purposes, we show printing the call stack leading up to this ReadFile call. All of this put together might look like:

    def notify(self, event, trace):
        #
        print "+ Returned from ReadFile()"

        # not sure what to do about errors 
        # for now give up if one was seen
        retval = trace.getRegister(x86.REG_EAX)

        # return of success is non-zero
        if retval == 0:
            return

        # how much was read?         
        # get the value at the length pointer
        length = get32bit(trace, self.ptrLengthRead)

        # was there even enough data read?
        if length < len(self.needle):
            return

        # get that data!
        print "+ reading data from buffer"
        data = trace.readMemory(self.lpbuf, length)

        # did we get it?
        if self.needle in data:

            # boom goes the dynamite
            x = self.addr
            print "+ DATA FOUND; returned by ReadFile() at %08x" % x

            # get a stack trace
            stacktrace = trace.getStackTrace()
            print "+ Stack Trace:"
            print "+ Code Address    Stack Frame"
            for frame in stacktrace:
                print "+ %08x        %08x" % (frame[0], frame[1])

        return



The astute reader will notice we called an additional function we have not described or defined anywhere: get32bit(). This is a stupid simple helper function we wrote for dereferencing and reading from pointers, and the code for it is:


# return pointer from a pointer
# depth = how many pointers to follow (linked list, etc)
def get32bit(trace, addr, depth=1):
    ptr = addr
    while (depth):
        depth -= 1
        ptr, = struct.unpack("I", trace.readMemory(ptr, 4))

    return ptr



V. Put it all together



Piecing all of these components together, the code below shows an example python script to attach to a program, break on ReadFile (and its callers), and search for a specific value:


import sys
import struct
import vtrace
import envi.archs.i386.regs as x86

# used to abstract reading from stack top
# also saves having to type out stack+=4 all over
class vstack():
    def __init__(self, trace):
        self.trace = trace
        self.stacktop = trace.getRegister(x86.REG_ESP)

    def pop32(self):
        x = self.trace.readMemory(self.stacktop, 4)
        dword, = struct.unpack("I", x)
        self.stacktop += 4
        return dword

# return pointer from a pointer
# depth = how many pointers to follow (linked list, etc)
def get32bit(trace, addr, depth=1):
    ptr = addr
    while (depth):
        depth -= 1
        ptr, = struct.unpack("I", trace.readMemory(ptr, 4))

    return ptr

# this BP is set dynamically by readFileBP
# it checks the buffer used to store data
# using the length of bytes read 
# This is a OneTimeBreak so vtrace auto removes it after it hits
# Probably bad from a performance perspective - fuck it
class readFileRet(vtrace.OneTimeBreak):
    def __init__(self, addr, needle, lpbuf, ptrLengthRead):
        vtrace.OneTimeBreak.__init__(self, addr)
        # addr
        self.addr = addr
        # needle 
        self.needle = needle
        # buffer
        self.lpbuf = lpbuf
        # pointer to count of bytes read value
        self.ptrLengthRead = ptrLengthRead


    def notify(self, event, trace):
        #
        print "+ Returned from ReadFile()"

        # not sure what to do about errors 
        # for now give up if one was seen
        retval = trace.getRegister(x86.REG_EAX)

        # how much was read?         
        # get the value at the length pointer
        length = get32bit(trace, self.ptrLengthRead)

        # was there even enough data read?
        if length < len(self.needle):
            return

        # get that data!
        print "+ reading data from buffer"
        data = trace.readMemory(self.lpbuf, length)

        # did we get it?
        if self.needle in data:

            # boom goes the dynamite
            x = self.addr
            print "+ DATA FOUND; returned by ReadFile() at %08x" % x

            # get a stack trace
            stacktrace = trace.getStackTrace()
            print "+ Stack Trace:"
            print "+ Code Address    Stack Frame"
            for frame in stacktrace:
                print "+ %08x        %08x" % (frame[0], frame[1])


        return


# only argument this class takes for init is string to hunt for
# everything else is assumed, such as kernel32.ReadFile as the function
class readFileBP(vtrace.Breakpoint):

    def __init__(self, needle):
        vtrace.Breakpoint.__init__(self, None, "kernel32.ReadFile")
        self.needle = needle


    # notify: this is called when BP hits
    def notify(self, event, trace):
        # readfile hit bitches
        print "+ ReadFile BP hit"
        stack = vstack(trace)

        # get EIP of caller
        caller = stack.pop32()

        # skip hfile 
        blah = stack.pop32()

        # read lpBuffer pointer
        lpBuffer = stack.pop32()

        # number of bytes to read
        numToRead = stack.pop32()

        # number of bytes read pointer
        numberRead = stack.pop32()

        # was a real pointer passed for storing count read?
        if numberRead == 0:
            # shit your pants
            print "+ NULL ptr passed in for lpNumberOfBytesRead"
            return

        # first check: is the requested read amount even as long 
        # as our needle?
        if numToRead < len(self.needle):
            # too $hort 
            return

        # now set a breakpoint at EIP of caller to track data
        bp = readFileRet(caller, self.needle, lpBuffer, numberRead)
        trace.addBreakpoint(bp)

        return 


# main
if __name__ == "__main__":

    # error checking: FOR COWARDS
    # pid to attach to
    pid = int(sys.argv[1])

    # get trace
    apptrace = vtrace.getTrace()

    # attach to process
    print "+ attaching to pid %i" % pid
    apptrace.attach(pid)

    # readfile bp
    readfile = readFileBP("meowmix")

    # add breakpoint
    print "+ setting ReadFile breakpoint"
    apptrace.addBreakpoint(readfile)

    # set RunForever, contiued execution
    apptrace.setMode("RunForever", True)

    # run
    print "+ continuing execution"
    apptrace.run()

    # bye
    sys.exit(0)





That's it!



Tags:
Published On: 2012-04-02 08:36:21

Comments post a comment

  1. Anonymous commented on 2012-04-03 @ 00:51

    any reason why you wouldn't use pydbg to perform many / most / all of the same work?

  2. raid commented on 2012-04-03 @ 09:28

    No reason at all, I'm sure you could do the same with pydbg. I'm familiar with vtrace so that's what I used.

  3. Some Summercon Attendee commented on 2012-04-09 @ 20:52

    Also, pydbg doesn't begin with "v," so you don't have to drink every time you use it. Which, for me, is a major drawback.


Trackback