TippingPoint Digital Vaccine Laboratories

MindshaRE: Searching in IDA

MindshaRE is our weekly look at some simple reverse engineering tips and tricks.  The goal is to keep things small and discuss every day aspects of reversing.  You can view previous entries here by going through our blog history. In this weeks installment of MindShaRE we will take a look at some fun uses for searching in IDA even utilizing IDC/IDAPython to automate this.

IDA provides several different search options.  Ranging from immediate values to undefined functions.  Right now we are going to only touch on the byte/text searching options which include.
  1. Immediate value
  2. Text
  3. Sequence of bytes
  4. Regular expressions
As a security researcher, I use the built in methods primarily to locate potential and common problems in assembly, parsing functions, and structure information.  Some simple things I look for are improper sign extension, protocol switch statements, and packet structure offsets.  Each of these can be potentially discovered using the search functionality.  Lets take a look at each one.

Finding Improper Sign Extension  Bugs

What I'm referring to here is the promotion of an integer to a larger size.  In a nutshell, this promotion results in a security vulnerability when a sign extension occurs on user data.  More detailed information regarding sign extension bugs can be found elsewhere such as the very excellent book by Mark Dowd et al The Art of Software Security Assessment. A typical instance of this bug will stem at the assembly level from the instruction "movsx". Using the "Text" search (Alt+T) and entering in "movsx" with "Find all occurrences" checked provides us with a neat little window of all sign extended move operations.



Locating Parsing Functions

If I'm analyzing an application that handles complex data, user input or configuration information I will be sure to track down and audit the various routines responsible for parsing that inbound data into data structures where programmatic logic can be applied to.  Many parsers are implemented with the help of "switch" statements.  At the assembly level, switch statements are actually implemented as jump tables. IDA does a good job of automatically identifying switch statements.  When a switch is identified it will be commented in a form such as:
2B0016DF    jmp    ds:off_2B001716[eax*4] ; switch jump
Utilizing the same search dialogue as before (Alt+T) we can plugin "switch jump" with "Find all Occurrences" enabled to produce a list of all switches within the binary.  Taking it one step further here is a simple IDAPython script that enumerates all switches and additionally each switches number of cases and case addresses:
while curea <= end and curea != BADADDR:
  comment = Comment(curea)
  if comment:
      if 'switch' in comment and 'cases' in comment:
          count = int(comment.split(' ')[1])
      elif 'switch jump' in comment:
          table = curea
          cases = []
          switches.append({'name'  : function_name, \
                           'table' : table,         \
                           'count' : count,         \
                           'cases' : cases})
  else:
      comment = RptCmt(curea)
      if comment:
          if 'jumptable' in comment:
              jt = int(comment.split(' ')[1], 16)
              if jt == table:
                  cases.append({'loc' : curea,
                                'tag' : "cases " + comment.split(' ')[3]})
   
  curea = NextHead(curea, end)
Running this IDAPython script produces the following sample output:
  ICMPv6Receive: 251e7: 10 cases
  ICMPv6Receive: 251ee: cases 128
Following the addresses takes you to the individual switch case:

  000251E7    jmp     ds:off_2523B[eax*4] ; switch jump
  000251EE
  000251EE loc_251EE:
  000251EE    push    esi                 ; jumptable 000251E7 case 128
  000251EF    call    ICMPv6SendEchoReply

This approach is especially effective when symbol information is present.

Searching for Structure References

This example is definitely gimmicky but worthwhile none the less.  IDA does not provide structure cross references.  Without proper symbolic information it's relatively impossible to discern between an [ecx+4] between one function and another.  Such is the nature of static reverse engineering. Sometimes though you have a pretty good idea of a structures use in a binary.  This occurs most often for myself when looking at network code.  Programmers usually create a structure for storing information about a request.  This typically includes a socket descriptor, buffer size, buffer pointer, and any other information associated with that session.

Once again we can use the search functionality to find accesses to a structure.  I know what you're thinking "this will never work", but it will in lots of cases.  Searching for the value "+28h" will likely show you any stores or reads to a structure that might seriously be the one you are concerned with.  Try it, it might be handy.

So there we have it.  We have really only covered one of the search mechanisms IDA provides to its user.  Immediate value searching and Sequence of Bytes searching can be used in very similar cases (movsx instruction can be searched by byte value as well) as the Text search.  I hope this can come in handy at some point for someone out there.  Feel free to leave a comment with some other fun ways of using IDA's searching capabilities.

See you next week,

Cody
Tags: MindshaRE
Published On: 2008-06-19 14:22:51

Comments post a comment

  1. Anonymous commented on 2008-06-20 @ 01:00

    Good one ;)

    if you could explain about "+28" for structure finding ,it will be more useful to us. what is that value ? opcode ???
    please explain it !!

  2. Cody Pierce commented on 2008-06-20 @ 10:37

    Anonymous,

    Thanks! What I am doing is searching for any use of the structure offset 28h. For instance if a program defines a structure and stores a pointer to user controlled data at struct+28, you can use the search to potentially find all access to that specific offset.

  3. Rolf Rolles commented on 2008-06-23 @ 17:27

    For more information on the structure-offset searching trick, see http://www.openrce.org/blog/view/1030/Binary_Search_in_Large-Scale_Structure_Recovery. My entry's about byte-searching, not text-searching.


Trackback