TippingPoint Digital Vaccine Laboratories

MSRPC NDR Types Technical Overview

Aaron Portnoy and I have finished a presentation at the first annual DeepSec security conference. Our talk titled "RPC Auditing Tools and Techniques" focused on some new tools and existing methodologies for auditing RPC interfaces.

The main focus of this research was to provide the tools and techniques we use so that others may also be able to audit RPC services. The three components we mentioned were pulling all binaries that include RPC interfaces, dumping their IDL information, and communicating with the endpoint via our PyMSRPC toolkit. This toolkit allows the researcher to focus on auditing instead of the tedious intricacies of the NDR wire representation.

The toolkit works by parsing a provided IDL pulled from the binary and creating the proper Python objects for each data type needed. This includes any structures, unions, and arguments needed to call an RPC function on the remote system. Once properly parsed one can then access any element they wish programmatically and call the requested function.

PyMSRPC abstracts the difficult task of taking all of this information and properly marshalling it for transmission. This is done by implementing several of the rules that the unmarshalling routines in RPCRT4.dll constrain the data coming in to. However, this blog post is going to focus on these rules, and walk through some of the way different data types are transported by PyMSRPC. This can give an interesting insight into how data is both packaged and sent over the wire. Also, if ever building an RPC request manually this might prove useful.
      
Each section is broken down by the data type we are representing via NDR.

Simple Types

Simple types are the basic data types we can send. They are all packed according to the endianess of our wire format. Nothing special to note. The following lists all the simple types.
  • ndr_byte()
  • ndr_char()
  • ndr_small()
  • ndr_usmall()
  • ndr_wchar()
  • ndr_short()
  • ndr_ushort()
  • ndr_long()
  • ndr_ulong()
  • ndr_float()
  • ndr_hyper()
  • ndr_double()
  • ndr_enum16()
  • ndr_enum32()
  • ndr_error_status_t()
  • ndr_int3264()*
  • ndr_uint3264()*
* Unsupported in pymsrpc

Arrays

As expected arrays are a collection of data. However, NDR has four ways to represent an array. The following is a little info about each.

Fixed

The fixed array is the simplest of arrays. It knows how many elements it contains and is packed as you would a simple type. The IDL for a fixed array appears like so.

    byte elem_4[8];

This would pack into a string of 8 bytes like below.

    "\x41\x41\x41\x41\x41\x41\x41\x41"

Conformant

The conformant array is the most common array encountered in an IDL. This array has its size known so it can be block copied by the NDR engine. To do this we have to supply additional information to the NDR representation. The IDL for a conformant array can be any of the following*.

    [size_is(arg_4)] char * arg_3
    [size_is(20)] char * elem_5
    [size_is(elem_2/2)] wchar_t * elem_3
    [size_is(arg_5)] long arg_6[]

As you will notice the size of the array can either be a static value or taken from another element in the IDL. When we pack this into NDR we have to add this value to the beginning of our request. Example 2 would appear like so on the wire.

    "\x10\x00\x00\x00" <-- Conformance size
    "\x41\x41\x41\x41\x41\x41\x41\x41\x41\x41\x41\x41\x41\x41\x41\x41" <--- Array data

As can be seen in example 3 simple modifier tokens can also exist in the size specification. These include.

    / 2
    * 2
    - 1
    + 1

* The attribute "max_is" can also be used to denote a conformant array, however it is encountered rarely in our experiences.

Varying

The varying array is used to specify how many elements are being transmitted over the wire. This allows a varying sized array saving some memory. The IDL for a varying array would appear as follows*:

    [length_is(elem_1)] char elem_2[80]

This can be packed on the wire in a similar way as the conformant array, however we also must add an "offset" to the representation so the NDR engine can find the variance values.

    "\x00\x00\x00\x00" <-- Offset to variance size (Almost always 0x0)
    "\x04\x00\x00\x00" <-- Variance size
    "\x41\x41\x41\x41" <-- Array data

* No examples exist of simply a varying array used in any Microsoft Service. 

Conformant Varying

Just as expected this array contains both conformant and varying information. This is specified using the "size_is" and "length_is" attributes to an array. This would appear like so in the IDL.
      
    [size_is(elem_10), length_is(elem_9)] wchar_t * elem_11
    [size_is(elem_2/2), length_is(elem_1/2)] short * elem_3
    [size_is(20), length_is(20)] char ** elem_5
      
As previously explained these values represent the size of the memory allocated (size_is), and number of elements transmitted (length_is). The wire representation simply specifies both the conformance and varying information. Example 3 would appear on the wire as.
      
    "\x10\x00\x00\x00" <-- Conformant size
    "\x00\x00\x00\x00" <-- Offset to variance data
    "\x10\x00\x00\x00" <-- Varying size
    "\x41\x41\x41\x41\x41\x41\x41\x41\x41\x41\x41\x41\x41\x41\x41\x41" <--- Array data

Complex

Complex arrays are any arrays that contain elements preventing the ndr engine from block copying the whole array. These have to be parsed element by element. The following elements make an array complex[1].
      
  • simple types: ENUM16, __INT3264 (on 64-bit platforms only), an integral with [range]
  • reference and interface pointers (all pointers on 64-bit platforms)
  • unions
  • complex structures (see the Structure section)
  • elements defined with [transmit_as], [user_marshal]
  • All multidimensional arrays with at least one conformant and/or varying dimension are complex regardless of the underlying element type.
The NDR representation of a complex array is mostly linear to the data it contains. Each array has its on alignment directive. However in almost all cases this is four byte aligned. This means if we have to pad to the nearest dword any array we transmit. For instance
  
    [size_is(2)] char elem_1[]
  
Would have to be two byte padded to pass NDR inspection.
  
    "\x41\x41\xaa\xaa"
  
Where "\xaa" is our padding byte*.

* The padding byte can be anything and is simply skipped during NDR parsing.

Strings

The MIDL string is represented by a special attribute in an IDL. The "[string]" attribute denotes an array of type "char", "wchar_t", or "byte" that is counted at runtime. This is not to say that an array cannot contain certain characters, or inversly, that a string has to be ascii, but that we are dealing with an array of data that is null terminated. When packing a string we must count the length including the null and supply it to the NDR engine on the wire. Since this is done at runtime we use the same size representation as a conformant varying array. The following are some examples.
  
    [string] wchar_t * elem_1
    [string] char * elem_28
  
Let's consider example two is the string "test". On the wire we would need to add a null character, count the string, and pack the conformant varying information before the actual data.
  
    "\x05\x00\x00\x00"     <-- Conformant size
    "\x00\x00\x00\x00"     <-- Offset to Varying size
    "\x05\x00\x00\x00"     <-- Varying size
    "test\x00\xaa\xaa\xaa" <-- String + Null + Alignment
  
Just like arrays we must align to the next dword. There is a lesser used string representation that already knows the size of the array.
  
    [string] char elem_3[64]
    [string] wchar_t elem_1[3]

This is considered a NonConformant string according to rpcrt4 and thus only contains the varying information. Example 2 would appear on the wire as such.
  
    "\x00\x00\x00\x00" <-- Varying size offset
    "\x03\x00\x00\x00" <-- Varying size
    "t\x00e\x00\x00\x00\xaa\xaa" <-- String + Null + Alignment 

Unions

A union contains a list of data indexed by a simple data type. There technically exists two types of unions in MIDL, encapsulated unions, and nonencapsulated unions, however the former is not used in newer implementations of RPC. The nonencapsulated union is denoted by the declaration of a union type including its switch, case numbers, and resulting data. An example of a union declaration is.
  
    typedef [switch_type( unsigned long )] union union_2 {
     [case(0)]  struct struct_1B * elem_1;
     [case(1)]  struct struct_1C * elem_2;
     [default] ;
    } union_2;
  
This states that a union has been created, keyed of a long simple type. The union contains 2 cases and both are pointers to structures. In practice this union is then used like so.
  
    [switch_is(elem_1)] union union_2 elem_3
  
Here we see that the value in elem_1 is the case we would like sent out. Packing this on the wire is pretty straight forward. We must first pack the discriminant value, and then the data it points to. For example the following union.
  
    typedef [switch_type( unsigned long )] union union_1 {
     [case(0)]  long elem_1;
     [case(1)]  short elem_2;
     [default] ;
    } union_2;
  
With the following use.
  
    long elem_1
    [switch_is(elem_1)] union union_2 elem_3
  
Would result in the NDR wire representation where elem_1 is 0x0 and elem_1 of the union is 0x52.
  
    "\x00\x00\x00\x00"
    "\x52\x00\x00\x00"
  
Since the union is basically a container for other data alignment is not done by the actual union, but left up to the type being rendered. 

Structures

Structures are the most complex of all the data types in IDL/NDR. Seeing as how they contain any of the previously mentioned types, or other structures, this is obvious. When dealing with structures we have to keep in mind several aspects. These include, alignminment, points, and arrays. If we satisfy all of those we can represent the structure properly in NDR. The following are the different structures*. 

Simple

A simple structure only contains simple types, fixed arrays and any other simple structures. The reason this is considered simple is it can be block copied because the size is already known. An example of a simple structure would be.
      
        typedef struct struct_2D {
         long elem_1;
         long elem_2;
         long elem_3;
         long elem_4;
        } struct_2D ;
      
This would render down like four longs.
      
        "\x01\x00\x00\x00"
        "\x02\x00\x00\x00"
        "\x03\x00\x00\x00"
        "\x04\x00\x00\x00"
      
Alignment is handled at the top most structure level and is done to the largest data types size (usually dword) so in a case where we had a data type smaller than four bytes we would need to align it.     

Simple with Pointers

A simple structure with pointers contains simple types, fixed arrays, pointers, and others simple structures with pointers. The reason this is handled differently than a simple structure is we also have to walk the pointers. This is done by first walking all elements in the structure (and any embedded structures), rendering their data type, and pointer. Then any data that was being pointed to (deferred data). An example would be.
      
    typedef struct struct_2D {
     long elem_1;
     long elem_2;
     long elem_3;
     long * elem_4;
    } struct_2D ;
      
This would render down like four longs with a pointer to the last long.
      
    "\x01\x00\x00\x00"
    "\x02\x00\x00\x00"
    "\x03\x00\x00\x00"
    "\x41\x42\x43\x44" <-- Pointer value
    "\x04\x00\x00\x00" <-- Deferred data
      
Alignment is handled in the same manner as Simple Structures.

Conformant

A conformant structure contains only base types, fixed arrays, and simple structures, and must contain either a conformant string or a conformant array. This array could actually be contained in another conformant structure or conformant structure with pointers which is embedded in this structure. Because we have conformant information we must also include that in the structures NDR representation. An example of this would be.
      
    typedef struct struct_19 {
     long elem_1;
     [size_is(elem_1)] long elem_2[];
    } struct_19 ;
      
Since we are dealing with a conformant array we must specify the conformant information at the beginning of the structure. If elem_1 was 0x1 this would be represented as such.
      
    "\x01\x00\x00\x00" <-- Conformant size
    "\x01\x00\x00\x00" <-- elem_1
    "\x02\x00\x00\x00" <-- elem_2 array data
      
If elem_1 was 0x2 we would simple add the array data at the end.
      
    "\x02\x00\x00\x00" <-- Conformant size
    "\x02\x00\x00\x00" <-- elem_1
    "\x03\x00\x00\x00" <-- elem_2 array data[0]
    "\x03\x00\x00\x00" <-- elem_2 array data[1]     

Conformant with Pointers

A conformant structure with pointers contains only base types, pointers, fixed arrays, simple structures, and simple structures with pointers; a conformant structure must contain a conformant array. This array could actually be contained in another conformant structure or conformant structure with pointers that is embedded in this structure. The difference here is the location of the conformance information. If we have a pointer to a conformant array we put the conformant size in front of the array data. An example is below.
      
    typedef struct struct_36 {
     long elem_1;
     [size_is(elem_1)] long * elem_2;
    } struct_36 ;
      
In this case instead of prepending the conformant size to the structure we will simply include it next to the actual array data as such.
      
    "\x02\x00\x00\x00" <-- elem_1
    "\x41\x42\x43\x44" <-- Pointer
    "\x02\x00\x00\x00" <-- Conformant size
    "\x03\x00\x00\x00" <-- elem_2 array data[0]
    "\x03\x00\x00\x00" <-- elem_2 array data[1]
      
This can get complicated when the structure embeds a conformant array with pointers. All data must first be rendered including the pointer values. After all of that has been done the deferred data is then added including its conformance information.     

Conformant Varying

A conformant varying structure contains only simple types, pointers, fixed arrays, simple structures, and simple structures with pointers; a conformant varying structure must contain either a conformant string or a conformant-varying array. The conformant string or array can actually be contained in another conformant structure or conformant structure with pointers that is embedded in this structure. This is the same as the Conformant (and Conformant with Pointers) structures. An example would be.
      
    typedef struct struct_47 {
     long elem_1;
     long elem_2;
     long elem_3;
     long elem_4;
     long elem_5;
     [string] wchar_t * elem_6;
     [string] wchar_t * elem_7;
    } struct_47 ;
      
As previously stated this would pack all the longs, and pointers to the strings. Then it would include the conformant varying information and strings value. The behvior is duplicated for all pointers and conformant varying data.     

Complex

A complex structure is any structure containing one or more fields that either prevent the structure from being block-copied, or for which additional checking must be performed during marshaling or unmarshaling (for example, bound checks on an enumeration). The following NDR types fall in this category:
  • simple types: ENUM16, __INT3264 (on 64-bit platforms only), an integral with [range]
  • alignment padding at the end of the structure
  • interface pointers (they go using an embedded complex)
  • ignored pointers (that is related to [ignore] attribute and FC_IGNORE token)
  • complex arrays, varying arrays, string arrays
  • multidimensional conformant arrays with at least one nonfixed dimension
  • unions
  • elements defined with [transmit_as], [represent_as], [wire_marshal], [user_marshal]
  • embedded complex structures
  • padding at the end of the structure
Just like complex arrays we must process these element by element. The NDR representation would be the same as an array and rely on the element to properly align and pack the data.

* Most descriptions were lifted from [2]

Pointers

Various pointer types exist in the MIDL specification. And for RPC communication they are important. However, for NDR they behavior almost identically. The three pointer types are Unique, Full, and Reference pointers[3]. The difference being that Unique, and Full pointers can be NULL*.

We have already discussed how pointers behave differently in structures and how they appear on the wire. It is not necessary to go in depth on the intricacies of them. An example of a Unique pointer is below.

    [in][unique][string] wchar_t * arg_1,

When packing this string would be prepended with a pointer value (ndr.py defaults to 0x41424344).

* Other differences exist, but are outside the scope of this document.

Opcodes

Opcodes are simply the actual RPC function. When requesting a particular function on a remote server via rpc you specify it by the opcode (param number). Each opcode contains any parameters it is expecting via the MIDL "[in]", "[out]", or "[in,out]" declarations. Each parameter can consist of any of the aforementioned types. An example is below.

    /* opcode: 0x00, address: 0x767EB3CE */

    long  NetrCharDevQSetInfo (
     [in][unique][string] wchar_t * arg_1,
     [in, out] struct struct_1 * arg_2,
     [in] long arg_3,
     [out] long * arg_4,
     [in, out][unique] long * arg_5
    );

When we represent this on the wire we will not be interested in the "[out]" parameters as those are what the client will be receiving. Another change in pace is the pointer attributes. In structures and arrays we would embedd a value to represent that point, however in an opcode we simply ignore it unless it has the "[unique]" attribute. In those cases we would proceed like any other unique pointer and add a value.
  
Since these elements are passed to a function we must align each one on the stack boundry. In all cases this would be to a dword size. This lets the function taking the arguments have the proper data aligned.
  
The method of packing this particular opcode would go like such.
  1. Pack a value to represent the "[unique]" attribute
  2. Pack the wide character conformant varying string
  3. Pack the contents of struct_1 adhering to the rules of the structure type
  4. Pack a long simple type
  5. Pack a value to represent the "[unique]" attribute
  6. Pack a long simple type
One would notice we skiped "arg_4" but yet included "arg_5". This is because an "[out]" parameter is not transmissed, but an "[in, out]" is.

There you have it. Most of the dirty things that PyMSRPC will handle for you. The source code for this all will get posted as soon as I get back from touring Europe...

Take care,

-Cody

References:

[1] http://msdn2.microsoft.com/en-us/library/aa373542.aspx
[2] http://msdn2.microsoft.com/en-us/library/aa378695.aspx
[3] http://msdn2.microsoft.com/en-us/library/aa373964.aspx
Tags:
Published On: 2007-11-24 13:50:18

Comments post a comment

No comments.
Trackback