Note: If you want to test this dissection out for yourself, please note that I take no responsibility in any accounts getting banned for Terms Of Service violations. And no, I don't have an active account anymore; I '/wowquit' on my own terms long ago.
To understand the gaming protocol, one must understand the overall process that the game goes through. Most MMO games will first do a file and version check, then download a patch if necessary. After the client is up to date, the game starts and the user can enter their account information to authenticate with the login server. If successful, they are presented with a list of game servers to choose, which starts a new network connection to the game server. "World of Warcraft" runs this process slightly different with the authentication process first, then the client version check, and then the hand off to the game server. Some games will use standard protocols, such as HTTP for patch updates or HTTPS for authentication, but the game communication itself can be gibberish to the untrained eye. "World of Warcraft" uses its own protocol on TCP port 3724 for the authentication and game communication. Patch updates are transferred over the BitTorrent protocol.
A basic understanding of computer science is required for this type of research. When writing programming code or stepping about in a debugger, one has to keep in mind that data typically comes in 4 byte segments, but may also be in a single byte, 2 bytes, or larger sizes. In most file formats, data is typically stored with a null (00) byte to terminate the data string or with a length value prepended before the string so that the program reading the file can know how far to look. Data can be written in different directions, which is referred to as its endianness. If these concepts are new to you, you may want to read some more about it.
Here is an example of the first packet sent by the Warcraft client in the authentication session. The TCP/IP headers have been omitted.
0030 02 22 00 57 6f 57 00 01 07 .".WoW... 0040 00 3f 12 36 38 78 00 6e 69 57 00 53 55 6e 65 98 .?.68x.niW.SUne. 0050 fe ff ff c0 a8 01 aa 04 54 45 53 54 ........TEST
You already know that this is a Warcraft game packet, so the "WoW" string should stand out. How about the other bytes? Looking toward the beginning, we see that there are 3 bytes that precede the "WoW" string. We might assume that 2 of those bytes are together and one is a single byte, possibly using its bits as flags. But which, and to what purpose? "02 22" can be 8706 decimal or 546 decimal, depending on the endianness. Those values do not make much sense. How about the next byte pair "22 00"? That's a nice small number in little endian; only 34 decimal. Could it be a length? By counting 34 bytes into the packet after those bytes, we would arrive at the very end of the packet. So perhaps these two bytes are the length of the rest of the packet. By gathering more packet traces using different account names, we can see the same pattern start to emerge. We can deduce that the 2nd the 3rd bytes are length bytes. Observation of minor changes to data is essential in this method of research. One might try a different account name each time, or client version, or a different IP address, or even a different desktop system entirely to induce changes in the data.
After enough data gathering and analysis, we can easily deduce the following packet schema from the game client (version 1.7.0.4671):
0030 02 22 00 57 6f 57 00 01 07 .".WoW...
0040 00 3f 12 36 38 78 00 6e 69 57 00 53 55 6e 65 98 .?.68x.niW.SUne.
0050 fe ff ff c0 a8 01 aa 04 54 45 53 54 ........TEST
In little endian:
02 Packet Type
22 00 Length of rest of packet
WoW 00 Packet Identifier: "WoW"
01 Major Version number of client program
07 Minor Version number of client program
00 Patch Version of client program
3f 12 Build Number of client ("4671")
68x 00 Client Processor (either "x86" or "PPC")
niW 00 Client Operating system (either "Win" or "OSX")
SUne Client Language setting ("enUS" here)
98 fe ff ff Unknown
c0 a8 01 aa Client's real IP address
04 Length of the following Account Name
TEST Account Name
The 4 unknown bytes are of little importance to the identification of the protocol. We could use a debugger if we really wanted to know what the bytes are used for, but it is not necessary as we have plenty of labeled data to work with. You can see that the password for the account is not sent here, but is rather sent later after a response from the authentication server. For those wondering, the password is not sent "in the clear" for all to see like the account name.
I hope this gives some insight into how to read this and other "unknown" protocols. As George would say: "I have no end for this, so I will take a bow."
