TippingPoint Digital Vaccine Laboratories
DID YOU KNOW... Ganesh developed a network protocol fuzzer for SCADA vulnerability research, unveiled at the 2007 LayerOne conference.

New Leopard Security Features - Part II: Code Signing

Last week we talked about Address Space Layout Randomization, one of the new security features in Leopard. This week, we’re going to talk about code signing.

Once again, I’m going to attempt to differentiate this blog posting from every other blog posting about the security features of Leopard by actually going into the history of code signing and the science behind it.

So, without further ado, I hereby give you:

Mac OS X 10.5 “Leopard” New Security Features - Part II: Code Signing


Code signing is sort of like taking echinacea or running a non-TippingPoint IPS. Sure, it makes you feel like you’re doing something constructive, but nobody’s really sure it actually helps in any practical way at all.

(It could also potentially cause severe autoimmune reactions and subsequent liver failure and death. The echinacea could probably have bad side effects too.)

All that being said, it’s in vogue to support code signing now, and it is interesting from a math geek’s perspective, so we’re going to go over it. First off, though, we need to cover two important fields of applied cryptography: cryptographic hashes and cryptographic digital signatures.

A hash algorithm is a mathematical procedure that maps a given value to another value, usually of a fixed size. For example, I could have a hash function that maps each letter of the Latin alphabet to a number between 1 and 26. When each possible input results in a unique output, it’s called a one-to-one hash function or a perfect hash function.

In general, however, it’s impossible to construct a perfect hash function when the range of inputs is unknown. If all of the possible hash values are just the numbers between 1 and 26, but the function accepts 52 inputs, there’s going to be a hash collision where two input values map to the same hash value. Hash functions that can result in collisions are called, appropriately enough, imperfect hash functions.

If you think that imperfect hash functions are, well, imperfect, then you'd be, well, wrong. They have a useful property that perfect hash functions don't have: they are one-way  functions. That is, they are not reversible - given a hash value I cannot tell you with certainty which input value produced that hash.

For example, take the trivial hash function odd?. The odd? hash function produces only two possible hash values: 0 or 1. If an input number is odd, the hash value is 1, if the input number is even, the hash value is 0. Given the hash value, I can definitely tell you whether or not the number input is even or odd, but I can’t tell you which even or odd number was input.

(A quick aside: one way functions are not actually proven to exist, but a very large portion of modern cryptography rests on the assumption that they do, so I’m going to assume they do too.)

There are a class of imperfect hash functions called cryptographically secure hash functions. These are hash functions that map a large range of inputs to a large range of hash values, and do so in a way that a small change in the input value results in a wildly different hash value. Good examples of these are SHA-1, MD5, and WHIRLPOOL. SHA-1 and MD5 produce 128-bit hash values for inputs, while WHIRLPOOL produces 512-bit hashes. These example hashes are great, in that changing one bit in the input stream will, on average, change half of the bits in the output hash value.

These cryptographically secure hash functions are not immune from collisions, of course - any time a larger number of inputs is mapped to a smaller number of outputs, collisions are inevitable - but the output sizes of these hash functions are so large that the chances of any two reasonable inputs causing a collisions are minuscule.

What’s the point of all this? Well, say I want to download a piece of software. The server gives me the software to download and the MD5 hash of the software. After I’ve downloaded the software, I can compute its MD5 hash. If the hash that I compute matches the hash given by the server, I can be extremely confident that what I got is what was sent. The data was almost certainly not changed - by mistake or intention -  in transmission.

This is great, of course, because I can verify that what I downloaded is what they said I downloaded. It would be extremely difficult to find another piece of software that would actually run and produce the same hash value.

This leads to an example of how to use this sort of thing in a practical security context. A lot of operating systems support code hashing - all of the programs on the system have a hash value computed for them and stored in a database. After that, every time they are run, the MD5 hash is validated. If the value calculated doesn’t match the stored value, then either the program or the hash database has changed. This could be a sign that an attacker broke in and replaced legitimate programs with malicious ones, or just that something got corrupted, but either way, it’s something that merits attention.

Another application of these hash functions is password storage. For example, my computer never stores my password (which is "batmanrules", for the record), anywhere. If it did, someone who broke into my computer could read my password (batmanrules) and know what it is. Instead, my computer stores the hash value of my password (batmanrules). When I log in, it calculates the hash of what I type and compares that; it immediately discards the actual password (batmanrules) that I typed in.

So, how does all this relate to code signing? Well, we already saw a primitive form of code signing in the example above. Hash values of program code are stored, and if they change, we know. If we trust that they were non-malicious when we first calculated the hashes, then  we know they’re still non-malicious.

Note the big if in the paragraph above. If we trust that the files were non-malicious when we first calculated the hashes, then we know they’re still non-malicious. The question is, how do we know that the code at the very beginning was non-malicious? Well, it all comes down to that weakest of links: human trust. We trust that the code on the system was non-malicious when we first installed it, because we installed it from CDs sent by the manufacturer, which presumably were untampered with in transit.

While trusting code loaded directly from physical media may be a reasonable leap to make for most people, what about code being downloaded from the internet? Well, if you trust the owner of the website where you downloaded the code, then, sure, it’s trustworthy...right? Right???

Well, what if the code was tampered with in transit? What if you accidentally typed in “Wicrosoft.com” instead of “Microsoft.com”? I don’t know about you, but I don’t trust those guys any farther than I can throw them. Also, the Wicrosoft guys are probably pretty bad too.

(See what I did there? Man, that joke’s never going to get old.)

 That’s where code signing comes into play. Code signing can tell me that, not only has the code not changed in transit, but it really does come from the person it says it does.

How does it do that? By a little bit of applied cryptography known as digital signatures. (Technically, most signatures are digital, unless you somehow don’t write using your fingers. But I digress.) Digital signatures rely heavily on the cryptographic hash functions discussed above, but they also toss something new into the mix: public key cryptography.

In your normal, everyday cryptography, things are encrypted using a single key (password, passphrase, call it what you will). Anyone who knows the password can decrypt the data. Usually, this sort of encryption works by taking the bits of the key and combining them in creative ways with the bits of the data to be encrypted. This type of cryptography is called symmetric because on key can be used for both encryption and decryption.

Public key cryptography is different - it’s asymmetric. There are two keys. Usually, it works like this: something encrypted with key A can only be decrypted with the key B, and something encrypted with key B can only be decrypted with key A.

Think about this for a second. This opens up all sorts of interesting possibilities. Say, for example, that I generate two keys. I designate one of them my public key and one of them my private key. The public key I spread far and wide, like so many phone numbers on so many bathroom walls. I want everyone to have my public key. Put it on billboards for all I care.

Why? Because I still have the private key in my possession. That key I keep closer to me than the One True Ring was kept to Gollum. It’s my precious. I talk to it. (I also fear the sun, but that has more to do with my inherent nerdiness than any effects of magic rings.) Now that there are these two keys, an interesting thing becomes possible: If I encrypt a message with my private key and send it somewhere, people with my public key can decrypt it. However, since they used my public key to decrypt it, they know that it could only have been encrypted with my private key. That means that it could have only come from me.

Therein lies the basis of code signing. You have the public keys of software vendors that you trust. The code comes to you encrypted, and tells you who its from. If you can use their public key to successfully decrypt it, you can be sure it really did come from them. Cryptographic hash functions are then employed to ensure that the data wasn’t changed since it was first encrypted.

(As a quick digression, usually digital certificates are used to ensure the identity of the issuer of the public keys. All these certificates do is say that the person issuing the certificate believes that the public key came from the people it says it does. Of course, this depends on your trusting the issuer of the certificate. Eventually, you have to trust someone.)

That trust is the biggest problem, though. If you choose to trust “Evil Malicious Company, Inc (a Division of General Ethical Alignments, Inc)” then all the code signing in the world won’t save you from their dastardly deeds. One saving grace, however, is that your operating system vendor may pick up some of the slack for you here. They might give you a bunch of public keys from vendors they trust. So long as you trust your OS vendor’s judgment in issuing signing keys to software vendors, you can trust that nobody malicious is going to run signed code on your machine.

Now, after all that: How well does code signing work in Leopard? That’s simple question with a complicated answer. On the one hand, the code signing itself works just fine: the algorithms and methods to implement it are well understood and seem the be applied appropriately.

Apple seems to have signed most of the code that came with the operating system. You can examine signed code using Apple’s “codesign” command from the command prompt. Let’s take a look:

Last login: Wed Nov 14 13:20:03 on ttys000
jking@kremvax:~
$ codesign
Usage: codesign -s identity [-fv*] [-o flags] [-r reqs] [-i ident] path ... # sign
       codesign -v [-v*] [-R testreq] path|pid ... # verify
       codesign -d [options] path ... # display contents
       codesign -h pid ... # display hosting paths


So far so good. Now, let’s look at a signed binary:

jking@kremvax:~
$ codesign --display -v -v -v  /bin/ls
Executable=/bin/ls
Identifier=com.apple.ls
Format=Mach-O universal (i386 ppc7400)
CodeDirectory v=20001 size=257 flags=0x0(none) hashes=8+2 location=embedded
Signature size=4064
Authority=Software Signing
Authority=Apple Code Signing Certification Authority
Authority=Apple Root CA
Info.plist=not bound
Sealed Resources=none
Internal requirements count=0 size=12


Okay, once again, so far so good. The ‘/bin/ls’ command was signed by someone called the Apple Code Signing Certification Authority. They sound legit.

Now, let’s verify the signed code:

jking@kremvax:~
$ codesign --verify -v -v -v /bin/ls
/bin/ls: valid on disk
/bin/ls: satisfies its Designated Requirement

Yep, it’s “valid on disk”. Looks like it’s signed and hasn’t been changed since the bits were hand-crafted at Apple. Now let’s look at a version of ‘/bin/ls’ that I modified (called, of course 'ls.evil'):

jking@kremvax:~
$ codesign --verify -v -v -v ./ls.evil 
./ls.evil: code or signature modified

“Code or signature modified?!?!?” Looks like they caught my modifications. Good for them!



However, here’s the kicker:

jking@kremvax:~
$ ./ls.evil -l
total 144
drwx------+  7 jking  staff    238 Nov 14 13:30 Desktop
drwx------+  8 jking  staff    272 Nov  2 18:32 Documents
drwx------+  7 jking  staff    238 Nov 13 23:50 Downloads
drwx------+ 37 jking  staff   1258 Nov  6 13:24 Library
drwx------+  3 jking  staff    102 Oct 28 22:29 Movies
drwx------+  4 jking  staff    136 Oct 28 20:25 Music
drwx------+  7 jking  staff    238 Oct 28 13:54 Pictures
drwxr-xr-x+  5 jking  staff    170 Oct 28 12:32 Public
drwxr-xr-x+  5 jking  staff    170 Oct 28 12:32 Sites
drwxr-xr-x  23 jking  staff    782 Nov 13 12:02 Source
drwxr-xr-x   5 jking  staff    170 Nov 14 10:47 Temp
-rwxr-xr-x   1 jking  staff  73696 Nov 14 13:22 ls.evil



Oops. My modified code executed without a hitch. Now, I can understand running unsigned code without complaint, since most people haven’t started signing their code yet. However, to execute code with an invalid signature, well, that’s a horse of a different color.

So, they don't stop code with invalid signatures from running. Where does the code signing come into play?

Code signing in Leopard comes into play in ways that make me scratch my head. I understand what they're doing, but I don't like how they're doing it. Signed code gets two privileges that unsigned code doesn't get: automatic firewall bypass, and automatic updates. In other words, signed code can change the firewall configuration and install updates without prompting the user.

Thus far, all the updates I've gotten via Software Update have prompted me, but that may not always be the case in the future. Third party updates may happen completely without my knowledge. That scares me.

Maybe code signing really will make vendor updates easier, but it certainly isn’t lighting my fire from a security standpoint. Sure, if I wanted to I could go to the command line and examine each piece of code I download and verify its pedigree, but Mac OS X doesn’t do it for me when it launches the program. I modified several binaries - both traditional Unix commands and Cocoa applications, and both launched without complaint but with invalid signatures.

Apple could make the code signing feature a hundred times more useful by complaining before running code with invalid signatures. Apple could make the code signing feature a thousand times more useful by not running unsigned code without prompting the user first. Obviously, the second feature will have to wait until most vendors are signing their code, but the first feature...there’s no excuse to have excluded that this time.

The ultimate bit of usefulness that could come from code signing is actually integrating it with the sandboxing system (to be discussed next week). Having the ability to assign different levels of privilege dynamically to applications based on their signatures would be awesome. I would also like a unicorn.

So, there you have it. My conclusion: Code signing in Leopard is a no-op. It doesn’t do much good right now, and given the surreptitious nature of the potential privileges granted to signed code, it might actually do some harm. I suppose I’m happy that the foundations have been laid - in the future it could turn into something very helpful.

Code signing in Leopard: interesting in theory, lacking in practice. Here’s hoping it evolves into something awesome.

Okay, kids, next week we’ll be discussing Apple’s sandboxing technology. Be sure to bring your pail and spade.
Tags:
Published On: 2007-11-21 15:43:16

Comments post a comment

  1. Rob Keniger commented on 2008-06-29 @ 05:41

    You can prevent an app from launching if it has been modified since signing by setting the kill option when you sign it:

    codesign -o kill-s "Signing Identity" /path/to/app

    If you try launching a modified app that was signed in such a way, launchd will just kill it immediately.


Trackback