Made of Bugs

Some Notes on CVE-2010-3081 Exploitability

Most of you reading this blog probably remember CVE-2010-3081. The bug got an awful lot of publicity when it was discovered an announced, due to allowing local privilege escalation against virtually all 64-bit Linux kernels in common use at the time.

While investigating CVE-2010-3081, I discovered that several of the commonly-believed facts about the CVE were wrong, and it was even more broadly exploitable than was publically documented. I'd like to share those observations here.

A brief review of the bug

The bug arose from the compat_alloc_user_space function in Linux's 32-bit compatibility support on 64-bit systems. compat_alloc_user_space allocates and returns space on the userspace kernel stack for the kernel to use:

static inline void __user *compat_alloc_user_space(long len)
    struct pt_regs *regs = task_pt_regs(current);
    return (void __user *)regs->sp - len;

This function is only called by compat-mode syscalls, so current is assumed to be a 32-bit process, in which case regs->sp, the user stack pointer, will be a 32-bit quantity. This, if we subtract a small len, the result should still fit in 32 bits, which, on a 64-bit system means it is guaranteed to fall within the user address space.

Because of this, some callers of compat_alloc_user_space were lazy, and did not call access_ok (or a function which called access_ok) to check that the result of compat_alloc_user_space fell within the user address space.

However, it turned out that some call sites in the kernel called compat_alloc_user_space with a user-controlled len value, allowing the subtraction to wrap around. On a 64-bit system, the kernel lives in the top four gigabytes of memory, and so this wraparound is enough for a user to cause compat_alloc_user_space to return a pointer into the kernel's address space.

Moreover, it turned out that the functions that used a user-controlled len also did not check access_ok on the result of the allocation. In particular, Linux 2.6.26 introduced the compat_mc_getsockopt function, which called compat_alloc_user_space with a user-controlled length and then copied user-controlled data to this pointer. It is this function which the public exploit targetted.

Disabling 32-bit binaries doesn't help

When an exploit was released for this bug, many sources circulated a mitigation: Disable 32-bit binaries on a system. Prevent compat-mode processes from running, the logic goes, and you prevent anyone from making a compat-mode syscall that triggers the vulnerable path.

This mitigation indeed prevented the public exploit from working (it included 32-bit inline assembly, and so couldn't even easily be recompiled as a 64-bit binary), and many observers seemed to believe it closed the bug entirely.

However, this was not the case! It turns out, on an amd64 system, a 64-bit process can still make a compat-mode system call using the int $0x80 instruction, which is the traditional 32-bit syscall mechanism! Even though the process is running in 64-bit mode, int $0x80 redirects to the compat-mode syscall table.

After realizing this, modifying the public exploit to work when compiled in 64-bit mode was a simple matter of porting the inline assembly, and changing a small handful of types. I've posted the modified exploit and the diff against the original for the curious.

The integer overflow is totally irrelevant

Once you've realized that you can make compat-mode system calls from a 64-bit process, a little bit of thought reveals something else interesting. compat_alloc_user_space subtracts the len value off of the userspace stack pointer. Previously, we relied on subtracting a large value from a 32-bit stack pointer in order to end up with a kernel pointer. However, while a 32-bit is limited to a 32-bit stack pointer, a 64-bit process can write a full 64-bit value into %rsp, and thus regs->sp! There's no need for underflow at all – you can just write a 64-bit value into %rsp and do an int $0x80, and make compat_alloc_user_space return any value you please!

The condition for exploitability thus drops from "user-controlled len and no access_ok" to simply "no access_ok".

This is interesting, because it turns out that some very old kernels, before 2.6.11, including RHEL 4, have the following function:

int siocdevprivate_ioctl(unsigned int fd, unsigned int cmd, unsigned long arg)
        struct ifreq __user *u_ifreq64;

        u_ifreq64 = compat_alloc_user_space(sizeof(*u_ifreq64));

        /* Don't check these user accesses, just let that get trapped
         * in the ioctl handler instead.
        copy_to_user(&u_ifreq64->ifr_ifrn.ifrn_name[0], &tmp_buf[0], IFNAMSIZ);
        __put_user(data64, &u_ifreq64->ifr_ifru.ifru_data);

        return sys_ioctl(fd, cmd, (unsigned long) u_ifreq64);

Remember, we can make compat_alloc_user_space return an arbitrary value. The copy_to_user will call access_ok and fail, but that return value will be discarded, and the __put_user will scribble 32 bits of user-controlled data at a user-controlled address. Bingo, local root.

It turns out this function was present in Linux 2.4.x, too, meaning that this exploit even affected RHEL3 and anyone else still running a 2.4-based system!

Based on this exploit, I've produced a working proof-of-concept exploit for RHEL4, based on the released exploit for RHEL5. Contact me if you're interested, but it's pretty straightforward.

Closing notes

As far as I know, neither of these facts has been publically documented prior to this post. I shared this information with Red Hat, and they requested I keep it private until they released fixes for RHEL 3, which happened last week. I would not be at all surprised to learn that someone else has private exploits that incorporate either or both of these observations, though.

One important moral here is you must be very careful when declaring a system unaffected by a vulnerability, or declaring a mitigation to be complete. Software systems have gotten tremendously complex, and it's often impossible to be totally confident you understand every last way an attacker could tickle a vulnerability.