Most of you reading this blog probably remember CVE-2010-3081. The bug got an awful lot of publicity when it was discovered an announced, due to allowing local privilege escalation against virtually all 64-bit Linux kernels in common use at the time.
While investigating CVE-2010-3081, I discovered that several of the commonly-believed facts about the CVE were wrong, and it was even more broadly exploitable than was publically documented. I’d like to share those observations here.
A brief review of the bug 🔗︎
The bug arose from the
compat_alloc_user_space function in Linux’s
32-bit compatibility support on 64-bit
compat_alloc_user_space allocates and returns space on the
userspace kernel stack for the kernel to use:
static inline void __user *compat_alloc_user_space(long len)
struct pt_regs *regs = task_pt_regs(current);
return (void __user *)regs->sp - len;
This function is only called by compat-mode syscalls, so
current is assumed to
be a 32-bit process, in which case
regs->sp, the user stack pointer, will be a
32-bit quantity. This, if we subtract a small
len, the result should still fit
in 32 bits, which, on a 64-bit system means it is guaranteed to fall within the
user address space.
Because of this, some callers of
compat_alloc_user_space were lazy, and did
access_ok (or a function which called
access_ok) to check that the
compat_alloc_user_space fell within the user address space.
However, it turned out that some call sites in the kernel called
compat_alloc_user_space with a user-controlled
len value, allowing the
subtraction to wrap around. On a 64-bit system, the kernel lives in the top four
gigabytes of memory, and so this wraparound is enough for a user to cause
compat_alloc_user_space to return a pointer into the kernel’s address space.
Moreover, it turned out that the functions that used a user-controlled
also did not check
access_ok on the result of the allocation. In particular,
Linux 2.6.26 introduced the
compat_mc_getsockopt function, which called
compat_alloc_user_space with a user-controlled length and then copied
user-controlled data to this pointer. It is this function which the public
Disabling 32-bit binaries doesn’t help 🔗︎
When an exploit was released for this bug, many sources circulated a mitigation: Disable 32-bit binaries on a system. Prevent compat-mode processes from running, the logic goes, and you prevent anyone from making a compat-mode syscall that triggers the vulnerable path.
This mitigation indeed prevented the public exploit from working (it included 32-bit inline assembly, and so couldn’t even easily be recompiled as a 64-bit binary), and many observers seemed to believe it closed the bug entirely.
However, this was not the case! It turns out, on an
amd64 system, a
64-bit process can still make a compat-mode system call using the
int $0x80 instruction, which is the traditional 32-bit syscall mechanism!
Even though the process is running in 64-bit mode,
redirects to the compat-mode syscall table.
After realizing this, modifying the public exploit to work when compiled in 64-bit mode was a simple matter of porting the inline assembly, and changing a small handful of types. I’ve posted the modified exploit and the diff against the original for the curious.
The integer overflow is totally irrelevant 🔗︎
Once you’ve realized that you can make compat-mode system calls from a 64-bit
process, a little bit of thought reveals something else
compat_alloc_user_space subtracts the
len value off of the
userspace stack pointer. Previously, we relied on subtracting a large value from
a 32-bit stack pointer in order to end up with a kernel pointer. However, while
a 32-bit is limited to a 32-bit stack pointer, a 64-bit process can write a full
64-bit value into
%rsp, and thus
regs->sp! There’s no need for underflow at
all – you can just write a 64-bit value into
%rsp and do an
int $0x80, and
compat_alloc_user_space return any value you please!
The condition for exploitability thus drops from “user-controlled
len and no
access_ok” to simply “no
This is interesting, because it turns out that some very old kernels, before 2.6.11, including RHEL 4, have the following function:
int siocdevprivate_ioctl(unsigned int fd, unsigned int cmd, unsigned long arg)
struct ifreq __user *u_ifreq64;
u_ifreq64 = compat_alloc_user_space(sizeof(*u_ifreq64));
/* Don't check these user accesses, just let that get trapped
* in the ioctl handler instead.
copy_to_user(&u_ifreq64->ifr_ifrn.ifrn_name, &tmp_buf, IFNAMSIZ);
return sys_ioctl(fd, cmd, (unsigned long) u_ifreq64);
Remember, we can make
compat_alloc_user_space return an arbitrary
copy_to_user will call
access_ok and fail, but that
return value will be discarded, and the
__put_user will scribble 32
bits of user-controlled data at a user-controlled address. Bingo,
It turns out this function was present in Linux 2.4.x, too, meaning that this exploit even affected RHEL3 and anyone else still running a 2.4-based system!
Based on this exploit, I’ve produced a working proof-of-concept exploit for RHEL4, based on the released exploit for RHEL5. Contact me if you’re interested, but it’s pretty straightforward.
Closing notes 🔗︎
As far as I know, neither of these facts has been publically documented prior to this post. I shared this information with Red Hat, and they requested I keep it private until they released fixes for RHEL 3, which happened last week. I would not be at all surprised to learn that someone else has private exploits that incorporate either or both of these observations, though.
One important moral here is you must be very careful when declaring a system unaffected by a vulnerability, or declaring a mitigation to be complete. Software systems have gotten tremendously complex, and it’s often impossible to be totally confident you understand every last way an attacker could tickle a vulnerability.