About two months ago, Linux saw CVE-2010-0307, which was a trival
denial-of-service attack that could crash essentially any 64-bit Linux
machine with 32-bit compatibility enabled. LWN has an excellent writeup of the bug, which turns out to be a subtle error related to
the details of the execve
system call and with 32-bit compatibility
mode.
While dealing with this patch for Ksplice, I ended up reading an awful lot of the code in Linux that deals with handling 32-bit processes on 64-bit machines. In the process, I discovered a number of alternately terrifying and clever hacks, the highlights of which I wanted to share here.
Compatibility mode 🔗︎
On x86_64
Linux kernels, the config option CONFIG_IA32_EMULATION
controls whether the kernel kernel supports i386 compatibility mode,
and the execution of "compatibility-mode" 32-bit processes.
For the most part, this support is very simple. As long as the OS sets up a few bits in appropriate places, the hardware will switch to 32-bit mode and happily execute these compatibility processes in 32-bit mode. The kernel needs to contain a compatibility entry point to handle system calls and such from 32-bit processes, which needs to do a small bit of marshalling to convert 32-bit arguments to 64-bit arguments, and handle the fact that i386 has different syscall numbers than amd64, but that's about it. Most kernel interfaces are fairly word-width agnostic; If you map the compatibility process into the first 4G of the kernel's 64-bit address space, you can mostly just zero-extend all arguments, and almost everything works fine.
But there are always details… 🔗︎
There are, however, a few devilish details that remain. Specifically, places where the kernel cares about the details of either the pointer size inside a process, or of the memory layout of a process. The primary culprit is the ELF loader: The code which is responsible for loading an executable out of a filesystem and into a process's address space to start executing. This process is very much architecture specific; While 64- and 32-bit ELF files are structured almost identically, many of their fields are different sizes, as they need to hold a pointer or offset of the appropriate size for the architecture.
Similarly, while a running process on Linux stores most relevant
information about its address space layout inside a struct mm
in the
struct task_struct
, or inside a struct thread_info
, when first
constructing a new process, the ELF loader and exec
system call need
to figure out how to set up an initial memory layout, which is very
dependent on the bittedness of the new process.
The ELF loader 🔗︎
Linux's ELF loader lives mostly in fs/binfmt_elf.c
, which takes then
definition of an ELF header from include/linux/elf.h
. The latter
file defines the structs for both 32- and 64- ELF files
(e.g. Elf32_Ehdr
and Elf64_Ehdr
), and then uses an #ifdef
near
the bottom to select the appropriate definition.
In order to support loading both 32- and 64- bit ELF files in the same
kernel, Linux uses a cute hack on the fs/compat_binfmt_elf.c
file. This file uses #define
to set the ELF class to ELFCLASS32
,
indicating that elf.h
should use the 32-bit definitions, #define
's a few more thing, and then just #include
's binfmt_elf.c
, causing
the ELF loader to get compiled a second time!:
/*
* Rename the basic ELF layout types to refer to the 32-bit class of files.
*/
#undef ELF_CLASS
#define ELF_CLASS ELFCLASS32
#undef elfhdr
#undef elf_phdr
#undef elf_shdr
#undef elf_note
#undef elf_addr_t
#define elfhdr elf32_hdr
#define elf_phdr elf32_phdr
#define elf_shdr elf32_shdr
#define elf_note elf32_note
#define elf_addr_t Elf32_Addr
/* Some more #defines elided */
/*
* We share all the actual code with the native (64-bit) version.
*/
#include "binfmt_elf.c"
The ELF structs themselves, however, aren't the only thing that
depends on the architecture. The details of initializing a new process
depend on the architecture as well. So, throughout binfmt_elf.c
,
there are a number of calls to macros that handle various
platform-specific elemnts of ELF loading.
compat_binfmt_elf.c
then just goes through and uses #define
to
replace all of these with appropriate COMPAT_
versions, defined by
the architecture:
#undef ELF_ARCH #undef elf_check_arch #define elf_check_arch compat_elf_check_arch #ifdef COMPAT_ELF_PLATFORM #undef ELF_PLATFORM #define ELF_PLATFORM COMPAT_ELF_PLATFORM #endif /* ... */
The Linux developers do love their preprocessor.
TASK_SIZE
🔗︎
In the linux kernel, the TASK_SIZE
macro defines the highest address
available to a user process. Once a process is running, this
information (along with a whole host of other information about the
memory layout). However, in various places, including the ELF loader,
the TASK_SIZE
macro (along with a few others, like STACK_TOP
) are
needed.
However, TASK_SIZE
obviously must be different between 32- and 64-
processes. Conveniently, almost all code that uses TASK_SIZE
cares
about the current process (such as the ELF loader), and so the
introduction of compatibility mode just changed the macro as follows
(arch/x86/include/asm/processor.h
):
#define TASK_SIZE (test_thread_flag(TIF_IA32) ? \ IA32_PAGE_OFFSET : TASK_SIZE_MAX)
test_thread_flag
reads a bit out of the flags field on the current
process's thread_info
struct. And so the TASK_SIZE
macro
pseudo-magically changes value depending on whether the process
calling it is running in 32-bit compatibility mode or not!