<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>Made of Bugs &#187; followup</title>
	<atom:link href="http://blog.nelhage.com/tag/followup/feed/" rel="self" type="application/rss+xml" />
	<link>http://blog.nelhage.com</link>
	<description>It's software. It's made of bugs.</description>
	<lastBuildDate>Thu, 18 Aug 2011 21:57:23 +0000</lastBuildDate>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=3.2.1</generator>
		<item>
		<title>Followup to &#8220;A Very Subtle Bug&#8221;</title>
		<link>http://blog.nelhage.com/2010/03/followup-to-a-very-subtle-bug/</link>
		<comments>http://blog.nelhage.com/2010/03/followup-to-a-very-subtle-bug/#comments</comments>
		<pubDate>Wed, 03 Mar 2010 17:45:11 +0000</pubDate>
		<dc:creator>nelhage</dc:creator>
				<category><![CDATA[linux]]></category>
		<category><![CDATA[followup]]></category>
		<category><![CDATA[python]]></category>
		<category><![CDATA[tar]]></category>
		<category><![CDATA[unix]]></category>

		<guid isPermaLink="false">http://blog.nelhage.com/?p=160</guid>
		<description><![CDATA[After my previous post got posted to reddit, there was a bunch of interesting discussion there about some details I&#8217;d handwaved over. This is a quick followup on some the investigation that various people carried out, and the conclusions they reached. In the reddit thread, lacos/lbzip2 objected that in his experiments, he didn&#8217;t see tar [...]]]></description>
			<content:encoded><![CDATA[<p>After my <a href="http://blog.nelhage.com/archives/150">previous post</a> got posted to <a href="http://www.reddit.com/r/programming/comments/b7djd/stuff_like_this_makes_me_hate_python_subtle_bugs/">reddit</a>, there was a bunch of
interesting discussion there about some details I&#8217;d handwaved
over. This is a quick followup on some the investigation that various
people carried out, and the conclusions they reached.</p>

<p>In the reddit thread, <a href="http://lacos.hu/">lacos/lbzip2</a> <a href="http://www.reddit.com/r/programming/comments/b7djd/stuff_like_this_makes_me_hate_python_subtle_bugs/c0lc0dy">objected</a> that in his
experiments, he didn&#8217;t see <code>tar</code> closing the input pipe before it was
done reading the file, and so questioned where the <code>SIGPIPE</code>/<code>EPIPE</code>
was coming from in the first place. I had actually done similar
experiments with similar results, but I was still seeing the <code>EPIPE</code>,
so I knew it could happen, but I couldn&#8217;t totally explain why.</p>

<p>A friend of mine, David Benjamin, was curious enough to source-dive
<code>tar</code>, and <a href="http://davidben.scripts.mit.edu/blog/2010/02/28/tar-filled-pipes/">posted his results</a> on his own blog. He discovered that
by default, <code>tar</code> does not close the pipe after finding all the files
it needs, because the <code>tar</code> archive format allows for later copies of
the same file, which would supercede the previous ones. This explains
why <code>lacos</code> and I saw <code>tar</code> reading to the end of a <code>linux-2.6</code>
tarball, even if we only asked for the first file.</p>

<p>He also discovered, however, that a typical tar file ends with a
number of <code>NUL</code> blocks, which <code>tar</code> treats as end-of-file. And so
<code>tar</code> will close the pipe after reading the first of these, which
opens a narrow race condition whereby tar can potentially do so before
<code>gzip</code> has written the remaining <code>NUL</code> blocks, resulting in a
<code>SIGPIPE</code>.</p>

<p>Finally, the discussion inspired <code>lacos</code> to post a <a href="http://lists.gnu.org/archive/html/help-tar/2010-03/msg00000.html">query</a> clarifying <code>tar</code>&#8216;s behavior with respect to SIGPIPE and closing the pipe early to the <code>help-tar</code>
mailing list, which resulted in a brief thread that, among other
things, revealed that the bug I posted about has been <a href="http://lists.gnu.org/archive/html/bug-tar/2009-06/msg00009.html">fixed</a> in
GNU tar as of last summer, by having <code>tar</code> reset the disposition of
<code>SIGPIPE</code> to <code>SIG_DFL</code> before spawning a child. It was also pointed out that tar checks whether a filter subprocess is killed by <code>SIGPIPE</code>, and treats that as a success &#8212; so it&#8217;s not actually necessary for a <code>tar</code> filter to handle <code>SIGPIPE</code> and exit cleanly, like <code>gzip</code> does.</p>
]]></content:encoded>
			<wfw:commentRss>http://blog.nelhage.com/2010/03/followup-to-a-very-subtle-bug/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
	</channel>
</rss>

