<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>Ian Dennis Miller &#187; Technology</title>
	<atom:link href="http://iandennismiller.com/blog/category/technology/feed/" rel="self" type="application/rss+xml" />
	<link>http://iandennismiller.com/blog</link>
	<description>Essays and Whatnot</description>
	<lastBuildDate>Sun, 25 Jul 2010 02:46:41 +0000</lastBuildDate>
	<generator>http://wordpress.org/?v=2.9.2</generator>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
			<item>
		<title>Disable sharing</title>
		<link>http://iandennismiller.com/blog/2010/06/disable-sharing/</link>
		<comments>http://iandennismiller.com/blog/2010/06/disable-sharing/#comments</comments>
		<pubDate>Mon, 14 Jun 2010 18:18:42 +0000</pubDate>
		<dc:creator>idm</dc:creator>
				<category><![CDATA[Technology]]></category>
		<category><![CDATA[observation]]></category>
		<category><![CDATA[entrepreneurship]]></category>
		<category><![CDATA[internet]]></category>
		<category><![CDATA[localshow]]></category>

		<guid isPermaLink="false">http://iandennismiller.com/blog/2010/06/disable-sharing/</guid>
		<description><![CDATA[Imagine you are a member of a successful mega-band with expensive music videos and everything. You started out small, worked your way up, and now you&#8217;ve sold millions of albums. The band&#8217;s music is owned and distributed by a big Label, and the Label has reluctantly put your music videos online. It is easy to [...]]]></description>
			<content:encoded><![CDATA[<p>Imagine you are a member of a successful mega-band with expensive music videos and everything. You started out small, worked your way up, and now you&#8217;ve sold millions of albums. The band&#8217;s music is owned and distributed by a big Label, and the Label has reluctantly put your music videos online. It is easy to show that Music Labels are wary of video sharing sites (such as youtube) because the Labels often choose to <b>disable sharing</b> when they are given the option. Makes sense, right?</p>
<p>Since you&#8217;re a band member in our hypothetical mega-band, what this means to you in practical terms is that bloggers cannot put your videos in their blogs, among other things. Naturally, the music label wants to disable sharing, because they want fans to be dependent on the label to get band updates. In Internet terms, this is a de facto walled garden of content, which in the music labels&#8217; ideal world would be something completely separate from the Internet.</p>
<p>Ideally, as a band member, you&#8217;ll get a cut of everything the Label sells, so there&#8217;s a lot to say for the walled garden concept. The big problem is this: by definition, <b>sharing is disabled between the Internet and the walled garden</b>. For as long as the Label was the best way to promote your music, there has never been any benefit to sharing content. Ever since broadcast radio music was used to promote albums, and even through the whole music video era, sharing has never played into the promotion scheme.</p>
<p><b>Maybe sharing should be a part of music promotion.</b></p>
<p>I&#8217;m not talking about ripping whole albums or bittorrent filesharing, per se, although there are some people would riff on this. They might go so far as to argue that if you give an artist some money as a result of downloading an entire album worth of mp3s, then that artist got some free promotion via filesharing. It&#8217;s happened before.</p>
<p>But I&#8217;m not even going to touch that. The situation I am talking about is when a music label on youtube clicks that one checkbox to &#8220;disable embedding.&#8221; In the picture below, these are some of the options youtube gives you.</p>
<p><img src="http://iandennismiller.com/blog/wp-content/uploads/2010/06/embedding.jpg" width="412" height="218" alt="embedding.jpg" /></p>
<p>It&#8217;s worth considering why someone would ever decide to disable sharing, but the inescapable observation is that many music labels have made this choice. A practical consequence is that they are missing out on potentially free promotion through the Internet. More on this point later.</p>
]]></content:encoded>
			<wfw:commentRss>http://iandennismiller.com/blog/2010/06/disable-sharing/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Passwords, and the Apple Keychain</title>
		<link>http://iandennismiller.com/blog/2010/05/passwords-and-the-apple-keychain/</link>
		<comments>http://iandennismiller.com/blog/2010/05/passwords-and-the-apple-keychain/#comments</comments>
		<pubDate>Wed, 19 May 2010 15:47:18 +0000</pubDate>
		<dc:creator>idm</dc:creator>
				<category><![CDATA[Technology]]></category>
		<category><![CDATA[os x]]></category>
		<category><![CDATA[security]]></category>

		<guid isPermaLink="false">http://iandennismiller.com/blog/2010/05/passwords-and-the-apple-keychain/</guid>
		<description><![CDATA[Some time around 2006, I started thinking about my online passwords in a new way. Until this point, I had used a collection of perhaps a dozen gibberish passwords, which I reused on various sites depending on the sensitivity of the site. For example, my bank account would use a nearly unique password, whereas a [...]]]></description>
			<content:encoded><![CDATA[<p>Some time around 2006, I started thinking about my online passwords in a new way. Until this point, I had used a collection of perhaps a dozen gibberish passwords, which I reused on various sites depending on the sensitivity of the site. For example, my bank account would use a nearly unique password, whereas a random forum would use a very commonly reused password.</p>
<p>This worked acceptably well, but I frequently had to ask myself: &#8220;which password did I use when I signed up for this service?&#8221; In response to having to guess my own passwords, I made two decisions: I would start writing my passwords down, and I would make them all unique and randomly generated.  Four years later, I am using a totally different system, and I&#8217;ll explain all of my reasoning.</p>
<p><span id="more-183"></span></p>
<p>To facilitate my random password approach, I started using 3&#215;5 index cards and a card filer. I added A-Z tabs, and I generally filed cards according to the domain name of the service (e.g. paypal.com is filed under P). I wrote a quick perl script to make 10 random passwords at a time, and I would pick one from the list and write it down on the index card. I really liked the concept of a purely non-digital password storage system, because it would be essentially unhackable without physical access. <i>Essentially unhackable</i> &#8211; more on this later.</p>
<p>There were several drawbacks to the index card system. For brevity, I&#8217;ll just list them:</p>
<ul>
<li>writing some characters by hand is ambiguous. I confused capital I, lowercase L, and numeral 1 all the time. Capital O and numeral 0 are also a trick.</li>
<li>it&#8217;s possible to copy the password incorrectly</li>
<li>it is extremely difficult to create a backup copy, so catastrophic loss is a possibility</li>
<li>if someone has physical access to the index cards, they have access to your accounts</li>
<li>it&#8217;s tedious to type in a random password every time you log in</li>
<li>it doesn&#8217;t scale well after about 400 accounts</li>
</ul>
<p>The scaling problems were the real killer. For example, did I file sandbox.paypal.com under P for paypal or S for sandbox? I don&#8217;t remember, so I need to perform a linear search through both letters.  Or, since a disproportionate number of words start with S, then it became a more tedious task to flip through all the S cards in order to find an S site, whereas a site that started with Y would be pretty quick to look up since there were fewer. Eventually, it got to the point that I knew it was too much of a chore to look up cards, and on that basis, I became too lazy to log in to my accounts! Total failure.</p>
<p><img src="http://iandennismiller.com/blog/wp-content/uploads/2010/05/Keychain-Icon.png" width="64" height="64" alt="Keychain Icon.png" style="float:left;" /> The solution for me is to use <a href="http://en.wikipedia.org/wiki/Apple_Keychain">Apple Keychain</a>. If you&#8217;re a <a href="http://en.wikipedia.org/wiki/Getting_things_done">GTD adherent</a>, then you&#8217;ll understand what I mean when I say this is my trusted system for account information. How did I reconcile a digital password storage with my original goal of keeping my passwords offline in order to make it unhackable? It was when I realized that both offline passwords and the keychain can be successfully attacked with a keystroke logger. If someone went to those lengths to get a password, then it wouldn&#8217;t matter how it was originally stored; the password could be intercepted regardless.</p>
<p>Why use Apple Keychain? Based on my list of drawbacks for the index cards, here&#8217;s a list of pro-Keychain points:</p>
<ul>
<li>built-in random password generator</li>
<li>keyword search</li>
<li>simple cut-and-paste workflow makes it very easy to enter passwords without typing</li>
<li>keychain itself is password protected</li>
<li>passwords are <a href="http://en.wikipedia.org/wiki/Triple_DES">Triple DES</a> encrypted (which should be acceptable until the year 2030)</li>
<li>simple to back up keychain file</li>
<li>slick integration with many applications, including Mail.app, subversion, and Safari/Chrome.</li>
</ul>
<p>I&#8217;m currently at about 900 accounts (yes &#8211; this is deserving of a separate post unto itself) and the system is working great. I think this scales to meet my requirements, and probably beyond. In practical terms, a password that used to take 30 second to retrieve is now instant.  I probably save 5 minutes per day by switching away from index cards, and I am avoiding untold frustrations.  In all, I recommend Apple Keychain highly.  &nbsp;&nbsp;</p>
<p></p>
]]></content:encoded>
			<wfw:commentRss>http://iandennismiller.com/blog/2010/05/passwords-and-the-apple-keychain/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>The fastest way to download a youtube video</title>
		<link>http://iandennismiller.com/blog/2010/04/the-fastest-way-to-download-a-youtube-video/</link>
		<comments>http://iandennismiller.com/blog/2010/04/the-fastest-way-to-download-a-youtube-video/#comments</comments>
		<pubDate>Sat, 03 Apr 2010 04:05:26 +0000</pubDate>
		<dc:creator>idm</dc:creator>
				<category><![CDATA[Technology]]></category>

		<guid isPermaLink="false">http://iandennismiller.com/blog/2010/04/the-fastest-way-to-download-a-youtube-video/</guid>
		<description><![CDATA[A recent comment on BoingBoing asked if there was a way to download a video from youtube, such that it could be reposted elsewhere. One solution, suggested by Cory Doctorow, is to use pwnyoutube.com, but there exists a general method that works on all flash video (not just youtube), and happens to be faster than [...]]]></description>
			<content:encoded><![CDATA[<p>A <a href="http://www.boingboing.net/2010/04/02/digital-economy-bill.html#comments">recent comment on BoingBoing</a> asked if there was a way to download a video from youtube, such that it could be reposted elsewhere. One solution, suggested by Cory Doctorow, is to use pwnyoutube.com, but there exists a general method that works on <b>all flash video</b> (not just youtube), and happens to be faster than using pwnyoutube.com. Behold! For I shall demonstrate a painless use of <b><a href="http://en.wikipedia.org/wiki/Lsof">lsof</a></b>, the under-appreciated and extra-useful command line tool.</p>
<p><span id="more-145"></span></p>
<p>In the Internet video world, there are two kinds of creatures: streaming video (which is appropriate for live events) and buffered video (which is for recorded things, like youtube). &#8220;Buffering&#8221; means it&#8217;s actually downloading a file in the background, and if it can download a little faster than you can watch it, then everything plays smoothly. If the video pauses suddenly and restarts after a few seconds, that&#8217;s because it&#8217;s rebuffering.</p>
<p><img src="http://iandennismiller.com/blog/wp-content/uploads/2010/04/buffering.png.jpg" width="480" height="284" alt="buffering.png.jpg" /></p>
<p>Have you ever noticed how the youtube progress bar slowly fills in with a pinkish color? It&#8217;s at about 20% in the picture, above. That indicates how much of the file has been buffered, and when it reaches the end, it means the file is fully downloaded. In other words, you don&#8217;t need some special plugin or service to &#8220;download a video from youtube.&#8221; Your browser does it automatically! Even better, this happens for any website that uses flash and .flv files to deliver buffered video.</p>
<p>The key is to use lsof (which is a mnemonic for &#8220;list open files&#8221;). I&#8217;m demonstrating this on OS X, but the process is basically the same with *nixes and Cygwin. If you don&#8217;t have lsof installed by default, just use your package manager to install it. (e.g. apt-get install lsof).</p>
<p>So, the magical incantation is:</p>
<p><code>lsof |grep lash</code></p>
<p>I grep for &#8220;lash&#8221; instead of &#8220;Flash&#8221; since you never know if the F will be capitalized or not, and this is the laziest way to get the desired results. Here is an example of the output:</p>
<p><img src="http://iandennismiller.com/blog/wp-content/uploads/2010/04/lsof.png.jpg" width="750" height="176" alt="lsof.png.jpg" /></p>
<p>Notice the files FlashTmp0 and FlashTmp1? That&#8217;s where the video files are saved, so long as you keep your browser window and video tabs open. There&#8217;s no need to &#8220;download&#8221; a video that you just watched. Instead, simply copy the file straight to your Desktop:</p>
<p><code>cp /private/var/folders/.../TemporaryItems/FlashTmp1 ~/Desktop/rickroll.flv</code></p>
<p>Now, you can open the local file with VLC:</p>
<p><img src="http://iandennismiller.com/blog/wp-content/uploads/2010/04/vlc.png.jpg" width="480" height="394" alt="vlc.png.jpg" /></p>
<p>You might need to try multiple FlashTmp files before you find the one containing the video you want (i.e. is it FlashTmp0 or FlashTmp1) but there usually aren&#8217;t many. <b>On many non-youtube sites, this is the only way you&#8217;re going to get access to a buffered flash video</b> (since there aren&#8217;t handy pwnyoutube.com clones for everything).</p>
<p>Once you have copied the file to your desktop, why not convert it to mp4 and edit it in iMovie?</p>
<p><code>ffmpeg -i rickroll.flv rickroll.mp4</code></p>
<p><img src="http://iandennismiller.com/blog/wp-content/uploads/2010/04/ffmpeg.png.jpg" width="480" height="240" alt="ffmpeg.png.jpg" /></p>
<p>And now you know how to access videos you just watched, as well as convert them into a format you can edit.</p>
]]></content:encoded>
			<wfw:commentRss>http://iandennismiller.com/blog/2010/04/the-fastest-way-to-download-a-youtube-video/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>curl: HTTP/1.1 100 CONTINUE and multipart/form-data POST</title>
		<link>http://iandennismiller.com/blog/2009/09/curl-http1-1-100-continue-and-multipartform-data-post/</link>
		<comments>http://iandennismiller.com/blog/2009/09/curl-http1-1-100-continue-and-multipartform-data-post/#comments</comments>
		<pubDate>Fri, 18 Sep 2009 15:10:35 +0000</pubDate>
		<dc:creator>idm</dc:creator>
				<category><![CDATA[Technology]]></category>
		<category><![CDATA[100-continue]]></category>
		<category><![CDATA[blank form]]></category>
		<category><![CDATA[curl]]></category>
		<category><![CDATA[expect]]></category>
		<category><![CDATA[expect 100]]></category>
		<category><![CDATA[http]]></category>
		<category><![CDATA[http 1.0]]></category>
		<category><![CDATA[http 1.1]]></category>
		<category><![CDATA[http/1.0]]></category>
		<category><![CDATA[http/1/1]]></category>
		<category><![CDATA[missing payload]]></category>
		<category><![CDATA[multipart]]></category>
		<category><![CDATA[multipart form]]></category>
		<category><![CDATA[multipart post]]></category>
		<category><![CDATA[multipart/form-data]]></category>
		<category><![CDATA[payload]]></category>
		<category><![CDATA[post]]></category>
		<category><![CDATA[tcpdump]]></category>

		<guid isPermaLink="false">http://iandennismiller.com/blog/?p=91</guid>
		<description><![CDATA[I&#8217;m working on a REST interface at the moment, and there&#8217;s nothing I need more than a quick utility to test out various functions.  Curl fills this role perfectly, but I have run into a strange problem that interferes with multipart/form-data form POSTing.  Let me explain some of the evidence I&#8217;ve collected, as well as [...]]]></description>
			<content:encoded><![CDATA[<p>I&#8217;m working on a REST interface at the moment, and there&#8217;s nothing I need more than a quick utility to test out various functions.  <a title="Curl" href="http://curl.haxx.se/">Curl</a> fills this role perfectly, but I have run into a strange problem that interferes with <a title="multipart form data" href="http://www.w3.org/TR/html401/interact/forms.html#h-17.13.4.2">multipart/form-data</a> form POSTing.  Let me explain some of the evidence I&#8217;ve collected, as well as tell you a workaround I learned from an IRC conversation.  In the end, this comes down to the HTTP 1.1 100 CONTINUE response code, which plays a critical role in HTTP 1.1 POST.</p>
<p><span id="more-91"></span></p>
<h4>Configuration</h4>
<p>For starters, I&#8217;m testing this out with OS X 10.5:</p>
<pre>$ uname -a
Darwin osx.example.com 9.7.0 Darwin Kernel Version 9.7.0: Tue Mar 31 22:52:17
PDT 2009; root:xnu-1228.12.14~1/RELEASE_I386 i386 i386</pre>
<p>For all of the output in this post, I used this version of curl:</p>
<pre>$ curl --version
curl 7.19.6 (i386-apple-darwin9.7.0) libcurl/7.19.6 zlib/1.2.3
Protocols: tftp ftp telnet dict http file
Features: Largefile libz</pre>
<p>Next up, we have the web server I was testing with:</p>
<pre>$ curl -I osx.example.com:8000
HTTP/1.0 302 FOUND
Date: Fri, 18 Sep 2009 13:57:22 GMT
Server: WSGIServer/0.1 Python/2.6.2</pre>
<p>I&#8217;ve been using wireshark and tcpdump to watch the traffic.  Here&#8217;s an example invocation of tcpdump that you can work with to replicate the issue:</p>
<pre>sudo tcpdump -X -s 1500 -i lo0 tcp port 8000</pre>
<p>Obviously, you might run a testing webserver on port 80, and you might send your traffic over lo, eth0, or en0.  If you&#8217;re reading this post, you probably know what&#8217;s what, and how to modify the command accordingly.</p>
<h4>Issuing a simple multipart/form-data POST</h4>
<p>It starts when I try to issue a multipart form POST:</p>
<pre>curl -F name=somevalue http://osx.example.com:8000</pre>
<p>This means &#8220;set the field called &#8216;name&#8217; to &#8217;somevalue&#8217; and instead of url encoding it, post it as a multipart MIME message.&#8221;  Your browser does this any time you upload a file to a website.  In my case, my REST API lets me upload PDF files, so curl needs to use multipart instead of url encoding for this purpose.</p>
<p>Curl only generates one TCP packet based on this command (even though it should generate multiple) and this is what that one packet looks like:</p>
<pre>10:07:00.971856 IP osx.57777 &gt; osx.8000: P 1:<strong>260</strong>(259) ack 1 win 65535
&lt;nop,nop,timestamp 1076257225 1076257225&gt;
 0x0000:  4500 0137 03b9 4000 4006 0000 7f00 0001  E..7..@.@.......
 0x0010:  7f00 0001 e1b1 1f40 3f35 c1d8 0bb4 4ba7  .......@?5....K.
 0x0020:  8018 ffff ff2b 0000 0101 080a 4026 61c9  .....+......@&amp;a.
 0x0030:  4026 61c9 504f 5354 202f 2048 5454 502f  @&amp;a.<strong>POST./.HTTP/</strong>
 0x0040:  312e 310d 0a55 7365 722d 4167 656e 743a  <strong>1.1</strong>..User-Agent:
 0x0050:  2063 7572 6c2f 372e 3139 2e36 2028 6933  .curl/7.19.6.(i3
 0x0060:  3836 2d61 7070 6c65 2d64 6172 7769 6e39  86-apple-darwin9
 0x0070:  2e37 2e30 2920 6c69 6263 7572 6c2f 372e  .7.0).libcurl/7.
 0x0080:  3139 2e36 207a 6c69 622f 312e 322e 330d  19.6.zlib/1.2.3.
 0x0090:  0a48 6f73 743a 2031 3237 2e30 2e30 2e31  .Host:.127.0.0.1
 0x00a0:  3a38 3030 300d 0a41 6363 6570 743a 202a  :8000..Accept:.*
 0x00b0:  2f2a 0d0a 436f 6e74 656e 742d 4c65 6e67  /*..<strong>Content-Leng</strong>
 0x00c0:  7468 3a20 3134 380d 0a45 7870 6563 743a  <strong>th:.148</strong>..<strong>Expect:</strong>
 0x00d0:  2031 3030 2d63 6f6e 7469 6e75 650d 0a43  <strong>.100-continue</strong>..<strong>C</strong>
 0x00e0:  6f6e 7465 6e74 2d54 7970 653a 206d 756c  <strong>ontent-Type:.mul</strong>
 0x00f0:  7469 7061 7274 2f66 6f72 6d2d 6461 7461  <strong>tipart/form-data</strong>
 0x0100:  3b20 626f 756e 6461 7279 3d2d 2d2d 2d2d  ;.boundary=-----
 0x0110:  2d2d 2d2d 2d2d 2d2d 2d2d 2d2d 2d2d 2d2d  ----------------
 0x0120:  2d2d 2d2d 2d2d 2d37 3937 3037 6465 6438  -------79707ded8
 0x0130:  6339 640d 0a0d 0a                        c9d....</pre>
<p>We can tell a few things from this.  First, the packet is 260 bytes long, the HTTP Content-Length header indicates a forthcoming payload of 148 bytes, and curl has set the HTTP Expect header to:</p>
<pre>Expect: 100-continue</pre>
<p>Notice that it gets all the way up to the MIME boundary, which is fine.  The server responds with an ACK of the 260 bytes received, then sends back an HTTP reply:</p>
<pre>10:07:00.971880 IP osx.8000 &gt; osx.57777: . <strong>ack 260</strong> win 65535 &lt;nop,nop,
timestamp 1076257225 1076257225&gt;
 0x0000:  4500 0034 cda2 4000 4006 0000 7f00 0001  E..4..@.@.......
 0x0010:  7f00 0001 1f40 e1b1 0bb4 4ba7 3f35 c2db  .....@....K.?5..
 0x0020:  8010 ffff fe28 0000 0101 080a 4026 61c9  .....(......@&amp;a.
 0x0030:  4026 61c9                                @&amp;a.
10:07:00.973365 IP osx.8000 &gt; osx.57777: P 1:21(20) ack 260 win 65535
&lt;nop,nop,timestamp 1076257225 1076257225&gt;
 0x0000:  4500 0048 1bbf 4000 4006 0000 7f00 0001  E..H..@.@.......
 0x0010:  7f00 0001 1f40 e1b1 0bb4 4ba7 3f35 c2db  .....@....K.?5..
 0x0020:  8018 ffff fe3c 0000 0101 080a 4026 61c9  .....&lt;......@&amp;a.
 0x0030:  4026 61c9 4854 5450 2f31 2e30 2033 3032  @&amp;a.<strong>HTTP/1.0.302</strong>
 0x0040:  2046 4f55 4e44 0d0a                      .<strong>FOUND</strong>..</pre>
<p>Ah!  The server didn&#8217;t respond with HTTP/1.1 100 CONTINUE.  Curl will wait until it receives a 100 CONTINUE before it sends its 148 byte payload.  If your server never sends that response, curl will never send the payload.  This can be a problem if your server or your application doesn&#8217;t know about this HTTP 1.1 behavior.</p>
<p>Here is the same communication, as seen from curl&#8217;s perspective with the -v flag (for verbose output):</p>
<pre>$ curl -v -F name=somevalue http://127.0.0.1:8000
* About to connect() to 127.0.0.1 port 8000 (#0)
*   Trying 127.0.0.1... connected
* Connected to 127.0.0.1 (127.0.0.1) port 8000 (#0)
&gt; POST / HTTP/1.1
&gt; User-Agent: curl/7.19.6 (i386-apple-darwin9.7.0) libcurl/7.19.6 zlib/1.2.3
&gt; Host: 127.0.0.1:8000
&gt; Accept: */*
&gt; Content-Length: 148
&gt; Expect: 100-continue
&gt; Content-Type: multipart/form-data; boundary=----------------------------d70cdce71857
&gt;
* HTTP 1.0, assume close after body
&lt; HTTP/1.0 302 FOUND
&lt; Date: Fri, 18 Sep 2009 15:03:20 GMT
&lt; Server: WSGIServer/0.1 Python/2.6.2
&lt; Vary: Cookie
&lt; Content-Type: text/html; charset=utf-8
&lt; Location: http://127.0.0.1:8000/crm/form/login?next=/
&lt;
* Closing connection #0</pre>
<h4>A brief note about HTTP 1.1</h4>
<p>So, the fact that the server responds with a 302 FOUND instead of a 100 CONTINUE is really a feature of HTTP 1.1 because it is at this point that the conversation stops.  The 302 response will redirect the client to another resource on the server, so why not wait until you actually hit the resource that will receive your POST before you POST your payload?  If you&#8217;re POSTing a file to a URL redirect, then you will end up uploading your file at least twice, and that&#8217;s a waste of precious upstream bandwidth.</p>
<p>The HTTP/1.1 100 CONTINUE can potentially save you bandwidth, but read the coda at the end of this post for a cautionary tale.</p>
<h4>Issuing a multipart/form-data POST without Expect</h4>
<p>Let&#8217;s try this again without the Expect header:</p>
<pre>curl -H "Expect:" -F name=somevalue http://osx.example.com:8000</pre>
<p>The command above is identical to the previous one with the exception of the -H flag.  By setting &#8220;Expect:&#8221; to have no value after the colon, curl will interpret this as deleting the Expect header.  Sure enough, when we look at the TCP packet, the Expect header is gone:</p>
<pre>10:11:59.308674 IP osx.57803 &gt; osx.8000: P 1:238(237) ack 1 win 65535
&lt;nop,nop,timestamp 1076260192 1076260192&gt;
 0x0000:  4500 0121 c4b5 4000 4006 0000 7f00 0001  E..!..@.@.......
 0x0010:  7f00 0001 e1cb 1f40 5bb2 5099 2f1c 8bad  .......@[.P./...
 0x0020:  8018 ffff ff15 0000 0101 080a 4026 6d60  ............@&amp;m`
 0x0030:  4026 6d60 504f 5354 202f 2048 5454 502f  @&amp;m`POST./.HTTP/
 0x0040:  312e 310d 0a55 7365 722d 4167 656e 743a  1.1..User-Agent:
 0x0050:  2063 7572 6c2f 372e 3139 2e36 2028 6933  .curl/7.19.6.(i3
 0x0060:  3836 2d61 7070 6c65 2d64 6172 7769 6e39  86-apple-darwin9
 0x0070:  2e37 2e30 2920 6c69 6263 7572 6c2f 372e  .7.0).libcurl/7.
 0x0080:  3139 2e36 207a 6c69 622f 312e 322e 330d  19.6.zlib/1.2.3.
 0x0090:  0a48 6f73 743a 2031 3237 2e30 2e30 2e31  .Host:.127.0.0.1
 0x00a0:  3a38 3030 300d 0a41 6363 6570 743a 202a  :8000..Accept:.*
 0x00b0:  2f2a 0d0a 436f 6e74 656e 742d 4c65 6e67  /*..Content-Leng
 0x00c0:  7468 3a20 3134 380d 0a43 6f6e 7465 6e74  th:.148..Content
 0x00d0:  2d54 7970 653a 206d 756c 7469 7061 7274  -Type:.multipart
 0x00e0:  2f66 6f72 6d2d 6461 7461 3b20 626f 756e  /form-data;.boun
 0x00f0:  6461 7279 3d2d 2d2d 2d2d 2d2d 2d2d 2d2d  dary=-----------
 0x0100:  2d2d 2d2d 2d2d 2d2d 2d2d 2d2d 2d2d 2d2d  ----------------
 0x0110:  2d65 3932 3861 6430 3332 3262 340d 0a0d  -e928ad0322b4...
 0x0120:  0a</pre>
<p>The packet is now 238 bytes long, which is exactly right.  &#8220;Expect: 100-continue&#8221; is 20 bytes long, plus 2 bytes for \r\n (0d0a in hex), which accounts for the packet being 22 bytes shorter than the previous 260 byte packet.   As before, the content length is 148 bytes.  As before, the packet goes all the way up to the MIME boundary.</p>
<p>As before, the server sends back a TCP ACK of the 238 bytes received, but here&#8217;s the difference.  Critically, without the Expect header, curl sends the entire payload before the server responds:</p>
<pre>10:11:59.308693 IP osx.8000 &gt; osx.57803: . ack 238 win 65535 &lt;nop,nop,
timestamp 1076260192 1076260192&gt;
 0x0000:  4500 0034 05cf 4000 4006 0000 7f00 0001  E..4..@.@.......
 0x0010:  7f00 0001 1f40 e1cb 2f1c 8bad 5bb2 5186  .....@../...[.Q.
 0x0020:  8010 ffff fe28 0000 0101 080a 4026 6d60  .....(......@&amp;m`
 0x0030:  4026 6d60                                @&amp;m`
10:11:59.308751 IP osx.57803 &gt; osx.8000 P 238:<strong>386</strong>(<strong>148</strong>) ack 1 win 65535
&lt;nop,nop,timestamp 1076260192 1076260192&gt;
 0x0000:  4500 00c8 45bc 4000 4006 0000 7f00 0001  E...E.@.@.......
 0x0010:  7f00 0001 e1cb 1f40 5bb2 5186 2f1c 8bad  .......@[.Q./...
 0x0020:  8018 ffff febc 0000 0101 080a 4026 6d60  ............@&amp;m`
 0x0030:  4026 6d60 2d2d 2d2d 2d2d 2d2d 2d2d 2d2d  @&amp;m`------------
 0x0040:  2d2d 2d2d 2d2d 2d2d 2d2d 2d2d 2d2d 2d2d  ----------------
 0x0050:  2d2d 6539 3238 6164 3033 3232 6234 0d0a  --e928ad0322b4..
 0x0060:  436f 6e74 656e 742d 4469 7370 6f73 6974  Content-Disposit
 0x0070:  696f 6e3a 2066 6f72 6d2d 6461 7461 3b20  ion:.form-data;.
 0x0080:  6e61 6d65 3d22 6e61 6d65 220d 0a0d 0a73  name=<strong>"name"</strong>....<strong>s</strong>
 0x0090:  6f6d 6576 616c 7565 0d0a 2d2d 2d2d 2d2d  <strong>omevalue</strong>..------
 0x00a0:  2d2d 2d2d 2d2d 2d2d 2d2d 2d2d 2d2d 2d2d  ----------------
 0x00b0:  2d2d 2d2d 2d2d 2d2d 6539 3238 6164 3033  --------e928ad03
 0x00c0:  3232 6234 2d2d 0d0a                      22b4--..</pre>
<p>Whoah!  Did you notice the &#8220;name&#8221; &#8230; somevalue above?  That&#8217;s our payload, and it&#8217;s finally being transmitted.  This all happens before the server respond with an HTTP status code.  When sending the response, the TCP header is now an ACK to the total size of the POST-related packets (238 and 148, which is a total size of 386 bytes).</p>
<pre>10:11:59.308768 IP osx.8000 &gt; osx.57803: . ack <strong>386</strong> win 65535 &lt;nop,nop,
timestamp 1076260192 1076260192&gt;
 0x0000:  4500 0034 d082 4000 4006 0000 7f00 0001  E..4..@.@.......
 0x0010:  7f00 0001 1f40 e1cb 2f1c 8bad 5bb2 521a  .....@../...[.R.
 0x0020:  8010 ffff fe28 0000 0101 080a 4026 6d60  .....(......@&amp;m`
 0x0030:  4026 6d60                                @&amp;m`
10:11:59.316155 IP osx.8000 &gt; osx.57803: P 1:21(20) ack 386 win 65535
&lt;nop,nop,timestamp 1076260192 1076260192&gt;
 0x0000:  4500 0048 e1c5 4000 4006 0000 7f00 0001  E..H..@.@.......
 0x0010:  7f00 0001 1f40 e1cb 2f1c 8bad 5bb2 521a  .....@../...[.R.
 0x0020:  8018 ffff fe3c 0000 0101 080a 4026 6d60  .....&lt;......@&amp;m`
 0x0030:  4026 6d60 4854 5450 2f31 2e30 2033 3032  @&amp;m`HTTP/1.0.302
 0x0040:  2046 4f55 4e44 0d0a                      .FOUND..</pre>
<p>So removing the Expect header allowed curl to send an HTTP 1.1 POST, with its payload, before the server generated HTTP 1.0 302 FOUND.  The fact that the server responded with 302 FOUND means the entire POST data was ignored on the server side, but the client DID send it!  In other words, we just wasted some bandwidth, and we are going to need to POST the data at least one more time.  When curl was expecting an HTTP 1.1 100 CONTINUE instead, it never sends the rest of the payload, and <strong>curl never complains, not even with -v</strong>.</p>
<h4>Issuing a multipart/form-data POST using HTTP 1.0</h4>
<p>I&#8217;ll spare you the complete packet dump, but suffice to say that when curl is invoked with the HTTP 1.0 flag (-0, as in &#8220;dash zero&#8221;), it works just like when the Expect header is absent.  In other words, the following example also sends the payload before waiting for the server to respond.</p>
<pre>curl -0 -F name=somevalue http://osx.example.com:8000</pre>
<p>This results in:</p>
<pre>10:33:25.942611 IP osx.57900 &gt; osx.8000 P 1:238(237) ack 1 win 65535
&lt;nop,nop,timestamp 1076272988 1076272988&gt;
 0x0000:  4500 0121 f225 4000 4006 0000 7f00 0001  E..!.%@.@.......
 0x0010:  7f00 0001 e22c 1f40 0fcb 74fc 46cf 148b  .....,.@..t.F...
 0x0020:  8018 ffff ff15 0000 0101 080a 4026 9f5c  ............@&amp;.\
 0x0030:  4026 9f5c 504f 5354 202f 2048 5454 502f  @&amp;.\<strong>POST./.HTTP/</strong>
 0x0040:  312e 300d 0a55 7365 722d 4167 656e 743a  <strong>1.0</strong>..</pre>
<p>After a little conversation with the server, the payload is transmitted before the HTTP response, just like in the previous example.</p>
<h4>Conclusion</h4>
<p>What is the takeaway message from all of this?  If you&#8217;re using curl to test your REST interface, then make sure you are aware of the behavior HTTP 1.1 100 CONTINUE.  You might notice it because your server receives a blank POST payload.  Your HTML forms will appear to have not been filled in, even though you specified one or more -F arguments on the curl command line.</p>
<p>The solution for the versions of curl I&#8217;ve tested is to either remove the Expect header, or to tell curl to use HTTP 1.0 (since curl will default to 1.1 otherwise).  Once again, here are those examples:</p>
<pre>curl -H "Expect:" -F name=somevalue http://osx.example.com:8000
curl -0 -F name=somevalue http://osx.example.com:8000</pre>
<p>This forces curl to POST the payload without waiting for the 100 CONTINUE response, and it is suitable for servers that don&#8217;t know how to provide a 100 CONTINUE.  I hope this helps someone out there to avoid the trouble I had debugging my REST interface.</p>
<h4>Coda: fix your server!</h4>
<p>The &#8220;right&#8221; way to handle this situation is to make sure your server will send a 100 CONTINUE.  curl is smart enough, but your application might not be.</p>
<p>In a specific instance, my <a href="http://www.djangoproject.com/">Django</a>/<a href="http://bitbucket.org/jespern/django-piston/wiki/Home">Django-Piston</a>/<a href="http://code.google.com/p/modwsgi/">mod_wsgi</a> application having trouble with the 100 CONTINUE when the client sets an Expect: 100-continue header. It turns out this was a problem with mod_wsgi 2.5 and the solution is to update to 3.0 (it&#8217;s RC4 as of this post). Until I realized this was the deeper issue, I was able to get around the problem by preemptively POSTing the entire payload.  This worked, but as I said before, it frequently resulted in duplicate uploads (like in 302 and 401 situations).</p>
<p>Before I started preemptively POSTing the whole payload, my application appeared to be receiving blank POST data, so it was responding with a 400 BAD REQUEST.   In truth, the POST was blank, so it technically was a bad request.  This is not curl&#8217;s fault, though &#8211; it was just being extremely polite, waiting for a proper handshake before sending the payload.</p>
<p>Just watch out, because curl might be <strong>too polite</strong> &#8211; it didn&#8217;t even tell me that my server was refusing the 100 CONTINUE handshake.  This masked a much deeper problem with mod_wsgi, which took several days to sort out.</p>
]]></content:encoded>
			<wfw:commentRss>http://iandennismiller.com/blog/2009/09/curl-http1-1-100-continue-and-multipartform-data-post/feed/</wfw:commentRss>
		<slash:comments>4</slash:comments>
		</item>
		<item>
		<title>The Free Beer Speech House: discussing the meaning of the word &#8220;free&#8221;</title>
		<link>http://iandennismiller.com/blog/2009/08/free-speech-beer-house-discussing-the-meaning-of-the-word-free/</link>
		<comments>http://iandennismiller.com/blog/2009/08/free-speech-beer-house-discussing-the-meaning-of-the-word-free/#comments</comments>
		<pubDate>Tue, 04 Aug 2009 19:15:36 +0000</pubDate>
		<dc:creator>idm</dc:creator>
				<category><![CDATA[Technology]]></category>
		<category><![CDATA[beer]]></category>
		<category><![CDATA[original]]></category>
		<category><![CDATA[speech]]></category>
		<category><![CDATA[apa]]></category>
		<category><![CDATA[arrogant bastard]]></category>
		<category><![CDATA[free beer]]></category>
		<category><![CDATA[free house]]></category>
		<category><![CDATA[free speech]]></category>
		<category><![CDATA[gnupg]]></category>
		<category><![CDATA[gpl]]></category>
		<category><![CDATA[ipa]]></category>
		<category><![CDATA[jupiter]]></category>
		<category><![CDATA[open source]]></category>
		<category><![CDATA[optimator]]></category>
		<category><![CDATA[oss]]></category>
		<category><![CDATA[pgp]]></category>
		<category><![CDATA[pub]]></category>
		<category><![CDATA[stone brewing company]]></category>
		<category><![CDATA[triple rock]]></category>

		<guid isPermaLink="false">http://iandennismiller.com/blog/?p=67</guid>
		<description><![CDATA[Freedom, glorious freedom.
Once upon a time, I took a class based on  a single question: &#8220;what is freedom?&#8221;  We meandered through US history, identifying several distinct stages in the evolution of the definition of &#8220;freedom.&#8221;  I was horrified to learn, during a discussion, that so many of my classmates wanted what I will call &#8220;freedom [...]]]></description>
			<content:encoded><![CDATA[<p>Freedom, glorious freedom.</p>
<p>Once upon a time, I took a class based on  a single question: &#8220;what is freedom?&#8221;  We meandered through US history, identifying several distinct stages in the evolution of the definition of &#8220;freedom.&#8221;  I was horrified to learn, during a discussion, that so many of my classmates wanted what I will call &#8220;freedom from information.&#8221; Ah yes &#8211; Professor Sandage had a way of bringing the ugliest truths to the surface, for all to witness.</p>
<p>On the one hand, I can understand this desire for freedom from information: telemarketing, advertising, spam, the scrolling headlines at the bottom of a newscast&#8230;  well, any unsolicited attempt at selling things you don&#8217;t care about.  On the other hand, I think we need <strong>more</strong> information instead of less, and we need effective tools to filter and manage that information so we only see what we care about.</p>
<p>The term &#8220;freedom&#8221; is muddied by historical contexts, but also through the process of etymological erosion.  With that said, I want to take a moment to discuss the expression, &#8220;free as in speech, not beer.&#8221;</p>
<p><span id="more-67"></span></p>
<h4>Free as in speech, not beer</h4>
<p>&#8220;<a href="http://www.gnu.org/philosophy/free-sw.html">Free as in speech, not beer</a>&#8221; is an expression that comes up in open source discussions all the time.  It&#8217;s a little hard to unpack, unless you really dig into the dual meaning of the word &#8220;free.&#8221;  Thanks to Wikipedia, we&#8217;re part of the way there: the word &#8220;free&#8221; is used to mean two things: <a href="http://en.wikipedia.org/wiki/Gratis_versus_Libre">Gratis versus Libre</a>.  We call both of these terms &#8220;free&#8221; nowadays, but once upon a time, there were different words because they are totally different concepts.  Gratis means &#8220;without charge&#8221; whereas Libre is more like &#8220;liberty&#8221; or &#8220;freedom.&#8221;</p>
<p>So what is free speech?  Of course, that&#8217;s the freedom to say what you want (so long as you accept the consequences for what you&#8217;ve said).  And free beer?  Well, that would mean beer that is provided at no cost.  I think the key is this: although you are free to say what you want, you could well end up in court for it (e.g. slander) and your expression won&#8217;t come free of charge.  On the flipside, you can provide beer free of charge, but not to someone who is 15 years old, so you may not freely provide beer to anyone you wish.</p>
<p>In other words, speech embodies Libre (but not necessarily Gratis) perfectly.  Likewise, beer embodies Gratis very well, at the same time that beer is so closely regulated by many governments that it is hardly &#8220;libre.&#8221;  Nevertheless, everybody likes a good party with some beer pro gratis.</p>
<h4>The Free House, and the Public House</h4>
<p>Speaking of free beer, the <a href="http://en.wikipedia.org/wiki/Free_house_%28pub%29">Free House</a> is definitely not a place to find such a zero-cost beverage.  For starters, the term Free House is mostly British, and always beer-related.  It refers to a Public House (which you may know as a &#8220;pub&#8221;) that will sell any kind of beer they can get people to buy.  Contrast this with a Tied House, which sells beer manufactured by a single brewer, and you find that the Free House will have several brands on tap.  Here, the term &#8220;Free&#8221; is more like Libre, and is used in the context of the &#8220;free market.&#8221;  &#8230;and we all know that the free market isn&#8217;t composed of things that are zero-cost.</p>
<p>When I was living in Berkeley, California there were two particularly good &#8220;Tied House&#8221; pubs that brewed and sold only their own brands of beer: <a href="http://www.jupiterbeer.com/jupiter/">Jupiter</a> and <a href="http://www.triplerock.com/">Triple Rock</a>.  I should also mention <a href="http://www.pyramidbrew.com/alehouses/berkeley">Pyramid</a>, which had a pretty cool restaurant with their own beverages on tap.  This kind of pub is fun because they&#8217;ll often have a sampler option to let you taste a small glass of everything they brew.  It&#8217;s a great way to experience the full spectrum of beers, but a word of advice: start with the lightest stuff and progress towards darker.  The one exception to this rule is for hoppy beverages (e.g. <a href="http://en.wikipedia.org/wiki/India_Pale_Ale">IPA</a> or <a href="http://en.wikipedia.org/wiki/American_Pale_Ale">APA</a>), which might be light but which may have a pronounced bitter taste.  You might want to close it off with an APA, even after drinking the stouts.</p>
<h4>Open Source Software</h4>
<p>There&#8217;s nothing that goes quite so well with open source software as a tasty hoppy beverage.  I like pairing <a href="http://en.wikipedia.org/wiki/Stone_Brewing_Company">Stone Brewing Company&#8217;s</a> <a href="http://www.arrogantbastard.com/">Arrogant Bastard</a> with <a href="http://www.gnupg.org/">GnuPG</a>, the open source implementation of <a href="http://en.wikipedia.org/wiki/Phil_Zimmermann">Phil Zimmerman</a>&#8217;s PGP (<a href="http://en.wikipedia.org/wiki/Pretty_Good_Privacy">pretty good privacy</a>) software.  Another favorite of mine is the <a href="http://en.wikipedia.org/wiki/Spaten-Franziskaner-Br%C3%A4u">Spaten</a> Optimator paired with <a href="http://wordpress.org/">Wordpress</a>.  More recently, I&#8217;ve taken a liking to <a href="http://www.unibroue.com/index_eng.html">Unibroue</a>, the French Canadian brewer, who offers such brews as <a href="http://www.unibroue.com/graphs_our_beers/trois_pistoles.html">Tres Pistoles</a>, which is an excellent complement to <a href="http://www.python.org/">Python</a>.  This last combination is probably the most dangerous of the group, because you might end up with excellent code, and you might end up with <a href="http://en.wikipedia.org/wiki/Monty_python">British comedy</a>.</p>
<div id="attachment_71" class="wp-caption aligncenter" style="width: 310px"><img class="size-full wp-image-71" title="300px-Flyingcircus_2" src="http://iandennismiller.com/blog/wp-content/uploads/2009/08/300px-Flyingcircus_2.jpg" alt="http://en.wikipedia.org/wiki/Monty_Python" width="300" height="225" /><p class="wp-caption-text">http://en.wikipedia.org/wiki/Monty_Python</p></div>
<p>In the end of the day, free speech and free beer have a lot to do with open source software.  You see, licenses such as the <a href="http://en.wikipedia.org/wiki/Gpl">GNU General Public License</a> actually permit developers to charge for their software, while simultaneously requiring all GPL software to be published with its source code.  In this sense, the &#8220;free beer&#8221; part means the software isn&#8217;t necessarily without cost, and the &#8220;free speech&#8221; part means you are required to publish the source code.  In other words, the Libre aspect of the GPL has an important restriction: you are <strong>not</strong> free to <strong>not</strong> publish the source code, which in turn provides the most fundamental tenet of open source software: <strong>you are free to read and distribute the source code</strong>.</p>
<p>I want to hedge my previous statement: the GPL is a famous topic of debate, so there&#8217;s plenty of room to criticize anyone who says anything &#8211; at all &#8211; about the GPL or about open source software, either according to the letter of the license, or according to the spirit of the movement.</p>
<p>Let me sum it up like this: &#8220;free&#8221; means many things to many people throughout many time-periods, but for some reason, it almost always comes down to a matter of speech and beer.</p>
]]></content:encoded>
			<wfw:commentRss>http://iandennismiller.com/blog/2009/08/free-speech-beer-house-discussing-the-meaning-of-the-word-free/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>My experience with semantic dementia, or how I am coping with my reformatted laptop</title>
		<link>http://iandennismiller.com/blog/2009/08/my-experience-with-semantic-dementia-or-how-i-am-coping-with-my-reformatted-laptop/</link>
		<comments>http://iandennismiller.com/blog/2009/08/my-experience-with-semantic-dementia-or-how-i-am-coping-with-my-reformatted-laptop/#comments</comments>
		<pubDate>Sat, 01 Aug 2009 22:08:21 +0000</pubDate>
		<dc:creator>idm</dc:creator>
				<category><![CDATA[Technology]]></category>
		<category><![CDATA[original]]></category>
		<category><![CDATA[10.5]]></category>
		<category><![CDATA[aphasia]]></category>
		<category><![CDATA[cognitive science]]></category>
		<category><![CDATA[frustration]]></category>
		<category><![CDATA[marooned in realtime]]></category>
		<category><![CDATA[mcclelland]]></category>
		<category><![CDATA[neural network]]></category>
		<category><![CDATA[neuroscience]]></category>
		<category><![CDATA[os x]]></category>
		<category><![CDATA[parallel distributed processing]]></category>
		<category><![CDATA[pick's disease]]></category>
		<category><![CDATA[reformat]]></category>
		<category><![CDATA[rumelhart]]></category>
		<category><![CDATA[semantic dementia]]></category>
		<category><![CDATA[tip of the fingers]]></category>
		<category><![CDATA[tip of the tongue]]></category>
		<category><![CDATA[vinge]]></category>

		<guid isPermaLink="false">http://iandennismiller.com/blog/?p=61</guid>
		<description><![CDATA[I just upgraded my laptop to OS X 10.5 and it&#8217;s great, but I hit one major snag along the way.  Although I thought all of the Intel Macs shipped with the new GUID partition scheme, it seems like my early-generation Macbook Pro used the old Apple partition scheme, and unless I reformatted my drive [...]]]></description>
			<content:encoded><![CDATA[<p>I just upgraded my laptop to OS X 10.5 and it&#8217;s great, but I hit one major snag along the way.  Although I thought all of the Intel Macs shipped with the new <a href="http://en.wikipedia.org/wiki/GUID_Partition_Table">GUID partition scheme</a>, it seems like my early-generation Macbook Pro used the old Apple partition scheme, and unless I reformatted my drive as GUID, I couldn&#8217;t install 10.5.  Fortunately, I spent the day backing up my old drive, so I just forged on, and once 10.5 was installed, I used the <a href="http://en.wikipedia.org/wiki/Migration_Assistant_%28Apple%29">Migration Assistant</a> to transfer my old home directory.</p>
<p>It worked&#8230;  mostly.  Partially by design, I chose to not migrate some command line tools, but now I find that every so often, I want to accomplish some task and I can&#8217;t &#8230; quite &#8230; do it, because I need to reinstall something, or perhaps reconfigure something.  I&#8217;d say 95% of the old functionality is still there, but the remaining 5% comes up often enough that it feels like something more than 5%.  The feeling is  this lurking suspicion that I can&#8217;t trust my computer to do something that I know it used to be capable of, and it reminded me of a disease called <a href="http://en.wikipedia.org/wiki/Semantic_dementia">Semantic Dementia</a>. I don&#8217;t have semantic dementia in the sense of the neurological disease, but I&#8217;d like to start this off with a story about it.</p>
<h4><span id="more-61"></span>Rumelhart and McClelland</h4>
<p>Back in 2002, I was taking a class called something like &#8220;Cognitive Neuroscience&#8221; from Carl Olson and <a href="http://en.wikipedia.org/wiki/James_McClelland">Jay McClelland</a>.  I already knew  about McClelland&#8217;s work from a previous class on neural networks, which relied heavily on <a href="http://www.amazon.com/Explorations-Parallel-Distributed-Processing-Macintosh/dp/0262631296/ref=sr_1_3/175-4392983-6164664?ie=UTF8&amp;s=books&amp;qid=1249160029&amp;sr=1-3">a textbook written by McClelland</a> and his former colleague <a href="http://en.wikipedia.org/wiki/David_Rumelhart">Rumelhart</a>.  While the neuroscience class was fascinating, the most salient memory I have of the class involves a fairly personal reflection by McClelland on the current state of Rumelhart.</p>
<p>Although Rumelhart is relatively young (aged 67, as of 2009) he suffers from a form of semantic dementia called <a href="http://en.wikipedia.org/wiki/Pick%27s_disease">Pick&#8217;s Disease</a> (or, if it&#8217;s not a form of semantic dementia, it&#8217;s related to it).  In any case, Pick&#8217;s Disease is a neuro-degenerative disorder.  The deep irony of this is that in the 1980s, Rumelhart and McClelland provided the <a href="http://scholar.google.ca/scholar?hl=en&amp;client=firefox-a&amp;rls=org.mozilla:en-US:official&amp;hs=WJ1&amp;q=author:%22Rumelhart%22+intitle:%22Parallel+distributed+processing:+Explorations+in+the+...%22+&amp;um=1&amp;ie=UTF-8&amp;oi=scholarr">first and possibly still-best description of neural networks</a>, which could only have come from a profound insight into how the brain functions.</p>
<p>I can&#8217;t say much about what Rumelhart thinks of his disease, but I know McClelland thought about it, and if I&#8217;m not mistaken, this ultimately lead to a bridge between neural networks and an older model of semantic cognition called <a href="http://en.wikipedia.org/wiki/Isa_%28computer_science%29">ISA networks</a>.  As I recall it, it&#8217;s in terms of ISA networks that semantic dementia can be understood, which provides an intuitive way to describe how categories of items can become indistinct.</p>
<p>Consider: is a cat an animal?  Is a bird an animal?  Is a canary a bird?  Is a penguin a bird?  Is a penguin an animal?  It&#8217;s when we say &#8220;a penguin ISA bird,&#8221; and &#8220;a bird ISA animal&#8221; that we&#8217;re establishing categories, but lots of categories have weird and one-of-a-kind rules that lead us to say, &#8220;a penguin is a bird, even though it doesn&#8217;t fly&#8221; or &#8220;even though a cat is an animal that doesn&#8217;t fly, it is not a bird.&#8221;  Kids have to learn this through trial and error (e.g. a horse has four legs, but it isn&#8217;t a cat).</p>
<div id="attachment_62" class="wp-caption aligncenter" style="width: 330px"><img class="size-full wp-image-62" title="isa-example" src="http://iandennismiller.com/blog/wp-content/uploads/2009/08/isa-example.png" alt="from http://www.uark.edu/misc/lampinen/sm.html" width="320" height="283" /><p class="wp-caption-text">from http://www.uark.edu/misc/lampinen/sm.html</p></div>
<p>Semantic dementia is a condition where concepts lose their meaning, and a possible explanation for this comes from the inability to categorize concepts anymore.  This might have to do with making new distinctions between concepts, and it might relate to the retrieval of concepts that were previously distinct.  I think of semantic dementia as being like a car crash victim who is paralyzed and unable to communicate with the world, even though they are not cognitively impaired in any other way.  I imagine semantic dementia as being the frustrating condition of &#8220;knowing&#8221; what you want to say, but being unable to find the right words to say it.  Maybe it&#8217;s like having every word on the &#8220;tip of your tongue&#8221; and trying to construct sentences in spite of it.</p>
<h4>Corrupted Greenthink</h4>
<p>Earlier this year, I read a book by Vernor Vinge called <a href="http://en.wikipedia.org/wiki/Marooned_in_Realtime">Marooned in Realtime</a> about life in a post-singularity enclave.  First off, I recommend <a href="http://www.amazon.com/Marooned-Realtime-Vernor-Vinge/dp/0765308843/ref=sr_1_1?ie=UTF8&amp;s=books&amp;qid=1249161177&amp;sr=1-1">Marooned in Realtime</a> highly, but the book contains a cautionary tale about relying on an external system to store your personal database.  Whereas humans once kept their thoughts in their brains, and later turned to notebooks for assistance, in the MiR universe the pinnacle of personal database technology is called Greenthink.  While Greenthink has some similarities to 2009&#8217;s Wikipedia, I think it also includes personal notes, and presents a deeply personalized interface for interacting with these information objects.</p>
<p>Because the characters in MiR departed from Earth at different times over the span of about a century, and because technology continued to improve at an exponential rate, those who departed 10 years later benefited from vastly superior technology than those who were 10 years earlier.  One of the characters is so advanced as to be barely recognizable as a human, and this character has the deepest and most integrated relationship with Greenthink.</p>
<p>At one point, another character tries to use the futuristic Greenthink database, but since they were familiar with an older version of Greenthink, the interface is so foreign and personalized that they are unable to accomplish much of anything.  With some practice, they get up to speed, but the learning curve is steep.  During a fascinating future-combat scene, parts of Greenthink are corrupted, and the futuristic character struggles to retrieve information objects that were previously available.</p>
<p>Again, this reminds me of semantic dementia.  In the case of Greenthink it&#8217;s the retrieval cues that no longer lead to the right information objects; perhaps the interface doesn&#8217;t link correctly, perhaps the &#8220;menus&#8221; don&#8217;t have the right choices, or perhaps the choices are all there but the information objects aren&#8217;t there.  Whatever the case is, the character is frustrated and somewhat crippled by the inability to &#8220;think&#8221; the way they are used to, or to think at the speed they are accustomed to.</p>
<p>Although Vinge wrote Marooned in Realtime back in 1986, it is startling prescient.  As is the case with many series, I only found out too late that MiR is the third book in the Across Realtime trilogy, and I haven&#8217;t read the first two books yet.</p>
<h4>Thinking in real-time</h4>
<p>I can imagine some of the frustration of semantic dementia, or with <a href="http://en.wikipedia.org/wiki/Aphasia">aphasia</a> (another language disorder), in terms of the amount of time it takes to accomplish something.  I definitely don&#8217;t think this provides any insight to the disorders themselves, but just consider the &#8220;tip of the tongue&#8221; phenomenon.  There&#8217;s a word that means <em>something</em>, and you know <em>exactly what that thing is</em>, but you just can&#8217;t think of the word.  If it weren&#8217;t that you are in the middle of a conversation, there would be no problem with sitting down to ponder the concept until you remember precisely the word you were looking for.  However, there&#8217;s another person waiting for you to finish your thought, and you are so compelled to express what it is that you have to say.  This results in a feeling of frustration, because it&#8217;s so irritatingly imprecise to not have access to the word you need, to the extent that it won&#8217;t do justice to the concept you are expressing.</p>
<p>In other words, we like to think and talk in real-time, but when something delays us, we get frustrated.  Likewise, I like to <em>compute</em> in realtime, and I strongly doubt I&#8217;m alone on this one.  Although I have many stories about my frustration with an early-1990s 286 computer (which I used until about 1996), I experienced the same problem in 2006 with my shiny new laptop.  It showed up with 512MB of RAM, which was not my intention.  I waited several months to upgrade to 1GB, but during this time-period, I experienced the profound frustration of swap disk hell.  Let me explain.</p>
<p>You might think of computers as a pyramid of buckets, where the fewest buckets at the top of the pyramid are the fastest to put things into and to get things out of.  As you go down the pyramid, you find that the buckets are both more numerous and also slower.  At the top of this pyramid is the <a href="http://en.wikipedia.org/wiki/CPU_cache">storage that is physically located on the CPU</a>; this is very small and extremely fast (<a href="http://www.pctechguide.com/14Memory_L1_cache.htm">called L1 Cache</a>).  Depending on your CPU type, there might be two more levels of storage here (called L2 and L3), and just slightly slower is that type of storage called RAM.  Almost at the bottom of the pyramid is the hard drive, which is about 1000 times slower than RAM, but is correspondingly cheaper and much larger.</p>
<p>So there I was, using a totally modern machine that had way less RAM than it needed.  To run some software quickly might require lots of RAM, but if you don&#8217;t have that, modern operating systems elegantly &#8220;swap&#8221; data from RAM to the hard disk.  Recall that the hard drive is 1000 times slower than RAM, so swapping can be a very slow process.  However, this is much better than being restricted from running certain software because there isn&#8217;t enough RAM.</p>
<p>My typical usage pattern required vastly more RAM than I had available, such that I had to wait as much as 30 seconds to switch from my web browser to my word processor.  If I were leisurely wasting time, I might not mind this so much, but the fact of the matter is that I was working really hard at the time, and I didn&#8217;t have time for all of those 30-second delays.  As a result, I was frustrated, which is really an understatement.  I was able to think so much faster than my computer, and it felt like I was crippled somehow because I wasn&#8217;t free to think as fast as I wanted.  I call these 30-second delays &#8220;swap disk hell.&#8221;</p>
<p>For what it&#8217;s worth, I&#8217;ve maxed out my laptop at 2GB, and I&#8217;m quite happy, because I&#8217;ve caught back up with real-time.</p>
<h4>The tip of the fingers phenomenon</h4>
<p>On Thursday, I upgraded to OS X 10.5, and as I said, I migrated about 95% of my old functionality to the new system.  It&#8217;s that lingering 5% that comes up way more often than it should.  The 5% of the time that I try to do something (say, run a <a href="http://en.wikipedia.org/wiki/Python_%28programming_language%29">Python</a> script I rely on) but I find that I can&#8217;t, it&#8217;s at once surprising (since I can&#8217;t predict when it will happen) and frustrating (since it will invariably delay me from accomplishing my task).</p>
<p>In some senses, this is like the tip of the tongue problem: I know what I want to do, and I could describe exactly what needs to happen, but I am looking for the command to actually accomplish my goal and the command just isn&#8217;t there.  It&#8217;s like a corrupted Greenthink database, and the frustration is just as bad as living in swap disk hell.  It&#8217;s like thinking and knowing everything you always used to know, and finding that it just doesn&#8217;t matter how it used to work, because it doesn&#8217;t work that way anymore.  This leads to suspicion and distrust, like I can&#8217;t totally rely on my ability to think anymore, because my thoughts don&#8217;t map onto actions the way they used to.</p>
<p>Semantic dementia can be a progressive disease, meaning it doesn&#8217;t necessarily happen all at once, and instead comes on in degrees.  Over time, I will rehabilitate my computing environment to be back at 100% &#8211; this is one sense in which my problem is totally unlike semantic dementia, because I can recover from reformatting my computer.  Really, there are many senses in which everything I&#8217;ve said is just a metaphor for semantic dementia; I don&#8217;t know what it&#8217;s like, and even what I do know about it is still quite different from what I am experiencing now.</p>
<p>Still, in everything that I do, I find that I hesitate slightly, since I am not certain I will accomplish the goal I have in mind as long as my computer is partially corrupt.  This is the &#8220;tip of the fingers phenomenon&#8221; &#8230; and the only cure is the slow and deliberate process of rehabilitation.</p>
]]></content:encoded>
			<wfw:commentRss>http://iandennismiller.com/blog/2009/08/my-experience-with-semantic-dementia-or-how-i-am-coping-with-my-reformatted-laptop/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Wireless Security in 2009: Recommendations</title>
		<link>http://iandennismiller.com/blog/2009/07/wireless-security-in-2009-recommendations/</link>
		<comments>http://iandennismiller.com/blog/2009/07/wireless-security-in-2009-recommendations/#comments</comments>
		<pubDate>Fri, 03 Jul 2009 16:12:59 +0000</pubDate>
		<dc:creator>idm</dc:creator>
				<category><![CDATA[Technology]]></category>
		<category><![CDATA[802.11]]></category>
		<category><![CDATA[802.11b]]></category>
		<category><![CDATA[802.11g]]></category>
		<category><![CDATA[802.11i]]></category>
		<category><![CDATA[802.11n]]></category>
		<category><![CDATA[MAC]]></category>
		<category><![CDATA[MAC address]]></category>
		<category><![CDATA[radius]]></category>
		<category><![CDATA[recommendation]]></category>
		<category><![CDATA[security]]></category>
		<category><![CDATA[SSID]]></category>
		<category><![CDATA[TKIP]]></category>
		<category><![CDATA[WEP]]></category>
		<category><![CDATA[wifi]]></category>
		<category><![CDATA[wireless]]></category>
		<category><![CDATA[wireless access point]]></category>
		<category><![CDATA[wireless AP]]></category>
		<category><![CDATA[wireless network]]></category>
		<category><![CDATA[wireless router]]></category>
		<category><![CDATA[wireless security]]></category>
		<category><![CDATA[WPA]]></category>
		<category><![CDATA[WPA2]]></category>
		<category><![CDATA[WPA2-PSK]]></category>

		<guid isPermaLink="false">http://iandennismiller.com/blog/?p=48</guid>
		<description><![CDATA[Yesterday, I  grabbed an 802.11b/g/* router from Chinatown ($32 &#8211; can&#8217;t beat that) and set out to use my laptop&#8217;s wireless network card.  I hadn&#8217;t done this before because I was (justifiably) concerned about wireless security, so I wanted to make sure that a breach of the wireless network wouldn&#8217;t turn into a breach of [...]]]></description>
			<content:encoded><![CDATA[<p>Yesterday, I  grabbed an 802.11b/g/* router from Chinatown ($32 &#8211; can&#8217;t beat that) and set out to use my laptop&#8217;s wireless network card.  I hadn&#8217;t done this before because I was (justifiably) concerned about wireless security, so I wanted to make sure that a breach of the wireless network wouldn&#8217;t turn into a breach of the wired LAN (which includes a printer and a few sensitive servers). This post collects some of my research and observations, and it concludes with my recommendations for how you can secure your own wireless network&#8230;  or at a minimum, it tells you how you could if you were willing to spend $32 on a new wireless access point.</p>
<p><span id="more-48"></span></p>
<h4>Network Topology: equal distrust for the Internet as for wireless</h4>
<p>A series of events brought me to the point where this became realistic, the most important of which is that I got an extra router for the LAN.  Let me briefly explain the current network topology I use, which allows me to equally distrust public Internet traffic as much as I distrust my wireless router.</p>
<p>- We connect to the internet via DSL and a router, which uses NAT to provide a private address space behind the router (10.0.50.x)</p>
<p>- The &#8220;wired&#8221; router connects to the DSL+router, which uses NAT to create a separate private address space (10.0.51.x)</p>
<p>- The wireless router also connects to the DSL+router, and as you might have guessed, there is yet another private address space behind this router (10.0.52.x)</p>
<p>So, if you&#8217;re using an ethernet cable, your connection cannot be routed to a machine connected via wireless, and vise versa.  Barring an attack against the wired router, the address space is simply not routable.  I&#8217;ll eventually provide a VPN into the wired network, so I can print using a wireless connection (since the printer is only connected to the wired network).</p>
<p>At this point, I was pretty happy about running a wireless access point, because I was really no worse off if someone attacked the wired LAN via wireless or via the public Internet.  Basically, both vectors are equally untrusted.</p>
<h4>Digging into WPA2</h4>
<p>Still, I was uneasy about actually using my wireless network, and I hoped that wireless security had advanced beyond <a href="http://en.wikipedia.org/wiki/Fluhrer,_Mantin,_and_Shamir_attack">the famous WEP debacle</a>, which made it downright trivial to attack older wireless access points.  <a href="http://en.wikipedia.org/wiki/WPA2#WPA2">The solution is to use WPA2,</a> which is a better protocol that only runs on newer hardware.  This is not without its pitfalls, and some impressive work has been undertaken to attack WPA2.  Notably, <a href="http://code.google.com/p/pyrit/">the pyrit project</a> has made great progress using 3d acceleration hardware to create downright feasible attacks against WPA2 with a pre-shared key (WPA2-PSK).</p>
<p>An alternative to using a pre-shared key with WPA2 is to use a <a href="http://freeradius.org/">key server technology called Radius</a>, but because I didn&#8217;t wish to run another server, I needed to learn more about the pyrit approach so that I could still use WPA2-PSK.</p>
<h4>The Pyrit Approach</h4>
<p>Pyrit can make use of multiple 3d accelerator cards, and now can even cluster machines for parallel processing, in order to pre-calculate values that are useful in attacking a wireless network.  In other words, it is plausible for anyone with enough friends (or perhaps a government budget) to get the raw computing power required to crunch the numbers.  After saving these computed values to disk (a process that takes hours or days), they can be rapidly transmitted to the access point in a few minutes, and the attack will have been executed.</p>
<p>The key here comes down to disk storage, instead of processor power, because we might as well assume that processor power isn&#8217;t realistically limited anymore.  From the pyrit blog itself, <a href="http://pyrit.wordpress.com/the-twilight-of-wi-fi-protected-access/#comment-103">it appears PSK values longer than 10 ASCII characters</a> cannot affordably be stored on current hard drives, even though it is definitely possible to perform the necessary calculations.</p>
<p>The pyrit attack is further thwarted by the incorporation of the wireless access point&#8217;s SSID in the WPA2 calculations, so while it is possible to pre-calculate an attack for common SSIDs (like &#8220;linksys&#8221; or &#8220;default&#8221;) it is only possible to attack a novel SSID after some reconnaissance to determine that value of the target SSID.  Most impromptu pyrit attacks will probably involve common SSIDs that ship as the default setting for wireless access points.</p>
<h4>Other considerations</h4>
<p>There is also the issue of traffic over the air, where the question is to either use TKIP or AES.  This one is easy: there is <a href="http://en.wikipedia.org/wiki/WPA2#Weakness_in_TKIP">a weakness in TKIP</a>, so don&#8217;t use it.</p>
<p>If you know ahead of time which machines will exclusively use your access point, then MAC address filtering will be an extra security measure.  While MAC addresses can be spoofed, it takes extra time to do so and can be a hassle to brute force your way through the address space.  MAC address filtering is an option on my wireless router, so I have chosen to disallow all network access except for the few wireless devices that I know the MAC address of.</p>
<p>So you know, it can become a hassle to keep your MAC address whitelist up to date if you keep adding new wireless devices, like if you have friends who drop by with their laptops.  It&#8217;s probably worth the 60 seconds it takes to add a new device, but YMMV.</p>
<h4>Recommendations</h4>
<p>After all is said and done, it looks like it&#8217;s possible to create a relatively secure wireless access point.  Here are my recommendations:</p>
<p>- Use WPA2-PSK</p>
<p>- Use a PSK that is the maximum allowable length (probably around 63.)  Use a completely random method that includes all allowable ASCII characters (mixed case, numbers, and symbols).  Your wireless access point will probably call this a &#8220;password&#8221; or something, but just know that this is the &#8220;pre-shared key&#8221; (PSK).</p>
<p>- Encrypt all traffic with AES instead of TKIP</p>
<p>- Use a randomly generated SSID to name your access point</p>
<p>- tell your access point to NOT broadcast its SSID.  This will prevent it from showing up in the list of available access points when someone clicks on their wireless network card to scan.  This won&#8217;t deter the most determined attackers, but do this if it&#8217;s an option.</p>
<p>- Use MAC address filtering.  Disallow all by default, and whitelist the devices you want to explicitly allow.</p>
<p>This should be a pretty good starting point, and it works with my $32 wireless router.  There may be new attacks in the future, and hard drive space will obviously get cheaper, but I feel pretty comfortable at this precise moment.</p>
]]></content:encoded>
			<wfw:commentRss>http://iandennismiller.com/blog/2009/07/wireless-security-in-2009-recommendations/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
	</channel>
</rss>
