<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>automaticable &#187; hack</title>
	<atom:link href="http://www.automaticable.com/category/hack/feed/" rel="self" type="application/rss+xml" />
	<link>http://www.automaticable.com</link>
	<description>adjective: of or pertaining to things that should work but go awry</description>
	<lastBuildDate>Sun, 27 Mar 2011 16:16:02 +0000</lastBuildDate>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=3.2.1</generator>
		<item>
		<title>Crazy One-Liners</title>
		<link>http://www.automaticable.com/2008-02-22/crazy-one-liners/</link>
		<comments>http://www.automaticable.com/2008-02-22/crazy-one-liners/#comments</comments>
		<pubDate>Sat, 23 Feb 2008 01:42:27 +0000</pubDate>
		<dc:creator>Scott Wegner</dc:creator>
				<category><![CDATA[hack]]></category>
		<category><![CDATA[linux]]></category>
		<category><![CDATA[ubuntu]]></category>
		<category><![CDATA[awk]]></category>
		<category><![CDATA[header]]></category>
		<category><![CDATA[http]]></category>
		<category><![CDATA[lsp-request]]></category>
		<category><![CDATA[one liners]]></category>
		<category><![CDATA[Scott Wegner]]></category>
		<category><![CDATA[script]]></category>
		<category><![CDATA[sh]]></category>
		<category><![CDATA[shell]]></category>
		<category><![CDATA[terminal]]></category>
		<category><![CDATA[tutorial]]></category>

		<guid isPermaLink="false">http://www.automaticable.com/2008-02-22/crazy-one-liners/</guid>
		<description><![CDATA[So I wrote a pretty interesting one-line command for a specific task today. Here it is&#8211; can you guess what it does? awk '1 {system("lwp-request -Sm HEAD " $0)}' \ input.txt &#124; awk '/200 OK/ {print $2}' &#62; output.txt Yeah, me either if I were just looking at it. But let&#8217;s break it apart, piece-by-piece. [...]]]></description>
			<content:encoded><![CDATA[<p><a href="http://www.flickr.com/photos/adactio/2144285151/" target="_blank" title="Command Terminal"><img src="http://www.automaticable.com/wp-content/uploads/2008/02/terminal.thumbnail.jpg" alt="Command Terminal" class="imageframe imgalignleft" height="188" width="300" /></a>So I wrote a pretty interesting one-line command for a specific task today.  Here it is&#8211; can you guess what it does?</p>
<pre>awk '1 {system("lwp-request -Sm HEAD " $0)}' \</pre>
<pre>input.txt | awk '/200 OK/ {print $2}' &gt; output.txt</pre>
<p>Yeah, me either if I were just looking at it.  But let&#8217;s break it apart, piece-by-piece.  You&#8217;ll notice that&#8217;s its essentially two commands, strung together through some piping and redirection (the &#8220;\&#8221; character is just to break the command up in two lines).  It&#8217;s broken up like so:</p>
<pre>{command1} | {command2} &gt; {file}</pre>
<p>This says to execute <em>command1</em> first.  Then pipe it&#8217;s output into <em>command2</em> as input.  Finally take the output of <em>command2</em>, and throw it all into a file.  So we start with the first command:</p>
<pre>awk '1 {system("lwp-request -Sm HEAD " $0)}' input.txt</pre>
<p>So what the heck does awk do?  Well, it&#8217;s basically a utility to read in input text, do some filtering on it, and then execute a specific task (or tasks) based on the results.  In this case, it has the form:</p>
<pre>awk 'filter {command}' input</pre>
<p>Skipping first to <em>input</em>, we see that the text we want to process comes from a simple text file&#8211; in this case, <em>input.txt</em>.  <em>filter</em> is what decides which lines of the input actually get used.  Generally it&#8217;s in the form of a regular expression, and the matching lines are processed.  In our case, we just use <em>1</em>, which means everything matches, and we will process all lines.  Next to the<em> command</em>:</p>
<pre>system("lwp-request -Sm HEAD " $0)</pre>
<p>In awk, the <em>system</em> command actually specifies that the parameter command should be executed in a sub-shell.  The parameter is a quoted string, and using <em>$0</em> means that we should use the first token of the matching line each time.  So the function we really want to look at is:</p>
<pre>lwp-request -Sm HEAD {token}</pre>
<p>The lwp-request command, as seen here, is a command-line utility to send HTTP requests to a server, and observe there response.  It has one required argument, which is the URL to query.  Since we don&#8217;t see that explicitly here, that must be coming from the <em>token</em> we parsed from our input.  We also specify two other parameters.  <em>-S</em> tells the program to &#8220;print the response chain,&#8221; meaning that it will show any redirection or authorization handled automatically.  Also, we use <em>-m HEAD</em>, which specifies that we are interested in the header data from the HTTP response.  So far, pretty confusing, right?  Well, let&#8217;s see what a sample query looks like:</p>
<pre>$ lwp-request -Sm HEAD http://google.com
HEAD http://google.com --&gt; 301 Moved Permanently
HEAD http://www.google.com/ --&gt; 200 OK
Cache-Control: private
Connection: Close
Date: Sat, 23 Feb 2008 01:27:27 GMT
Server: gws
Content-Length: 0
Content-Type: text/html; charset=ISO-8859-1
Client-Date: Sat, 23 Feb 2008 01:27:36 GMT
Client-Peer: 64.233.167.104:80
Client-Response-Num: 1
Set-Cookie: PREF=ID=4b507d757f70e13b:TM=1203730047: (...)</pre>
<p>Interesting, sort of.  Anyway, let&#8217;s move on.  So that little piece of code is getting executed for the first token of every line in our input file.  Then, the output is getting &#8220;piped&#8221; into our next command:</p>
<pre>awk '/200 OK/ {print $2}'</pre>
<p>We&#8217;ve seen awk before!  This time, though, we don&#8217;t specify an input file, because the input comes directly from the previous command.  Our other parameters have changed as well.  The filter is no longer <em>1</em>, but rather <em>/200 OK/</em>.  This is a true (albeit simple) regular expression, and matches any line that contains the string &#8220;200 OK&#8221;.  Only lines with this string will be processed.  Which brings us to our command, or action: <em>print $2</em>.  <em>print</em> means to simply output what follows.  In this case, <em>$2</em>, which represents the second parsed token.  awk is going to consider everything that is piped in from <em>command1</em>, filter out lines it doesn&#8217;t care about, and execute the action on the filtered set.  Looking at our sample output above, the only matching line is:</p>
<pre>HEAD http://www.google.com/ --&gt; 200 OK</pre>
<p>This line will be used in the command, <em>print $2</em>.  The command simply prints the second <em>token</em> (separated by a space) on the line, so it outputs:</p>
<pre>http://www.google.com/</pre>
<p>The final piece of our code redirects <em>command2</em>&#8216;s output into a file, <em>output.txt</em>.  And that&#8217;s it!  So putting the pieces together, let&#8217;s look at what is really happening here:</p>
<ul>
<li>We read in data from an input file for parsing.  We can infer that each line contains a URL, which is needed later</li>
<li>Each URL is passed to the <em>lwp-request</em> command, which outputs header information from the server</li>
<li>We filter the response information down to only the bits we care about.  In this case, a new URL</li>
<li>Finally, we output each of these &#8220;new&#8221; URL&#8217;s to an output file.</li>
</ul>
<p>So, that&#8217;s the whole one-liner.  A little more compactly, it&#8217;s a piece of code that takes a list of input URL&#8217;s, and outputs the URL&#8217;s that each one redirects to.  It&#8217;s a pretty specific snippet, and has absolutely no error-checking, so is definitely prone to bugs.  But, it worked for me the one time I needed it, and it was enough to show off.</p>
<p>On a side-note, this little piece of code made the difference between hours of mindless data-entry, and automated awesomeness.</p>
Similar:<ul><li><a href="http://www.automaticable.com/2008-01-18/how-to-use-your-ubuntu-computer-as-a-music-alarm-clock/" rel="bookmark" title="January 18, 2008">How-to: Use Your Ubuntu Computer as a Music Alarm Clock</a></li>

<li><a href="http://www.automaticable.com/2008-01-18/how-to-mount-a-network-drive-in-ubuntu/" rel="bookmark" title="January 18, 2008">How-to: Mount a Network drive in Ubuntu</a></li>
</ul><!-- Similar Posts took 23.578 ms -->]]></content:encoded>
			<wfw:commentRss>http://www.automaticable.com/2008-02-22/crazy-one-liners/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
	</channel>
</rss>

