From here on out, I'm going to assume that you have access to a Unix environment. If you use Windows, I suggest that you install Cygwin to get a Unix environment on your machine if you want to follow along. The default Windows terminal is also pretty crappy so I recommend installing the rxvt-native terminal from within Cygwin.

Every operating system in the Unix family (including Mac OS X) ships with a set of core utilities, or coreutils. These are small, fast programs which are invoked via the command line. The coreutils are truly the holy grail of automation.

By design, each utility is very, very good at one specific thing. For instance, cut will print out one or several slices of the text you pass into it, as determined by a "delimiter character" that you specify. paste, on the other hand, will join two files together line-by-line. Together, these programs can be used, for example, to combine reports from multiple files and format the output how you want.

This design is known as the Unix "toolbox" philosophy. The analogy is that a swiss army knife is supposed to be good at everything. But you can't build a house with a swiss army knife. Professional contractors have a toolbox full of highly specialized tools that can only do one thing, but are extremely good at that one thing. The same goes for each of the core utilities. Some software tries to be a swiss army knife and often winds up as the "jack of all trades, master of none." The core utilities are like the contractor's toolbox. By using several specialized tools in combination, you can achieve quick and powerful results.

The glue for the coreutils that allows you to use them together is called the shell. The shell is what you are interacting with when you start your "terminal" program, such as the one provided by Mac OS X located under /Applications/Utilities/Terminal.app. Bash is the most popular shell today, and is the default shell on most modern systems including recent versions of Mac OS X.

The shell interprets commands that you type into it to construct a pipeline between the different programs you are using. So, if I type:

ls

The ls program lists the files in the current directory and returns them to the shell, which prints them to the screen. But now, if I type

ls | grep "MyFile"

the shell is taking the output of ls, and instead of printing it out, it passes it to another program called grep. grep searches through the output of ls for the text "MyFile", and sends only the lines that contain it back to the shell. The shell prints those lines to the screen.

Where we really get cooking is when we start using output from one command as instructions for another command. The find utility is great for this, as is the xargs utility. Find will search recursively in the folder you specify for a set of files which match the parameters you give it. Then, using the -exec flag, you can tell find to execute any command on each one of those files. xargs does something similar, but it can take any text as an input (typically, file names) and provide them as arguments to another program. find / xargs are designed to control other programs, and can be used for tasks such as a batch move, batch rename, batch conversion, etc.

so in keeping with our current example:

ls | grep "MyFile" | xargs open

The output of grep, those files or folders containing the word "MyFile", is passed by the shell to xargs. xargs passes them to the open command, a Mac OS X utility that will try to use Finder to open each file. Simple functionality like this can be more elegantly achieved using only the find command. See my tutorial on using find to learn more about this incredible tool.

Also worth mentioning are sed and awk. These programs are used to perform manipulation of text inside a pipeline.

There is a learning curve here, but the more I learned about how to effectively use the command line and the coreutils, the more my productivity increased.

Mastery of your shell's syntax is crucial in becoming effective. Using the shell, you can construct complicated pipes which branch and merge data and execute commands to do practically anything on a massive scale.

To become a master of using the shell to automate things, I recommend that you practice doing more and more things from the shell instead of from a GUI. You need to understand what the most common utilities do, and the common arguments/flags that you can pass them in order to get their full range of functionality. If/when you move on to writing scripts, this knowledge will help you take advantage of built-in functionality instead of trying to implement it yourself. It becomes easier to practice shell skills if you run a website, since you often have to log in via ssh to a webserver in order to install software and make configuration changes on the webserver. I picked up most of my basic shell skills through running my personal website.

A word of warning: Make sure you are familiar with using a command normally before invoking it with find or xargs. This type of usage requires knowledge of how the shell and different programs handle spaces, etc. You can cause damage or data loss if you don't know what you're doing. So learn the basics first. Test out your batch commands by adding echo in front of the command, so it will print instead of execute and you can inspect it. Test out your execution on one or two files before you execute the whole batch.

Here are some resources for learning more about the command line:

  • LinuxCommand.org - Excellent in-depth overview of the command line, suitable for total beginners
  • Bash Guide on Greg's Wiki
  • man - Take advantage of the built-in manuals for each command on your system by typing "man somecommand". Type q to exit.
  • Built-in help: Many programs have built-in help. This is usually available by invoking the program with the -h or --help flag, such as "mv --help".
  • Google - I can't stress enough how useful it is to just look something up on Google when you're stuck. Chances are, somebody else was having this exact same problem and found a solution!


Here are some examples of tasks and automation that I frequently perform using the shell and some utilities. These are provided not as a tutorial, but an example of the kind of powerful automation you can achieve from a well-crafted command.

I've added echo statements to all the examples that will cause a change so they will print the commands they would execute instead of actually executing them. I'm worried about people randomly pasting commands into their terminal and running them. Here's a hint: DON'T DO THAT! With great power comes great responsibility. Make sure you understand what a command does before you execute it on your system so you don't get owned! Note that find (and other) syntax can vary from system to system and shell to shell so these may not all work in your environment.

Move a bunch of files matching my search term to a different location using find:

find . -name 'test*' -exec echo mv {} ../foodir \;

Batch-convert a bunch of files matching my search term using find and a converter program (imagemagick in this example):

find . -name '*.jpg' -exec echo convert {} {}.png \;

Tidy up the output of a monitoring program like ps to only contain the information i'm interested in, using head, grep, cat, and bash input substitution syntax:

cat <(ps|head -1) <(ps|grep bash)

Use diffcomm, and other programs to find the differences between two files/directories, or perform set theory operations

There's a whole in-depth article on this!

I hope I have whetted your appetite a bit to go learn more about using the command line. With practice, you can accomplish in a few keystrokes what could take tons of clicking and dragging in a GUI.

Up next:

Previously in this series: