Perform your shell work in parallel
02 Jun 2019Introduction
In some cases, breaking your larger programming problems into smaller, parallelizable units makes sense from a time complexity problem. If the work you are trying to perform exhibits some of these parallelizable characteristics, you should only need to wait for the longest of your jobs to finish.
In today’s post, we’ll be talking about GNU Parallel.
A summary from their website:
GNU parallel is a shell tool for executing jobs in parallel using one or more computers. A job can be a single command or a small script that has to be run for each of the lines in the input. The typical input is a list of files, a list of hosts, a list of users, a list of URLs, or a list of tables. A job can also be a command that reads from a pipe. GNU parallel can then split the input and pipe it into commands in parallel.
Input
The input system is quite complex. Delimiting the inputs with the :::
operator, parallel
will make a catersian product out of the input values.
parallel echo ::: John Paul Sally
John
Paul
Sally
parallel echo ::: John Paul Sally ::: Smith
John Smith
Paul Smith
Sally Smith
parallel echo ::: John Paul Sally ::: Smith Jones Brown
John Smith
John Jones
John Brown
Paul Smith
Paul Jones
Paul Brown
Sally Smith
Sally Jones
Sally Brown
Linkage is possible using :::+
, should this flexibility be required.
parallel echo ::: John Paul Sally :::+ Smith Jones Brown
John Smith
Paul Jones
Sally Brown
See more about input sources in the tutorial.
curl
For some examples, I’ll use curl.
Let’s get three web pages downloaded:
- google.com
- yahoo.com
- zombo.com
Getting these down, one at a time times in at nearlly 1.2 seconds.
( curl www.google.com && curl www.yahoo.com && curl www.zombo.com; ) 0.02s user 0.05s system 5% cpu 1.195 total
Running these downloads in parallel, we take half a second off the time:
parallel curl {1} ::: www.google.com www.yahoo.com www.zombo.com 0.21s user 0.04s system 33% cpu 0.774 total
Summing up
GNU Parallel is a great utility to get multiple things done at once at the shell. Take a look at the tutorial and immediately become more productive.