Cogs and Levers A blog full of technical stuff

Perform your shell work in parallel

Introduction

In some cases, breaking your larger programming problems into smaller, parallelizable units makes sense from a time complexity problem. If the work you are trying to perform exhibits some of these parallelizable characteristics, you should only need to wait for the longest of your jobs to finish.

In today’s post, we’ll be talking about GNU Parallel.

A summary from their website:

GNU parallel is a shell tool for executing jobs in parallel using one or more computers. A job can be a single command or a small script that has to be run for each of the lines in the input. The typical input is a list of files, a list of hosts, a list of users, a list of URLs, or a list of tables. A job can also be a command that reads from a pipe. GNU parallel can then split the input and pipe it into commands in parallel.

Input

The input system is quite complex. Delimiting the inputs with the ::: operator, parallel will make a catersian product out of the input values.

parallel echo ::: John Paul Sally
John
Paul
Sally

parallel echo ::: John Paul Sally ::: Smith
John Smith
Paul Smith
Sally Smith

parallel echo ::: John Paul Sally ::: Smith Jones Brown
John Smith
John Jones
John Brown
Paul Smith
Paul Jones
Paul Brown
Sally Smith
Sally Jones
Sally Brown

Linkage is possible using :::+, should this flexibility be required.

parallel echo ::: John Paul Sally :::+ Smith Jones Brown
John Smith
Paul Jones
Sally Brown

See more about input sources in the tutorial.

curl

For some examples, I’ll use curl.

Let’s get three web pages downloaded:

  • google.com
  • yahoo.com
  • zombo.com

Getting these down, one at a time times in at nearlly 1.2 seconds.

( curl www.google.com && curl www.yahoo.com && curl www.zombo.com; )  0.02s user 0.05s system 5% cpu 1.195 total

Running these downloads in parallel, we take half a second off the time:

parallel curl {1} ::: www.google.com www.yahoo.com www.zombo.com  0.21s user 0.04s system 33% cpu 0.774 total

Summing up

GNU Parallel is a great utility to get multiple things done at once at the shell. Take a look at the tutorial and immediately become more productive.

Mounting windows filesystems in Linux

A quick note on mounting windows filesystems on linux.

Your platform will require cifs-utils.

sudo apt install cifs-utils

From here, you can connect and mount a remote file system.

sudo mount -t cifs //ip.address.of.windows/location/to/mount /mnt -o user=john,password=mypassword,domain=mydomain

Done. You can now access the remote filesystem.

HDF5

In today’s article, we’ll talk a bit about the HDF5 format. From the HDF5 website:

An HDF5 file is a container for two kinds of objects: datasets, which are array-like collections of data, and groups, which are folder-like containers that hold datasets and other groups.

Useful links:

Text search with grep

GNU Grep is an amazingly powerful tool that will allow you to search input files containing specified patterns.

The following command performs a recursive search for a string, inside a directory of files:

grep -rnw '/path/to/search' -e 'pattern-to-find'

Breaking this command down:

  • -r makes the execution recursive
  • -n will give you the line number
  • -w matches the whole word

You can use the --exclude switch to remove file patterns from the set of files included in your search. Removing js and html files from your search might look like this:

--exclude *.{js,html}

The opposite of this will be the --include switch.

For further details on this really useful tool, check out the man page.

Writing addons for node

Introduction

Sometimes you might find yourself in the situation where you require a little more power out of your node.js application. You may need to squeeze some extra performance out of a piece of code that you simply can’t achieve using javascript alone. Node.js provides a very rich sdk to allow application developers to create their own addons to use, that allow you to write in C++.

These binary compiled modules then become directly accessible from your node.js applications.

In today’s article, I’d like to walk through the basic setup of an addon project. We’ll also add a function to the addon, and demonstrate the call from javascript to C++.

Setup

Before you can get developing, you’ll need to make sure you have some dependencies installed. Create a directory, and start a new node application.

mkdir my-addon
cd my-addon

npm init

You’ll need to let the package manager know that your application has a gyp file present by switching gypfile to true.

// package.json

{
  "name": "my-addon",
  "version": "1.0.0",
  "description": "",
  "main": "index.js",
  "gypfile": true,
  "scripts": {
    "build": "node-gyp rebuild",
    "clean": "node-gyp clean"
  },
  "author": "",
  "license": "ISC",
  "devDependencies": {
    "node-gyp": "^3.8.0"
  },
  "dependencies": {
    "node-addon-api": "^1.6.3"
  }
}

The project is going to require a gyp file called binding.gyp. It’s the responsibility of this file to generate the build environment that will compile our addon.

// binding.gyp

{
  "targets": [{
    "target_name": "myaddon",
    "cflags!": ["-fno-exceptions"],
    "cflags-cc!": ["-fno-exceptions"],
    "sources": [
      "src/main.cpp"
    ],
    "include_dirs": [
      "<!@(node -p \"require('node-addon-api').include\")"
    ],
    "libraries": [],
    "dependencies": [
      "<!(node -p \"require('node-addon-api').gyp\")"
    ],
    "defines": [ "NAPI_DISABLE_CPP_EXCEPTIONS" ]
  }]
}

With these in place, you can install your dependencies.

npm install

Your first module

The gyp file notes that the source of our addon sits at src/main.cpp. Create this file now, and we can fill it out with the following.

// src/hello.cpp 

#include <napi.h>

Napi::Object InitAll(Napi::Env env, Napi::Object exports) {
  return exports;
}

NODE_API_MODULE(myaddon, InitAll)

The keen reader would see that our module does nothing. That’s ok to start with. This will be an exercise in checking that the build environment is setup correctly.

Import and use your addon just like you would any other module from within the node environment.

// index.js

const myAddon = require("./build/Release/myaddon.node");
module.exports = myAddon;

Build and run

We’re ready to run.

npm run build
node index.js

Ok, great. As expected, that did nothing.

Make it do something

Let’s create a function that will return a string. We can then take that string, and print it out to the console once we’re in the node environment.

We’ll add a header file that will define any functions. We also need to tell our build environment that we’ve got another file to compile.

// binding.gyp

{
  "targets": [{
    "target_name": "myaddon",
    "cflags!": ["-fno-exceptions"],
    "cflags-cc!": ["-fno-exceptions"],
    "sources": [
      "src/funcs.h",
      "src/main.cpp"
    ],
    "include_dirs": [
      "<!@(node -p \"require('node-addon-api').include\")"
    ],
    "libraries": [],
    "dependencies": [
      "<!(node -p \"require('node-addon-api').gyp\")"
    ],
    "defines": [ "NAPI_DISABLE_CPP_EXCEPTIONS" ]
  }]

We define the functions for the addon.

// src/funcs.h

#include <napi.h>

namespace myaddon {
  Napi::String getGreeting(const Napi::CallbackInfo &info);
}

Now for the definition of the function, as well as its registration into the module.

#include "funcs.h"

Napi::String myaddon::getGreeting(const Napi::CallbackInfo &info) {
  Napi::Env env = info.Env();
  return Napi::String::New(env, "Good morning!");
}

Napi::Object InitAll(Napi::Env env, Napi::Object exports) {
  exports.Set("getGreeting", Napi::Function::New(env, myaddon::getGreeting));
  return exports;
}

NODE_API_MODULE(myaddon, InitAll)

The getGreeting function is actually doing the work here. It’s simply returning a greeting. The InitAll function now changes to add a Set call on the exports object. This is just registering the function to be available to us.

Greetings

So, now we can actually use the greeting. We can just console.log it out.

const myAddon = require("./build/Release/myaddon.node");

console.log(myAddon.getGreeting());

module.exports = myAddon;

We can now run our code.

➜  my-addon node index.js
Good morning!