Cogs and Levers A blog full of technical stuff

Eventing model in Node.js

An easy way to create an extensible API in Node.js is to use the EventEmitter class. It allows you to publish interesting injection points into your module so that client applications and libraries can respond when these events are emitted.

Simple example

In the following example, I’ll create a Dog class that exposes an event called bark. When this class internally decides that it’s time to bark, it will emit this event for us.

First of all, we define our class which includes a way to start the dog barking.

var util = require('util');
var EventEmitter = require('events').EventEmitter;

var Dog = function (name) {
    var self = this;

    self.name = name;

    self.barkRandomly = function () {
        // WOOF WOOF!
        var delay = parseInt(Math.random() * 1000);

        setTimeout(function () {
            self.emit('bark', self);
            self.barkRandomly();
        }, delay);

    };

    self.on('bark', function (dog) {
        console.log(dog.name + ' is barking!');
    });
};

util.inherits(Dog, EventEmitter);

The barkRandomly function will take a random interval of time and then emit the bark event for us. It’s an example for demonstration purposes so that you can see how you’d emit and event at the back end of a callback.

Note that the emit call allows us to specify some information about the event. In this example, we’ll just send the dog (or self) that’s currently barking.

Using the on function at the end, we’re also able to get the class itself to subscribe to its own bark event. The emit and on functions are available to us internally because we’ve used the inherits function from the util module to extend the Dog class with the attributes of EventEmitter.

All that’s left now is to create a dog and get it to bark.

var rover = new Dog('Rover');

rover.on('bark', function (dog) {
    console.log('I just heard ' + dog.name + ' barking');
});

rover.barkRandomly();

Running this code, you’ll end up with a stream of barking notifications scrolling down your page.

Subscription management

Just as you can subscribe to an emitted event, you can remove a handler from the event when you are no longer interested in updates from it. To continue from the example above; if we had a handler that only cared if the dog barked for the first 3 times we could manage the subscription like so:

var rover = new Dog('Rover');

var notificationCount = 0;

var handler = function (dog) {
    console.log('I just heard ' + dog.name + ' barking');
    
    notificationCount ++;

    if (notificationCount == 3) {
        rover.removeListener('bark', handler);
    }
};

rover.on('bark', handler);

The operative line here being the call to removeListener.

You can simulate an irritable neighbor who would call the cops as soon as he heard your dog bark with a call to once which will fire once only, the first time it gets a notification:

rover.once('bark', function (dog) {
    console.log('I\'VE HAD IT WITH THAT DOG, ' + dog.name + '! I\'M CALLING THE COPS!');
});

Finally, all subscribers can be removed from any given event with a call to removeAllListeners.

Persistent brightness settings in Ubuntu

A really quick tip for persistently setting your video brightness level in Ubuntu, originally picked up from here.

Set the brightness level as you would normally with your control keys, then open a terminal to grab the current brightness setting:

cat /sys/class/backlight/acpi_video0/brightness

The value that you’re given as the output here can be used in your /etc/rc.local startup script. As the last item prior to the exit 0 statement, just add this:

echo your_value_here > /sys/class/backlight/acpi_video0/brightness

Docker Development Workflow Setup

As I write code in a lot of different languages using a lot of different frameworks, it makes sense for me to virtualise my development environments in such a way that they’re given their own isolated space so that they can’t infect each other. In today’s post, I’m going to walkthrough my development environment setup using Docker.

Setting up a Clojure environment

The example that I’ll use is the development container that I have setup for Clojure. First of all, I create a workspace for my Clojure development on my host machine, under my source directory like anything else. I’ll put this in ~/src/clojure.

In that folder, I create two scripts. run.sh which just gets a disposable container up and running and Dockerfile which is based off of the clojure:latest image from the docker hub repository, but just adds a couple of extra bits and pieces to help the development environment get started.

Dockerfile

The Dockerfile is pretty straight forward but relies on some magic. Unfortunately, here’s my solution loses its portability. DOH! But, I’m still unsure of how to get around this. To keep permissions and ownerships common between the host and the container (because we’ll be mounting a volume from the host), I create my developer account called michael as the next account after root. This is how it is on my host machine, so there’s no conflicting user/group ids.

FROM clojure:latest

RUN apt-get update && \
    apt-get install -y sudo && \
    apt-get clean && \
    rm -rf /var/lib/apt/lists/* /tmp/* /var/tmp/*

RUN adduser --disabled-password --gecos '' michael && \
    adduser michael sudo && \
    echo '%sudo ALL=(ALL) NOPASSWD:ALL' >> /etc/sudoers

WORKDIR /home/michael
ENV HOME /home/michael
VOLUME ["/home/michael"]

USER michael

A couple of small things to note.

After installing sudo, I clean up any dead weight that apt-get may have left behind. This is just container maintenance to ensure that the image that’s generated is as small as possible.

We add the michael account with no password and directly into the sudo group. We also adjust the sudo config so that users in the sudo group can issue administrative commands without needing to supply a password.

From there, it’s all about making the home directory of michael centre stage for the container and switching to the michael user.

run.sh

The run script is fairly straight forward. All we need it to do is mount our current directory as a volume in the container and start bash. Of course, if you need to publish ports or create other volume mounts; here is where you’d do it.

#!/bin/bash

docker run -ti --rm -v $(pwd):/home/michael clojure:latest /bin/bash

Start developing

You’re done now. Everything creates in context of your non-root user. Your tools are available to you in your container and you’re free to develop using your editor on your host.

Approximation with the Monte Carlo method

An interesting way of finding values that fall within a domain is to perform random sample analysis with the Monte Carlo method. This method of finding values relies on random values (lots of them) being measured against some form of deterministic calculation to determine if the value falls within the source function’s scope.

In today’s post, I’m going to illustrate how to use this method in some practical scenarios.

Approximating π

To approximate the value of π, we’re going to treat a square containing a quarter of a unit circle with a lot of random data. We only treat a quarter of a circle (which will contain angles 0 through 90) as we can easily mirror image a single quarter 4 times. Mathematically, you can consider the ratio of the circle’s area with respect to the square that contains it.

Circle area = πr^2
Square area = 2r^2 + 2r^2
            = 4r^2

Ratio       = πr^2 / 4r^2
            = π/4

For every point that we randomly sample, the following must be true in order for us to consider the point as satisfying the circle:

rx^2 + ry^2 <= radius^2

This tells us, that from the midpoint 0,0, if the x and y values that we’ve randomly selected are within the bounds of the radius; we’ll consider it as “in”.

Once we’ve sampled enough data, we’ll take the ratio of points that are “in” and points that are “out”. We are still only dealing with a quarter of a circle, so we’ll multiply our result out as well and we should get close to π if we’ve sampled enough data. Here’s how it looks in python:

runs = 1000000
radius = 1

# get a batch of random points
rand_points = map(lambda x: (random() * radius, random() * radius), range(runs))

# filter out the points that satisfy our equation
in_points = filter(lambda (x, y): ((x * x) + (y * y)) <= (radius * radius), rand_points)

# calculate the ratio of points in the circle vs. points out of the circle
ratio = float(len(in_points)) / float(runs)

# multiply this figure by 4 to get all 4 quadrants considered
estimate = ratio * 4

runs is the number of points that we’re going to sample. radius is only defined to be clear. If you were to change the radius of the tested area, your output ratio would need to be adjusted also.

Running this code a few times, I get the following results:

3.140356
3.14274
3.14064
3.142
3.140664

Area under a curve

When it comes to finding the area under a curve, nothing really beats numeric integration. In some cases though, your source function doesn’t quite allow for integration. In these cases, you can use a Monte Carlo simulation to work it out. For the purposes of this post though, I’ll work with x^2.

Let’s integrate it to begin with and work out what the area is between the x-axis points 0 and 3.

 f(x) = x^2
ʃf(x) = x^3/3

area  = ʃf(3) - ʃf(0)
      = 9

So we’re looking for a value close to 9. It’s also important to note the values of our function’s output at the start of where we want to take the area from to the end as this will setup the bounds of our test:

f(x) = x^2
f(0) = 0
f(3) = 9

The area that we’ll be testing from is 0, 0 to 3, 9. The following code looks very similar to the π case. It has been adjusted to test the area and function:

runs = 1000000
max_x = 3
max_y = 9

# get a batch of random points
rand_points = map(lambda x: (random() * max_x, random() * max_y), range(runs))

# filter out the points that satisfy our equation
in_points = filter(lambda (x, y): y <= (x * x), rand_points)

# calculate the ratio of points in the curve area vs. points outside
ratio = float(len(in_points)) / float(runs)

# the estimate is the ratio over the area of the rectangle
estimate = ratio * (max_x * max_y)

Here’s some example outputs. Remember, our answer is 9; we want something close to that:

9.015219
9.008199
8.986761
8.998317
9.006282
8.995995
9.00693

These are only a couple of the many applications that you can use these for. Good luck and happy approximating.

MZ EXE files

Executable files in MS-DOS come in a few different formats. The original 16-bit version of this file format is referred to as the DOS MZ Executable.

In today’s post, we’re going to dissect the internals of this format.

MZ

This particular gets its name “MZ” due to the first two bytes of the file 0x4d and 0x5a. Translated to ASCII text, these two bytes form the characters “MZ”. This is the opening signature (or magic number) for a file of this format.

The header

The first chunk of an EXE file is the header information. It stores relocation information important to the execution of the file. A few important notes when reading the header:

  • All values spanning more than one byte are stored LSB first
  • A block is 512 bytes in size
  • A paragraph is 16 bytes in size
Offset Description
0x00-0x01 The values 0x4d and 0x5a translating to the ASCII string “MZ”. This is the magic number for the file
0x02-0x03 The number of bytes used in the last block of the EXE. A zero value indicates that the whole block is used
0x04-0x05 The number of blocks that form part of the EXE
0x06-0x07 The number of relocation entries. These are stored after the header
0x08-0x09 The number of paragraphs in the header
0x0A-0x0B The number of paragraphs required for uninitialized data
0x0C-0x0D The number of paragraphs of additional memory to constrain this EXE to
0x0E-0x0F Relative value for the SS register
0x10-0x11 Initial SP register value
0x12-0x13 Word checksum
0x14-0x15 Initial IP register value
0x16-0x17 Relative value for the CS register
0x18-0x19 Offset of the first relocation item
0x1A-0x1B Overlay number

An example

Take the following “Hello, world” program written in x86 assembly language:

section .text

start:

  mov   ax, seg hello
  mov   ds, ax

  mov   dx, hello
  mov   ah, 09h
  int   21h

  mov   ah, 4ch
  xor   al, al
  int   21h

section .data

  hello db 'Hello, world!', 13, 10, '$'

I assembled this file with NASM:

$ nasm -f obj hello2.asm -o hello2.obj

I then transferred the resulting obj file from my linux machine over to a dos machine and ran TLINK which was part of the Turbo Assembler product.

> tlink hello2.obj

Once we’ve assembled and linked this file to produce a 16-bit dos executable, we can pull it apart again with objdump.

$ objdump -s -D -b binary -mi8086 HELLO2.EXE

The output of this dump is quite detailed. I’ve removed a fair bit of it for brevity:

HELLO2.EXE:     file format binary

Contents of section .data:
 0000 4d5a2200 02000100 20000000 ffff0000  MZ"..... .......
 0010 00000000 00000000 3e000000 0100fb50  ........>......P
 0020 6a720000 00000000 00000000 00000000  jr..............
 0030 00000000 00000000 00000000 00000100  ................
 0040 00000000 00000000 00000000 00000000  ................   
  . . .
  . . .

 0200 b801008e d8ba0200 b409cd21 b44c30c0  ...........!.L0.
 0210 cd214865 6c6c6f2c 20776f72 6c64210d  .!Hello, world!.
 0220 0a24                                 .$              

Disassembly of section .data:

00000000 <.data>:
   0: 4d                    dec    %bp
   1: 5a                    pop    %dx
   2: 22 00                 and    (%bx,%si),%al
   4: 02 00                 add    (%bx,%si),%al
   6: 01 00                 add    %ax,(%bx,%si)
   8: 20 00                 and    %al,(%bx,%si)
   a: 00 00                 add    %al,(%bx,%si)
   c: ff                    (bad)  
   d: ff 00                 incw   (%bx,%si)
  ...
  17: 00 3e 00 00           add    %bh,0x0
  1b: 00 01                 add    %al,(%bx,%di)
  1d: 00 fb                 add    %bh,%bl
  1f: 50                    push   %ax
  20: 6a 72                 push   $0x72
  ...
  3e: 01 00                 add    %ax,(%bx,%si)
  ...
 200: b8 01 00              mov    $0x1,%ax
 203: 8e d8                 mov    %ax,%ds
 205: ba 02 00              mov    $0x2,%dx
 208: b4 09                 mov    $0x9,%ah
 20a: cd 21                 int    $0x21
 20c: b4 4c                 mov    $0x4c,%ah
 20e: 30 c0                 xor    %al,%al
 210: cd 21                 int    $0x21
 212: 48                    dec    %ax
 213: 65                    gs
 214: 6c                    insb   (%dx),%es:(%di)
 215: 6c                    insb   (%dx),%es:(%di)
 216: 6f                    outsw  %ds:(%si),(%dx)
 217: 2c 20                 sub    $0x20,%al
 219: 77 6f                 ja     0x28a
 21b: 72 6c                 jb     0x289
 21d: 64 21 0d              and    %cx,%fs:(%di)
 220: 0a 24                 or     (%si),%ah

Focusing on the top representation, we get a direct view of the values in the header.

0000 4d5a2200 02000100 20000000 ffff0000  MZ"..... .......
0010 00000000 00000000 3e000000 0100fb50  ........>......P
0020 6a720000 00000000 00000000 00000000  jr..............
0030 00000000 00000000 00000000 00000100  ................
0040 00000000 00000000 00000000 00000000  ................

(0x00-0x01) 4d5a

The first two bytes are indeed “MZ”, or 0x4d 0x5a. So we’ve got the correct signature.

(0x02-0x03) 2200

This is the number of bytes used in the last block of the EXE. Remember, we’ve got LSB first when we’re dealing with multi-byte values, so this is 0x22 bytes. If you take a look at the resulting code listing above, you’ll see that the code for the executable starts at address 0x200 and ends at 0x220. At the address of 0x220, 2 additional bytes are used.

This is our 0x22 bytes as it is the first, last and only block that we have!

(0x04-0x05) 0200

This is the number of blocks (remember: 512 bytes chunks) that comprise of our EXE. We have 2. Our header is using the first block, our code and data is in the second.

(0x06-0x07) 0100

We have 1 relocation item. A relocation item is just a 16-bit value for the offset followed by a 16-bit value for the segment.

(0x08-0x09) 2000

There are 0x20 paragraphs in the header.

0x20 = 32 (decimal)
paragraph size = 16 bytes

32 * 16 = 512 bytes

This calculates out. 512 bytes in the header. We can see that the file offset starts at 0x00. Code doesn’t appear until 0x200. 0x200 is 512 in decimal.

(0x0A-0x0B) 0000

Our program didn’t define any uninitialized data, only a pre-initialized string: “Hello, world”.

(0x0C-0x0D) ffff

This is the default mode of operation for memory constraints. It says, use everything (i.e. don’t place any constraint).

(0x0E-0x0F) 0000

No translation to the stack segment (SS) will go on here. This value gets added to the segment value of where the program was loaded at and that’s how SS is initialized. The program that we’ve written didn’t define a stack, so no translation required.

(0x10-0x11) 0000

SP’s initial value

(0x12-0x13) 0000

This is the word checksum. It’s seldom used.

(0x14-0x15) 0000

The instruction pointer will start at 0x0000.

(0x16-0x17) 0000

This value would adjust CS.

(0x18-0x19) 3e00

This is the address of the first relocation item in the file. If we take a look back at the dump now, we can see the value sat at that address:

0030 ________ ________ ________ ____0100  ................
0040 0000____ ________ ________ ________  ................ 

This takes the format of offset:segment here, so we’ve got 0000:0100. This will be used at execution time and will also influence the resulting stack segments and offsets.

(0x1A-0x1B) 0000

Overlay number. Zero indicates that this is the main program.

The rest

Everything from here looks pretty familiar. We can see our assembly code start off and our string defined at the end.