Cogs and Levers A blog full of technical stuff

16 bit COM files

COM files are plain binary executable file format from the MS-DOS era (and before!) that provide a very simple execution model.

The execution environment is given one 64kb segment to fit its code, stack and data segments into. This memory model is sometimes referred to as the “tiny” model.

In today’s post, we’re going to write a really simple program; compile it, disassemble it and dissect it. Here’s our program that very helpfully prints “Hello, world!” to the console and then exits.

ORG 100h

section .text

start:
	mov		dx, msg
	mov		ah, 09h
	int		21h

	ret

section .data

	msg DB 'Hello, world!', 13, 10, '$'

Nothing of great interest here. The only thing worth a mention is the ORG directive. This tells the assembler (and therefore the execution environment once executed) that our program starts at the offset 100h. There’s some more information regarding 16bit programs with nasm here.

nasm’s default output format is plain binary so, assembly is very simple:

$ nasm hello.asm -o hello.com

Running our program in dosbox and we’re given our prompt as promised. Taking a look at the binary on disk, it’s seriously small. 24 bytes small. We won’t have much to read when we dissassemble it!

Because this is a plain binary file, we need to give objdump a little help in how to present the information.

$ objdump -D -b binary -mi386 -Maddr16,data16 hello.com 

The full output dump is as follows:

hello.com:     file format binary


Disassembly of section .data:

00000000 <.data>:
   0:	ba 08 01             	mov    $0x108,%dx
   3:	b4 09                	mov    $0x9,%ah
   5:	cd 21                	int    $0x21
   7:	c3                   	ret    
   8:	48                   	dec    %ax
   9:	65                   	gs
   a:	6c                   	insb   (%dx),%es:(%di)
   b:	6c                   	insb   (%dx),%es:(%di)
   c:	6f                   	outsw  %ds:(%si),(%dx)
   d:	2c 20                	sub    $0x20,%al
   f:	77 6f                	ja     0x80
  11:	72 6c                	jb     0x7f
  13:	64 21 0d             	and    %cx,%fs:(%di)
  16:	0a 24                	or     (%si),%ah

Instructions located from 0 through to 7 correspond directly to the assembly source code that we’ve written. After this point, the file is storing our string that we’re going to print which is why the assembly code looks a little chaotic.

Removing the jibberish assembly language, the bytes directly correspond to our string:

	"H"
   8:	48                   	
   	"e"
   9:	65                   	
   	"l"
   a:	6c                   	
   	"l"
   b:	6c                   	
   	"o"
   c:	6f                   	
   	", "
   d:	2c 20                	
   	"wo"
   f:	77 6f                	
   	"rl"
  11:	72 6c                	
  	"d!", 13
  13:	64 21 0d             	
  	10, "$"
  16:	0a 24                	

So, our string starts at address 8 but the first line of our assembly code; the line that’s loading dx with the address of our string msg has disassembled to this:

   0:	ba 08 01             	mov    $0x108,%dx

The address of $0x108 is going to overshoot the address of our string by 0x100! This is where the ORG directive comes in. Because we have specified this, all of our addresses are adjusted to suit. When DOS loads our COM file, it’ll be in at 0x100 and our addresses will line up perfectly.

sysstat utilities

sysstat is a collection of utilities for Linux that provide performance and activity usage monitoring. In today’s post, I’ll go through a brief explanation of these utilities.

iostat

iostat(1) reports CPU statistics and input/output statistics for devices, partitions and network filesystems.

Linux 3.13.0-46-generic (thor) 	21/03/15 	_x86_64_	(8 CPU)

avg-cpu:  %user   %nice %system %iowait  %steal   %idle
           1.53    0.01    0.46    0.09    0.00   97.92

Device:            tps    kB_read/s    kB_wrtn/s    kB_read    kB_wrtn
sda              15.87       216.84       169.79     755449     591528

iostat provides a top level cpu report in its first line with a breakdown percentages for the amount of time the cpu is spent:

  • In user space (%user)
  • In user space with nice priority (%nice)
  • In kernel space (%system)
  • Waiting on I/O (%iowait)
  • Forced to wait from the hypervisor (%steal)
  • Doing nothing (%idle)

Secondly, a breakdown by device is given of disk activity. In this chart, it shows the disk devices’:

  • Transfer per second (tps)
  • Amount of data read per second (kB_read/s)
  • Amount of data written per second (kB_wrtn/s)
  • Total read (kB_read)
  • Total write (kB_wrtn)

mpstat

mpstat(1) reports individual or combined processor related statistics.

Linux 3.13.0-46-generic (thor) 	21/03/15 	_x86_64_	(8 CPU)

15:23:58     CPU    %usr   %nice    %sys %iowait    %irq   %soft  %steal  %guest  %gnice   %idle
15:23:58     all    1.36    0.01    0.41    0.08    0.00    0.00    0.00    0.00    0.00   98.14

mpstat goes a little deeper into how the cpu time is divided up among its responsibilities. By specifying -P ALL on the command line to it, you can get a report per cpu:

Linux 3.13.0-46-generic (thor) 	21/03/15 	_x86_64_	(8 CPU)

16:05:10     CPU    %usr   %nice    %sys %iowait    %irq   %soft  %steal  %guest  %gnice   %idle
16:05:10     all    1.58    0.00    0.42    0.06    0.00    0.00    0.00    0.00    0.00   97.94
16:05:10       0    2.05    0.00    0.74    0.04    0.00    0.00    0.00    0.00    0.00   97.17
16:05:10       1    2.62    0.02    0.58    0.03    0.00    0.00    0.00    0.00    0.00   96.76
16:05:10       2    2.84    0.00    0.59    0.03    0.00    0.00    0.00    0.00    0.00   96.54
16:05:10       3    2.31    0.00    0.67    0.02    0.00    0.00    0.00    0.00    0.00   97.00
16:05:10       4    0.83    0.00    0.18    0.10    0.00    0.00    0.00    0.00    0.00   98.89
16:05:10       5    0.65    0.02    0.20    0.10    0.00    0.00    0.00    0.00    0.00   99.03
16:05:10       6    0.71    0.00    0.20    0.08    0.00    0.00    0.00    0.00    0.00   99.01
16:05:10       7    0.64    0.00    0.17    0.06    0.00    0.00    0.00    0.00    0.00   99.13

pidstat

pidstat(1) reports statistics for Linux tasks (processes) : I/O, CPU, memory, etc.

Linux 3.13.0-46-generic (thor) 	21/03/15 	_x86_64_	(8 CPU)

16:06:19      UID       PID    %usr %system  %guest    %CPU   CPU  Command
16:06:19        0         1    0.00    0.01    0.00    0.02     0  init
16:06:19        0         7    0.00    0.02    0.00    0.02     3  rcu_sched

. . .
. . .

pidstat will give you the utilisation breakdown by process that’s running on your system.

sar

sar(1) collects, reports and saves system activity information (CPU, memory, disks, interrupts, network interfaces, TTY, kernel tables,etc.)

sar requires that data collection is on to be used. The settings defined in /etc/default/sysstat will control this collection process. As sar is the collection mechanism, other applications use this data:

sadc(8) is the system activity data collector, used as a backend for sar.

sa1(8) collects and stores binary data in the system activity daily data file. It is a front end to sadc designed to be run from cron.

sa2(8) writes a summarized daily activity report. It is a front end to sar designed to be run from cron.

sadf(1) displays data collected by sar in multiple formats (CSV, XML, etc.) This is useful to load performance data into a database, or import them in a spreadsheet to make graphs.

nfs and cifs

NFS and CIFS also have monitoring utilities.

nfsiostat-sysstat(1) reports input/output statistics for network filesystems (NFS).

cifsiostat(1) reports CIFS statistics.

These certainly come in handy when you’ve got remote shares running from your machine.

Basic docker usage

Docker is a platform that allows you to bundle up your applications and their dependencies into a distributable container easing the overhead in environment setup and deployment.

The Dockerfile reference in the docker documentation set goes through the important pieces of building an image.

In today’s post, I’m just going to run through some of the commands that I’ve found most useful.

Building a container

# build an image and assign it a tag
sudo docker build -t username/imagename:tag .

Controlling containers

# run a single command
sudo docker run ubuntu /bin/echo 'Hello world'

# run a container in a daemonized state
sudo docker run -d ubuntu /bin/sh -c "while true; do echo hello world; sleep 1; done"

# run a container interactively
sudo docker run -t -i ubuntu /bin/bash

# connect to a running container
sudo docker attach container_id

# stop a running container
sudo docker stop container_name

# remove a container
sudo docker rm container_name

# remove an image
sudo docker rmi image_name

When running a container, -p will allow you to control port mappings and -v will allow you to control volume locations.

Getting information from docker

# list images
sudo docker images

# list running containers
sudo docker ps

# list all containers
sudo docker ps -a

# inspecting the settings of a container
sudo docker inspect container_name

# check existing port mappings
sudo docker port container_name 

# retrieve stdout from a running container
sudo docker logs container_name
sudo docker logs -f container_name

XML literals in scala

A really handy feature that has been included in the Scala programming language is xml literals. The xml literals feature allows you to declare blocks of xml directly into your Scala code. As you’ll see below, you’re not limited to static xml blocks and you’re also given the full higher-order function architecture to navigate and process your xml data.

Definition and creation

You can create an xml literal very simply inside of your Scala code:

val people = 
	<people>
		<person firstName="John" 
				lastName="Smith" 
				age="25" 
				gender="M" />
		<person firstName="Mary" 
				lastName="Brown" 
				age="23" 
				gender="F" />
		<person firstName="Jan" 
				lastName="Green" 
				age="31" 
				gender="F" />
		<person firstName="Peter" 
				lastName="Jones" 
				age="23" 
				gender="M" />
	</people>

Scala then creates a variable of type Elem for us.

Xml literals can also be constructed or generated from variable sources

val values = <values>{(1 to 10).map(x => <value number={x.toString} />)}</values>

Take note that the value of x needs to be converted to a string in order to be used in an xml literal.

Another form of generation can be accomplished with a for comprehension:

val names = 
	<names>
	{for (name <- List("Sam", "Peter", "Bill")) yield <name>{name}</name>}
	</names>

Working with literals

Once you have defined your xml literal, you can start to interact with the data just like any other Seq typed structure.

val peopleCount = (people \ "person").length
val menNodes = (people \ "person").filter(x => (x \ "@gender").text == "M")
val mensNames = menNodes.map(_ \ "@firstName")

println(s"Number of people: $peopleCount")
println(s"Mens names: $mensNames")

The usage of map and filter certainly provide a very familiar environment to query your xml data packets.

Transform with RewriteRule

The scala.xml.transform package include a class called RewriteRule. Using this class, you can transform (or re-write) parts of your xml document.

Taking the sample person data at the top of this post, we can write a transform to remove all of the men out of the set:

val removeMen = new RewriteRule {
	override def transform(n: Node): NodeSeq = n match {
		case e: Elem if (e \ "@gender").text == "M" => NodeSeq.Empty
		case n => n
	}
}

We test if the gender attribute contains an “M”, and if so we empty out the node. To apply this transform to the source data, we use the RuleTransformer class.

val noMen = new RuleTransformer(removeMen).transform(people)

Another rule we can write, would be to remove any person who was over the age of 30:

val removeOver30s = new RewriteRule {
	override def transform(n: Node): NodeSeq = n match {
		case e: Elem if e.label == "person" && (e \ "@age").text.toInt > 30 => NodeSeq.Empty
		case n => n
	}
}

Pretty much the same. The only extra complexity is ensuring that we have an age attribute and getting it casted to an integer for us to perform arithmetic testing.

The RuleTransformer class accommodates if we want to use these two transforms in conjunction with each other.

val noMenAndOver30s = new RuleTransformer(removeMen, removeOver30s).transform(people)

Working with lambdas in python

Python provides a simple way to define anonymous functions through the use of the lambda keyword. Today’s post will be a brief introduction to using lambdas in python along with some of the supported higher-order functions.

Declaration

For today’s useless example, I’m going to create a “greeter” function. This function will take in a name and give you back a greeting. This would be defined using a python function like so:

def greet(name):
	return "Hello, %s." % (name,)

Invoking this function gives you endless greetings:

>>> greet("Joe")
'Hello, Joe.'
>>> greet("Paul")
'Hello, Paul.'
>>> greet("Sally")
'Hello, Sally.'

We can transform this function into a lambda with a simple re-structure:

greeter = lambda name: "Hello %s" % (name,)

Just to show you a more complex definition (i.e. one that uses more than one parameter), I’ve prepared a lambda that will execute the quadratic formula.

from math import sqrt

quadratic = lambda a, b, c: ((-b + sqrt((b * b) - (4 * a * c))) / (2 * a), (-b - sqrt((b * b) - (4 * a * c))) / (2 * a))

This is invoked just like any other function:

>>> quadratic(100, 45, 2)
(-0.05, -0.4)
>>> quadratic(100, 41, 2)
(-0.0565917792034417, -0.3534082207965583)
>>> quadratic(100, 41, 4)
(-0.16, -0.25)

Higher order functions

Now that we’re able to define some anonymous functions, they really come into their own when used in conjunction with higher-order functions. The primary functions here are filter, map and reduce.

We can filter a list of numbers to only include the even numbers.

filter(lambda x: x%2 == 0, range(1, 10))

Of course it’s the lambda x: x%2 == 0 performing the even-number test for us.

We can reduce a list of numbers to produce the accumulation of all of those values:

reduce(lambda x, y: x + y, range(1, 10))

Finally, we can transform a list or map a function over a list of numbers and turn them into their inverses:

map(lambda x: 1.0/x, range(1, 10))