Cogs and Levers A blog full of technical stuff

Add info at build time to go binaries

Introduction

Sometimes it can be useful to capture information about your environment at build time, and have this information injected into the binary that you’re building. Some examples centre around versioning, where it might make sense to capture git commit hashes or build serials.

An example program

package main

import "fmt"

var gitCommit string
var buildSerial string

func main() {
  fmt.Printf("git hash: %s, build serial: %s", gitCommit, buildSerial)
}

Two variables in this module gitCommit, and buildSerial are going to hold some version information for us. Running this program yields some rather uninteresting results.

go run main.go
git hash: , build serial: 

-X switch

While building a program, you can use the -X linker switch which will allow you to supply information into module variables from the build process.

We can obtain the latest build hash using git with the following:

git rev-list -1 HEAD

We can even synthesize a build number involving the date, perhaps?

date +%s

Using the -ldflags switch, we can now specify these at the console.

go build -ldflags "-X main.gitCommit=`git rev-list -1 HEAD` -X main.buildSerial=`date +%s`" main.go

Closing up

Now that we have a binary built, it’s had its build information applied - these variables now magically receive these values.

./main
git hash: f6f62d9a759a03afffb913a1d24fb64a1bc5507d, build serial: 1564573076

Remember, these switches can be buried behid a Makefile also, so you don’t need to be typing these things over and over.

GOOS=linux
GOARCH=386

.PHONY: build

GIT_COMMIT := $(shell git rev-list -1 HEAD)
BUILD_NO := $(shell date +%s)

build:
  GOOS=$(GOOS) GOARCH=$(GOARCH) go build -ldflags "-X main.gitCommit=$(GIT_COMMIT) -X main.buildSerial=$(BUILD_NO)" .

hexdump

Introduction

Sometimes you may need to investigate the contents of binary files. Simply using cat to view these details in your terminal can have all sorts of random effects due to control characters, etc. The utility hexdump allows you to look at the contents of these files in a sane way.

From the hexdump manpage:

display file contents in hexadecimal, decimal, octal, or ascii

In today’s article, we’ll walk through some example usages of this utiltiy.

Examples

For all of these examples, we’ll be using a 256 byte file of random binary. I generated this data on my system with the following command:

head -c 256 /dev/urandom > example

The initial view of this data now looks like this:

➜  ~ hexdump example 
0000000 5103 7055 bd22 a3bf 2f36 fc05 3a80 5d5a
0000010 0e4c cbdd 06a7 9dc3 b104 2dae 0c3e e9e6
0000020 d01a dc5a 2eaf c01d 5336 d738 231c 0358
0000030 9133 eafa cd24 1206 0f71 988e 2349 648c
0000040 1eb8 7cf4 e7b8 4e61 a5e9 aa16 063f 9370
0000050 7bab e97d c197 6662 e99d 0b97 381a 9712
0000060 7e88 ed64 2b22 74b9 3f5b c68f ce00 5c6e
0000070 7d4c 5f5f ee66 6198 b812 f54d 740a 0343
0000080 d1ce 7092 2623 91fa f7a7 cc0a 961b 10dd
0000090 ea41 b512 806f 16ee 74bf 32dd fc13 6bc9
00000a0 7126 99b5 1a7c 7282 a464 93a4 aae1 6070
00000b0 8e28 e93a 5342 c6fd 027a 6837 1131 668e
00000c0 574b 5025 4e8c 0f6a d2bd 6b7a c8ec daa0
00000d0 9ebc 3c2d d288 0514 2493 1aca ffd0 684c
00000e0 9bdc d2c8 b1f5 f862 4c5c b6c4 b722 9397
00000f0 d4f6 2bf0 74a5 a00a 8007 5fc5 cf99 0701
0000100

Formatting

Now things get interesting. The -e switch of the hexdump command allows us to specify a format string that controls the output to the terminal.

➜  ~ hexdump -v -e '"%07_ax  |"' -e '16/1 "%_p" "|\n"' example

Using _a[dox] we can control how that offset down the left hand side looks. %07_ax pads the offset with a width of 7. 16/1 "%_p" will print 16 bytes using _p which prints using the default character set. The output of which looks like this:

0000000  |.QUp"...6/...:Z]|
0000010  |L..........->...|
0000020  |..Z.....6S8..#X.|
0000030  |3...$...q...I#.d|
0000040  |...|..aN....?.p.|
0000050  |.{}...bf.....8..|
0000060  |.~d."+.t[?....n\|
0000070  |L}__f..a..M..tC.|
0000080  |...p#&..........|
0000090  |A...o....t.2...k|
00000a0  |&q..|..rd.....p`|
00000b0  |(.:.BS..z.7h1..f|
00000c0  |KW%P.Nj...zk....|
00000d0  |..-<.....$....Lh|
00000e0  |......b.\L.."...|
00000f0  |...+.t....._....|

Anytime this format encounters a non-printable character, a . is put in its place.

Builtin

-v -C gives a side-by-side of hex values along with the printable characters:

➜  ~ hexdump -v -C example

This is probably the most familliar:

0000000  03 51 55 70 22 bd bf a3  36 2f 05 fc 80 3a 5a 5d  |.QUp"...6/...:Z]|
00000010  4c 0e dd cb a7 06 c3 9d  04 b1 ae 2d 3e 0c e6 e9  |L..........->...|
00000020  1a d0 5a dc af 2e 1d c0  36 53 38 d7 1c 23 58 03  |..Z.....6S8..#X.|
00000030  33 91 fa ea 24 cd 06 12  71 0f 8e 98 49 23 8c 64  |3...$...q...I#.d|
00000040  b8 1e f4 7c b8 e7 61 4e  e9 a5 16 aa 3f 06 70 93  |...|..aN....?.p.|
00000050  ab 7b 7d e9 97 c1 62 66  9d e9 97 0b 1a 38 12 97  |.{}...bf.....8..|
00000060  88 7e 64 ed 22 2b b9 74  5b 3f 8f c6 00 ce 6e 5c  |.~d."+.t[?....n\|
00000070  4c 7d 5f 5f 66 ee 98 61  12 b8 4d f5 0a 74 43 03  |L}__f..a..M..tC.|
00000080  ce d1 92 70 23 26 fa 91  a7 f7 0a cc 1b 96 dd 10  |...p#&..........|
00000090  41 ea 12 b5 6f 80 ee 16  bf 74 dd 32 13 fc c9 6b  |A...o....t.2...k|
000000a0  26 71 b5 99 7c 1a 82 72  64 a4 a4 93 e1 aa 70 60  |&q..|..rd.....p`|
000000b0  28 8e 3a e9 42 53 fd c6  7a 02 37 68 31 11 8e 66  |(.:.BS..z.7h1..f|
000000c0  4b 57 25 50 8c 4e 6a 0f  bd d2 7a 6b ec c8 a0 da  |KW%P.Nj...zk....|
000000d0  bc 9e 2d 3c 88 d2 14 05  93 24 ca 1a d0 ff 4c 68  |..-<.....$....Lh|
000000e0  dc 9b c8 d2 f5 b1 62 f8  5c 4c c4 b6 22 b7 97 93  |......b.\L.."...|
000000f0  f6 d4 f0 2b a5 74 0a a0  07 80 c5 5f 99 cf 01 07  |...+.t....._....|
00000100

Conway's Game of Life

Game of life is a cellular simulation; not really a game that is played, but more a sequence that is observed. In today’s article, I’ll walk through the rules and a simple C implementation.

How to play

There are some simple rules that the game must abide by.

The universe that it is played within is an orthogonal cartesian grid of squares that define if a cell is dead or if it’s alive. The lifespan of the cells is determined by the following rules:

  • Any cell that’s alive with fewer than two live neighbours dies (underpopulation)
  • Any cell that’s alive with two or three live neighbours lives on to the next generation
  • Any cell that’s alive with more than three live neighbours dies (overpopulation)
  • Any dead cell that has three live neighbours becomes alive (reproduction)

And, that’s it.

Implementation

The pitch where the game is played would be a pretty simple buffer of 1’s and 0’s. 1 would define “alive”, and 0 would define “dead”:

#define UV_WIDTH 64
#define UV_HEIGHT 32

unsigned char *universe;


universe = (unsigned char *) malloc(UV_HEIGHT * UV_WIDTH * sizeof(unsigned char));

So, a buffer of memory for a defined width and height will do the job.

The remainder of the process could be split into the following:

  • Seed the universe
  • Permute the universe
  • Render the universe

Seed

Chicken or the egg isn’t asked here. We just use srand and rand to play god for us:

void universe_seed(unsigned char *u, int width, int height) {
  int i;

  for (i = 0; i < (width * height); i ++) {
    u[i] = rand() % 9 == 0;
  }
}

For every cell, we’ll get a random number from rand. If that number is divisible by 9, we’ll mark the cell as alive.

There are much more clever ways to seed the universe, in such a way that the rules of the game keep the generations running for ever with very clever patterns.

Permute

Actually making the universe kick along between frames, is simply applying the rules to a buffer of states. This buffer of states needs to be considered all in the same move; so we can’t mutate the original buffer sequentially.

void universe_permute(unsigned char *u, int width, int height) {
  int x, y, n, l;
  unsigned char *r = (unsigned char *)malloc(width * height);

  memcpy(r, u, width * height);

  for (y = 0; y < height; y ++) {
    for (x = 0; x < width; x ++) {
      l = u[x + (y * width)];
      n = count_live_neighbours(u, width, height, x, y);

      if (l & ((n < 2) || (n > 3))) {
        r[x + (y * width)] = 0;
      } else if (!l & (n == 3)) {
        r[x + (y * width)] = 1;
      }
    }
  }

  memcpy(u, r, width * height);
  free(r);
}

A copy of the game buffer is made, first up. This is what we’ll actually write the next buffer states to; leaving the current buffer intact.

Following the rules of the game:

  if (l & ((n < 2) || (n > 3))) {
    r[x + (y * width)] = 0;
  } else if (!l & (n == 3)) {
    r[x + (y * width)] = 1;
  }

Here, we see the overpopulation, underpopulation, and reproduction rules in action.

The number of neighbours, is counted with a difference:

int count_live_neighbours(unsigned char *u, int width, int height, int x, int y) {
  /* clip the bounds */
  int x1 = (x - 1) % width;
  int x2 = (x + 1) % width;
  int y1 = (y - 1) % height;
  int y2 = (y + 1) % height;

  return u[x1 + (y1 * width)] +
         u[x1 + (y2 * width)] +
         u[x2 + (y1 * width)] +
         u[x2 + (y2 * width)] +
         u[x  + (y1 * width)] +
         u[x  + (y2 * width)] +
         u[x1 + (y  * width)] +
         u[x2 + (y  * width)];
}

The x and y values are clipped to the width and height values. This means that if you fall off the right-hand side of the universe, you’ll magically appear back on the left-hand side. In the same way - top to bottom, etc.

A neighbour check must look at all 8 cells that surround the cell in question. If a cell is alive, it’s value will be 1; this gives us a really simple hack of adding all of these values together. This now tells us the number of neighbours to this cell.

Rendering

To the terminal.

Always, to the terminal.

You can render anywhere you want. For my example implementation, I’ve used the console.

void universe_render(unsigned char *u, int width, int height) {
  int x, y;

  erase();

  for (y = 0; y < height; y ++) {
    for (x = 0; x < width; x ++) {
      if (u[x + (y * width)]) {
        mvprintw(y + 1, x, "*");
      } else {
        mvprintw(y + 1, x, ".");
      }
    }
  }
}

Finishing up

Here’s a grab of the console.

Alive: 81    Dead: 1967
........................*.......................................
.......................*.*.............***......................
.......................*.*...........*.***......................
........................*...........*...........................
...................................**...........................
..........***.......................**.........**...............
.....................................*.........**...............
........*.....*.................................................
........*.....*.......................................*.........
........*.....*.......................................*.........
.***..................................................*.........
..........***...................................................
................**..............................................
...............*..*.............................................
................**..............................................
................................................................
................................................................
................................................................
................................................................
................................................................
................................................................
................................................................
..........................*.....................................
.........................***....................................
........................***.*...................................
.......................*....**..................................
......................**....*...................................
.......................*.**.*...................................
*...........................*..................................*
.*.......................***.........................**........*
*.........................*.........................*..*........
.....................................................**.........

The full code for this article can be found here.

Create tables from queries with Redshift

As a convenience to the developer, AWS Redshift offers CTAS for those times where you need to materialise a physical table from the result of a query.

Syntax

CREATE [ [LOCAL ] { TEMPORARY | TEMP } ]
TABLE table_name
[ ( column_name [, ... ] ) ]
[ BACKUP { YES | NO } ]
[ table_attributes ]
AS query

where table_attributes are:
[ DISTSTYLE { EVEN | ALL | KEY } ]
[ DISTKEY ( distkey_identifier ) ]
[ [ { COMPOUND | INTERLEAVED } ] SORTKEY ( column_name [, ...] ) ]

Re-produced from the documentation.

As you can see, this is basically a CREATE TABLE statement, with a SELECT query at the end of it.

The new table is loaded with data defined by the query in the command. The table columns have names and data types associated with the output columns of the query. The CREATE TABLE AS (CTAS) command creates a new table and evaluates the query to load the new table.

More Window Functions

This article is an extension of the window function, to follow up on each of the window functions.

MAX

The MAX function will retrieve the maximum value that it sees within the window.

MEDIAN

The MEDIAN function will calculate the median value for the range seen, within the window.

MIN

The MIN function will retrieve the minimum value that it sees within the window.

NTH_VALUE

Where the LAG and LEAD values are relative to the row in question, NTH_VALUE will retain the value at the literal offset specified.

NTILE

NTILE ranks rows into equally proportioned groups within the window seen by the expression.

PERCENT_RANK

Calculates the percent rank on rows seen by the window. The formula calculation is defined as:

(x - 1) / (the number of rows in the window or partition - 1)

PERCENTILE_CONT

PERCENTIL_CONT will calculate the linear interpolation between ordered values.

PERCENTILE_DISC

Returns the value with the smallest cumulative distribution value.

RATIO_TO_REPORT

Calculates a percentage where the row’s value is the divisor, and the total amount for the window is the dividend.

In this example, we’ll use RATIO_TO_REPORT to show us the percentage each sale makes over the period of a single day.

SELECT sale_date, salesperson, quantity::decimal * unit_cost,
  RATIO_TO_REPORT(quantity::decimal * unit_cost) OVER (PARTITION BY sale_date)
FROM public.sales
ORDER BY sale_date, sale_id;

RATIO_TO_REPORT(quantity::decimal * unit_cost) is what gives us the value that we’re working with in terms of ratio; the PARTITION BY sale_date then gives us the window; these ratios need to be calculated for the day.

sale_date salesperson ?column? ratio_to_report
2018-05-02 Bob 26 1.0
2018-05-13 Sally 60 0.2777777777777778
2018-05-13 June 156 0.7222222222222222
2018-05-14 John 96 1.0
2018-05-25 Bob 192 0.5962732919254659
2018-05-25 Sally 130 0.40372670807453415
2018-05-26 John 156 1.0
2018-05-27 John 52 0.0962962962962963
2018-05-27 June 20 0.037037037037037035
2018-05-27 June 468 0.8666666666666667
2018-06-02 Sally 26 1.0
2018-06-03 John 60 0.2777777777777778
2018-06-03 John 156 0.7222222222222222
2018-06-12 John 96 1.0
2018-06-13 Bob 192 0.5962732919254659
2018-06-13 Sally 130 0.40372670807453415
2018-06-15 John 156 1.0
2018-06-24 Bob 52 0.7222222222222222
2018-06-24 Sally 20 0.2777777777777778
2018-06-29 John 468 1.0

ROW_NUMBER

ROW_NUMBER is a utility function that simply gives the row an ordinal value, counting up from 1; over the window.

We count the sales for the day, by applying ROW_NUMBER over the sale_date.

SELECT sale_date, salesperson, quantity::decimal * unit_cost,
  ROW_NUMBER() OVER (PARTITION BY sale_date)
FROM public.sales
ORDER BY sale_date, sale_id;
sale_date salesperson ?column? row_number
2018-05-02 Bob 26 1
2018-05-13 Sally 60 1
2018-05-13 June 156 2
2018-05-14 John 96 1
2018-05-25 Bob 192 1
2018-05-25 Sally 130 2
2018-05-26 John 156 1
2018-05-27 John 52 1
2018-05-27 June 20 2
2018-05-27 June 468 3
2018-06-02 Sally 26 1
2018-06-03 John 60 1
2018-06-03 John 156 2
2018-06-12 John 96 1
2018-06-13 Bob 192 1
2018-06-13 Sally 130 2
2018-06-15 John 156 1
2018-06-24 Bob 52 1
2018-06-24 Sally 20 2
2018-06-29 John 468 1

STDDEV_SAMP and STDDEV_POP

STDDEV_SAMP and STDDEV_POP will find the sample and population standard deviation of the values seen in a window.

SUM

The SUM function will retrieve the accumulated sum of an expression over the defined window.

VAR_SAMP and VAR_POP

VAR_SAMP and VAR_POP will find the sample and population variance of the values seen in a window.