Cogs and Levers A blog full of technical stuff

PostgreSQL Data Access with Haskell


PostgreSQL is a very popular relational database which has quite a few different data access libraries available for the Haskell programming language.

Today’s article aims to get you up and running, executing queries against PostgreSQL from your Haskell environment with the least amount of hassle.


The first library that we’ll go through is postgresql-simple. This library has a very basic interface, and is really simple to get up an running.

A mid-level client library for the PostgreSQL database, aimed at ease of use and high performance.


Before you get started though, you’ll need libpq installed.

pacman -S postgresql-libs

Now you’re ready to develop.

You’ll need to add a dependency on the postgresql-simple library to your application. The following code will then allow you to connect to your PostgreSQL database, and ru a simple command.

Hello, Postgres!

{-# LANGUAGE OverloadedStrings #-}
module Main where

import Database.PostgreSQL.Simple

localPG :: ConnectInfo
localPG = defaultConnectInfo
        { connectHost = ""
        , connectDatabase = "clients"
        , connectUser = "app_user"
        , connectPassword = "app_password"

main :: IO ()
main = do
  conn <- connect localPG
  mapM_ print =<< (query_ conn "SELECT 1 + 1" :: IO [Only Int])

When your application successfully builds and executes, you should be met with the following output:

Only {fromOnly = 2}

Walking through this code quickly, we first enable OverloadedStrings so that we can specify our Query values as literal strings.

localPG :: ConnectInfo
localPG = defaultConnectInfo
        { connectHost = ""
        , connectDatabase = "clients"
        , connectUser = "app_user"
        , connectPassword = "app_password"

In order to connect to Postgres, we use a ConnectInfo value which is filled out for us via defaultConnectInfo. We just override those values for our examples. I’m running PostgreSQL in a docker container, therefore I’ve got my docker network address.

conn <- connect localPG

The localPG value is now used to connect to the Postgres database. The conn value will be referred to after successful connection to send instructions to.

mapM_ print =<< (query_ conn "SELECT 1 + 1" :: IO [Only Int])

Finally, we run our query SELECT 1 + 1 using the query_ function. conn is passed to refer to the connecion to execute this query on.

With this basic code, we can start to build on some examples.

Retrieve a specific record

In the Hello, World example above, we were adding two static values to return another value. As exampeles get more complex, we need to give the library more information about the data that we’re working with. Int is very well known already, and already has mechanisms to deal with it (along with other basic data types).

In the client database table we have a list of names and ids. We can create a function to retrieve the name of a client, given an id:

retrieveClient :: Connection -> Int -> IO [Only String]
retrieveClient conn cid = query conn "SELECT name FROM client WHERE id = ?" $ (Only cid)

The Query template passed in makes use of the ? character to specify where substitutions will be put. Note the use of query rather than query_. In this case, query also accepts a Tuple containing all of the values for substitution.

Using the FromRow type class, our code can define a much stronger API. We can actually retrieve client rows from the database and convert them into Client values.

We need FromRow first:

import Database.PostgreSQL.Simple.FromRow

The Client data type needs definition now. It’s how we’ll refer to a client within our Haskell program:

data Client = Client { id :: Int, name :: String }
  deriving (Show)

The Client data type now gets a FromRow instance, which allows postgresql-simple to use it.

instance FromRow Client where
  fromRow = Client <$> field <*> field

In order of the fields definitions, we give fromRow definition. The retrieveClient function only changes to broaden its query, and change its return type!

retrieveClient :: Connection -> Int -> IO [Client]
retrieveClient conn cid = query conn "SELECT id, name FROM client WHERE id = ?" $ (Only cid)

Create a new record

When creating data, you can use the function execute. The execute function is all about execution of the query without any return value.

execute conn "INSERT INTO client (name) VALUES (?)" (Only "Sam")

Extending our API, we can make a createClient function; but with a twist. We’ll also return the generated identifier (because of the id field).

createClient :: Connection -> String -> IO [Only Int64]
createClient conn name =
  query conn "INSERT INTO client (name) VALUES (?) RETURNING id" $ (Only name)

We need a definition for Int64. This is what the underlying SERIAL in PostgreSQL will translate to inside of your Haskell application.

import Data.Int

We can now use createClient to setup an interface of sorts fo users to enter information.

main :: IO ()
main = do
  conn <- connect localPG
  putStrLn "Name of your client? "
  clientName <- getLine
  cid <- createClient conn clientName
  putStrLn $ "New Client: " ++ (show cid)

We’ve created a data creation interface now.

Name of your client?
New Client: [Only {fromOnly = 4}]

Update an existing record

When it comes to updating data, we don’t expect much back in return aside from the number of records affected by the instruction. The execute function does exactly this. By measuring the return, we can convert the row count into a success/fail style message. I’ve simply encoded this as a boolean here.

updateClient :: Connection -> Int -> String -> IO Bool
updateClient conn cid name = do
  n <- execute conn "UPDATE client SET name = ? WHERE id = ?" (name, cid)
  return $ n > 0

Destroying records

Finally, destroying information out of the database will look a lot like the update.

deleteClient :: Connection -> Int -> IO Bool
deleteClient conn cid = do
  n <- execute conn "DELETE FROM client WHERE id = ?" $ (Only cid)
  return $ n > 0

execute providing the affected count allows us to perform the post-execution validation again.


There’s some basic operations to get up and running using postgresql-simple. Really looks like you can prototype software all the way through to writing fully blown applications with it.

Really simple to use.

Add info at build time to go binaries


Sometimes it can be useful to capture information about your environment at build time, and have this information injected into the binary that you’re building. Some examples centre around versioning, where it might make sense to capture git commit hashes or build serials.

An example program

package main

import "fmt"

var gitCommit string
var buildSerial string

func main() {
  fmt.Printf("git hash: %s, build serial: %s", gitCommit, buildSerial)

Two variables in this module gitCommit, and buildSerial are going to hold some version information for us. Running this program yields some rather uninteresting results.

go run main.go
git hash: , build serial: 

-X switch

While building a program, you can use the -X linker switch which will allow you to supply information into module variables from the build process.

We can obtain the latest build hash using git with the following:

git rev-list -1 HEAD

We can even synthesize a build number involving the date, perhaps?

date +%s

Using the -ldflags switch, we can now specify these at the console.

go build -ldflags "-X main.gitCommit=`git rev-list -1 HEAD` -X main.buildSerial=`date +%s`" main.go

Closing up

Now that we have a binary built, it’s had its build information applied - these variables now magically receive these values.

git hash: f6f62d9a759a03afffb913a1d24fb64a1bc5507d, build serial: 1564573076

Remember, these switches can be buried behid a Makefile also, so you don’t need to be typing these things over and over.


.PHONY: build

GIT_COMMIT := $(shell git rev-list -1 HEAD)
BUILD_NO := $(shell date +%s)

  GOOS=$(GOOS) GOARCH=$(GOARCH) go build -ldflags "-X main.gitCommit=$(GIT_COMMIT) -X main.buildSerial=$(BUILD_NO)" .



Sometimes you may need to investigate the contents of binary files. Simply using cat to view these details in your terminal can have all sorts of random effects due to control characters, etc. The utility hexdump allows you to look at the contents of these files in a sane way.

From the hexdump manpage:

display file contents in hexadecimal, decimal, octal, or ascii

In today’s article, we’ll walk through some example usages of this utiltiy.


For all of these examples, we’ll be using a 256 byte file of random binary. I generated this data on my system with the following command:

head -c 256 /dev/urandom > example

The initial view of this data now looks like this:

➜  ~ hexdump example 
0000000 5103 7055 bd22 a3bf 2f36 fc05 3a80 5d5a
0000010 0e4c cbdd 06a7 9dc3 b104 2dae 0c3e e9e6
0000020 d01a dc5a 2eaf c01d 5336 d738 231c 0358
0000030 9133 eafa cd24 1206 0f71 988e 2349 648c
0000040 1eb8 7cf4 e7b8 4e61 a5e9 aa16 063f 9370
0000050 7bab e97d c197 6662 e99d 0b97 381a 9712
0000060 7e88 ed64 2b22 74b9 3f5b c68f ce00 5c6e
0000070 7d4c 5f5f ee66 6198 b812 f54d 740a 0343
0000080 d1ce 7092 2623 91fa f7a7 cc0a 961b 10dd
0000090 ea41 b512 806f 16ee 74bf 32dd fc13 6bc9
00000a0 7126 99b5 1a7c 7282 a464 93a4 aae1 6070
00000b0 8e28 e93a 5342 c6fd 027a 6837 1131 668e
00000c0 574b 5025 4e8c 0f6a d2bd 6b7a c8ec daa0
00000d0 9ebc 3c2d d288 0514 2493 1aca ffd0 684c
00000e0 9bdc d2c8 b1f5 f862 4c5c b6c4 b722 9397
00000f0 d4f6 2bf0 74a5 a00a 8007 5fc5 cf99 0701


Now things get interesting. The -e switch of the hexdump command allows us to specify a format string that controls the output to the terminal.

➜  ~ hexdump -v -e '"%07_ax  |"' -e '16/1 "%_p" "|\n"' example

Using _a[dox] we can control how that offset down the left hand side looks. %07_ax pads the offset with a width of 7. 16/1 "%_p" will print 16 bytes using _p which prints using the default character set. The output of which looks like this:

0000000  |.QUp"...6/...:Z]|
0000010  |L..........->...|
0000020  |..Z.....6S8..#X.|
0000030  |3...$...q...I#.d|
0000040  |...|..aN....?.p.|
0000050  |.{}|
0000060  |.~d."+.t[?....n\|
0000070  |L}__f..a..M..tC.|
0000080  |...p#&..........|
0000090  |A...o....t.2...k|
00000a0  |&q..|..rd.....p`|
00000b0  |(.:.BS..z.7h1..f|
00000c0  |KW%P.Nj...zk....|
00000d0  |..-<.....$....Lh|
00000e0  |......b.\L.."...|
00000f0  |...+.t....._....|

Anytime this format encounters a non-printable character, a . is put in its place.


-v -C gives a side-by-side of hex values along with the printable characters:

➜  ~ hexdump -v -C example

This is probably the most familliar:

0000000  03 51 55 70 22 bd bf a3  36 2f 05 fc 80 3a 5a 5d  |.QUp"...6/...:Z]|
00000010  4c 0e dd cb a7 06 c3 9d  04 b1 ae 2d 3e 0c e6 e9  |L..........->...|
00000020  1a d0 5a dc af 2e 1d c0  36 53 38 d7 1c 23 58 03  |..Z.....6S8..#X.|
00000030  33 91 fa ea 24 cd 06 12  71 0f 8e 98 49 23 8c 64  |3...$...q...I#.d|
00000040  b8 1e f4 7c b8 e7 61 4e  e9 a5 16 aa 3f 06 70 93  |...|..aN....?.p.|
00000050  ab 7b 7d e9 97 c1 62 66  9d e9 97 0b 1a 38 12 97  |.{}|
00000060  88 7e 64 ed 22 2b b9 74  5b 3f 8f c6 00 ce 6e 5c  |.~d."+.t[?....n\|
00000070  4c 7d 5f 5f 66 ee 98 61  12 b8 4d f5 0a 74 43 03  |L}__f..a..M..tC.|
00000080  ce d1 92 70 23 26 fa 91  a7 f7 0a cc 1b 96 dd 10  |...p#&..........|
00000090  41 ea 12 b5 6f 80 ee 16  bf 74 dd 32 13 fc c9 6b  |A...o....t.2...k|
000000a0  26 71 b5 99 7c 1a 82 72  64 a4 a4 93 e1 aa 70 60  |&q..|..rd.....p`|
000000b0  28 8e 3a e9 42 53 fd c6  7a 02 37 68 31 11 8e 66  |(.:.BS..z.7h1..f|
000000c0  4b 57 25 50 8c 4e 6a 0f  bd d2 7a 6b ec c8 a0 da  |KW%P.Nj...zk....|
000000d0  bc 9e 2d 3c 88 d2 14 05  93 24 ca 1a d0 ff 4c 68  |..-<.....$....Lh|
000000e0  dc 9b c8 d2 f5 b1 62 f8  5c 4c c4 b6 22 b7 97 93  |......b.\L.."...|
000000f0  f6 d4 f0 2b a5 74 0a a0  07 80 c5 5f 99 cf 01 07  |...+.t....._....|

Conway's Game of Life

Game of life is a cellular simulation; not really a game that is played, but more a sequence that is observed. In today’s article, I’ll walk through the rules and a simple C implementation.

How to play

There are some simple rules that the game must abide by.

The universe that it is played within is an orthogonal cartesian grid of squares that define if a cell is dead or if it’s alive. The lifespan of the cells is determined by the following rules:

  • Any cell that’s alive with fewer than two live neighbours dies (underpopulation)
  • Any cell that’s alive with two or three live neighbours lives on to the next generation
  • Any cell that’s alive with more than three live neighbours dies (overpopulation)
  • Any dead cell that has three live neighbours becomes alive (reproduction)

And, that’s it.


The pitch where the game is played would be a pretty simple buffer of 1’s and 0’s. 1 would define “alive”, and 0 would define “dead”:

#define UV_WIDTH 64
#define UV_HEIGHT 32

unsigned char *universe;

universe = (unsigned char *) malloc(UV_HEIGHT * UV_WIDTH * sizeof(unsigned char));

So, a buffer of memory for a defined width and height will do the job.

The remainder of the process could be split into the following:

  • Seed the universe
  • Permute the universe
  • Render the universe


Chicken or the egg isn’t asked here. We just use srand and rand to play god for us:

void universe_seed(unsigned char *u, int width, int height) {
  int i;

  for (i = 0; i < (width * height); i ++) {
    u[i] = rand() % 9 == 0;

For every cell, we’ll get a random number from rand. If that number is divisible by 9, we’ll mark the cell as alive.

There are much more clever ways to seed the universe, in such a way that the rules of the game keep the generations running for ever with very clever patterns.


Actually making the universe kick along between frames, is simply applying the rules to a buffer of states. This buffer of states needs to be considered all in the same move; so we can’t mutate the original buffer sequentially.

void universe_permute(unsigned char *u, int width, int height) {
  int x, y, n, l;
  unsigned char *r = (unsigned char *)malloc(width * height);

  memcpy(r, u, width * height);

  for (y = 0; y < height; y ++) {
    for (x = 0; x < width; x ++) {
      l = u[x + (y * width)];
      n = count_live_neighbours(u, width, height, x, y);

      if (l & ((n < 2) || (n > 3))) {
        r[x + (y * width)] = 0;
      } else if (!l & (n == 3)) {
        r[x + (y * width)] = 1;

  memcpy(u, r, width * height);

A copy of the game buffer is made, first up. This is what we’ll actually write the next buffer states to; leaving the current buffer intact.

Following the rules of the game:

  if (l & ((n < 2) || (n > 3))) {
    r[x + (y * width)] = 0;
  } else if (!l & (n == 3)) {
    r[x + (y * width)] = 1;

Here, we see the overpopulation, underpopulation, and reproduction rules in action.

The number of neighbours, is counted with a difference:

int count_live_neighbours(unsigned char *u, int width, int height, int x, int y) {
  /* clip the bounds */
  int x1 = (x - 1) % width;
  int x2 = (x + 1) % width;
  int y1 = (y - 1) % height;
  int y2 = (y + 1) % height;

  return u[x1 + (y1 * width)] +
         u[x1 + (y2 * width)] +
         u[x2 + (y1 * width)] +
         u[x2 + (y2 * width)] +
         u[x  + (y1 * width)] +
         u[x  + (y2 * width)] +
         u[x1 + (y  * width)] +
         u[x2 + (y  * width)];

The x and y values are clipped to the width and height values. This means that if you fall off the right-hand side of the universe, you’ll magically appear back on the left-hand side. In the same way - top to bottom, etc.

A neighbour check must look at all 8 cells that surround the cell in question. If a cell is alive, it’s value will be 1; this gives us a really simple hack of adding all of these values together. This now tells us the number of neighbours to this cell.


To the terminal.

Always, to the terminal.

You can render anywhere you want. For my example implementation, I’ve used the console.

void universe_render(unsigned char *u, int width, int height) {
  int x, y;


  for (y = 0; y < height; y ++) {
    for (x = 0; x < width; x ++) {
      if (u[x + (y * width)]) {
        mvprintw(y + 1, x, "*");
      } else {
        mvprintw(y + 1, x, ".");

Finishing up

Here’s a grab of the console.

Alive: 81    Dead: 1967

The full code for this article can be found here.

Create tables from queries with Redshift

As a convenience to the developer, AWS Redshift offers CTAS for those times where you need to materialise a physical table from the result of a query.


TABLE table_name
[ ( column_name [, ... ] ) ]
[ BACKUP { YES | NO } ]
[ table_attributes ]
AS query

where table_attributes are:
[ DISTKEY ( distkey_identifier ) ]
[ [ { COMPOUND | INTERLEAVED } ] SORTKEY ( column_name [, ...] ) ]

Re-produced from the documentation.

As you can see, this is basically a CREATE TABLE statement, with a SELECT query at the end of it.

The new table is loaded with data defined by the query in the command. The table columns have names and data types associated with the output columns of the query. The CREATE TABLE AS (CTAS) command creates a new table and evaluates the query to load the new table.