Cogs and Levers A blog full of technical stuff

Getting started with Go

Go is a general purpose programming language aiming at resolving some of the short-comings observed in other languages. Some key features of Go is that it’s statically typed, and has a major focus on making scalability, multiprocessing and networking easy.

In today’s post, I’ll go through some of the steps that I’ve taken to prepare a development environment that you can be immediately productive in.

Code organisation

To take a lot of the think work out of things, as well as present a consistent view from machine-to-machine, there are some strict rules around code organisation. A full run-down on the workspace can be found here; but for the purposes of today’s article we’ll look at locating a folder at ~/Source/go.

Docker for development

To not clutter my host system, I make extensive use of Docker containers. Docker containers allow me to run multiple versions of the same software concurrently, but also make all of my environments disposable. Whilst the instructions below will be centralised around the go command, all of these will be executed in context of a golang container. The following command sets up a container for the duration of one command’s execution:

docker run -ti --rm -v ~/Source/go:/go golang

-ti runs the container interactively allocating a TTY; --rm cleans the container up after the command has finished executing; we mount our go source folder inside the container at the pre-configured /go directory.

I found it beneficial to make an alias in zsh wrapping this up for me.

Hello, world

Getting that first application up and running is pretty painless. We need to create a directory for our project, build and run.

# Create the project folder
mkdir -p src/github.com/user/hello

# Get editing the program
cd src/github.com/user/hello
vim hello.go

As you’d expect, we create our program:

package main

import "fmt"

func main() {
  fmt.Printf("Hello, world\n")
}

Now we can build the program.

go install github.com/user/hello

We’re done

You’ll have a binary waiting for you to execute now.

bin/hello
Hello, world

Blockchain Basics

A blockchain is a linked list of record items that are chained together with hashes. To make it a little more concrete, each subsequent block in a chain contains its predecessors hash as a piece of the information made up to make its own hash.

This forms a strong chain of records that is very difficult to change without re-processing all of the ancestor records.

Each record in the chain typically stores:

  • A timestamp
  • The actual data for the block
  • A reference to the predecessor block

In today’s post, I’ll try to continue this explanation using an implementation written in C++.

A simple implementation

It’ll be a pretty easy build. We’ll need a block class, which really does all of the work for us. We’ll need a way to hash a block in a way that gives us a re-usable string. Finally, we’ll put the whole implementation using a vector.

The block

We need a timestamp, the actual data and the hash of the predecessor.

class block {
public:
  block(const long int ts, const std::string &data, const std::string &prev_hash)
    : _ts(ts), _data(data), _prev_hash(prev_hash) { }

public:
  const long ts() const   { return _ts; }
  const std::string& data() const { return _data; }
  const std::string& prev_hash() const { return _prev_hash; }

private:
  long _ts;
  std::string _data;
  std::string _prev_hash;
};

In this class, _ts assumes the role of the timestamp; _data holds an arbitrary string of our data and _prev_hash will be the hex string of the hash from the previous record.

The block needs a way of hashing all of its details to produce a new hash. We’ll do this by concatenating all of the data within the block, and running it through a SHA256 hasher. I found a really simple implementation here.

std::string hash(void) const {
  std::stringstream ss;

  ss << _ts 
     << _data 
     << _prev_hash; 

  std::string src = ss.str();
  std::vector<unsigned char> hash(32);
  
  picosha2::hash256(
    src.begin(), 
    src.end(), 
    hash.begin(), 
    hash.end()
  );

  return picosha2::bytes_to_hex_string(
    hash.begin(), hash.end()
  );
}

_ts, _data and _prev_hash get concatenated and hashed.

Now we need a way to seed a chain, as well as build subsequent blocks. Seeding a list is nothing more than just generating a single block that contains no previous reference:

static block create_seed(void) {
  auto temp_ts = std::chrono::system_clock::now().time_since_epoch();

  return block(
    temp_ts.count(),
    "Seed block",
    ""
  );
}

Really simple. Empty string can be swapped out for nullptr should we want to add some more branches to the hasher and change the internal type of _prev_hash. This will do for our purposes though.

static block create_next(const block &b, const std::string &data) {
  auto temp_ts = std::chrono::system_clock::now().time_since_epoch();

  return block(
    temp_ts.count(),
    data, 
    b.hash()
  );    
}

The next blocks need to be generated from another block; in this case b. We use its hash to populate the _prev_hash field of the new block.

This is the key part of the design though. With the previous block making in to being a part of the concatenated string that gets hashed into this new block, we form a strong dependency on it. This dependency is what chains the records together and makes it very difficult to change.

Finally, we can test out our implementation. I’ve created a function called make_data which just generates a JSON string, ready for the _data field to manage. It simply holds 3 random numbers; but you could imagine that this might be imperative data for your business process.

int main(int argc, char *argv[]) {

  std::vector<block> chain = { 
    block::create_seed()
  };

  for (int i = 0; i < 5; i ++) {
    // get the last block in the chain
    auto last = chain[chain.size() - 1];

    // create the next block
    chain.push_back(block::create_next(last, make_data()));
  }

  print_chain(chain);

  return 0;
}

Running this code, we can see that the chains are printed to screen:

index: 0
ts: 1502801929223372929
data: Seed block
this: b468ae4c1a5a0b416162a59ebcdd75922ab011d0cc434c8c408b6507459abd5b
prev: 
-------------------------------------
index: 1
ts: 1502801929223494692
data: { "a": 1804289383,"b": 846930886,"c": 1681692777 }
this: 25d892a8de27890ee057923e784124c9c07161ff340d3d11e5f76b5a865e03af
prev: b468ae4c1a5a0b416162a59ebcdd75922ab011d0cc434c8c408b6507459abd5b
-------------------------------------
index: 2
ts: 1502801929223598810
data: { "a": 1714636915,"b": 1957747793,"c": 424238335 }
this: 8a7d5fe462e71663cccda99f80cce99b25199b82fbefc11c6a3f6c2cc4e985f3
prev: 25d892a8de27890ee057923e784124c9c07161ff340d3d11e5f76b5a865e03af
-------------------------------------
index: 3
ts: 1502801929223720644
data: { "a": 719885386,"b": 1649760492,"c": 596516649 }
this: 0da2de773551a15f6bca003196a02b30312c895dab6835c9ac3434f852eeaa60
prev: 8a7d5fe462e71663cccda99f80cce99b25199b82fbefc11c6a3f6c2cc4e985f3
-------------------------------------
index: 4
ts: 1502801929223837467
data: { "a": 1189641421,"b": 1025202362,"c": 1350490027 }
this: a3709be80f80c24ac6ebc526a9dcec5e2212c03260a2f82f5e9943f762becb6e
prev: 0da2de773551a15f6bca003196a02b30312c895dab6835c9ac3434f852eeaa60
-------------------------------------
index: 5
ts: 1502801929223952445
data: { "a": 783368690,"b": 1102520059,"c": 2044897763 }
this: c1de653540556064b3d01fba21d0a80a07071b19969d3e635ad66eb3db2e6272
prev: a3709be80f80c24ac6ebc526a9dcec5e2212c03260a2f82f5e9943f762becb6e
-------------------------------------

Note that index isn’t a member of the class; it just counts while we’re iterating over the vector. The real membership here is established through the _prev_hash; as discussed above.

Where to?

Now that the storage mechanism is understood, we can apply proof-of-work paradigms to attribute a sense of value to our records. More information on how this has been applied can be read up in the following:

The full source code for this article can be found here.

SSH tunneling

SSH Tunneling is a technique that allows you to provide access to a network or service, through another access point. This is particularly useful when you aren’t afforded immediate access to the network or service that you’re trying to reach. Marshalling this traffic through a network protocol that you can immediately access (and allowing that secondary point to on-forward your network requests) you can achieve the access that you require.

From the wikipedia article:

In computer networks, a tunneling protocol allows a network user to access or provide a network service that the underlying network does not support or provide directly.

In today’s article, I’ll demonstrate a basic setup for tunneling to different services.

The basic format that the command takes that you’ll use, will look like this:

ssh -f remote-user@remote-host -L local-port:remote-host:remote-port -N

-f tells the ssh command to drop into the background after its invocation. -L maps a local-port through the remote-host onto the remote-port. -N tells OpenSSH to not execute a command, remotely.

Examples of the local-port:remote-host:remote-port might look as follows.

A firewall that you’re implicitly connected through (as a result of being at Starbucks, or at a hotel) isn’t allowing you to connect to your home email server (on port 25). You setup a tunnel using 5000:my-email-host-at-home.com:25 and connect your email client to localhost on port 5000. Data is now encrypted through the tunnel and is marshalled over the requested port.

Your company doesn’t allow IRC traffic through it’s firewall. You use 9000:irc.server.com:6667 to get around these restrictions; sending your chat data encrypted through the firewall.

Type Families language pragma

The Type Families language pragma provides the developer the ability to attribute an association between two different types. This is going to allow us to write the same function for different types.

In this example, the type class Falsey is somewhat of a loose boolean test on every day values . . not just Bool values. By allowing the developer to specify type within the class and instance we establish the association between the types:

{-# LANGUAGE TypeFamilies #-}

class Falsey a where
  type Value a
  isFalsey :: a -> Bool

instance Falsey [a] where
  type Value [a] = a
  isFalsey [] = True
  isFalsey _  = False

instance Falsey Bool where
  type Value Bool = Bool
  isFalsey x  = x

main :: IO ()
main = do
  print $ isFalsey []
  print $ isFalsey [1, 2, 3]
  print $ isFalsey True
  print $ isFalsey False

OverloadedStrings Language Pragma

The OverloadedStrings language pragma can be enabled either by passing the -XOverloadedStrings switch to GHC or you can just add the following to the top of your Haskell source:

{-# LANGUAGE OverloadedStrings #-}

The OverloadedStrings language pragma changes the way that literal strings identify themselves, in a way that favours performance. [Char] is a rather cumbersome type to be used when dealing with something as primitive as a string.

Prelude> :t "Hello, world"
"Hello, world" :: [Char]
Prelude> :set -XOverloadedStrings
Prelude> :t "Hello, world"
"Hello, world" :: Data.String.IsString t => t

The literal string "Hello, world" now identifies as a call to the fromString function out of the IstString type class. You can define instances like so:

import GHC.Exts ( IsString(..) )

data Colour = Red | Green | Blue | Other String deriving Show

instance IsString Colour where
  fromString "Red" = Red
  fromString "Green" = Green
  fromString "Blue" = Blue 
  fromString xs = Other xs

Now we just cast our strings to our type, and the fromString functions are invoked for us:

Prelude GHC.Exts> "Red" :: Colour
Red
Prelude GHC.Exts> "Yellow" :: Colour
Other "Yellow"