Cogs and Levers A blog full of technical stuff

Clojure threading macros

A Threading Macro in Clojure is a utility for representing nested function calls in a linear fashion.

Simple transformations

Meet mick.

user=> (def mick {:name "Mick" :age 25})
#'user/mick

He’s our subject for today.

If we wanted to give mick an :occupation, we could simply do this using assoc; like so:

user=> (assoc mick :occupation "Painter")
{:name "Mick", :age 25, :occupation "Painter"}

At the same time, we also want to take note of his earning for the year:

user=> (assoc mick :occupation "Painter" :ytd 0)
{:name "Mick", :age 25, :occupation "Painter", :ytd 0}

Keeping in mind that this isn’t actually changing mick at all. It’s just associating new pairs to him, and returning the new object.

mick got paid, $100 the other week, so we increment his :ytd by 100. We do this by performing the transformation after we’ve given him the attribute.

user=> (update (assoc mick :occupation "Painter" :ytd 0) :ytd + 100)
{:name "Mick", :age 25, :occupation "Painter", :ytd 100}

He earned another $32 as well, in another job.

user=> (update (update (assoc mick :occupation "Painter" :ytd 0) :ytd + 100) :ytd + 32)
{:name "Mick", :age 25, :occupation "Painter", :ytd 132}

He also got a dog.

user=>  (assoc (update (update (assoc mick :occupation "Painter" :ytd 0) :ytd + 100) :ytd + 32) :pets [:dog])
{:name "Mick", :age 25, :occupation "Painter", :ytd 132, :pets [:dog]}

So, the nesting gets out of control. Quickly.

Thread first macro

We’ll use -> (The thread-first macro) to perform all of these actions in one form (must as we’ve done above), but in a much more readable manner.

user=> (-> mick
  #_=>   (assoc :occupation "Painter" :ytd 0)
  #_=>   (update :ytd + 100)
  #_=>   (update :ytd + 32)
  #_=>   (assoc :pets [:dog]))
{:name "Mick", :age 25, :occupation "Painter", :ytd 132, :pets [:dog]}  

So, it’s the same result; but with a much cleaner and easier to read interface.

Thread last macro

We saw above that the -> threading macro works well for bare values being passed to forms. When the problem changes to the value not being supplied in the initial position, we use thread last ->>. The value that we’re threading appears as the last item in each of the transformations, rather than the mick example where they were the first.

user=> (filter #(> % 12) (map #(* % 5) [1 2 3 4 5]))
(15 20 25)

We multiply the elements of the vector [1 2 3 4 5] by 5 and then filter out those items that are greater than 12.

Again, nesting quickly takes over here; but we can express this with ->>:

user=> (->> [1 2 3 4 5]
  #_=>   (map #(* % 5) ,,,)
  #_=>   (filter #(> % 12) ,,,))
(15 20 25)

Again, this is a much more readable form.

as

If the insertion point of the threaded value varies, we can use as-> to alias the value.

user=> (as-> "Mick" n
  #_=>   (clojure.string/upper-case n)
  #_=>   (clojure.string/reverse n)
  #_=>   (.substring n 1))
"CIM"

Take the name “Mick”

  • Convert it to upper case
  • Reverse it
  • Substring, skipping the first character

It’s the substring call, which takes the string in the initial position that’s interesting here; as it’s the only call that does that. upper-case and reverse take it in as the only (or last).

some

The two macros some-> and some->> work like their -> and ->> counterparts; only they do it on Java interop methods.

cond

cond-> and cond->> will evaluate a set of conditions, applying the threaded value to the front to back of any expression associated to a condition that evaulates true.

The following example has been taken from here.

(defn describe-number [n]
  (cond-> []
    (odd? n) (conj "odd")
    (even? n) (conj "even")
    (zero? n) (conj "zero")
    (pos? n) (conj "positive")))

So you can describe a number as you go:

user=> (describe-number 1)
["odd" "positive"]
user=> (describe-number 5)
["odd" "positive"]
user=> (describe-number 4)
["even" "positive"]
user=> (describe-number -4)
["even"]

CPUID

CPUID is an opcode present in the x86 architecture that provides applications with information about the processor.

In today’s article, I’ll show you how to invoke this opcode and extract the information that it holds.

The Opcode

The CPUID opcode is actually rather simple. Using EAX we can control CPUID to output different pieces of information. The following table outlines all of the information available to us.

EAX Description
0 Vendor ID string; maximum CPUID value supported
1 Processor type, family, model, and stepping
2 Cache information
3 Serial number
4 Cache configuration
5 Monitor information
80000000h Extended Vendor ID
80000001h Extended processor type, family model, and stepping
80000002h-80000004h Extended processor name

As you can see, there’s quite a bit of information available to us.

I think that if you were to take a look in /proc/cpuinfo, you would see similar information:

➜  ~ cat /proc/cpuinfo 
processor : 0
vendor_id : GenuineIntel
cpu family  : 6
model   : 142
model name  : Intel(R) Core(TM) i7-7500U CPU @ 2.70GHz
stepping  : 9
. . . 

Processor name

We’ll put together an example that will read out the processor name, and print it to screen.

When CPUID is invoked with a 0 in RAX, the vendor string is split across RBX, RDX and RCX. We need to piece this information together into a printable string.

To start, we need a buffer to store the vendor id. We know that the id will come back in 3 chunks of 4-bytes each; so we’ll reserve 12 bytes in total.

section .bss
    vendor_id:   resb 12 

The program starts and we execute cpuid. After that, we stuff it into the vendor_id buffer that’s been pre-allocated.

section .text
    global _start

_start:
    mov   rax, 0
    cpuid

    mov   rdi, vendor_id
    mov   [rdi], ebx
    mov   [rdi + 4], edx
    mov   [rdi + 8], ecx

Print it out to screen using the linux system call write.

    mov   rax, 4
    mov   rbx, 1
    mov   rcx, vendor_id
    mov   rdx, 12
    int   0x80

. . and, get out

    mov   rax, 1
    mov   rbx, 0
    int   0x80

Testing

Assembling and executing this code is pretty easy.

$ nasm -f elf64 -g cpuid.asm
$ ld -s -o cpuid cpuid.o    
$ ./cpuid 
GenuineIntel

From here

There are so many other system services that will allow you to view data about your processor. Going through the documentation, you’ll create yourself a full cpuinfo replica in no time.

Create a REST API with Go

Let’s create a REST api using golang. In our example, we’ll walk through what’s required to make an API for a Todo-style application.

Starting off

First up, we’re going to create a project. I’ve called mine “todo”.

mkdir -p $GOPATH/src/github.com/tuttlem/todo

This gives us a project folder. Start off editing your main.go file. We’ll pop the whole application into this single file, as it’ll be simple enough.

package main

import (
  "fmt"
)

func main() {
  fmt.Println("Todo application")
}

The Server

We can turn our console application now into a server application pretty easily with the net/http module. Once we import this, we’ll use the ListenAndServe function to stand a server up. While we’re at it, we’ll create a NotImplementedHandler so we can assertivly tell our calling clients that we haven’t done anything just yet.

package main

import (
  "net/http"
)

func main() {

  // start the server listening, and always sending back
  // the "NotImplemented" message
  http.ListenAndServe(":3000", NotImplementedHandler);

}

var NotImplementedHandler = http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
  w.Header().Set("Content-Type", "application/json")
  w.WriteHeader(http.StatusNotImplemented)
})

Testing this service will be a little pointless, but we can see our 501’s being thrown:

➜  ~ curl --verbose http://localhost:3000/something
*   Trying 127.0.0.1...
* Connected to localhost (127.0.0.1) port 3000 (#0)
> GET /something HTTP/1.1
> Host: localhost:3000
> User-Agent: curl/7.47.0
> Accept: */*
> 
< HTTP/1.1 501 Not Implemented
< Content-Type: application/json
< Date: Wed, 27 Sep 2017 13:26:33 GMT
< Content-Length: 0
< 
* Connection #0 to host localhost left intact

Routing

Routing will allow us to direct a user’s request to the correct piece of functionality. Routing also helps us extract input parameters for requests. Using mux from gorilla we can quickly setup the list, create, update and delete endpoints we need to accomplish our TODO application.

import (
  // . . . 
  "github.com/gorilla/mux"
  // . . . 
)

func main() {

  r := mux.NewRouter()

  r.Handle("/todos", NotImplementedHandler).Methods("GET")
  r.Handle("/todos", NotImplementedHandler).Methods("POST")
  r.Handle("/todos/{id}", NotImplementedHandler).Methods("PUT")
  r.Handle("/todos/{id}", NotImplementedHandler).Methods("DELETE")

  // start the server listening, and always sending back
  // the "NotImplemented" message
  http.ListenAndServe(":3000", r);

}

What’s nice about this, is that our actual routes are what will emit the 501. Anything that completely misses the router will result in a much more accurate 404. Perfect.

Handlers

We can give the server some handlers now. A handler takes the common shape of:

func handler(w http.ResponseWriter, r *http.Request) {
}

The http.ResponseWriter typed w parameter is what we’ll use to send a payload back to the client. r takes the form of the request, and it’s what we’ll use as an input to the process. This is all looking very “server’s output as a function of its input” to me.

var ListTodoHandler = NotImplementedHandler
var CreateTodoHandler = NotImplementedHandler
var UpdateTodoHandler = NotImplementedHandler
var DeleteTodoHandler = NotImplementedHandler

Which means that our router (whilst still unimplemented) starts to make a little more sense.

r.Handle("/todos", ListTodoHandler).Methods("GET")
r.Handle("/todos", CreateTodoHandler).Methods("POST")
r.Handle("/todos/{id}", UpdateTodoHandler).Methods("PUT")
r.Handle("/todos/{id}", DeleteTodoHandler).Methods("DELETE")

Modelling data

We need to start modelling this data so that we can prepare an API to work with it. The following type declaration creates a structure that will define our todo item:

type Todo struct {
  Id              int    `json:"id"`
  Description     string `json:"description"`
  Complete        bool   `json:"complete"`
}

Note the json directives at the end of each of the members in the structure. This is allowing us to control how the member is represented as an encoded JSON value. A more idomatic JSON has lowercased member names.

The “database” that our API will manage is a slice.

var Todos []Todo
var Id int

// . . . inside "main"
// Initialize the todo "database"
Id = 1
Todos = []Todo{ Todo{Id: Id, Description: "Buy Cola"} }

Implementation

To “list” out todo items, we simply return the encoded slice.

var ListTodoHandler = http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
  json.NewEncoder(w).Encode(Todos);  
})

Creating an item is a bit more complex due to value marshalling.

var CreateTodoHandler = http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
  decoder := json.NewDecoder(r.Body);
  var newTodo Todo

  err := decoder.Decode(&newTodo)

  if err != nil {
    w.WriteHeader(http.StatusInternalServerError)
    return
  } 

  defer r.Body.Close()

  Id ++
  newTodo.Id = Id

  Todos = append(Todos, newTodo)

  w.WriteHeader(http.StatusCreated)
  json.NewEncoder(w).Encode(Id);  
})

In order to implement a delete function, we need a Filter implementation that knows about Todo objects.

func Filter(vs []Todo, f func(Todo) bool) []Todo {
  vsf := make([]Todo, 0)
  for _, v := range vs {
    if f(v) {
      vsf = append(vsf, v)
    }
  }
  return vsf
}

We then add a reference to strconv because we’ll need Atoi to take in the string id and convert it to an int. Remember, the Id attribute of our Todo object is an int.

var DeleteTodoHandler = http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
  params := mux.Vars(r)
  id, _ := strconv.Atoi(params["id"])

  Todos = Filter(Todos, func(t Todo) bool { 
    return t.Id != id
  })

  w.WriteHeader(http.StatusNoContent)
})

Finally, an update. We’ll do the same thing as a DELETE, but we’ll swap the posted object in.

var UpdateTodoHandler = http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
  params := mux.Vars(r)
  id, _ := strconv.Atoi(params["id"])

  Todos = Filter(Todos, func(t Todo) bool { 
    return t.Id != id
  })

  decoder := json.NewDecoder(r.Body);
  var newTodo Todo

  err := decoder.Decode(&newTodo)

  if err != nil {
    w.WriteHeader(http.StatusInternalServerError)
    return
  } 

  defer r.Body.Close()

  newTodo.Id = id

  Todos = append(Todos, newTodo)

  w.WriteHeader(http.StatusNoContent)
})

The UpdateTodoHandler appears to be a mix of the delete action as well as create.

Up and running

You’re just about done. The Todo api is doing what we’ve asked it to do. The only thing left now, is to get some logging going. We’ll do that with some clever middleware again, from gorilla that will do just that.

import (
  // . . .

  "os"

  "github.com/gorilla/handlers"
 
  // . . . 

)

// . . down in main() now

  http.ListenAndServe(":3000", 
    handlers.LoggingHandler(os.Stdout, r))

This now gives us a status on requests hitting our server.

That’s all

That’s all for now. The full source is available as a gist.

Getting started with Go

Go is a general purpose programming language aiming at resolving some of the short-comings observed in other languages. Some key features of Go is that it’s statically typed, and has a major focus on making scalability, multiprocessing and networking easy.

In today’s post, I’ll go through some of the steps that I’ve taken to prepare a development environment that you can be immediately productive in.

Code organisation

To take a lot of the think work out of things, as well as present a consistent view from machine-to-machine, there are some strict rules around code organisation. A full run-down on the workspace can be found here; but for the purposes of today’s article we’ll look at locating a folder at ~/Source/go.

Docker for development

To not clutter my host system, I make extensive use of Docker containers. Docker containers allow me to run multiple versions of the same software concurrently, but also make all of my environments disposable. Whilst the instructions below will be centralised around the go command, all of these will be executed in context of a golang container. The following command sets up a container for the duration of one command’s execution:

docker run -ti --rm -v ~/Source/go:/go golang

-ti runs the container interactively allocating a TTY; --rm cleans the container up after the command has finished executing; we mount our go source folder inside the container at the pre-configured /go directory.

I found it beneficial to make an alias in zsh wrapping this up for me.

Hello, world

Getting that first application up and running is pretty painless. We need to create a directory for our project, build and run.

# Create the project folder
mkdir -p src/github.com/user/hello

# Get editing the program
cd src/github.com/user/hello
vim hello.go

As you’d expect, we create our program:

package main

import "fmt"

func main() {
  fmt.Printf("Hello, world\n")
}

Now we can build the program.

go install github.com/user/hello

We’re done

You’ll have a binary waiting for you to execute now.

bin/hello
Hello, world

Blockchain Basics

A blockchain is a linked list of record items that are chained together with hashes. To make it a little more concrete, each subsequent block in a chain contains its predecessors hash as a piece of the information made up to make its own hash.

This forms a strong chain of records that is very difficult to change without re-processing all of the ancestor records.

Each record in the chain typically stores:

  • A timestamp
  • The actual data for the block
  • A reference to the predecessor block

In today’s post, I’ll try to continue this explanation using an implementation written in C++.

A simple implementation

It’ll be a pretty easy build. We’ll need a block class, which really does all of the work for us. We’ll need a way to hash a block in a way that gives us a re-usable string. Finally, we’ll put the whole implementation using a vector.

The block

We need a timestamp, the actual data and the hash of the predecessor.

class block {
public:
  block(const long int ts, const std::string &data, const std::string &prev_hash)
    : _ts(ts), _data(data), _prev_hash(prev_hash) { }

public:
  const long ts() const   { return _ts; }
  const std::string& data() const { return _data; }
  const std::string& prev_hash() const { return _prev_hash; }

private:
  long _ts;
  std::string _data;
  std::string _prev_hash;
};

In this class, _ts assumes the role of the timestamp; _data holds an arbitrary string of our data and _prev_hash will be the hex string of the hash from the previous record.

The block needs a way of hashing all of its details to produce a new hash. We’ll do this by concatenating all of the data within the block, and running it through a SHA256 hasher. I found a really simple implementation here.

std::string hash(void) const {
  std::stringstream ss;

  ss << _ts 
     << _data 
     << _prev_hash; 

  std::string src = ss.str();
  std::vector<unsigned char> hash(32);
  
  picosha2::hash256(
    src.begin(), 
    src.end(), 
    hash.begin(), 
    hash.end()
  );

  return picosha2::bytes_to_hex_string(
    hash.begin(), hash.end()
  );
}

_ts, _data and _prev_hash get concatenated and hashed.

Now we need a way to seed a chain, as well as build subsequent blocks. Seeding a list is nothing more than just generating a single block that contains no previous reference:

static block create_seed(void) {
  auto temp_ts = std::chrono::system_clock::now().time_since_epoch();

  return block(
    temp_ts.count(),
    "Seed block",
    ""
  );
}

Really simple. Empty string can be swapped out for nullptr should we want to add some more branches to the hasher and change the internal type of _prev_hash. This will do for our purposes though.

static block create_next(const block &b, const std::string &data) {
  auto temp_ts = std::chrono::system_clock::now().time_since_epoch();

  return block(
    temp_ts.count(),
    data, 
    b.hash()
  );    
}

The next blocks need to be generated from another block; in this case b. We use its hash to populate the _prev_hash field of the new block.

This is the key part of the design though. With the previous block making in to being a part of the concatenated string that gets hashed into this new block, we form a strong dependency on it. This dependency is what chains the records together and makes it very difficult to change.

Finally, we can test out our implementation. I’ve created a function called make_data which just generates a JSON string, ready for the _data field to manage. It simply holds 3 random numbers; but you could imagine that this might be imperative data for your business process.

int main(int argc, char *argv[]) {

  std::vector<block> chain = { 
    block::create_seed()
  };

  for (int i = 0; i < 5; i ++) {
    // get the last block in the chain
    auto last = chain[chain.size() - 1];

    // create the next block
    chain.push_back(block::create_next(last, make_data()));
  }

  print_chain(chain);

  return 0;
}

Running this code, we can see that the chains are printed to screen:

index: 0
ts: 1502801929223372929
data: Seed block
this: b468ae4c1a5a0b416162a59ebcdd75922ab011d0cc434c8c408b6507459abd5b
prev: 
-------------------------------------
index: 1
ts: 1502801929223494692
data: { "a": 1804289383,"b": 846930886,"c": 1681692777 }
this: 25d892a8de27890ee057923e784124c9c07161ff340d3d11e5f76b5a865e03af
prev: b468ae4c1a5a0b416162a59ebcdd75922ab011d0cc434c8c408b6507459abd5b
-------------------------------------
index: 2
ts: 1502801929223598810
data: { "a": 1714636915,"b": 1957747793,"c": 424238335 }
this: 8a7d5fe462e71663cccda99f80cce99b25199b82fbefc11c6a3f6c2cc4e985f3
prev: 25d892a8de27890ee057923e784124c9c07161ff340d3d11e5f76b5a865e03af
-------------------------------------
index: 3
ts: 1502801929223720644
data: { "a": 719885386,"b": 1649760492,"c": 596516649 }
this: 0da2de773551a15f6bca003196a02b30312c895dab6835c9ac3434f852eeaa60
prev: 8a7d5fe462e71663cccda99f80cce99b25199b82fbefc11c6a3f6c2cc4e985f3
-------------------------------------
index: 4
ts: 1502801929223837467
data: { "a": 1189641421,"b": 1025202362,"c": 1350490027 }
this: a3709be80f80c24ac6ebc526a9dcec5e2212c03260a2f82f5e9943f762becb6e
prev: 0da2de773551a15f6bca003196a02b30312c895dab6835c9ac3434f852eeaa60
-------------------------------------
index: 5
ts: 1502801929223952445
data: { "a": 783368690,"b": 1102520059,"c": 2044897763 }
this: c1de653540556064b3d01fba21d0a80a07071b19969d3e635ad66eb3db2e6272
prev: a3709be80f80c24ac6ebc526a9dcec5e2212c03260a2f82f5e9943f762becb6e
-------------------------------------

Note that index isn’t a member of the class; it just counts while we’re iterating over the vector. The real membership here is established through the _prev_hash; as discussed above.

Where to?

Now that the storage mechanism is understood, we can apply proof-of-work paradigms to attribute a sense of value to our records. More information on how this has been applied can be read up in the following:

The full source code for this article can be found here.