Cogs and Levers A blog full of technical stuff

Either type in Haskell

Introduction

Doing some further work in the world of Haskell and have come across the Either type from the base library on a few occasions. Today I’ll post about how to work with this type as you’ll come across it a bit and it is quite handy.

Background

Just as its english counterpart describes, Either can represent one value or another. Scenarios where this might be the return value from a function where you may get the successful result value or you might get an error value. The Either data type is defined as follows.

You construct an Either by calling either the Left or Right constructor. So, Either is just a wrapper around a value that can be either one or the other.

Working with it

In GHCI I have created two instances of Either. “lefty” is constructed using Left, “righty” with Right.

> let lefty = Left 10
> let righty = Right "John"
> :t lefty
lefty :: Either Integer b
> :t righty
righty :: Either a [Char]

That all seems pretty straight forward. Calling Left or Right gives us back a value of an incomplete type. Haskell only really knows how to fill the types out that we’ve actually used. Having a look at the values that we’ve created, we’re reminded that they are either Left or Right values as such.

> lefty
Left 10
> righty
Right "John"

Convenient, but if we’re going to have a chance of using these values for anything real, we’ll need to extract or unbox the value from the Either construct. The Either type has two functions which will take the boxed values into array called lefts and rights. This makes sense. Take a look at how these functions interact with lefty and righty.

> lefts [lefty]
[10]
> rights [lefty]
[]
> lefts [righty]
[]
> rights [righty]
["John"]

They’ve now been taken out of the Either construct and are values ready to be processed sitting in a list. In the next example, we use pattern matching to detect if we’re trying to divide by zero. Even though my preference is to always say that it’s infinity, computers just like to complain about it.

safeDiv :: Float -> Float -> Either String Float
safeDiv x 0 = Left "Divison by zero"
safeDiv x y = Right (x / y)

The type that’s used here Either String Float says that we’re either going to receive a String or a Float in this value. You can see the case for zero division offering a String on the Left, otherwise we supply the quotient on the Right.

There ya have it!

Write an LLVM Backend Tutorial for Cpu0

One day, when I get a chance I will get through this entire article and actually give it a crack, but, until they invent the 48 hour day I’m stuck with just making a bookmark post. I’ve scratched the surface of this article and it’s comprehensive. Really very interesting if it’s your sort of thing.

https://jonathan2251.github.io/lbd/index.html

Recursive Data Structures in Haskell

Introduction

As I’ve been following along in Learn You a Haskell for Great Good, I’ve been coming across some really interesting material around the type system. In today’s post, I want to walk through the Tree data type that they define in this book for the section on recursive data structures.

What it is?

A recursive data structure is a type that has itself as a type amongst its fields. Pretty wild idea, but really useful for situations of Child -> Parent, Tree -> Leaf, etc. In today’s example we’re going to be using the Tree structure which looks like this.

data Tree a = EmptyTree | Node a (Tree a) (Tree a)
  deriving (Show)

This data type reads as, a Tree can either be an EmptyTree or it can have two nodes (that are also trees). You can start to see that this would be a best-fit as a binary search tree. Using an instance of Functor, we can give the the Tree data type recursively iterative properties through fmap.

instance Functor Tree where
  fmap f EmptyTree = EmptyTree
  fmap f (Node x left right) = Node (f x) (fmap f left) (fmap f right)

Using the type

Getting data into the tree is done cleverly (or so I think) with some basic pattern matching. I think it’s done cleverly because it reads so well/humanly (perhaps I’m easily pleased).

singleton :: a -> Tree a
singleton x = Node x EmptyTree EmptyTree

treeInsert :: (Ord a) => a -> Tree a -> Tree a
treeInsert x EmptyTree = singleton x
treeInsert x (Node a left right)
   | x == a = Node x left right
   | x < a  = Node a (treeInsert x left) right
   | x > a  = Node a left (treeInsert x right)

Inserting an EmptyTree node will result in a node with an EmptyTree either side. Inserting data for key that exists will overwrite the existing data. Inserting a lesser key (lesser according to Ord) will put the node on the left, greater onto the right. Actually building a tree can be as simple as the following.

fmap (*1) (foldr treeInsert EmptyTree [5,7,3])

This results in a tree that looks like this.

*Main> fmap (*1) (foldr treeInsert EmptyTree [5,7,3])
Node 3 EmptyTree (Node 7 (Node 5 EmptyTree EmptyTree) EmptyTree)

3 is at the trunk with no left-hand-side. 3’s right hand side has a 7 with 5 for its left hand side and no right hand sides (for 5 or 7). Reading the dump out from ghci is going to be better for your understanding than trying to read my English describing the tree’s structure. The important part that I wanted to illustrate really was the use of fmap here. We map the function (*1) so that none of the values in our source array change. We apply treeInsert in a fold-right starting with EmptyTree across the source array [5,7,3].

If you ask me, that’s pretty cool.

Picking your Smart Pointers

Introduction

I wanted to do a quick write up on the Boost library’s Smart Pointers and when they should and shouldn’t be used. I’d hope to use this post in future (if the knowledge doesn’t stick in my head) as a rough guide when finding the right tool for the job. For the uninitiated, I strongly advise that you go through the smart pointer documentation on the Boost website as it’s a real eye-opener as to just how hands-off you can now be with dynamic memory allocation in C++ (these days).

Why use Smart Pointers?

Smart pointers will help you manage the lifetime of your objects. It will force you to think about the ownership of these pointers and who’s currently “in-charge” in some cases. They will allow you to think in terms of observation of objects so that you don’t disturb the ownership of a resource and they just generally make your code cleaner, easier to maintain and read.

How can I start using Smart Pointers?

Get Boost! That’s going to be the best way. Dive right in, take a look at samples, set things up, blow them up - be a scientist about it! Anyway, enough of this! On to the pointers.

scoped_ptr & scoped_array

scoped_ptr is all about ensuring that the pointer that you’re working with is the ultimate owner of the resource that it points to. There’s no facility within this pointer type to transfer the ownership of the inner resource elsewhere. With all of this in mind, scoped_ptr ensures that the resource that is under ownership will be destroyed properly once the pointer has dropped out of scope. scoped_ptr is a very lightweight resource. It’s by no means going to harm the performance or size of your application. scoped_array will perform the same service as scoped_ptr does, it’s just that scoped_array will work on array types (as the name suggests).

shared_ptr & shared_array

shared_ptr is all about reference counting. They will internally manage the reference count that they have and govern the managed resource’s lifespan based on this. The clear advantage that they have over the scoped_ptr and scoped_array counterparts is their ability to be shared between multiple owner so that those owners can maintain their interest in the object by “hanging around”. The true power of this class of pointer is when you don’t know when to delete the underlying resource. As long as someone is referencing you, you’ll stay alive. shared_array will perform the same service as shared_ptr does, it’s just that shared_array will work on array types (deja vu anyone?)

intrusive_ptr

The intrusive_ptr is another reference counting pointer only is allows you to provide your own mechanism for performing the reference counting. This means if you have an existing codebase that does all of this work for you, all you need to do is provide it to an intrusive_ptr. intrusive_ptr also allows for native usage of the this keyword.

weak_ptr

A weak_ptr just performs observation on a shared_ptr without getting its hands into ownership. It’s used purely at an observation capacity.

Conclusion

That’s it for a brief smart pointer analysis. Hopefully these tid-bits will help you decide which pointer fits your problem best.

A brief MapReduce tutorial for MongoDB

Introduction

This is just a short tutorial on using the MapReduce operation from within the MongoDB environment.

What is it and how does it work?

MapReduce is a database operation. Its intended for performing higher-level (or more complex) aggregation tasks across huge data stores. You supply it with two functions, a map function and a reduce function and it will supply you with the aggregated result. As it’s defined from the MongoDB documentation, the operation takes the following form:

db.collection.mapReduce(
                         <mapfunction>,
                         <reducefunction>,
                         {
                         	out: <collection>,
                         	query: <document>,
                         	sort: <document>,
                         	limit: <number>,
                         	finalize: <function>,
                         	scope: <document>,
                         	jsMode: <boolean>,
                         	verbose: <boolean>
                         }
)

You can immediately see the first two parameters to this operation as the map function and the reduce function. The remaining parameter is focused on the delivery of the result back to you.

The Map Function

The map function is responsible for defining the data that you want to work with. For instance, if you’re interested in accumulating the sum of all products sold by their category you will want to return both the amount sold per line item as well as the product’s category.

var mapFunction = function() {
	emit(this.category, this.price);
};

Note the use of the emit function. Its use tells the MapReduce operation that the key for this process will be category and the value will be price. The value part of the emit function can also take form of a javascript object itself should you need to perform multiple aggregations across the same data set.

var mapFunction = function() {
    emit(this.category, { price: this.price, count: 1 });
};

This particular map function gives us a count of 1 per result that comes out so we would be able to not only sum the price but find out how many items made up the sum as well.

The Reduce Function

The reduce function is responsible for performing the aggregation on the data emitted by the map function. It is passed the key-values emitted by the map function as parameters to it. Performing the reduce function to get the sum of all prices and the number of items making up the sum total would look like this.

var reduceFunction = function(key, values) {
    outValue = { total: 0, count: 0 };

    // aggregate all of the values for this key
    for (var i = 0; i < values.length; i ++) {
        outValue.total += values[i].price;
        outValue.count += values[i].count;
    }

    return outValue;
};

All of the “magic” happens via-javascript. Pretty easy really.

Getting a result

Putting your map and reduce function together for the operation ends up looking like this (for a simple scenario).

var res = db.sales.mapReduce(
    mapFunction, 
    reduceFunction, 
    { 
    	out: "result"
    }
);

This will get the process underway against the database. Once it’s finished doing the hard-yards, then you can start basing your aggregation reports off of the “res” variable that we built just above by doing the following.

db[res.result].find();

This is only a “scratch-the-surface” - “know enough to be dangerous” type write-up. MapReduce is a complex and powerful tool that you should be reading the official documentation about if you want to achieve ninja-like status.