In this post I would like to present some basic concepts in folding. This really will be over in a flash, so don’t blink - it’s easy.
tl;dr
The Haskell functions foldl and foldr allow you to “fold” functions between values.
Diagram it for me!
If you read up on these functions in the documentation, you’ll see mention of “reducing values in a list” and such. All this really means, is that you’re going to iterate through a list apply a function at every stop and finish up with a “reduced” answer. Here, take a look at this. I have an array spanning 1 through 5.
[1, 2, 3, 4, 5]
I want to add all of the values together, so I use foldl to move through the list applying the + operator.
foldl(+)0[1,2,3,4,5]
This command in english says, apply the + operator between each element in the list (moving left to right) with an initial value of 0. Or:
0 + 1 + 2 + 3 + 4 + 5 = 15
Bring it back the other way!
Folding right is interesting. No so much for the example that we have above, as addition moving left or right is at identity with each other. I’ve prepared a more interesting example for moving to the right. Take a look at the following and the results:
foldl(-)10[10,20,30]-50foldr(-)10[10,20,30]10
Wow, that’s quite the difference. When folding right, the reduction occurs in reverse (values right to left) and then it’s applied to the initial value.
So, there we are folding to the right! That’s (very basic) folding for you anyway. An interesting follow-up to this article is the function foldl'. By nature foldl that we’ve just discussed is lazy, meaning it will build the list of computations to execute using the source array and only execute those computations once the array is depleted (internally). It’s been shown that this model of execution can cause stack overflow errors for larger lists because of this deferred execution model. foldl' solves this by not deferring execution. So, the overall result will be the same it’s just that foldl' won’t be lazy in getting the answer back to you.
I think it’s important to follow up my previous post on anonymous functions with a post on currying. One of the more difficult concepts to think about (only because Haskell does a great job of separating you from this) is that every function only has 1 argument.
Take this basic greeting example which expects the name of someone who is doing the greeting and the name of someone who is being greeted:
letsayHellotoWhofromWho=fromWho++" says Hello to "++toWho
This is a pretty trivial example. Give it two names and the computer will appear to play nice between these two people:
*Main> sayHello "John" "Peter"
"Peter says Hello to John"
We like John so much, that we’re going to make a new function using this existing one.
letsayHelloToJohn=sayHello"John"
So now, we can get anyone to play nice with John:
*Main> sayHelloToJohn "Peter"
"Peter says Hello to John"
*Main> sayHelloToJohn "Joe"
"Joe says Hello to John"
*Main> sayHelloToJohn "Jane"
"Jane says Hello to John"
Great! We’ve just made a partially applied function. When you don’t specify enough parameters to a function, you’re actually returned a function (or, partially applied function) that you can continue to use. Breaking down how this works, when Jane is saying hello to John she is actually doing so by doing this:
(sayHello"John")"Jane"
This should at least explain my outlandish claims above of functions only having one argument, anyway. You’ve just witnessed function currying in motion. These principles are also directly applicable on infix functions as well, they just need a little extra help to be told so. Take this for example:
double::(Floatinga)=>a->adouble=(*2)
Ignoring the function definition, you can see that all you need to do for infix functions is to surround then with parenthesis. You need to supply the value that makes it a partial application of course!
Function composition is a no-brainer concept for Haskell that makes a lot of sense. It’s truly aligned with its mathematical equivelant where you have two given functions:
f(x) = x * x
g(x) = x + 2
The composition part comes in when you nest these functions, so that you end up with:
Mathematically, this is cake. We just nest the second function inside the first. If we look at it with respect to Haskell we end up with this:
-- | g(x) = x + 2letgx=x+2-- | f(x) = x * xletfx=x*x-- | f(g(x))letfg=f.g
So, the . operator is doing the composition here for us. Below I’ve got the mathematical representation on the left, Haskell on the right.
f(g(x)) = (f . g) x
The real power here is that you can use function composition with any two functions, just as long as the return type of the second function is the same as the argument taken by the first function.
It’s no doubt that when you’re first entering the wild world of Haskell, that it’s syntax is a little alien to look at, at first glance. Once of the major building blocks or keys to unlocking power in Haskell are anonymous functions. The idea of higher-order functions in languages these days is becoming more of a standard rather than a feature and Haskell is no exception. There’s plenty around on the web to read about anonymous functions, higher-order functions and of course [functional programming](http://en.wikipedia.org/wiki/Functional_programming.
An example
A fairly unintuitive example, but shows how you can assign an anonymous function to a handleable variable is shown:
lety=(\x->x)
Breaking this down, the key part is (\x -> x). Immediately, you can see that anonymous functions take the form of (\param1 .. paramn -> body). The real power here is unlocked when you use an anonymous function in conjunction with other functions. The best example I think is the use of the map function. The map function is defined as follows:
map::(a->b)->[a]->[b]mapfxs=[fx|x<-xs]
This definition says, give me a function f and I’ll take all of the a’s and turn them into b’s. Obviously, the anonymous function comes in for the f component. For simplicity, a can be an array of integers.
map(\x->x*x)[1,2..100]
So the above example maps the function (x*x) across an array from 1 to 100. Easy!
I’ve been looking for a simple way to quickly supply some of my applications with easy-access configuration information. Most of the times these are games and smaller utility applications. Enterprise scale applications deserve configuration systems of their own gargantuan proportions, but that’s another story. Thinking back quite a few years, the windows crew had a pretty simple way to supply this sort of information to their running programs, they used the INI file format.
How’s it look?
It’s a pretty basic, plain-text format. Typically you’d see something along the lines of:
You get the idea. The wikipedia link above will give you a more thorough run-down should you need it.
How to make it usable?
The aim is to basically feed an INI file into this process and have some very easy to use C++ objects out the other side. The STL contains some very suitable container style objects that will measure up, so I want something like this:
The map class is a convenient way to manage key-value pairs. Perfect. Looking at the code, the types are very long-winded. All of the namespace declaration actually mixed with the types makes for a very long line. This can be cleaned up by “using” the std namespace, but the whole type definition itself could be cleaned up by just using a typedef. I’ll be writing the types long-hand for the duration of this tutorial, just so we’re clear.
Doing it smart
Thanks to the rigid format of the file, we’ve got a very solid standard set in place as to what we’ll expect when we crack the file open. We can centralise all of our file processing around two regular expressions.
\[(.*?)\]
This first regular expression is our test for the section parts. It tests that the line being interpreted is wrapped in square brackets. When we run this through the regular expression system, it’ll allow us to extract just the name. Nice.
(\w+)=([^\+]+(?!\+{3}))
This second expression will do our key value pair testing. When we use it in extraction with a regular expression, the first match will be the key, the second - the value. Double nice. So, we can fire these regular expressions up using the boost regex library. I’m led to believe that parts of the boost library will be appearing in the newest C++ standard, regular expressions being one of them.
Putting it all together
The block of code in the end is quite simple.
std::stringline,current;// regular expressions to process ini filesboost::regexsection_test("\\[(.*?)\\]");boost::regexvalue_test("(\\w+)=([^\\+]+(?!\\+{3}))");// the result config objectstd::map<std::string,std::map<std::string,std::string>>config;// assuming we've opened the file ok into a// filestream object called "mapfile"while(getline(mapfile,line)){boost::trim(line);if(line.length()>0){boost::smatchmatch;if(boost::regex_search(line,match,section_test)){// any key-value pairs from here to be attributed // to this new namecurrent=match[1];}elseif(boost::regex_search(line,match,value_test)){// set this as a key value pair on the current nameconfig[current][match[1]]=match[2];}}}
So, with a couple of tests to assure us that the values are ok we’ve got a pretty crude implementation here. Erroneous lines are ignored rather than responded to in an exception case. If no initial section is supplied before some key value pairs, those pairs will go into an item section with an empty string.
Anyway, if the user is half-sane about how they treat their INI files, you’ll be just fine.