Yoda speaks Visual Haskell

This isn't a Haskell tutorial, but it is a Haskell introduction. I don't want to try to teach Haskell, but I do want to break the ice and get you interested enough in Haskell so that you'll explore the language more fully.

I'm not going to go into detail on some of the most basic operations (list construction and syntax for example) you'd expect to see in a tutorial. You're smart enough to work it out by example, but clearly a read of a proper Haskell tutorial will help before you get your hands too dirty.

I'm going to assume that you already know how your way around Visual Studio and Windows™. All of the most important points are of interest to anybody thinking about learning Haskell, but if you're using some other environment I can't help you do basic things like compile the program.

I was researching an article for kirit.com on “zeugma” and came across a Yoda quote on Wikipedia:

Fear leads to anger. Anger leads to hate. Hate leads to suffering.

This is an example of hypozeuxis which means the repetition of the same verb for several subjects¹ [1Normally this would be factored out, but it would lose impact and it's hard to preserve the full meaning without making it sound clumsy. “Fear leads to anger, through hate to suffering” just sounds stupid, not gnostic.]. It immediately struck me that this would be a perfect thing to try for my first foray into Haskell. I also figured it would be interesting for people who hadn't tried Haskell before.

So what we're going to do is to start with a list like this:

[ "fear", "anger", "hate", "suffering" ]

And end up with a string like this:

"Fear leads to anger. Anger leads to hate. Hate leads to suffering."

In order to get there I'm going to cover some of the more interesting and important points in functional programming. I'll show a little bit about how algorithms are created and written, how lists are manipulated and the basics of how pattern matching works. I'll also cover something called partial application (closely related to currying) and two of the most important functional algorithms, map and fold.

I've chosen these topics as they're the ones that are easy to start with, easy to understand given the problem that we're working on and because they're all fundamental building blocks of functional programming and functional idioms.

The first thing you'll need of course is Visual Haskell which can be downloaded and installed via the Visual Haskell home page. You need to download the one for your compiler, .NET Studio 2003 uses the 7.1 download and .NET Studio 2005 uses the 8.0 download.

Using Visual Haskell

Once you've got it installed you should have a Haskell group to choose from when creating a new project to add to a Visual Studio solution. We're going to start off with a command line Haskell program. Visual Haskell will make the project files you need and you will get a single file that contains a simple “Hello World” program.

Tip 1 — Visual Haskell gets really, really, really upset if you have spaces in any of the directories between the disk's root and the Haskell project.

This is the code that we've been given by Visual Haskell to start with:

module Main where

main = do
	putStrLn "Hello, world!"

Tip 2 — The Haskell compiler really, really, really upsets Windows. It looks to me like the compiler is self modifying, so if it crashes it is because Windows has stepped in and stopped it. You may need to log out and back in again if this happens in order to see the dialogue box asking you to make an exception for it, or you can find it via control panel ➜ system properties ➜ advanced ➜ performance ➜ settings ➜ data execution prevention (phew). The file you need to make an exception for is “C:\Program Files\Visual Haskell\bin\ghc.exe”.

Ignore everything except the string. In Haskell the string that we see is really just syntactic sugar for a list of characters. To start let's change this code to the following (I'm going to rely on your intellect to be able to work out which bits we're changing and which bits we're leaving alone):

main = do
	putStrLn "fear leads to anger"

Tip 3 — There seems to be no simple way to run the program and actually see what it did. The executable is tucked down in some sub-directory where it is hard to find, but if you want to see what output you get then find it you must² [2Geoff Pullum has a good explanation of the fun Yoda has with verbs and why they often end up at the end of a sentence.]. You will have to run the program from the command prompt because the command prompt to see program output use.

So far so good, we have a great starting point.

Starting to code

Haskell has an operator which concatenates two lists (i.e. adds one list to the end of another) which is ++. We can break this first part of the problem down into splitting out the parts of the string that we need:

main = do
	putStrLn ( "fear" ++ " leads to " ++ "anger" )

You'll immediately notice that as you edit the code you get little red wiggly lines wherever there is an error. Frankly the errors don't make sense to me, but if you started off editing to this:

main = do
	putStrLn "fear" ++ " leads to " ++ "anger"

You would have got a more or less incomprehensible error message which will appear as a tool-tip if you hover over the expression, and will be confirmed by the compiler if you try to compile the code.

I think that a little thought will convince you that it's a precedence problem, hence the extra brackets. You can see this as well from the placement of the wiggly line. It only sits under the putStrLn “fear” part of the code. This gives a visual indication that the compiler sees this:

main = do
	( putStrLn "fear" ) ++ " leads to " ++ "anger"

And that it doesn't like what it sees³ [3The error message is actually telling us that the return type of the expression putStrLn “fear” cannot be prepended to the other two lists, but I think you can be forgiven for missing that right now.].

Our first function

We now know how to make a simple expression so we should be able to turn this into a function, after all we'll want to appply “leads to” to three different pairs of words. We'll call the function interstitial.

A function looks a bit like an assignment in other languages, but really it's better to think of it like a mathematical formula:

interstitial left right = left ++ " leads to " ++ right

Writing a function doesn't get much easier than that.

Tip 4 — Haskellers tend to use very short identifier names. I'm going to use longer ones as I think they look less cryptic. As you get better at Haskell and more used to the syntax you'll find yourself using shorter names.

Using the function is pretty obvious too:

main = do
	putStrLn ( interstitial "fear" "anger" )

Note that the brackets here aren't anything to do with surrounding the arguments to apply to the function like in most other languages. The brackets are here just to ensure that the expression is handled in the right order.

Running this we get:

fear leads to anger

Not a bad start. We've played a bit with lists and seen that they can be combined and we've also written our first function.

Lists and tuples

The next thing we're going to do is to stop using putStrLn and to start to use print. This is simply because that putStrLn will only write out a list of characters (a string) whereas print will print out any old thing to the best of its ability. Try the change and you will get this as your output:

"fear leads to anger"

Back to our starting point:

[ "fear", "anger", "hate", "suffering" ]

We can see that we're going to have to do some duplication here, but what form are these duplicates going to take?

So far we've seen lists and, important as they are, they're not the only useful way to structure data. If we want to store the distances that I've walked each time I've got up and paced around whilst thinking about this introduction we might get something like this:

[2, 3, 3, 3, 3, 4, 10, 3]

The fridge is three meters away⁴ [4Which is where I keep my caffeine fix.]. The list stores an unspecified number of similar data points.

If I want to specify a location though I need an ordered collection. I'm writing this at (13.912205, 100.582629) — click to see in Google Earth. Co-ordinates like this are always given in the same order: latitude then longitude and there are always two of them.

This is called a tuple and of course the type of each datum doesn't have to be the same and tuples can also nest. If we include my height above sea level then the co-ordinate might look like this: (13.912205, 100.582629, ( 5, metre ) ).

Tuples are also very important in Haskell and they have a special syntax which is the same as the one we would normally use — they're simply comma separated values between brackets.

Our basic sentence structure that we want to end up at (if you factor out the “leads to” which interstitial will add for us) looks a bit like this transformation:

[ "fear", "anger", "hate", "suffering" ]
[ ( "fear", "anger" ), ( "anger", "hate" ), ( "hate", "suffering" ) ]

Each tuple represents a sentence and the tuples together represent the three sentences in Yoda's hypozeuxisism⁵ [5I probably just made that word up, but it does have a rather cool repeated … .]. The big question is “can we write a function that gives us the latter from the former?”

A more complex function

We'll call our new function inner_duplicate. Before we can tackle that we need to know one other bit of syntax and think through properly what we have so far.

We've already seen the list concatenation operator ++. This takes two lists and returns a single list, but we have to look carefully at our initial data.

[ "fear", "anger", "hate", "suffering" ]

This isn't actually a list of characters. Because each string is a list of characters this is actually a list of a list of characters. We can see that there is a problem if we try this:

"fear" ++ [ "anger", "hate", "suffering" ]

The reason this doesn't work is that we're trying to append a list of characters to a list of list of characters and this doesn't work. If we want to do that we need another operator that carries out this specific thing. That operator is the colon, :, (often called cons) and it takes an item and a list of items and returns a copy of the list with the new item at the front. Try this:

main = do
	print ( "fear" : [ "anger", "hate", "suffering" ] )

We can also add more than one item at a time:

main = do
	print ( "fear" : "anger" : [ "hate", "suffering" ] )

Although this doesn't work:

main = do
	print ( "fear" : "anger" : "hate" : "suffering" )

Remember that the operator adds to the front of a list so it needs a list to operate on. We can give it an empty list like this though:

main = do
	print ( "fear" : "anger" : "hate" : "suffering" : [] )

All of this is a rather round-about way of getting to pattern matching which is really a lot like what we've done to make lists but in reverse.

Pattern matching — the opposite of list creation

What we want to do is to take the fist term with the second and then repeat this with the second and the rest of the list until we're down to the last two when we just pair them up. It's easier than it sounds, but a diagram might help:

"fear" : "anger" : "hate" : "suffering" : []
( "fear", "anger" ) : "anger" : "hate" : "suffering" : []
( "fear", "anger" ) : ( "anger", "hate" ) : "hate" : "suffering" : []
( "fear", "anger" ) : ( "anger", "hate" ) : ( "hate", "suffering" ) : []

Replace the words with some symbols and you might describe the algorithm in this way:

inner_duplicate first : second : the_rest = ( first, second) : ( inner_duplicate second : the_rest )

This handles all but the last step which can be expressed like this:

inner_duplicate first : second : [] = ( first, second ) : []

Actually this is more or less exactly the way to write it in Haskell. The only thing that we need to do is to add some brackets so that the pattern matching is done first:

inner_duplicate ( first : second : [] ) = [ ( first, second ) ]
inner_duplicate ( first : second : the_rest ) = ( first, second) : ( inner_duplicate ( second : the_rest ) )

If this looks like magic it's only because it should. Try it out:

inner_duplicate ( first : second : [] ) = [ ( first, second ) ]
inner_duplicate ( first : second : the_rest ) = ( first, second) : ( inner_duplicate ( second : the_rest ) )

main = do
	print ( inner_duplicate ( "fear" : "anger" : "hate" : "suffering" : [] ) )

[("fear","anger"),("anger","hate"),("hate","suffering")]

Interstitial injection

We need to quickly revisit our interstitial function because at the moment it does this:

interstitial “fear” “anger” ➜ “fear leads to anger”

And what we need now is this:

interstitial (“fear”,“anger”) ➜ “fear leads to anger”

When we look at it like this re-writing the function is surprisingly easy:

interstitial ( left,  right ) = left ++ " leads to " ++ right

Now that we can take the tuples in our list how can we execute this? The obvious thing to do is to write a function that does it. We have a list of binary tuples and we are going to get a list of strings instead.

Here's one simple way:

do_interstitial [] = []
do_interstitial ( head : the_rest ) = interstitial head : do_interstitial the_rest

main = do
	print ( do_interstitial ( inner_duplicate ( "fear" : "anger" : "hate" : "suffering" : [] ) ) )

Again we have a simple recursive function. Empty lists are always just empty and for any other list we just apply our function to the head before carrying on with the rest.

I've picked out the expression interstitial head in bold above because it's the only bit of the definition that actually uses the interstitial function itself. The rest of it has exactly the same form no matter which function we want to apply to a list. It is also a pretty common thing to do. In fact its so common that it would be a real pain to have to do all of this typing every time we want to do something to each element in a list. There has to be an easier way, and of course there is. We can use map.

Enter the map

map is a very clever little function. What you do is to give it a function and a list and it applies the function to each element in the list and returns us a new list with the results.

This explanation sounds very complicated, but the idea is actually quite simple as we can see by writing our version of it. We'll call ours do_function and we'll just re-write our version of do_interstitial for it.

First off, the base case. Remember we have to take a function as our first argument and the list as the next so we will get this:

do_function function [] = []

Just the same except for the extra parameter we're ignoring. The next part looks just like the second line of do_interstitial except now we're using the function that's passed in (and we have to pass it on in the recursive call).

do_function function ( head : the_rest ) = function head : do_function function the_rest

That doesn't look too bad so far. Here's us calling it:

main = do
	print
		( do_function
			interstitial
			( inner_duplicate ( "fear" : "anger" : "hate" : "suffering" : [] ) ) )

I've written the two arguments to do_function on separate lines otherwise we're likely to get confused by all the brackets.

So if we think we understand how do_function works we should be able to replace it with the standard map function. All we do is delete our version and use the normal one:

module Main where

interstitial ( left,  right ) = left ++ " leads to " ++ right

inner_duplicate ( first : second : [] ) = [ ( first, second ) ]
inner_duplicate ( first : second : the_rest ) = ( first, second) : ( inner_duplicate ( second : the_rest ) )

main = do
	print
		( map
			interstitial
			( inner_duplicate ( "fear" : "anger" : "hate" : "suffering" : [] ) ) )

Yes, this is the entire program at this point! And the output we get looks like this:

["fear leads to anger","anger leads to hate","hate leads to suffering"]

Making Curry

This section actually contains a bit of terminology abuse, but it seems so common in Haskell that I'm going to do it anyway. What we are doing here is partial application rather than currying. We are taking a function and applying an argument and getting back a function that takes the rest of the arguments.

Currying doesn't take any values. It just takes a function and returns a function. The difference is in how the arguments are applied to the two functions in order to get the result out.

A binary function that was curried would be converted from a function that takes two arguments at the same time into a function that takes one argument and returns a function that takes one argument and returns the result. In both cases two arguments are still needed, but before you curry the function you must supply them both together, but after currying you supply them one at a time getting back a function as an intermediate value. It's another of those things that's simpler than it sounds.

In Haskell it is common to define all functions in curried form which I guess is one of the causes of the lax terminology.

The one thing that bothers me (apart from that we haven't quite finished as we don't have a single string yet) is that the text “leads to” is buried in that interstitial function. Wouldn't it be better to pull it out somehow?

Haskell does provide a mechanism to do this and it is called Currying. The technical definition of Currying is quite complex, but for our purposes we don't need that. Let's start by pulling the string out anyway so we have this:

interstitial inner ( left,  right ) = left ++ inner ++ right

The problem we have is how do we now get the string in there? The way we're using map takes a function that does this:

interstitial (“fear”,“anger”) ➜ “fear leads to anger”

But now we're doing this:

interstitial “leads to” (“fear”,“anger”) ➜ “fear leads to anger”

What we need to do is to wrap the interstitial up with the first argument to give us this sort of thing:

( interstitial “leads to” ) (“fear”,“anger”) ➜ “fear leads to anger”

This is actually exactly what partial application is. We can apply any number of arguments to a function and we get back either an answer if we've given it all of the arguments⁶ [6Actually Haskell won't bother to work out what the answer actually is until we use the value we get from the function. Suitably enough, this feature of a programming language is called lazy.], or a function that takes the rest of them. Again it's harder to explain than to use:

main = do
	print
		( map
			( interstitial " leads to " )
			( inner_duplicate ( "fear" : "anger" : "hate" : "suffering" : [] ) ) )

All we do is to call interstitial with the first argument and then pass the function we get back to map which will repeatedly apply the function to the tuples in the list until we get all of our strings.

Hypozeuxis

The code that we've got so far is certainly pretty cool, but we've still got the actual program logic mixed up with the text we're working with. We should be able to separate them by writing a function that takes the “leads to” and the word list and applies the logic to it.

Here it is⁷ [7If my grammar understanding is about right then the parameter that I call verb is really a verb phrase.]:

hypozeuxis verb words = map ( interstitial verb ) ( inner_duplicate words )

main = do
	print
		( hypozeuxis " leads to "  ( "fear" : "anger" : "hate" : "suffering" : [] ) )

It should now be clearer what the core of the logic is.

interstitial places a verb phrase between two other words.
inner_duplicate duplicates the words in a list of words except for the first and last word. It gives us back pairs of words.
map then runs interstitial on our list produced from inner_duplicate.

Finally hypozeuxis ties this logic up in a neat and easy to digest package.

It isn't quite finished though because we're getting a list of sentences, not a single string representing the entire phrase and we're also not getting capital letters at the start of the sentences.

Sentences and paragraphs

This is what we have now:

["fear leads to anger","anger leads to hate","hate leads to suffering"]

Here is what we'd like to see:

“Fear leads to anger. Anger leads to hate. Hate leads to suffering.”

Hopefully we can see how to convert one of our strings into the right sentence form. We just need to pull out the first letter, do a case conversion on it and then put a full stop at the end.

The one thing we don't know how to do is the case conversion, but Haskell has a built in function, toUpper. In order to use it though we have to pull in the right external module. We do that by adding an import after the module Main where like this:

module Main where
import Data.Char

If you think about using the pattern matching we had earlier you can probably work out pretty quickly how to get our sentence function working:

sentence ( first : the_rest ) = ( ( toUpper first ) : the_rest ) ++ "."

The pattern on the left separates out the first letter of the string and the expression on the right stitches it all back together again. This sort of thing should be starting to feel familiar by now. Personally I always feel like I haven't actually told Haskell what to do yet, but the program works anyway.

Paragraphs

The final basic building block that we need to learn about is folding. Like map, fold is a basic functional idiom that abstracts a particular type of looping construct. Whereas map is used to transform a list into a different list by applying a function to each element, a fold is used to transform a list into a value through the repeated application of a binary function.

The complication with fold is that there are all sorts of binary functions, some are left associative, some right. There are also differences in how the result is “primed”, do we start with the values in the list, or do we start with a value we supply?

This gives us four general folds that we can choose between:

foldl—Takes a function and initial value then applies the function to the previous results and each element in turn. This is left associative.
foldl1—Takes a function and applies it to the first two elements in the list then successively applies the function to the result and each other value. Also left associative. If the list only has a single element then that element is returned and the function is not used.
foldr—Right associative version of foldl. That is it starts with the last element in the list and moves backwards.
foldr1—Right associative version of foldl1. This means it starts with the last two values and moves backwards.

Before we can choose between these we should probably look at the function we're going to use. Here is a very simple function:

inner_space left right = left ++ " " ++ right

A bit of thought will tell us that this can be used either left or right associatively and give the same answer. We also know that a paragraph of just one sentence is that sentence so either of foldl1 or foldr1 should work.

main = do
	print
		( foldl1 inner_space
			( map sentence ( hypozeuxis " leads to "  [ "fear", "anger", "hate", "suffering" ] ) ) )

If we run this we get the string we wanted.

Now what?

Here is the complete listing that we have ended up with:

module Main where
import Data.Char

inner_space left right = left ++ " " ++ right

interstitial inner ( left,  right ) = left ++ inner ++ right

inner_duplicate ( first : second : [] ) = [ ( first, second ) ]
inner_duplicate ( first : second : the_rest ) = ( first, second) : ( inner_duplicate ( second : the_rest ) )

sentence ( first : the_rest ) = ( ( toUpper first ) : the_rest ) ++ "."

hypozeuxis verb words = foldl1 inner_space ( map sentence ( map ( interstitial verb ) ( inner_duplicate words ) ) )

main = do
	print
		( hypozeuxis " leads to "  [ "fear", "anger", "hate", "suffering" ] )

A few things to notice:

There is no if statement or looping constructs. There's nothing that looks like standard program flow constructs at all.

More Haskell

The first most obvious thing is that much of the program can be re-worked to use more of the Haskell libraries and maybe a bit more advanced syntax. Here are some ideas:

Use inner_space and a foldl1 to implement interstitial so that the spaces around the verb phrase are added for us.
The only difference between the two versions of interstitial we wrote is actually that one is curried and one isn't. Can you work out how to convert between them? This is hard if you've not come across currying before. If you can do this then you should be able to work out how to unify inner_space and interstitial

Further learning

The first place to go from here is to look at Simon Peyton Jones' A Taste of Haskell lectures from OSCON '07. There's a part 1 and a part 2 and slides PDF.

There is a Haskell book in the process of being written at Wikibooks. There you can also find links to YAHT—Yet Another Haskell Tutorial. On Haskell.org is a much more complete list of books and teaching materials.

Finally, there is also a #haskell channel on Freenode. People there are very helpful and answered my stupid questions very quickly. It can be a bit like drinking from a fire hose. Don't expect to understand everything that you're being told, especially at first… Don't be afraid to ask again if you didn't understand the answer the first time (but say which bits you didn't understand).

Practical use

An obvious question now becomes “What can I do with it?”

One thing that Haskell is very good for is prototyping, especially for prototyping and testing algorithms. It is an especially good way of making sure that you actually understand how something works even if you have to do a final implementation in another language.

Visual Haskell makes it very easy to make command line executables which are useful, but not always easy to integrate into larger systems (at least on Windows™). Visual Haskell can be used to create DLLs and COM servers⁸ [8Philip Wadler, in his paper Why no one uses functional languages talks about COM for Haskell.], but it is not especially straight forwards and there is no direct IDE support.

Even if you never write a single line of production code in Haskell it will still be worth learning more of it. It will profoundly change the way that you look at programming and how you approach problems. This is one of the best practical uses of the language because you get the benefit every time you even think about software development.