Learning Haskell - Miscellaneous Enlightenments

The following are some of the so called 'Aha!' moments I have experienced while learning Haskell. I am sharing them here so that it might help someone spare the hopeless frustration that precedes them.

About the idea of purely functional programming

For a long time I didn't understand why 'functional programming' was considered better than "regular" imperative programming. So I continued to make programs in the "regular" imperative fashion. And one day it hit me.

I saw the true nature of what I was doing. I saw how imperative program was really about arranging a sequence of side effects, the majority of time the side effect being the mutation of a variable. I saw how an invisible web of dependencies between statements, that can span, not only spatial but also temporal dimensions, grow inside my imperative functions with each statement I add to it. I saw how breaking even one strand of this invisible web, can silently break function behavior and hence the whole program. And I was enlightened...

Enough with the Zen talk. The point is that an imperative programming language, like Python or C, allows the programmer to create variables. It also allows the programmer refer these variables in the future and also allows them to change their values during the runtime.

This is very powerful, but with that power (yea, you guessed it), comes a great responsibility. The responsibility of tracking states of variables while writing and reading the program. Because each statement that you add to a program depends on a state (state of all variables in scope) that was created by the statements surrounding it.

Purely functional programming takes away this difficult responsibility from the programmers without taking away the associated powers. It does this by providing a different set of tools. By this new set of tools, the programmer can be just as or even more powerful. It takes away variable state changes and loops and gives us continuation passing, folds, zips, filters and maps. The enlightenment here is simple. It is that what ever you can express with state changes and loops in an imperative language can be expressed with this new vocabulary in a functional style.

About learning Haskell

People say that Haskell is not complex, and that It is just different. But I think that is a useless statement. When the thing you are dealing with is vastly different from what you are used to, it can appear complex no matter how simple it actually is.

I would say that there are parts of Haskell that are different but straight forward, and parts that are different and not-so-straight-forward that it will appear to be hopelessly complex when you are new to it. But bit by bit, topics that you once considered beyond your grasp will turn approachable. When it happens, it is like unlocking a new level of a video game; New wonders await. This is why learning Haskell is so much worth the effort. There is enough depth and breadth to cover to keep you interested long enough, at the same time being a very good general purpose language with an excellent community behind it. And now it is even gaining popularity!

About the terminology you might encounter while learning Haskell

Following is an excerpt from the script of the movie 'The Good Dinosaur'.

Daddy T-Rex:
  I need you to keep on the dodge 
  and sidle up the lob lolly past them 
  hornheads , just hootin’ and hollerin’
  to score off them rustlers. We’ll cut dirt 
  and get the bulge on ‘em.
ARLO:
  What?
Son T-Rex:
  He just wants you to get on that rock and scream.

Point is, don't be fazed by the unfamiliar terminology. Most of the time the whole thing means something a lot simpler than it appears to be.

About Haskell functions

Haskell functions do not have a statement to return a value to the calling code. In hindsight, this is pretty obvious, Haskell programs does not have statements, at all. Instead Haskell functions are expressions that evaluate to a value, and this value is implicitly the "return" value of that function. Despite this, you will see people say things like "this function returns x". By that they just mean that the function evaluate to x.

Let expressions

If there was one thing that could have single handedly eased my mind as an imperative programmer coming to functional programming, it is the 'let' expression. Because as soon as I found that Haskell functions are limited to single expression, I am like, "there is only so much you can do with an expression, how can one do anything useful with it?". My problem was that I was thinking of expressions in imperative languages. The enlightenment here is that expressions in Haskell can be really elaborate, and Haskell's "let" expression allows you to define any number of intermediate expressions or functions that are required by your final expression. This brings us very close to an imperative style of programming, even though the execution is completely different, as we will see below.


  sumOfDoubleAndTriple :: Int -> Int
  sumOfDoubleAndTriple x = let
    double = 2 * x
    triple = 3 * x 
  in double + triple

In the above function, we used the let expression to define two intermediate results 'double' and 'triple' before adding them both and returning them as the value of the function.

Note that these are not variable definitions. These bindings cannot change. You won't be allowed to redefine a symbol within the same let expression. Also the scope of the bindings are limited to the expression after the 'in' and any other definitions nested in the same let block. Even though bindings cannot change, bindings in a syntactically deeper level can shadow bindings coming from levels above.

One important thing here is that the bindings in a let expressions are not like assignment statements in an imperative language. They are not 'executed' from top down. Instead one can think of the execution as starting from the expression after the 'in' clause, and the required values being looked up in the bindings and evaluated as required.

Typeclasses

There is something very simple about Haskell typeclasses that I took a while to completely grasp. It is just that Haskell must be able to figure out the matching instance from the place from which a call to a typeclass function is made. If it cannot, then it will be an error.

Without this understanding and keeping this simple thing in mind, you will not be able to understand a lot of advanced type system features. For example, FunctionalDependencies extension. It also helps understanding a lot of errors that the typechecker ends up throwing at you.

Return type Polymorphism

If you ask, this was the biggest enlightenment for me, and one that snapped a lot things into place. The simple fact, that it is possible for Haskell functions to return different type of values depending on the type that is required at the call site. In other words, Haskell functions can be polymorphic in the return type. The simplest example I can think of is the 'read' function of type String -> a. The call to this function in (1::Int) + (read "2") will return an Int and in (1::Float) + (read "2") will return a Float.

About IO

When I was starting with Haskell, I remember trying to take a value wrapped in IO out of it, purely. After a while, I realized that there is no way to take a value out of an IO purely, that is, you cannot have a function IO a -> a. It is not because IO is a Monad and Monads are special cased magic, but simply because the constructor of IO is not exported out of its module. This feels so obvious now, but it wasn't once.

Wrapper confusion

When I was still new to Haskell, I some how ended up with an intution that types of the form Xyz a have tiny values of a wrapped inside them. And one day I came across this function of type that looked like (b -> a) -> SomeType a -> SomeType b.

And I am like "W.T.F !? Can GHC reverse engineer functions and make them work in reverse?" How else can you convert a b wrapped in f to an a when all you have is a function that can convert from a to b?

Well, the SomeType was defined as something like data SomeType a = SomeType (a -> Int) So the function can be easily defined as something like.


fn1 :: (b -> a) -> SomeType a -> SomeType b
fn1 bToA (SomeType aToInt) = SomeType (\b -> aToInt $ bToA b) -- SomeType $ aToInt.bToA

The point is, type of the form Xyz a need not be 'wrappers' or sandwiches or anything. A type does not tell you nothing about the structure of the data without it's definition.

Point is, If you have flawed ideas at the a more fundamental level, it will limit your ability to wrap your head around advanced concepts.

The 'do' notation

A do block such as,


do
  a <- expression1
  expression2
  expression3

DOES NOT desugar to


  expression1 >>= expression2 >>= expression3

or to..


  expression1 >>= (\a -> expression2) >>= (\_ -> expression3)

but something equivalent to


  expression1 >>= (\a -> expression2 >>= (\_ -> expression3))

Even though I was aware of this, I have often caught myself holding the preceeding two wrong intutions time to time. So I now remember it as desugaring to 'first expression in block >>= rest of block wrapped in a lambda'

If you recall the signature of >>= from the Monad class, it is >>= :: m a -> (a -> mb) -> mb So the arguments to >>= matches with the desugared parts as follows.


   expression1 >>= (\a -> expression2 >>= (\_ -> expression3))
-- |-- m a --| >>= | --------- (a -> mb) --------------------|

Another persistent, wrong intuition I had a hard time getting rid of is that it is the Monad's context that the lambdas in the RHS of >>= get as their argument.

But it is not. Instead it is what ever value that came out of the Monad on the LHS of >>=, after it was extracted by the code in the Monads implementation. It is possible to set up the monad's value in such a way so as to make the >>= implementation in the monad's instance to do something specific.

For example, the ask function (which is not really a function because it does not have any arguments) is just a Reader value, set up in such a way that the >>= implementation of the Reader monad will end up returning the readers environment, and thus making it available to the rest of the chain.

Laziness

For the longest time I was not able to make sense of how laziness, thunks and their evaluation really worked in Haskell. So here is the basic thing without further ceremony . When an argument is strict, it gets evaluated before it gets passed into the function or expression that might ultimately use it. When it is lazy, it gets passed in as an un-evaluated thunk. That is all it means!

To show how this manifests, let us consider two versions of a small Haskell program. One with strictness and one without.


module Main where
-- 
sumOfNNumbers :: Int -> Int -> Int
sumOfNNumbers a 0 = a
sumOfNNumbers a x = sumOfNNumbers (a+x) (x -1)
-- 
main :: IO ()
main = do
  let r = sumOfNNumbers 0 10000000
  putStrLn $ show r

When I run this program, it's memory usage is as follows.

# stack ghc app/Main.hs && ./app/Main +RTS -s
50000005000000
   1,212,745,200 bytes allocated in the heap
   2,092,393,120 bytes copied during GC
     495,266,056 bytes maximum residency (10 sample(s))
       6,964,984 bytes maximum slop
             960 MB total memory in use (0 MB lost due to fragmentation)

You can see this uses a whole lot of memory. Let us see how sumOfNNumbers 0 5 gets expanded.

sumOfNNumbers 0 5 = sumOfNNumbers (0+5) 4
sumOfNNumbers (0+5) 4 = sumOfNNumbers ((0+5)+4) 3
sumOfNNumbers ((0+5)+4) 3 = sumOfNNumbers (((0+5)+4)+3) 2
sumOfNNumbers (((0+5)+4)+3) 2 = sumOfNNumbers ((((0+5)+4)+3)+2) 1
sumOfNNumbers ((((0+5)+4)+3)+2) 1 = sumOfNNumbers (((((0+5)+4)+3)+2)+1) 0
sumOfNNumbers (((((0+5)+4)+3)+2)+1) 0 = (((((0+5)+4)+3)+2)+1)

We see that as we go deep, the expression that is the first argument, gets bigger and bigger. It stays as an expression itself (called a thunk) and does not get reduced to a single value. This thunk grows in memory with each recursive call.

Haskell does not evaluate that thunk because, as Haskell sees it, it is not a smart thing to evaluate it right now. What if the function/expression never really use the value?

Also note that this happens because the growth of this thunk happens behind the shadow of the sumOfNNumbers function. Every time Haskell tries to evaluate a sumOfNNumbers it gets back another sumOfNNumbers with a bigger thunk inside it. Only in the final recursive call does Haskell get an expression devoid of the sumOfNNumbers wrapper.

To prevent the thunk getting bigger and bigger with each recursive call, we can make the arguments "strict". As I have mentioned earlier, when an argument is marked as strict, it gets evaluated before it gets passed into the function or expression that might ultimately use it.

You can make arguments or bindings to be strict by using bang patterns


  sumOfNNumbers :: Int -> Int -> Int
  sumOfNNumbers !a 0 = a
  sumOfNNumbers !a x = sumOfNNumbers (a+x) (x -1)

This will also work.


  sumOfNNumbers :: Int -> Int -> Int
  sumOfNNumbers a 0 = a
  sumOfNNumbers a x = let
      !b = a in sumOfNNumbers (b+x) (x -1)

After this change the memory usage is as follows.


module Main where
-- 
sumOfNNumbers :: Int -> Int -> Int
sumOfNNumbers !a 0 = a
sumOfNNumbers !a x = sumOfNNumbers (a+x) (x -1)
-- 
main :: IO ()
main = do
  let r = sumOfNNumbers 0 10000000
  putStrLn $ show r

stack ghc app/Main.hs && ./app/Main +RTS -s
[1 of 1] Compiling Main             ( app/Main.hs, app/Main.o )
Linking app/Main ...
50000005000000
     880,051,696 bytes allocated in the heap
          54,424 bytes copied during GC
          44,504 bytes maximum residency (2 sample(s))
          29,224 bytes maximum slop
               2 MB total memory in use (0 MB lost due to fragmentation)

From 960 MB to 2MB!

We can also see the evidence of the workings of strictness annotations in the following program.

# :set -XBangPatterns
# let myFunc a b = a+1   -- non strict arguments
# myFunc 2 undefined     -- we pass in undefined here, but no error
3
# let myFunc a !b = a+1    -- strict second argument
# myFunc 2 undefined     -- passing undefined results in error
Exception: Prelude.undefined
CallStack (from HasCallStack):
  error, called at libraries/base/GHC/Err.hs:79:14 in base:GHC.Err
  undefined, called at :71:7 in interactive:Ghci11

The function myFunc has two arguments, but we only use the first one in the function. Since the arguments are not strict, we were able to call the function with 'undefined' for the second argument, and there was no error, because the second argument, undefined, was never evaluated inside the function.

In the second function, we have marked the argument to be strict. Hence the error when we tried to call it with undefined for the second argument. Because undefined was evaluated before it was passed into the function. So it didn't matter if we were using it inside the function or not.

Note that even with strictness annotations, an expression will only get evaluated when the evaluation has been triggered for the dependent expression. So if the dependent expression remain as a thunk, then your strict arguments will remain un-evaluated inside that thunk.

The story of Haskell's laziness goes a bit more deeper. Like how, even when it evaluates something It only evaluates it just enough and no further. Its laziness all the way down!

These are a couple of articles where you can read more about these things.

Exceptions

There is a lot to learn about exceptions in Haskell, various ways they can be thrown and caught. But there is one basic thing about them. It is that you can throw an exception from pure code. But to catch it, you must be in IO.

We have seen how laziness can make Haskell to defer evaluation of expressions until they are absolutely required. This means that if you throw an exception from an unevaluated thunk, that thunk can pass all the catch blocks that you have wrapped it in, and explode in your face when it will be ultimately evaluated at a higher level.

To prevent this, you should use the 'evaluate' function to force the evaluation of a pure value, if you want to catch any exceptions thrown in the process. Seriously, you should read the documentation for evaluate function.

Haskell Extensions

One thing that might be unique to Haskell is the availability of various language Extensions. Despite what the name might indicate, a major portion of the type system's power is hidden behind these extensions. But actually learning to use these in real world is a bit like what the character of master Shifu says about full splits in the movie 'Kung-fu Panda.'

it take years to develop one's flexibility aand years longer to apply it in combat!

Haskell extensions are not so bad. Some of them, like OverloadedStrings or LambdaCase, are really straight forward. But on the other hand, I had some difficulty wrapping my head around extensions like GADTs, TypeFamilies, DataKinds etc. But YMMV. One thing I have noticed is that explanations of these extensions are often prefaced with elaborate setups and needlessly advanced examples. "Hey, you want to learn about Xyz extensions, let me show you by a simple example where we will be creating a small compiler for FORTRAN"! Of course that is hyperbole, but you get the point. Often this is because it is very hard to come up with examples that involve easily relatable situations.

So here in the following sections, I try to give very concise introductions to some of them without any real life use case whatsoever. The only promise I can give about them is that they will be, well... concise ;-)

GADTs

It allows us to have data definitions where it is possible to explicitly associate constructors with a concrete type. Look at the definition of Maybe type.


data Maybe a = Just a | Nothing

Here there is an implicit association between the type of a in Just a and type a in Maybe a. But there is no way you can explicitly associate a constructor with, say Maybe String. Say, you want to add a third constructor NothingString that will explicitly return a Maybe String


data Maybe a = Just a | Nothing | NothingString

Will not work because NothingString will still return a polymorphic type Maybe a. GADTs extension makes this possible. But it has a slightly different syntax


{-# Language GADTs #-}
data Maybe a where
  Just :: a -> Maybe a
  Nothing :: Maybe a
  NothingString :: Maybe String

Here, by having been able to provide explicit type signatures for constructors, we were able to make NothingString constructor explicitly return Maybe String. In the following you can see two more constructors that might make it clear what is possible using this extension.


{-# Language GADTs #-}
data Maybe a where
  Just :: a -> Maybe a
  Nothing :: Maybe a
  NothingString :: Maybe String
  JustString :: String -> Maybe String
  JustNonSense :: Int -> Maybe String

Querying types from GHCI..


#:t Just 'c'
Just 'c' :: Maybe Char
#:t Nothing
Nothing :: Maybe a
#:t NothingString
NothingString :: Maybe String
#:t JustString "something"
JustString "something" :: Maybe String
#:t JustNonSense 45
JustNonSense 45 :: Maybe String

RankNTypes

You need RankNTypes if you want to use functions that accept polymorphic functions as argument.

Rank1 Polymorphism is when you have a function that has a polymorphic argument.
Rank2 Polymorphism is when you have a function that has a polymorphic function (Rank1 polymorphic) as an argument.
Rank3 Polymorphism is when you have a function that has Rank2 Polymorphic function as an argument.
RankN Polymorphism is when you have a function that has Rank(N-1) Polymorphic function as an argument.

One subtlety regarding this is that if you have a function with signature Int -> (a -> a) -> Int, Then the second argument does NOT demand a polymorphic function. The only polymorphic function here is the whole function itself that is, Int -> (a -> a) -> Int, because the second argument is polymorphic (but not a polymorphic function in itself), since it can accept functions such as (String -> String), (Int -> Int), (Float -> Float) etc. But none of these functions are not polymorphic functions in itself.

Here is a function that has a polymorphic function for second argument. Int -> (forall a. a -> a) -> Int. To enable these kinds of functions, you need RankNTypes extension.

You should probably also read this

FunctionalDependencies

Imagine this typeclass


class Convert a b where
  convert :: a -> b
instance Convert Char String where
  convert = show
instance Convert Int String where
  convert = show

This will work fine. Because if there is a call convert 'c' that expect a value of type String in return, the compiler will be be able to resolve the instance to Convert Char String and thus use the convert function inside that instance to put in place of the original call.

Now, Imagine that we want to add one more function to this typeclass as follows


class Convert a b where
  convert :: a -> b
  convertToString :: a -> String
instance Convert Char String where
  convert x = show x
  convertToString x = show x
instance Convert Int String where
  convert x = show x
  convertToString x = show x

Now we have a problem. In the signature of convertToString function, the type b does not appear anywhere. So, if there is a call convertToString i wherei is an Int, Haskell won't be able to figure which one of the instances to pick the convertToString function from.

Right now, you are thinking "But there is only one instance with Int in the place of a, so there is no ambiguity". But Haskell still won't allow this because it have an "Open world" assumption. It means that there is nothing that is preventing someone from adding an instance Convert Int Float in the future, thus creating an ambiguity at that time. Hence the error now.

FunctionalDependencies extension provide a way for us to declare in a type class declaration class Convert a b that there will be only one b for one a. In other words, it is a way to declare that a will imply what type b is. Syntax is as follows..


{-# LANGUAGE FunctionalDependencies #-}
class Convert a b | a -> b where
  convert :: a -> b
  convertToString :: a -> String

After this we won't be able to declare two instances as before because that would be a compile time error. Because since a implies b, there cannot be two instances with the same a. So knowing a means knowing b. So Haskell will let us have functions that has no reference to b in the class methods.

IRC

If you are learning Haskell on your own, please go and ask your doubts in #haskell IRC channel. I don't remember a time when I came back from there empty handed.

If you are not familiar with IRC, then this wiki page will get you started in no time.

Seriously. Use it.

Random things I have found to be very useful

Use the functions in Debug.Trace module to print debugging stuff from anywhere. Even from pure code. The only catch is that it only prints stuff when it gets evaluated. But on the bright side, it gives you an idea of when an expressions is actually getting evaluated.
Use 'undefined' to leave implementation of functions out and get your programs to type check, and one by one convert each of them into proper implementations.
Use typed holes to examine what type the compiler expects at a location.
Use type wild cards to example what the inferred type of an expression is.
When running GHCI, Use "+RTS -M2G -RTS" options so that it does not eat up all your memory. Here we limit it to 2GB. If you are using Stack, the command is 'stack ghci --ghci-options "+RTS -M2G -RTS"'