SE-EDU
  • AB-1
  • AB-2
  • AB-3
  • AB-4
  • Collate
  • Book
  • Home
  • Contribute
  • About
  • GitHub
  • Learning Resources for Software Engineering Students ยป

    Introduction to Haskell

    Author: Thenaesh Elango

    Table of Contents

    Overview

    Haskell is a purely functional programming language with strong, static, inferred typing.

    While Haskell has its roots in academia, its emphasis on purity and side-effect-free computation makes it a valuable asset in software engineering contexts. Programs written in Haskell tend to be easy to test, refactor and debug, with the compiler usually catching all bugs before the program can even be compiled and run. Consequently, Haskell codebases are extraordinarily stable.

    Here's an example of a Haskell program that reads a string of numbers, prints the sum of the numbers and repeats the process until the string "quit" is entered. This shall serve as our Hello World.

    -- the entry point of the program
    main :: IO ()
    main = do
        str <- getLine
        if str == "quit"
            then return ()
            else do
                let sumOfNumbers = sumAllNumbersInString str
                putStrLn $ show sumOfNumbers
                main
    
    sumAllNumbersInString :: String -> Int
    sumAllNumbersInString str = sumAll $ extractInts $ tokenize str
    
    -- sums up a list of integers using a higher-order function (the left fold)
    sumAll :: [Int] -> Int
    sumAll = foldl (+) 0
    
    -- convert each string of digits in a list to an actual integer
    extractInts :: [String] -> [Int]
    extractInts = fmap read
    
    -- split string by spaces using a built-in function
    tokenize :: String -> [String]
    tokenize = words
    

    Haskell is widely used in a whole range of industries, including banks, financial companies, technology companies and engineering companies use Haskell in a variety of systems. A comprehensive list may be found here.

    Getting Started

    Installation

    This tutorial, in general, assumes a system-wide installation of the Haskell Platform. This is primarily for simplicity. It is perfectly acceptable to write small programs or code not intended for production in this manner.

    When using Haskell in an actual project, however, it is strongly-recommended to use Stack. Not doing so may cause dependency management to become a nightmare.

    System-Wide Installation

    For new users, Haskell may be quickly and easily installed by downloading the Haskell Platform for their respective operating systems. The Haskell Platform contains many common and important Haskell libraries, in addition to the Glasgow Haskell Compiler (GHC).

    At the time of writing, the Haskell Platform has binaries available for all common operating systems, and many uncommon ones as well.

    Stack Installation

    For Haskell projects of significant size, it may be necessary to control the exact versions of the compiler and libraries used. For such use cases, the system-wide installation method above may prove unwieldy and inadequate. In cases like these, it may be preferable to have an entire Haskell environment just for that project, together with a curated set of libraries.

    In such a scenario, Stack may come in handy. Stack is a package manager of sorts for Haskell, similar to NPM. Installation instructions may be found in the Stack Documentation, and is fairly standard.

    A new Stack project may be created and set up with the following:

    # create the project skeleton
    stack new ${PROJECT_NAME}
    
    # go into the project source directory
    cd ${PROJECT_NAME}
    
    # install GHC for the project
    stack setup
    
    # compile the project
    stack build
    
    # run the project executable
    stack exec ${PROJECT_NAME}-exe
    

    The command stack new is used to create a new project, which contains a skeleton already set up. This skeleton includes a ${PROJECT_NAME}.cabal file, which contains nearly the entire configuration for the project (compiler/library versions, modules to be exposed, build targets, etc), and is best thought of as a sort of package.json or Gemfile for Stack.

    The command stack setup downloads and installs GHC. Stack installs GHC versions into an isolated location in a user's home directory, and does not add them to the system path. The version used for any particular project depends on the setting in the project's ${PROJECT_NAME}.cabal file.

    The commands stack build and stack exec are used to build and run the project. The executable name for a project named Project is Project-exe. This name is configurable in Project.cabal.

    The rest of the Stack documentation may be found in the official guide.

    Usage

    The command ghci may be used to invoke the GHC interpreter. This launches an REPL where Haskell code can be entered and evaluated interactively. This is a very useful tool when first learning Haskell, and also when debugging code that fails to compile.

    The command ghc may be used to compile Haskell code down to machine code. The invocation of ghc is very similar to that of gcc.

    When using Stack, simply prefix the commands with stack.

    Whirlwind Tour

    Types

    Haskell is statically typed, meaning that every variable binds to a value of a specified type. Haskell is also strongly-typed, meaning that every value has a well-defined type.

    Basic Types

    We specify types explicitly by postfixing the variable names with the type.

        a :: Int
        a = 5
    
        -- unbounded integer type, similar to Java BigInt
        b :: Integer
        b = product [1..1000] -- this is the factorial of 1000
    
        pi :: Double
        pi = 3.141592654
    

    Haskell has very powerful type inference engine, so it is possible to omit the type definitions in most cases.

        a = 5
    

    Functions & Currying

    Functions, which are just values, have types too. It is considered good practice in Haskell to specify types for toplevel functions, as a form of documentation, even though the compiler is likely able to infer types.

        -- input:      x of type Double
        -- output: x * x of type Double
        square :: Double -> Double
        square x = x * x
    
        -- computes the hypotenuse of a right triangle given the other two sides
        hypotenuse :: Double -> Double -> Double
        hypotenuse adj opp = sqrt (square adj + square opp)
    

    If the above syntax is confusing and the comments insufficient, the reader may wish to consult the detailed introduction to Haskell syntax here.

    The type definition for square is rather obvious. But the type definition of hypotenuse is a little strange. One would expect (Double, Double) -> Double) instead of Double -> Double -> Double. The reason is that functions in Haskell are curried, so a two-parameter function can be called with a single argument, with a one-parameter function (that takes in the remaining argument and produces the value) being returned. (Double, Double) -> Double is actually a function that takes in a single 2-tuple parameter, which is different from a function that takes in two parameters.

    The -> binds to the right, so Double -> Double -> Double may be written as Double -> (Double -> Double) (a function that takes a double and returns a function that takes a double and returns a double). Calling hypotenuse 3 4 is also the same as calling (hypotenuse 3) 4, as function application binds to the left.

    We may go even further with currying, by fixing some parameters in the function:

        -- hypotenuse of a right triangle whose adjacent side is restricted to 3
        hypotenuseWithAdjacent3 :: Double -> Double
        hypotenuseWithAdjacent3 = hypotenuse 3
    

    This clearly illustrates how currying can be used to reuse and partially specialise code as needed. This idiom comes in handy very often in Haskell, as will be seen later.

    It may be of interest to note that all functions in Haskell take in at most one parameter. The illusion of multi-parameter functions is created by currying and left-binding function calls.

    Algebraic Data Types & Pattern Matching

    It is possible to create custom data types, either from nothing or from existing types.

        data TrafficSignal = Red | Amber | Green
    
        -- define some values of type TrafficSignal, all type-inferred
        redLight = Red
        amberLight = Amber
        greenLight = Green
    

    The TrafficSignal type is an example of creating data types from nothing. We call Red, Amber and Green the value constructors and TrafficSignal itself the type constructor. In this case, a TrafficSignal has 3 possible values, Red, Amber or Green. Both type and value constructors must start with a capital letter.

    We make use of types in functions by pattern matching on the value constructors. It is necessary to pattern match on all the value constructors; omitting any will cause the compiler to complain of non-exhaustive pattern matches.

        makeTrafficDecision :: TrafficSignal -> String
        -- leaving any of these out will cause the compiler to complain
        makeTrafficDecision Red = "Stop"
        makeTrafficDecision Amber = "Carefully Proceed"
        makeTrafficDecision Green = "Go"
    

    It is also possible to create data types that encapsulate/contain other data types. The value constructors in this case take parameters instead of being bare. Pattern matching is done by "expanding" the value constructor.

        data HttpRequest = Get String | Post String
    
        handleRequest :: HttpRequest -> String
        -- the ++ denotes string concatenation in this context
        handleRequest (Get string) = "Get request performed on " ++ string
        -- we use _ to denote that we don't care about the actual value
        handleRequest (Post _) = "Post request not supported"
    

    There is also an additional way to declare data types. Suppose we had a C++ class like so:

        // we are omitting trivial details like constructors
        class Box {
            double length;
            double breadth;
            double height;
            double density;
        public:
            double getVolume() {
                return this->length * this->breadth * this->height;
            }
            double getMass() {
                return this->density * this->getVolume();
            }
        };
    

    We could certainly represent a Box as an algebraic data type as follows:

        -- NOTE: a value constructor can have the same name as the type constructor
        data Box = Box Double Double Double Double
    

    But we are missing key information here. Which Double stands for which attribute? In situations like these, we can use Haskell's record syntax:

        -- define box as a record type
        data Box = Box { length :: Double, breadth :: Double, height :: Double, density :: Double }
    
        getVolume :: Box -> Double
        getVolume (Box { length = l, breadth = b, height = h }) = l * b * h
    
        getMass :: Box -> Double
        getMass box = getVolume box * density box
    
        silverBox = Box { length = 5, breadth = 10, height = 15, density = 10.5 }
        goldBox = Box 5 10 15 19.3 -- we can still use normal construction by parameter order
    

    There are a few things to note here, other than the syntax itself. When pattern matching on a record type, we may omit any parameters we do not need (we do not even need to specify _). We may also extract values from the record type by treating the record parameter names as functions from the record type to the parameter type. For instance, density in the above example is actually a Box -> Double function. Doing density silverBox will give the value 10.5.

    Type Parameters

    Consider the division operator on Double. We may be tempted to define it with the type Double -> Double -> Double, but the result may be undefined when dividing by zero. Here's a first stab at a solution to remedy this:

        -- represents a value that may or may not exist
        data MaybeDouble = Undefined | Defined Double
    
        divide :: Double -> Double -> MaybeDouble
        divide _ 0 = Undefined
        divide x y = Defined (x / y)
    

    This ensures that division by zero returns a clearly-defined result instead of something weird.

    Now suppose we want to send a HTTP request and retrieve the response data. This response data may not exist as the server may refuse to return the data. We can try to solve the problem in the following manner:

        data MaybeResponse = NoResponse | GotResponse HttpResponse
    
        makeRequest :: HttpRequest -> MaybeResponse
        -- implementation details irrelevant
    

    We have MaybeDouble and MaybeResponse, both of which have a common pattern: they represent possible failure of computation. Naturally, we may wish to abstract this out. But all the means of abstraction available to us thus far cannot be used, as we wish to abstract on types rather than values.

    For this purpose, Haskell supports type parameters, much like how C++ has templates and Java has generics.

    We define the following abstraction of failing computations:

        data Maybe t = Nothing | Just t
    
        divide :: Double -> Double -> Maybe Double
        divide _ 0 = Nothing
        divide x y = Just (x / y)
    
        makeRequest :: HttpRequest -> Maybe Response
        -- implementation details irrelevant
    

    Note that we introduce an additional parameter t on the left side of the definition. This is known as a type parameter, and must always be lowercase. This parameter can then be used in the value constructors as a placeholder for any type that should be there.

    The use of type parameters in this way is similar to the use of generics in Java. We may think of Maybe t as Optional<T>, if that helps to understand the role of t.

    Note that there can be more than one type parameter. An example is Either, which represents the result of a computation that returns values of different types on success or failure:

        data Either a b = Left a | Right b
    
        divide :: Double -> Double -> Either String Double
        divide _ 0 = Left "Attempt to divide by zero!"
        divide x y = Right (x / y)
    

    Type constructors can be curried just like regular functions or value constructors. Therefore, Either String Double is a concrete type, while Either String is a type constructor that takes in the remaining type.

    Maybe and Either are both defined in the Haskell prelude library.

    Inductive Data Types

    We can define a data type in terms of itself. Consider, for instance, a tree. A tree can be thought of as either an empty tree, or a node with a left subtree and right subtree attached. We encode it like so:

        data Tree t = EmptyTree | Node t (Tree t) (Tree t)
    

    Another classic inductive data type is the singly-linked list. The list is either an empty list or the first element together with the rest of the list. While not canonical, this is a very common representation of lists in the functional programming world:

        data List t = EmptyList | Element t (List t)
    

    This representation of lists is actually exactly how traditional lists are defined in Haskell, just with different names and notation as will be seen later.

    We are now poised to enter the world of actual functional programming in Haskell.

    Further Reading

    General Functional Programming

    Functions

    A function may be defined in one of several ways. We illustrate the various syntaxes for defining a function below, with more details here if needed:

        sumOfSquares :: Double -> Double -> Double
        -- standard definition
        sumOfSquares x y  = (x * x) + (y * y)
         -- lambda function
        sumOfSquares = \x y -> (x * x) + (y * y)
    
        fizzBuzz :: Int -> Either String Int
        -- the horrible, disgusting, but still perfectly correct way
        fizzBuzz x = case x `mod` 15 == 0 of
                        True -> Left "fizzbuzz"
                        False -> case x `mod` 3 == 0 of
                            True -> Left "fizz"
                            False -> case x `mod` 5 == 0 of
                                True -> Left "buzz"
                                False -> Right x
        -- far more elegant way using guard patterns
        fizzBuzz x
            | x `mod` 15 = Left "fizzbuzz"
            | x `mod` 3 = Left "fizz"
            | x `mod` 5 = Left "buzz"
            | otherwise = Right x
    

    Recursion

    Recursion is one of the fundamental themes of functional programming. It is the ability of a function to call itself.

        -- Time: O(n)
        -- Space: O(n)
        factorial :: Integer -> Integer
        factorial 0 = 1
        factorial n = n * factorial (n - 1)
    
        -- Time: O(2^n)
        -- Space: O(n), may vary due to lazy evaluation
        fibonacci :: Integer -> Integer
        fibonacci 0 = 0
        fibonacci 1 = 1
        fibonacci n = fibonacci (n - 2) + fibonacci (n - 1)
    

    While Haskell has no primitive loop structures, looping can be simulated by recursion. While attempting this in languages in C may cause a stack overflow, Haskell avoids this via tail-call optimisation, which can be applied to recursive calls that meet certain requirements.

        -- Time: O(n)
        -- Space: O(1)
        factorial :: Integer -> Integer
        factorial = factorial' 1 where
            factorial' p 0 = p
            factorial' p n = factorial' (p * n) (n - 1)
    
        -- Time: O(n)
        -- Space: O(1)
        fibonacci :: Integer -> Integer
        fibonacci 0 = 0
        fibonacci 1 = 1
        fibonacci n = fibonacci' 0 1 n where
            fibonacci a _ 0 = a
            fibonacci' a b n = fibonacci' b (a + b) (n - 1)
    

    We can safely omit the types in the inner function definitions due to type inference. Also note how we freely use currying in the factorial definition.

    Lists

    As described earlier, a list is an inductive data type, defined as either the empty list or an element concatenated with the rest of the list. The actual list data type is

        data [] t = [] | (:) t ([] t)
    

    where : is an infix value constructor.

    IMPORTANT: Infix Functions

    Any function (a value constructor is really just a function) that takes in two parameters whose name consists of nothing but symbols is infix by default.

    An infix function like + may be used in prefix form by enclosing in parentheses. For instance, 1 + 1 is the same as (+) 1 1.
    In the type definition, the prefix form must be used i.e. (+) :: Int -> Int -> Int.
    In the function definition, either is acceptable.

    Find out more.

    We will use this concept freely from now on.

    Here are several ways to define a list xs :: [Int] containing 2,4,6,8 in that order:

        -- the crazy way, using prefix notation directly from the list definition
        xs = (:) 2 ((:) 4 ((:) 6 ((:) 8 [])))
    
        -- using infix syntax for (:), still annoying to write
        xs = 2:(4:(6:(8:[])))
    
        -- taking advantage of binding rules for (:), noiseless and easier to understand at a glance
        xs = 2:4:6:8:[]
    
        -- using varying amounts of list syntactic sugar provided by the compiler
        xs = 2:4:6:[8]
        xs = 2:4:[6,8]
        xs = 2:[4,6,8]
        xs = [2,4,6,8]
    

    The last representation is most commonly used, while the second last is often used when pattern matching on lists. The rest are almost never seen in practice. However, it is hoped that this pedantic exercise helps the reader understand the true nature of lists: an ordinary inductive data type with some compiler syntactic sugar tacked on.

    List Processing - Fold

    List processing is a very important part of elementary functional programming. This is due to the fact that lists can store large amounts of data, and it is very easy to define powerful abstractions to slice and dice that data in ways typically unknown in imperative programming.

    One common idiom is to loop over a list and aggregate their values.

    It is possible to run over a list and sum their values recursively like so:

        sumList :: [Int] -> Int
        sumList [] = 0
        sumList (x:xs) = x + sumList xs
    

    Note the infix pattern match (x:xs) as opposed to ((:) x xs). What if we wish to take the product of the elements instead of a sum? Then we would write:

        prodList :: [Int] -> Int
        prodList [] = 1
        prodList (x:xs) = x * prodList xs
    

    It is clear that some abstraction is in order here. The functions are almost identical except for the aggregating function used and the initial value (0 for sum, 1 for product). We can write a generalised aggregating function:

        -- 1st parameter is the aggregating function (e.g. (+) or (*))
        -- 2nd parameter is the initial value
        -- 3rd parameter is the list to aggregate
        aggregate :: (Int -> Int -> Int) -> Int -> [Int] -> Int
        aggregate _ initial [] = initial
        aggregate op initial (x:xs) = op x (aggregate op initial xs)
    

    This is better, but perhaps we could generalise this even further beyond Int. We then arrive at the following, by simply changing the type signature:

        -- 1st parameter is the aggregating function (e.g. (+) or (*))
        -- 2nd parameter is the initial value
        -- 3rd parameter is the list to aggregate
        aggregate :: (a -> b -> b) -> b -> [a] -> b
        aggregate _ initial [] = initial
        aggregate op initial (x:xs) = op x (aggregate op initial xs)
    

    This function is known as foldl in the Haskell prelude library, and there is also a variant called foldr that does the aggregation from the right instead.

    List Processing - Map & Filter

    One may wish to take in a list, transform every element in the list, and output the resulting list. This is known as a map, and may be defined as:

        map :: (a -> b) -> [a] -> [b]
        map _ [] = []
        map f (x:xs) = (f x):(map f xs)
    

    The type definition itself contains a wealth of information. The map function takes in a "transformer", the list to be transformed, and return the transformed list. An example of its usage would be:

        -- xs is [1,4,9,16]
        xs = map (\x -> x * x) [1,2,3,4]
    

    One may also wish to remove certain elements, that fail some predicate, from a given list. This is known as a filter:

        filter :: (t -> Bool) -> [t] -> [t]
        filter _ [] = []
        filter predicate (x:xs)
            | predicate x == True = x:xs
            | otherwise = xs
    

    This example uses guard patterns. An example of using filter would be:

        -- xs is [2,4]
        xs = filter (\x -> x `mod` 2 == 0) [1,2,3,4]
    

    It is left as an exercise for the reader to implement map and filter in terms of foldl (or aggregate as defined above, which is the same).

    Programming with Other Inductive Data Types

    Recursion is a natural fit with inductive data types other than lists. One example would be finding an element in a binary tree:

        find :: Tree Int -> Int -> Bool
        find EmptyTree _ = False
        find (Node x left right) target
            | x == target = True
            | x < target = find left target
            | x > target = find right target
    

    The above runs in O(log n) as long as the tree is balanced.

    Further Reading

    Typeclasses

    Typeclasses are essentially contracts/constraints imposed on types. They are similar to how Java interfaces are constraints imposed on Java classes. When used properly, they are an extremely powerful tool in helping to structure code.

    Defining and Instantiating Typeclasses

        -- "class" here has nothing to do with OOP
        class Eq t where
            (==) :: t -> t -> Bool
            (!==) :: t -> t -> Bool
            -- this ensures that we don't have to define (!==) separately
            a != b = not (a == b)
    

    We have just defined a typeclass called Eq. As its name probably suggests, this typeclass is used when we wish to define the meaning of equality on types. We then instantiate the typeclass with the TrafficSignal type, like so:

        instance Eq TrafficSignal where
            -- note that these are infix function DEFINITIONS
            -- we can define infix operators directly in infix notation
            Red == Red
            Amber == Amber
            Green == Green
    

    We have thus defined (==) completely for TrafficSignal. Note that (!=) now comes for free, since we have defined it in terms of (==) in the typeclass itself.

    Here's another example:

        instance Eq (List t) where
            EmptyList == EmptyList
            (Element x xs) == (Element y ys) = (x == y) && (xs == ys)
    

    Here, we define the equality of a list in terms of its underlying elements. This seems reasonable. However, running this program will give an error. This is because we are attempting to compare the underlying elements (of type t) using (==), which is not guaranteed to be defined on t.

    The solution, in this case, is to enforce a typeclass constraint prerequisite on t by writing:

        instance (Eq t) => Eq (List t) where
            -- as before
    

    Here is an example of Eq being defined on Trees:

        instance (Eq t) => Eq (Tree t) where
            EmptyTree == EmptyTree
            (Node x left right) == (Node x' left' right') = (x == x') && (left == left') && (right == right')
    
        tree1 = Node 1 (Node 2 EmptyTree) (Node 3 EmptyTree)
        tree2 = Node 1 (Node 2 EmptyTree) EmptyTree
        tree3 = Node 1 (Node 2 EmptyTree) EmptyTree
    
        -- some experiments
        tree1 == tree2 -- False
        tree2 == tree3 -- True
        tree3 != tree1 -- True
    

    We present another common typeclass called Ord, which defines order for a type:

        -- anything that instantiates Ord must also instantiate Eq
        -- this makes the typeclass definitions simpler as (==) is already provided and can be used
        class (Eq t) => Ord t where
            -- the only one we actually need to implement when instantiating
            (<) :: t -> t -> Bool
    
            -- we predefine these and can then get them all for free
            (>) :: t -> t -> Bool
            a > b = not ((a < b) || (a == b))
    
            (<=) :: t -> t -> Bool
            a <= b == (a < b) || (a == b)
    
            (>=) :: t -> t -> Bool
            a >= b = not (a < b)
    

    Adding Typeclass Constraints to Functions

    Consider the following function to check if the elements in the following list are all in ascending order:

        isAscending :: [t] -> Bool
        isAscending [] = True -- handle 0-element lists
        isAscending (x:[]) = True -- handle 1-element lists
        isAscending (x:y:xs) = (x < y) && isAscending (y:xs) -- recursive case
    

    This function seems reasonable, except for one minor detail: we (and the compiler) are not sure if t can be compared using (<)!. To remedy this, we need to explicitly state that t instantiates Ord, thereby allowing the use of (<). We do this by adding the constraint in the function type definition:

        isAscending :: (Ord t) => [t] -> Bool
    

    We can now try out the isAscending function:

        isAscending [1,2,4,3] -- False
        isAscending [1,2,3,4,5] -- True
        isAscending [] -- True
    

    Instantiating Typeclasses with Parameterized Type Constructors

    Up to this point, we have been instantiating typeclasses with concrete types, such as TrafficSignal and Tree t. It is also possible to instantiate typeclasses with parameterized type constructors like Tree and List.

    Consider the following typeclass Container that is instantiated by types that have some notion of constituent elements and size. For instance, a List has a length and contains elements of some type. A Tree has nodes and a size (number of nodes). The length is independent of type of element contained within.

        class Container s where
            -- t is an arbitrary unconstrained type
            size :: s t -> Int
    

    We can instantiate Container with Tree and List. These are parameterized type constructors, not concrete types. We can even instantiate with Maybe.

        instance Container Tree where
            size EmptyTree = 0
            size (Node _ left right) = 1 + size left + size right
    
        instance Container List where
            size EmptyList = 0
            size (Element _ restOfList) = 1 + size restOfList
    
        instance Container Maybe where
            size Nothing = 0
            size (Just _) = 1
    

    As an exercise, the reader may wish to redefine the size of a Tree to mean "height of tree" rather than "number of nodes". It is necessary to instantiate Container with Tree differently to achieve this. The function max :: (Ord a) => a -> a -> a may come in handy (Int is an instance of Ord).

    Further Reading

    Common Haskell Idioms

    Functors

    Consider the map function previously defined. The type of map is (a -> b) -> [a] -> [b], which means that it operates only on lists. We may imagine extending maps to Trees and Maybes in the following manner:

        map :: (a -> b) -> Tree a -> Tree b
        map :: (a -> b) -> Maybe a -> Maybe b
    

    The two type definitions above look very similar and suggest a generalization: types that can be mapped over. We call such types functors, and can represent their behaviour with a typeclass.

        class Functor f where
            -- f is a type constructor that takes in one type parameter
            fmap :: (a -> b) -> f a -> f b
    
    
        instance Functor Tree where
            fmap _ EmptyTree = EmptyTree
            fmap f (Node x left right) = Node (f x) (fmap f left) (fmap f right)
    
        instance Functor Maybe where
            fmap _ Nothing = Nothing
            fmap f (Just x) = Just (f x)
    

    We can then map over values of any functor:

        sq x = x * x
    
        fmap sq (Just 5) -- returns Just 25
        fmap sq Nothing -- returns Nothing
    
        fmap sq (Node 1 (Node 2 EmptyTree) (Node 3 (Node 4 EmptyTree) EmptyTree))
        -- returns (Node 1 (Node 4 EmptyTree) (Node 9 (Node 16 EmptyTree) EmptyTree))
    

    More information about functors, including the functor laws, may be found here.

    Applicative Functors

    An applicative functor is a functor that allows for a more advanced type of mapping. We shall jump straight into the (abridged) typeclass definition and the example of Maybe as an applicative functor:

        class (Functor f) => Applicative f where
            pure :: a -> f a
            (<*>) :: f (a -> b) -> f a -> f b -- generalized map function
    
        instance Applicative Maybe where
            pure x = Just x
            Nothing <*> _ = Nothing
            _ <*> Nothing = Nothing
            Just f <*> Just x = Just (f x)
    

    Applicative functors have the concept of lifting, embodied in pure, where a value is taken and placed in the context of a functor. For instance, in the context of Maybe, pure 5 returns the value Just 5.

    Applicative functors allow a more general form of mapping, where it is possible to use an N-parameter function to map over N functors. To understand the value of this, consider the following code:

        euclideanDistance :: Double -> Double -> Double -> Double
        euclideanDistance x y z = sqrt ((x * x) + (y * y) + (z * z))
    
        (pure euclideanDistance) <*> Just 1 <*> Just 2 <*> Just 3
        -- returns Just 3.7416573867739413
    

    The above code can be written with just fmap in the ordinary Functor class, but will involve incredible contortions.

    More information about applicative functors can be found here. There is a lot of additional functionality available in the Applicative typeclass. We have barely scratched the surface.

    Monads

    No Haskell tutorial will be complete without an introduction to the fabled monad. Monads have been described with various analogies, as well as with notorious phrases from category theory like "a monad is a monoid in the category of endofunctors".

    None of these are useful for the software engineer, so we dispense with them and opt for just showing the code:

        class (Applicative m) => Monad m where
            -- this function, called "bind", is at the heart of the monad
            (>>=) :: m a -> (a -> m b) -> m b
    
            -- we could actually just use pure, but return is here for historical reasons
            return :: a -> m a
            return = pure
    

    A monad is essentially an applicative functor that allows for operations to be chained together with a value carried in the background. To consider this, let us consider the familiar case of Maybe, which is a monad.

        instance Monad Maybe where
            Nothing >>= _ = Nothing
            (Just x) >>= f = Just (f x)
    
        -- maybeSomeValue is Just 50
        maybeSomeValue = Just 5 >>= (\x -> Just (x * x) >>= (\x -> Just (x + x)))
    

    Essentially, the bind function allows for values carried inside the monad (which is ultimately just a functor) to be extracted and passed into another computation. This explanation may seem obtuse, but consider the same code, with some extracted whitespace added and the return function used:

        maybeSomeValue = Just 5 >>= (\x ->
                         Just (x * x) >>= (\x ->
                         return (x + x)))
    

    If the reader squints hard enough, this looks like an imperative program! It looks like the following is being done:

        maybeSomeValue = do
            x <- Just 5
            x <- Just (x * x)
            return (x + x)
    

    The result of the imperative-looking code is exactly the same as that of the original computation, if traced through. Using monads to provide an imperative interface in a functional program is such a common pattern that the do notation was conceived as syntactic sugar to make writing such a pattern easier. That means that the imperative-looking code is actually valid Haskell!

    In addition to Maybe, there are several other monads. A major example is the IO monad, which allows external state to be encapsulated in the monad an interfaced with in a manner familiar to imperative programmers.

    Monads are a big topic, and additional resources are available:

    • Monads
    • Monad Laws that every monad should obey
    • IO Monad
    • State Monad, allows state to be carried in a monadic context, allowing imperative-style computation
    • ST Monad, allows mutable state to be carried in a monadic context, useful for implementing inherently destructive algorithms
    • Arrays, allows constant-time access to elements like a C array, with mutable variants in the IO and ST monads provided

    Guides

    • Learn You a Haskell is a good beginner text for learning Haskell. It does not have much real-world examples, but does quite a good job in explaining difficult theoretical concepts (e.g. functors, applicative functors and monads) well. It is recommended to read the material in 4 chunks:
      • 1-6: basic functional programming
      • 7: modules (very important for large codebases)
      • 8-10: details of the type-system
      • 11-14: monads
    • Real World Haskell is a rather old text which is possibly outdated. Still, it shows plenty of examples of how Haskell may be used in actual real-life scenarios (databases, web programming, etc).
    • Haskell Wikibook is a comprehensive Haskell reference. This resource really shines when it comes to aggregating and covering advanced Haskell topics that typically appear elsewhere in a very ad-hoc fashion.
    • Typeclassopedia is a reference for the major typeclasses contained in the Haskell hierarchical libraries. Use it to determine which typeclasses are related to which (e.g. every monad is an applicative functor, which is in turn a functor).