Part 17

Lists and Recursion

Here’s a new operator, :

Prelude> 1:[]
[1]
Prelude> 1:[2,3]
[1,2,3]
Prelude> tail (1:[2,3])
[2,3]
Prelude> head (1:[2,3])
1
Prelude> :t (:)
(:) :: a -> [a] -> [a]

The : operator builds a list out of a head and a tail. In other words, x : xs is the same as [x] ++ xs. Why do we need an operator for this?

Actually, : is the constructor for lists: it returns a new linked list node. The other list constructor is [], the empty list. All lists are built using : and []. The familiar [x,y,z] syntax is actually just a nicer way to write x:y:z:[], or even more explicitly, x:(y:(z:[])). In fact (++) is defined in terms of : and recursion in the standard library.

Here’s a picture of how [1,2,3] is structured in memory:

Building a List

Using : we can define recursive functions that build lists. For example here’s a function that builds lists like [3,2,1]:

descend 0 = []
descend n = n : descend (n-1)

descend 4 ==> [4,3,2,1]

Here’s a function that builds a list by iterating a function n times:

iterate f 0 x = [x]
iterate f n x = x : iterate f (n-1) (f x)

iterate (*2) 4 3 ==> [3,6,12,24,48]

let xs = "terve"
in iterate tail (length xs) xs
    ==> ["terve","erve","rve","ve","e",""]

Here’s a more complicated example: splitting a string into pieces at a given character:

split :: Char -> String -> [String]
split c [] = []
split c xs = start : split c (drop 1 rest)
    where start = takeWhile (/=c) xs
        rest = dropWhile (/=c) xs

split 'x' "fooxxbarxquux"   ==>   ["foo","","bar","quu"]

Pattern Matching for Lists

Last lecture, it was said that constructors are things that can be pattern matched on. Above, it was divulged that the constructors for the list type are : and []. We can put one and one together and guess that we can pattern match on : and []. This is true! Here’s how you can define your own versions of head and tail using pattern matching:

myhead :: [Int] -> Int
myhead [] = -1
myhead (first:rest) = first

mytail :: [Int] -> [Int]
mytail [] = []
mytail (first:rest) = rest

You can nest patterns. That is, you can pattern match more than one element from the start of a list. In this example, we use the pattern (a : b : _) which is the same as (a : (b : _)):

sumFirstTwo :: [Integer] -> Integer
-- this equation gets used for lists of length at least two
sumFirstTwo (a : b : _) = a+b
-- this equation gets used for all other lists (i.e. lists of length 0 or 1)
sumFirstTwo _       = 0

sumFirstTwo [1]      ==> 0
sumFirstTwo [1,2]    ==> 3
sumFirstTwo [1,2,4]  ==> 3

Here’s an example that uses many different list patterns:

describeList :: [Int] -> String
describeList []         = "an empty list"
describeList (x:[])     = "a list with one element"
describeList (x:y:[])   = "a list with two elements"
describeList (x:y:z:xs) = "a list with at least three elements"

describeList [1,3]        ==> "a list with two elements"
describeList [1,2,3,4,5]  ==> "a list with at least three elements"

List patterns that end with :[] can be typed out as list literals. That is, just like [1,2,3] is the same value as 1:2:3:[], the pattern [x,y] is the same as the pattern x:y:[]. Let’s rewrite that previous example.

describeList :: [Int] -> String
describeList []         = "an empty list"
describeList [x]        = "a list with exactly one element"
describeList [x,y]      = "a list with exactly two elements"
describeList (x:y:z:xs) = "a list with at least three elements"

Another way we can nest patterns is pattern matching on the head while pattern matching on a list. For example this function checks if a list starts with 0:

startsWithZero :: [Integer] -> Bool
startsWithZero (0:xs) = True
startsWithZero (x:xs) = False
startsWithZero []     = False

Consuming a List

Using pattern matching and recursion, we can recursively process a whole list. Here’s how you sum all the numbers in a list:

sumNumbers :: [Int] -> Int
sumNumbers [] = 0
sumNumbers (x:xs) = x + sumNumbers xs

Here’s how you compute the largest number in a list, this time using a helper function.

myMaximum :: [Int] -> Int
myMaximum [] = 0       -- actually this should be some sort of error...
myMaximum (x:xs) = go x xs
    where go biggest [] = biggest
        go biggest (x:xs) = go (max biggest x) xs

Note!, “go” is just a cute name for the helper function here. It’s not special syntax.

It’s often convenient to use nested patterns while consuming a list. Here’s an example that counts how many Nothing values occur in a list of Maybes:

countNothings :: [Maybe a] -> Int
countNothings [] = 0
countNothings (Nothing : xs) = 1 + countNothings xs
countNothings (Just _  : xs) = countNothings xs

countNothings [Nothing,Just 1,Nothing]  ==>  2

Building and Consuming a List

Now that we can build and consume lists, let’s do both of them at the same time. This function doubles all elements in a list.

doubleList :: [Int] -> [Int]
doubleList [] = []
doubleList (x:xs) = 2*x : doubleList xs

It evaluates like this:

doubleList [1,2,3]
=== doubleList (1:(2:(3:[])))
==> 2*1 : doubleList (2:(3:[]))
==> 2*1 : (2*2 : doubleList (3:[]))
==> 2*1 : (2*2 : (2*3 : doubleList []))
==> 2*1 : (2*2 : (2*3 : []))
=== [2*1, 2*2, 2*3]
==> [2,4,6]

Once you know pattern matching for lists, it’s straightforward to define map and filter. Actually, let’s just look at the GHC standard library implementations. Here’s map:

map :: (a -> b) -> [a] -> [b]
map _ []     = []
map f (x:xs) = f x : map f xs

and here’s filter:

filter :: (a -> Bool) -> [a] -> [a]
filter _pred []    = []
filter pred (x:xs)
    | pred x         = x : filter pred xs
    | otherwise      = filter pred xs

(Note! Naming the argument _pred is a way to tell the reader of the code that this argument is unused. It could have been just _ as well.)

Tail Recursion and Lists

When a recursive function evaluates to a new call to that same function with different arguments, it is called tail-recursive. (The recursive call is said to be in tail position.) This is the type of recursion that corresponds to an imperative loop. We’ve already seen many examples of tail-recursive functions, but we haven’t really contrasted the two ways for writing the same function. This is sumNumbers from earlier in this lecture:

-- Not tail recursive!
sumNumbers :: [Int] -> Int
sumNumbers [] = 0
sumNumbers (x:xs) = x + sumNumbers xs

In the second equation the function + is at the top level, i.e. in tail position. The recursive call to sumNumbers is an argument of +. This is sumNumbers written using a tail recursive helper function:

-- Tail recursive version
sumNumbers :: [Int] -> Int
sumNumbers xs = go 0 xs
    where go sum [] = sum
        go sum (x:xs) = go (sum+x) xs

Note the second equation of go: it has the recursive call to go at the top level, i.e. in tail position. The + is now in an argument to go.

For a function like sumNumbers that produces a single value (a number), it doesn’t really matter which form of recursion you choose. The non-tail-recursive function is easier to read, while the tail-recursive one can be easier to come up with. You can try writing a function both ways. The tail-recursive form might be more efficient, but that depends on many details. We’ll talk more about Haskell performance in part 2 of this course.

However, when you’re returning a list there is a big difference between these two forms. Consider the function doubleList from earlier. Here it is again, implemented first directly, and then via a tail-recursive helper function.

-- Not tail recursive!
doubleList :: [Int] -> [Int]
doubleList [] = []
doubleList (x:xs) = 2*x : doubleList xs

-- Tail recursive version
doubleList :: [Int] -> [Int]
doubleList xs = go [] xs
    where go result [] = result
            go result (x:xs) = go (result++[2*x]) xs

Here the direct version is much more efficient. The (:) operator works in constant time, whereas the (++) operator needs to walk the whole list, needing linear time. Thus the direct version uses linear time (O(n)) with respect to the length of the list, while the tail-recursive version is quadratic (O(n²))!

One might be tempted to fix this by using (:) in the tail-recursive version, but then the list would get generated in the reverse order. This could be fixed with an application of reverse, but that would make the resulting function quite complicated.

There is another reason to prefer the direct version: laziness. We’ll get back to laziness in part 2 of the course, but for now it’s enough for you to know that the direct way of generating a list is simpler, more efficient and more idiomatic. You should try to practice it in the exercises. Check out the standard library implementations of map and filter above, even they produce the list directly without tail recursion!

Exercises

All exercises can be found in Set3a and Set3b. Please pay attention in the title of the exercise in which file the exercises of this section can be found.

Exercises from 3a:

None for now!

Exercises from 3b:

Make sure to read the instructions for 3b carefully. All exercises should me made with just using the (:) operator, list literal syntax [a,b,c], if-thenelse, guards, ordering functions and case.

You have reached the end of this section! Continue to the next section:

You can check your current points from the blue blob in the bottom-right corner of the page.