Module 12 Lab
-
Go to a directory where you can save files. In an editor, open a file named
lab.hs
in one window. In your terminal, change to the same directory and startghci
. -
In your lab file, write a function called
listOf4
that takes four parameters and combines them into a list, so that, for example,listOf4 3 1 6 7
is[3,1,6,7]
. Load this file into ghci and make sure it works, then use:t
to determine its type. Unsurprisingly, it should belistOf4 :: a -> a -> a -> a -> [a]
. We have talked before about how in the type signature, the last type is the function result and all the rest are the parameter types. But why all the arrows? Actually, the arrows indicate functions, and they are right associative. Thus the type oflistOf4
is reallya -> (a -> (a -> (a -> [a])))
, and this says thatlistOf4
takes a parameter of typea
and returns a function. This result function takes a parameter of typea
and returns a function. This result function takes a parameter of typea
and returns a function. This result function takes a parameter of typea
and returns a list of values of type a. This idea of thinking of a function of multiple parameters as really a function of a single parameter returning another function of a single parameter, which itself returns a function of a single parameter, and so on, is called currying, named after Haskell Curry (yes, that Haskell). (Curry did not actually come up with currying, but it is named after him anyway.) -
Functions are represented internally in Haskell as curried functions. So, for example,
listOf4 8
returns a function. This function tacks on an 8 at the start of a list of three elements that it makes into a list. So we could define the functionlistOf4With8 = listOf4 8
; similarly, we can make the functionlistOf4With88 = listOf4 8 8
. And so on. Try them and see. -
So what? This has a consequence that is useful for defining functions. If we have a function of several values and need a function just like it but with fixed parameters, we can make one out of it by currying. For example, we might define a double function as
double n = 2*n
. But with currying, we can just define this asdouble = (2*)
. This is called a point-free definition. Try it. -
Functions are first-class entities in Haskell. This means that they can be returned as results (which is what currying does). This also means that they can be parameters. Suppose we want to do something to every element of a list. For example, suppose we want to double every element of a list. We could make the following definition.
doubleList [] = [] doubleList (n:ns) = (double n):(doubleList ns)
This works fine, as you will see if you try it.
-
But suppose we want to triple every element of a list, or do something else to every element of a list? Then we have to write another function every time. But what if defined a function like the following?
applyToList :: (a -> a) -> [a] -> [a] applyToList _ [] = [] applyToList f (n:ns) = (f n):(applyToList f ns)
Notice the type signature:
applyToList
takes a function as its first argument, followed by a list, and returns a list. The definition makes clear thatapplyToList
forms a new list by applying the argument function f to every element of the argument list. Now we can writeapplyToList double list
,applyToList triple list
, and so forth. -
We don’t need to write our own functions to do this. And since we have currying, we don’t even have to write the double and triple functions. We can write things like the following.
applyToList (2*) [1..10] applyToList (3*) [1..10]
Try these and see. Write code to generate the remainder of the numbers from 1 to 20 when divided by three.
-
It should not be a surprise that Haskell already has a function like
applyToList
. It is calledmap
. Use:t
to find the type ofmap
. Note that it has a slightly different type thanapplyToList
. This is because you can map a function that produces values of a type different from its argument type. For example, trymap odd [1..10]
. -
Use
map
to generate a list of the lengths of a list of strings. -
Haskell has other very useful functions that have functions as arguments (sometimes called higher-order functions). Two of the most useful are
foldl
andfoldr
. These functions produce a result by applying a binary function to successive elements of a list. For example, suppose you want to add up the elements of a list. You need an accumulator that starts at 0, and then you want to add each element of the list to the accumulating value and return it as the result. Bothfoldl
andfoldr
take three arguments: a binary function, a starting accumulator value, and list, and return the result of applying the function to successive elements of the list. The only difference is thatfoldl
goes left to right in the list, andfoldr
goes right to left (which matters in some cases). Use a fold function to sum the first 100 numbers. Use one of them to compute100!
(remember to enclose operators in parentheses to refer to them as functions). -
The functions
foldl1
andfoldr1
don’t have a starting accumulator value. Instead they use the first (or last) value in the list as the starting accumulator value. Usefoldl1
orfoldr1
to find the maximum value in a list of numbers (hint: see themax
function). Use a fold to&&
together all the values in a list of Booleans. -
Suppose we want to find the longest string in a list of strings. This is like finding the maximum of a list of numbers except that type of the value is different from the type of the compared value (that is, we are comparing string lengths, not strings). We have to write our own function for this comparison. Write a function
maxString ::[Char] -> [Char] -> [Char]
that return the longest of two strings. Then use a fold function to find the longest string in a list of strings. -
Now suppose that we want to write a function
findMaxString :: [[Char]] -> [Char]
that finds the largest string in a list. We can write this function easily using a fold function and themaxString
function from above. We can make themaxStringfunction
local tofindMaxString
using either awhere
or alet
clause. Alet
clause can be used anywhere an expression can be used. It introduces temporary name bindings. For example, we could writefindMaxString
using alet
as follows.findMaxString :: [[Char]] -> [Char] findMaxString = let maxString s t | (length s) < (length t) = t | otherwise = s in foldl maxString ""
In this construction,
maxString
is defined first and then used in the expression following thein
. Type this in and try it. -
Alternatively, consider the following definition.
findMaxString' :: [[Char]] -> [Char] findMaxString' = foldl maxString "" where maxString s t | (length s) < (length t) = t | otherwise = s
Here the
where
introduces a definition that applies in the context immediately preceding it. Type this in and try it. Although there are some subtle differences between these two constructs, they can mostly be used as alternatives to one another. -
Note the indentation in the previous examples. We have not mentioned it before, but indentation is important in Haskell because, like Python, Haskell relies on indentation to determine when expressions end. The “golden rule” of Haskell indentation is that code that is part of an expression must be indented further in than the start of the expression. The second rule of indentation is that all grouped expression must be exactly aligned. To make matters even more difficult, Haskell includes non-whitespace as part of the indentation. To illustrate, the following is ok.
let x = 4 y = 7
Here the
x = 4
is indented from thelet
and they = 7
is exactly aligned with it. But the following is not ok.let x = 4 y = 7
The second expression is not indented from the
let
. These are also not ok.let x = 4 y = 7 let x = 4 y = 7
Here the grouped expressions are not exactly aligned. If you only use spaces for indentation, and you align things nicely, you should be ok. (But you must indent
then
andelse
from theif
in anif
expression, which is unlike imperative languages conventions.) Some Haskell programmers think it is also a good idea to put things likelet
,where
,in
, and so on alone on a line. However, if you get a weird compiler message for code that looks perfectly ok, it could be an indentation problem. (Actually, you can use curly braces and semicolons to make everything work, but this is frowned on in the Haskell community.) -
Write a function
myReverse
that uses a fold to reverse a list. Write the argument function as a local function using alet
orwhere
. -
Yet another higher order function is one that selects or filters a list based on a test function. The built-in
filter
function has typefilter :: (a -> Bool) -> [a] -> [a]
. The first argument is a test function (or predicate) applied to every element of a list. If this test function returnsTrue
, then the element is part of the result list; otherwise it is tossed out. For example,filter odd [1..10]
returns a list of odd numbers between 1 and 10. -
Write a function
partitionLess :: Ord a => a -> [a] -> [a]
that takes a valuev
and a listt
and returns a list of elements oft
that are less than or equal tov
. Now write a functionpartitionMore :: Ord a => a -> [a] -> [a]
that returns the elements oft
greater thanv
. -
Now write
quicksort
. Now make the partition functions local. Now writequicksort
without any helper functions. -
We have discussed various types, and sometimes they can be quite complicated. Haskell provides a way to abbreviate types for ease of typing and documentation purposes. The statement
type T = <def>
can be used to define a new type. For example,type String = [Char]
defines aString
to be a list ofChar
. This definition is already built into Haskell, so we could have been doing this all along. -
Type definitions can also have parameters. For example,
type Pair a = (a,a)
definesPair
to be a two-tuple of something (because we used a type variable, the contained type is not constrained).Pair Int
would then be an abbreviation for(Int,Int)
,Pair String
an abbreviation for(String,String)
, etc. -
We can get values “out of” a composite data type using pattern matching. For example, the pattern
(p, q)
would match an instance ofPair
, and we can usep
andq
to refer to the values “inside” thePair
. Note that we have already been using patterns ([]
and(x:xs)
) to write list-based functions. -
Write a function
pairToList :: Pair a -> [a]
to convert a pair of two items to a list containing those two items. -
Write a function
pairUp :: Pair [a] -> [Pair a]
to convert a pair of lists into a list of pairs. For example,pairUp ([1,2,3],[4,5,6])
should return[(1,4),(2,5),(3,6)]
. -
Function application has high precedence and it is left associative. This can lead to lots of parentheses, such as
odd (mod 8 (div 9 3))
. The$
operator is right associative, has lower precedence than function application, but means the same as function application (sof $ x
means the same asf x
). Since we have currying, we can use it to eliminate some parentheses:odd $ mod 8 $ div 9 3
. -
We can also define new types in Haskell using algebraic data types. For instance, we could define a type that is either a single number or a range using the following declaration.
data Thing = ANum Int | ARange Int Int deriving (Show)
Here,
ANum
andARange
are called value constructors and are essentially functions that can be used to create new values of typeThing
(What is the type ofANum
? What aboutARange
?). For example,(ANum 5)
is a single-numberThing
with value5
. Write a functiontoThing :: Int -> Thing
that takes an integern
and returns a new single-numberThing
containingn
(use theANum
constructor!). -
You can use the
case
structure to pattern-match against algebraic data types. Here is an example of a function that returns the stored integer if it is given a single-numberThing
and it returns zero if it is given a rangedThing
.getNum :: Thing -> Int getNum box = case box of ANum x -> x ARange _ _ -> 0
Note the underscores indicate anonymous variables; we do not need to give them a name because we do not use them in the definition of
getNum
. -
Write a function
middle :: Thing -> Int
which takes aThing
and returns the middle number, which for a single number is just that number but for a range is the median of the range. -
Data types in Haskell can be recursive, which means that they can contain nested values. Here is an example.
data Tree = Empty | Node Int Tree Tree deriving (Show, Eq)
In other words, a
Tree
is either empty or it is a binary tree node that contains an integer and two more (sub)Tree
s. Here thederiving (Show, Eq)
indicates that we would like default definitions provided so that we can callshow
(for string output) and use the equality operator. -
Now we can define a function that generates a binary search tree from a list as follows.
toBST :: [Int] -> Tree toBST [] = Empty toBST (x:xs) = Node x (toBST $ filter (<=x) xs) (toBST $ filter (>x) xs)
Run
toBST [1,2,3,4]
and observe how its output is different fromtoBST [4,3,2,1]
. -
Here is an alternate version of
toBST
that uses a fold:toBST :: [Int] -> Tree toBST = foldl insertBST Empty where insertBST Empty x = Node x Empty Empty insertBST (Node t l r) x | (x <= t) = (Node t (insertBST l x) r) | otherwise = (Node t l (insertBST r x))
This is a somewhat complicated function but you should attempt to understand what it is doing. The helper function takes care of inserting a new element into a binary search tree, and once that is defined the overall build operation can be characterized by folding that operation over the list of elements, begining with an empty tree as the initial accumulator. The top-level routine is a point-free definition; the list parameter is not mentioned explicitly.
-
Write a function
find :: Int -> Tree -> Bool
that takes an integer and a binary search tree and returnsTrue
orFalse
indicating whether or not the integer is present in the tree. This routine should run in O(log n) time if the tree is balanced, and should be considerably simpler thantoBST
. -
Write a function
sorted :: Tree -> [Int]
that takes a binary search tree and returns a sorted list of integers in the tree. -
Write a function
countNodes :: Tree -> Int
that takes a binary search tree and returns the number of integer nodes in the tree. -
Write a function
countNonLeaves :: Tree -> Int
that takes a binary search tree and returns the number of non-leaf nodes in the tree.
This lab was originally written by Dr. Chris Fox and expanded by Dr. Mike Lam.