Module 12 Lab

Go to a directory where you can save files. In an editor, open a file named lab.hs in one window. In your terminal, change to the same directory and start ghci.
In your lab file, write a function called listOf4 that takes four parameters and combines them into a list, so that, for example, listOf4 3 1 6 7 is [3,1,6,7]. Load this file into ghci and make sure it works, then use :t to determine its type. Unsurprisingly, it should be listOf4 :: a -> a -> a -> a -> [a]. We have talked before about how in the type signature, the last type is the function result and all the rest are the parameter types. But why all the arrows? Actually, the arrows indicate functions, and they are right associative. Thus the type of listOf4 is really a -> (a -> (a -> (a -> [a]))), and this says that listOf4 takes a parameter of type a and returns a function. This result function takes a parameter of type a and returns a function. This result function takes a parameter of type a and returns a function. This result function takes a parameter of type a and returns a list of values of type a. This idea of thinking of a function of multiple parameters as really a function of a single parameter returning another function of a single parameter, which itself returns a function of a single parameter, and so on, is called currying, named after Haskell Curry (yes, that Haskell). (Curry did not actually come up with currying, but it is named after him anyway.)
Functions are represented internally in Haskell as curried functions. So, for example, listOf4 8 returns a function. This function tacks on an 8 at the start of a list of three elements that it makes into a list. So we could define the function listOf4With8 = listOf4 8; similarly, we can make the function listOf4With88 = listOf4 8 8. And so on. Try them and see.
So what? This has a consequence that is useful for defining functions. If we have a function of several values and need a function just like it but with fixed parameters, we can make one out of it by currying. For example, we might define a double function as double n = 2*n. But with currying, we can just define this as double = (2*). This is called a point-free definition. Try it.
Functions are first-class entities in Haskell. This means that they can be returned as results (which is what currying does). This also means that they can be parameters. Suppose we want to do something to every element of a list. For example, suppose we want to double every element of a list. We could make the following definition.
```
doubleList [] = []
doubleList (n:ns) = (double n):(doubleList ns)
```
This works fine, as you will see if you try it.
But suppose we want to triple every element of a list, or do something else to every element of a list? Then we have to write another function every time. But what if defined a function like the following?
```
applyToList :: (a -> a) -> [a] -> [a]
applyToList _ [] = []
applyToList f (n:ns) = (f n):(applyToList f ns)
```
Notice the type signature: applyToList takes a function as its first argument, followed by a list, and returns a list. The definition makes clear that applyToList forms a new list by applying the argument function f to every element of the argument list. Now we can write applyToList double list, applyToList triple list, and so forth.
We don’t need to write our own functions to do this. And since we have currying, we don’t even have to write the double and triple functions. We can write things like the following.
```
applyToList (2*) [1..10]
applyToList (3*) [1..10]
```
Try these and see. Write code to generate the remainder of the numbers from 1 to 20 when divided by three.
It should not be a surprise that Haskell already has a function like applyToList. It is called map. Use :t to find the type of map. Note that it has a slightly different type than applyToList. This is because you can map a function that produces values of a type different from its argument type. For example, try map odd [1..10].
Use map to generate a list of the lengths of a list of strings.
Haskell has other very useful functions that have functions as arguments (sometimes called higher-order functions). Two of the most useful are foldl and foldr. These functions produce a result by applying a binary function to successive elements of a list. For example, suppose you want to add up the elements of a list. You need an accumulator that starts at 0, and then you want to add each element of the list to the accumulating value and return it as the result. Both foldl and foldr take three arguments: a binary function, a starting accumulator value, and list, and return the result of applying the function to successive elements of the list. The only difference is that foldl goes left to right in the list, and foldr goes right to left (which matters in some cases). Use a fold function to sum the first 100 numbers. Use one of them to compute 100! (remember to enclose operators in parentheses to refer to them as functions).
The functions foldl1 and foldr1 don’t have a starting accumulator value. Instead they use the first (or last) value in the list as the starting accumulator value. Use foldl1 or foldr1 to find the maximum value in a list of numbers (hint: see the max function). Use a fold to && together all the values in a list of Booleans.
Suppose we want to find the longest string in a list of strings. This is like finding the maximum of a list of numbers except that type of the value is different from the type of the compared value (that is, we are comparing string lengths, not strings). We have to write our own function for this comparison. Write a function maxString ::[Char] -> [Char] -> [Char] that return the longest of two strings. Then use a fold function to find the longest string in a list of strings.
Now suppose that we want to write a function findMaxString :: [[Char]] -> [Char] that finds the largest string in a list. We can write this function easily using a fold function and the maxString function from above. We can make the maxStringfunction local to findMaxString using either a where or a let clause. A let clause can be used anywhere an expression can be used. It introduces temporary name bindings. For example, we could write findMaxString using a let as follows.
```
findMaxString :: [[Char]] -> [Char]
findMaxString =
  let
    maxString s t
      | (length s) < (length t) = t
      | otherwise = s
    in foldl maxString ""
```
In this construction, maxString is defined first and then used in the expression following the in. Type this in and try it.
Alternatively, consider the following definition.
```
findMaxString' :: [[Char]] -> [Char]
findMaxString' = foldl maxString ""
  where
    maxString s t
      | (length s) < (length t) = t
      | otherwise = s
```
Here the where introduces a definition that applies in the context immediately preceding it. Type this in and try it. Although there are some subtle differences between these two constructs, they can mostly be used as alternatives to one another.
Note the indentation in the previous examples. We have not mentioned it before, but indentation is important in Haskell because, like Python, Haskell relies on indentation to determine when expressions end. The “golden rule” of Haskell indentation is that code that is part of an expression must be indented further in than the start of the expression. The second rule of indentation is that all grouped expression must be exactly aligned. To make matters even more difficult, Haskell includes non-whitespace as part of the indentation. To illustrate, the following is ok.
```
let x = 4
    y = 7
```
Here the x = 4 is indented from the let and the y = 7 is exactly aligned with it. But the following is not ok.
```
let x = 4
y = 7
```
The second expression is not indented from the let. These are also not ok.
```
let x = 4
  y = 7

let x = 4
      y = 7
```
Here the grouped expressions are not exactly aligned. If you only use spaces for indentation, and you align things nicely, you should be ok. (But you must indent then and else from the if in an if expression, which is unlike imperative languages conventions.) Some Haskell programmers think it is also a good idea to put things like let, where, in, and so on alone on a line. However, if you get a weird compiler message for code that looks perfectly ok, it could be an indentation problem. (Actually, you can use curly braces and semicolons to make everything work, but this is frowned on in the Haskell community.)
Write a function myReverse that uses a fold to reverse a list. Write the argument function as a local function using a let or where.
Yet another higher order function is one that selects or filters a list based on a test function. The built-in filter function has type filter :: (a -> Bool) -> [a] -> [a]. The first argument is a test function (or predicate) applied to every element of a list. If this test function returns True, then the element is part of the result list; otherwise it is tossed out. For example, filter odd [1..10] returns a list of odd numbers between 1 and 10.
Write a function partitionLess :: Ord a => a -> [a] -> [a] that takes a value v and a list t and returns a list of elements of t that are less than or equal to v. Now write a function partitionMore :: Ord a => a -> [a] -> [a] that returns the elements of t greater than v.
Now write quicksort. Now make the partition functions local. Now write quicksort without any helper functions.
We have discussed various types, and sometimes they can be quite complicated. Haskell provides a way to abbreviate types for ease of typing and documentation purposes. The statement type T = <def> can be used to define a new type. For example, type String = [Char] defines a String to be a list of Char. This definition is already built into Haskell, so we could have been doing this all along.
Type definitions can also have parameters. For example, type Pair a = (a,a) defines Pair to be a two-tuple of something (because we used a type variable, the contained type is not constrained). Pair Int would then be an abbreviation for (Int,Int), Pair String an abbreviation for (String,String), etc.
We can get values “out of” a composite data type using pattern matching. For example, the pattern (p, q) would match an instance of Pair, and we can use p and q to refer to the values “inside” the Pair. Note that we have already been using patterns ([] and (x:xs)) to write list-based functions.
Write a function pairToList :: Pair a -> [a] to convert a pair of two items to a list containing those two items.
Write a function pairUp :: Pair [a] -> [Pair a] to convert a pair of lists into a list of pairs. For example, pairUp ([1,2,3],[4,5,6]) should return [(1,4),(2,5),(3,6)].
Function application has high precedence and it is left associative. This can lead to lots of parentheses, such as odd (mod 8 (div 9 3)). The $ operator is right associative, has lower precedence than function application, but means the same as function application (so f $ x means the same as f x). Since we have currying, we can use it to eliminate some parentheses: odd $ mod 8 $ div 9 3.
We can also define new types in Haskell using algebraic data types. For instance, we could define a type that is either a single number or a range using the following declaration.
```
data Thing
  = ANum Int
  | ARange Int Int
  deriving (Show)
```
Here, ANum and ARange are called value constructors and are essentially functions that can be used to create new values of type Thing (What is the type of ANum? What about ARange?). For example, (ANum 5) is a single-number Thing with value 5. Write a function toThing :: Int -> Thing that takes an integer n and returns a new single-number Thing containing n (use the ANum constructor!).
You can use the case structure to pattern-match against algebraic data types. Here is an example of a function that returns the stored integer if it is given a single-number Thing and it returns zero if it is given a ranged Thing.
```
getNum :: Thing -> Int
getNum box = case box of
  ANum x     -> x
  ARange _ _ -> 0
```
Note the underscores indicate anonymous variables; we do not need to give them a name because we do not use them in the definition of getNum.
Write a function middle :: Thing -> Int which takes a Thing and returns the middle number, which for a single number is just that number but for a range is the median of the range.
Data types in Haskell can be recursive, which means that they can contain nested values. Here is an example.
```
data Tree
  = Empty
  | Node Int Tree Tree
  deriving (Show, Eq)
```
In other words, a Tree is either empty or it is a binary tree node that contains an integer and two more (sub) Trees. Here the deriving (Show, Eq) indicates that we would like default definitions provided so that we can call show (for string output) and use the equality operator.

Now we can define a function that generates a binary search tree from a list as follows.

toBST :: [Int] -> Tree
toBST []     = Empty
toBST (x:xs) = Node x (toBST $ filter (<=x) xs)
                      (toBST $ filter (>x)  xs)

Run toBST [1,2,3,4] and observe how its output is different from toBST [4,3,2,1].

Here is an alternate version of toBST that uses a fold:
```
toBST :: [Int] -> Tree
toBST = foldl insertBST Empty
  where
    insertBST Empty x = Node x Empty Empty
    insertBST (Node t l r) x
      | (x <= t)  = (Node t (insertBST l x) r)
      | otherwise = (Node t l (insertBST r x))
```
This is a somewhat complicated function but you should attempt to understand what it is doing. The helper function takes care of inserting a new element into a binary search tree, and once that is defined the overall build operation can be characterized by folding that operation over the list of elements, begining with an empty tree as the initial accumulator. The top-level routine is a point-free definition; the list parameter is not mentioned explicitly.
Write a function find :: Int -> Tree -> Bool that takes an integer and a binary search tree and returns True or False indicating whether or not the integer is present in the tree. This routine should run in O(log n) time if the tree is balanced, and should be considerably simpler than toBST.
Write a function sorted :: Tree -> [Int] that takes a binary search tree and returns a sorted list of integers in the tree.
Write a function countNodes :: Tree -> Int that takes a binary search tree and returns the number of integer nodes in the tree.
Write a function countNonLeaves :: Tree -> Int that takes a binary search tree and returns the number of non-leaf nodes in the tree.

This lab was originally written by Dr. Chris Fox and expanded by Dr. Mike Lam.

JMU CS 430 Spring 2021

Programming Languages

Module 12 Lab