Haskell/Next steps

From testwiki
Jump to navigation Jump to search

Template:Haskell minitoc

Haskell files

Up to now, we've made heavy use of the GHC interpreter. The interpreter is indeed a useful tool for trying things out quickly and for debugging your code. But we're getting to the point where typing everything directly into the interpreter isn't very practical. So now, we'll be writing our first Haskell source files.

Open up a file Varfun.hs in your favourite text editor (the hs stands for Haskell) and paste the following definition in. Haskell uses indentations and spaces to decide where functions (and other things) begin and end, so make sure there are no leading spaces and that indentations are correct, otherwise GHC will report parse errors.

area r = pi * r^2

(In case you're wondering, pi is actually predefined in Haskell, no need to include it here). Now change into the directory where you saved your file, open up ghci, and use :load (or :l for short):

Prelude> :load Varfun.hs
Compiling Main             ( Varfun.hs, interpreted )
Ok, modules loaded: Main.
*Main> 

If ghci gives an error, "Could not find module 'Varfun.hs'", then use :cd to change the current directory to the directory containing Varfun.hs:

Prelude> :cd c:\myDirectory
Prelude> :load Varfun.hs
Compiling Main             ( Varfun.hs, interpreted )
Ok, modules loaded: Main.
*Main> 

Now you can execute the bindings found in that file:

*Main> area 5
78.53981633974483

If you make changes to the file, just use :reload (:r for short) to reload the file.

Template:Body note

You'll note that there are a couple of differences between how we do things when we type them directly into ghci, and how we do them when we load them from files. The differences may seem awfully arbitrary for now, but they're actually quite sensible consequences of the scope, which, rest assured, we will explain later.

No let

For starters, you no longer say something like

let x = 3
let y = 2
let area r = pi * r ^ 2

Instead, you say things like

x = 3
y = 2
area r = pi * r ^ 2

The keyword let is actually something you use a lot in Haskell, but not exactly in this context. We'll see further on in this chapter when we discuss the use of let bindings.

You can't define the same thing twice

Previously, the interpreter cheerfully allowed us to write something like this

Prelude> let r = 5
Prelude> r
5
Prelude> let r = 2
Prelude> r
2

On the other hand, writing something like this in a source file does not work

-- this does not work
r = 5
r = 2

As we mentioned above, variables do not change, and this is even more the case when you're working in a source file. This has one very nice implication. It means that:

Order does not matter

The order in which you declare things does not matter. For example, the following fragments of code do exactly the same thing:

 y = x * 2
 x = 3
 x = 3
 y = x * 2

This is a unique feature of Haskell and other functional programming languages. The fact that variables never change means that we can opt to write things in any order that we want (but this is also why you can't declare something more than once... it would be ambiguous otherwise).

Template:Exercises

More about functions

Working with actual source code files instead of typing things into the interpreter makes things convenient to define much more substantial functions than those we've seen up to now. Let's flex some Haskell muscle here and examine the kinds of things we can do with our functions.

Conditional expressions

if / then / else

Haskell supports standard conditional expressions. For instance, we could define a function that returns 1 if its argument is less than 0; 0 if its argument is 0; and 1 if its argument is greater than 0 (this is called the signum function). Actually, such a function already exists, but let's define one of our own, what we'll call mySignum.

mySignum x =
    if x < 0
      then -1
      else if x > 0
        then 1
        else 0

You can experiment with this as:

Template:HaskellExample Note that the parenthesis around "-1" in the last example are required; if missing, the system will think you are trying to subtract the value "1" from the value "mySignum," which is ill-typed.

The if/then/else construct in Haskell is very similar to that of most other programming languages; however, you must have both a then and an else clause. It evaluates the condition (in this case x<0 and, if this evaluates to True, it evaluates the then condition; if the condition evaluated to False, it evaluates the else condition).

You can test this program by editing the file and loading it back into your interpreter. Instead of typing :l Varfun.hs again, you can simply type :reload or just :r to reload the current file. This is usually much faster.

case

Haskell, like many other languages, also supports case constructions. These are used when there are multiple values that you want to check against (case expressions are actually quite a bit more powerful than this -- see the [[../Pattern matching/]] chapter for all of the details).

Suppose we wanted to define a function that had a value of 1 if its argument were 0; a value of 5 if its argument were 1; a value of 2 if its argument were 2; and a value of 1 in all other instances. Writing this function using if statements would be long and very unreadable; so we write it using a case statement as follows (we call this function f):

f x =
    case x of
      0 -> 1
      1 -> 5
      2 -> 2
      _ -> -1

In this program, we're defining f to take an argument x and then inspecting the value of x. If it matches 0, the value of f is 1. If it matches 1, the value of f is 5. If it matches 2, then the value of f is 2; and if it hasn't matched anything by that point, the value of f is 1 (the underscore can be thought of as a "wildcard" -- it will match anything).

The indentation here is important. Haskell uses a system called "layout" to structure its code (the programming language Python uses a similar system). The layout system allows you to write code without the explicit semicolons and braces that other languages like C and Java require.

Template:Warning

Indentation

The general rule for layout is that an open-brace is inserted after the keywords where, let, do and of, and the column position at which the next command appears is remembered. From then on, a semicolon is inserted before every new line that is indented the same amount. If a following line is indented less, a close-brace is inserted. This may sound complicated, but if you follow the general rule of indenting after each of those keywords, you'll never have to remember it (see the [[../Indentation/]] chapter for a more complete discussion of layout).

Some people prefer not to use layout and write the braces and semicolons explicitly. This is perfectly acceptable. In this style, the above function might look like:

f x = case x of
        { 0 -> 1 ; 1 -> 5 ; 2 -> 2 ; _ -> -1 }

Of course, if you write the braces and semicolons explicitly, you're free to structure the code as you wish. The following is also equally valid:

f x =
    case x of { 0 -> 1 ;
      1 -> 5 ; 2 -> 2
   ; _ -> -1 }

However, structuring your code like this only serves to make it unreadable (in this case).

Defining one function for different parameters

Functions can also be defined piece-wise, meaning that you can write one version of your function for certain parameters and then another version for other parameters. For instance, the above function f could also be written as:

f 0 = 1
f 1 = 5
f 2 = 2
f _ = -1

Here, the order is important. If we had put the last line first, it would have matched every argument, and f would return -1, regardless of its argument (most compilers will warn you about this, though, saying something about overlapping patterns). If we had not included this last line, f would produce an error if anything other than 0, 1 or 2 were applied to it (most compilers will warn you about this, too, saying something about incomplete patterns). This style of piece-wise definition is very popular and will be used quite frequently throughout this tutorial. These two definitions of f are actually equivalent -- this piece-wise version is translated into the case expression.

Function composition

More complicated functions can be built from simpler functions using function composition. Function composition is simply taking the result of the application of one function and using that as an argument for another. We've already seen this back in the [[../Getting set up/]] chapter, when we wrote 5*4+3. In this, we were evaluating 5*4 and then applying +3 to the result. We can do the same thing with our square and f functions:

square x = x^2

Template:HaskellExample The result of each of these function applications is fairly straightforward. The parentheses around the inner function are necessary; otherwise, in the first line, the interpreter would think that you were trying to get the value of square f, which has no meaning. Function application like this is fairly standard in most programming languages. There is another, more mathematical way to express function composition: the (.) enclosed period function. This (.) function is modeled after the () operator in mathematics.

Template:Body note

The (.) function (called the function composition function), takes two functions and makes them into one. For instance, if we write (square . f), this means that it creates a new function that takes an argument, applies f to that argument and then applies square to the result. Conversely, (f . square) means that a new function is created that takes an argument, applies square to that argument and then applies f to the result. We can see this by testing it as before:

Template:HaskellExample Here, we must enclose the function composition in parentheses; otherwise, the Haskell compiler will think we're trying to compose square with the value f 1 in the first line, which makes no sense since f 1 isn't even a function.

It would probably be wise to take a little time-out to look at some of the functions that are defined in the Prelude. Undoubtedly, at some point, you will accidentally rewrite some already-existing function (I've done it more times than I can count), but if we can keep this to a minimum, that would save a lot of time.

Let Bindings

Often we wish to provide local declarations for use in our functions. For instance, if you remember back to your grade school mathematics courses, the following equation is used to find the roots (zeros) of a polynomial of the form ax2+bx+c=0: x=(b±b24ac)/2a. We could write the following function to compute the two values of x:

roots a b c =
    ((-b + sqrt(b*b - 4*a*c)) / (2*a),
     (-b - sqrt(b*b - 4*a*c)) / (2*a))

Notice that our definition here has a bit of redundancy. It is not quite as nice as the mathematical definition because we have needlessly repeated the code for sqrt(b*b - 4*a*c). To remedy this problem, Haskell allows for local bindings. That is, we can create values inside of a function that only that function can see. For instance, we could create a local binding for sqrt(b*b-4*a*c) and call it, say, disc and then use that in both places where sqrt(b*b - 4*a*c) occurred. We can do this using a let/in declaration:

roots a b c =
    let disc = sqrt (b*b - 4*a*c)
    in  ((-b + disc) / (2*a),
         (-b - disc) / (2*a))

In fact, you can provide multiple declarations inside a let. Just make sure they're indented the same amount, or you will have layout problems:

roots a b c =
    let disc = sqrt (b*b - 4*a*c)
        twice_a = 2*a
    in  ((-b + disc) / twice_a,
         (-b - disc) / twice_a)


Template:Haskell navigation

Template:Auto category