Tuesday, August 14, 2012

Haskell Chapter 1.2.2

Your first assignment pt1: Create a Wizard


    When you install a piece of software for the first time, it will proceed to ask you some basic questions about you so that it can set the software up correctly. As you probably already know, that sequence of questions is called a wizard.
 
     Wizards are an excellent example of precisely what it is that software really does. A wizard asks you for information, takes that information and processes it. Then, the wizard spits out a result. All the while, the inner-workings of the wizard are invisible to the user. Anyone can use a wizard because a wizard uses natural language; in our case English. This allows the programmer to create algorithms of infinite complexity which are, all the while, still accessible to anyone with basic communications skills and perhaps some arithmetic. This is what all software should ideally be: an intuitive user interface that gathers data, invisibly executes it with unfettered complexity, and returns the data in terms that the user can understand without knowing how the program works.

    This is not so easy to do in Haskell; a weakness if ever there was one. Still, Haskell can do so many other things, so well and with so little effort, that we just have to bite the bullet here! 

    The perfect analogy for the wizard is the algebraic function. If you don't remember this, or never learned it, you need to stop here and seek out a good introductory Algebra textbook. In several years from now I may have one written for you (eh just go check one out from the library)!

    OK, now that you're back, let's continue with the algebraic function. A function is the first program anyone learns how to write. Long before learning to code, we learn to write functions and they work like 50% of the wizard which we are attempting to write. In this section, we will create a function along with a set of instructions and save them as a '.hs' file. We call this file a 'script' because it contains not only functions, but a general order in which the functions are-to-be performed. Whereas a module can be a very short '.hs' file consisting of only one line, a script is a collection of functions and instructions; a collection of multiple-lines of code. Here's the real-world scenario which we want to model as a mathematical function: 
  1. Start by finding a classmate who is either an only child or a firstborn child.
  2. Ask the user (your classmate) how old she is. For demonstration purposes, we'll say that she says 20 and you write that down in your notes. 
  3. Let x = your classmate's age. Now, let x = 20.
  4. Next, you tell her that you will guess her mom's age within +/- 5 years. 
  5. In america, the average new mother is about 25 years old. So, we can extrapolate that your classmate's mother was probably close to that age when she gave birth. On your notepad, add the classmate's age (x) plus 25. 
  6. Finally, perform the calculation (x + 25) and speak the results to your classmate: "Your mother is most likely 45 plus or minus 5 years"
     There's a pretty good chance you'll be right. Maybe not. The point is, you just made a function which (if you  remember algebra class) can be written like this:

function(x) = x + 25

The anatomy of a function can be divided into five parts:

  • First we have the name of the function: 'function' -- a rather boring name.
  • Second we have the input of the function: '(x)'
  • Next we have the declaration of equality, which signals the end of the arguments section: '='
  • Then we have the definition, which explains how the function works: 'x + 25' -- this can consist of nearly an infinite number of lines.
  • Lastly we have the return value. It is not usually listed in the function itself, but rather it is inferred from the definition. In non-conditional functions, the return value is usually the product of the last line of the function
    In Haskell we don't use parenthesis (parens) for function input; in this case 'x'. Haskell automatically infers that the data following a function-call is the input (also called the argument). We also don't need to actually use the word 'function'. This is a relief. It allows us to give the function a name which helps us separate one function from the countless others your programs may have. Using Haskell notation, we'll give it a nice descriptive title: 'momsAge'. In Haskell-eese we write the above function like this:

momsAge x = x + 25

Not much of a difference there, so we'll move on.

     OK, but wait, where's all the stuff where I asked her how old she was and took the notes and then spoke the answer to her? That's all IO. You will no doubt remember that none of your math classes covered anything like that. As it turns out, simple stuff like asking a chick in your class a question is really hard to describe in mathematical terms. The only way to do it is to use an abstract tool called a monad, which we won't talk about for awhile. All that matters is that monads are what allow IO in Haskell and, more succinctly, monads are what allow Haskell to exist in the first place! No monad, no Haskell. By learning Haskell IO from the start, you are learning a little about monads from the get go; for now you don't need to know much more than that; other than to say: monads are nothing to be afraid of.

     Forget about monads for now. Haskell IO is really composed of a mix of tools and structure. The tools consist of a set of built-in operations which Haskell calls 'IO Actions' or just 'actions' for short. In any other language these actions would be just another primitive operation like plus, minus  or divide. But in Haskell, they are quarantined from the rest of the language under the rubric: 'actions'.

    Common actions are: 'print', 'getLine', and 'putStrLn'. Of course, there are many others. Then there is the IO structure which is really what makes IO on Haskell so confusing for the unitiated. 

    The Haskell IO structure is dominated by rules on top of rules. In good dynamic languages, IO is so easy that the pupil learns it instinctively with only a few syntactic definitions from the author. With Haskell, you need to make sure that your code is laid-out according to a merciless set of rules. The rules are extensive and they don't come naturally. There are aspects that don't really make sense at first. As I gain insight, I will add posts to the subject of Haskell IO in order to create a methodical approach. The idea being that once you understand the rules, you should be able to synthesize your own code with impunity. 'Showing you how' just does not work with Haskell IO. Now, if I haven't scared you away yet, here's the core set of rules:

Haskell IO Rules of the Road


1. All IO scripts require a "main = do" function which acts as a spine around which the other functions of the script are attached. Think C/C++. This makes for easier-to-read code, but Haskell also requires it. In this way, main becomes the entry point for the program but also becomes the 'director' of the other functions. In other words: always start your IO script with 'main = do'. 

2. Remember to give each separate step in your function it's own carefully indented line. The exception here is 'do'. Put 'do' on the same line as 'main' without any particular indents.

3. Create seperate functions for standard input (outside of 'main'), then bind them to a variable within main using '<-' . Avoid putting any standard input commands directly within "main = do". Haskell has no problems with you putting specific output commands like ' putStrLn "Hello World" ' directly into main. But, it prevents input actions altogether. Thankfully, you can bind the output of a seperate input function within main for usage with later functions. This means that you will have multiple separate functions for even the simplest of IO scripts; weaving in and out of 'main' continuously.

4. Use '<-' only for binding IO functions. Any other (non-IO) binding can (and should) be performed with 'let x = n' syntax. This is also true when piping the output of a (non IO) function to a local variable. Like 'let a = momsAge 9'. Just remember that functions can be renamed in the same way. 'let a = momsAge'. You will find that a large portion of your compiler-errors come from missing arguments!

5. Separate non-IO functions from all other 'do' functions. Any mathematical operation or algorithm should be separated from IO functions like 'main'. You can still call those functions from within 'main'. This makes your programs easier to read and maintain but it also keeps your code from crashing during compile-time!

6. Use ' function = do' (do notation) for all functions that have IO actions in them. Get used to the fact that you will see many many instances of 'do' in large programs!

7. All 'do' functions must end with some sort of return value; even if that return value is only a print statement or onscreen command. This is especially true with 'main' statements.

8. Use 'print $ function x' to print the output value of a function (which itself has no IO actions) onscreen. You'll learn what the $ sign means later. For now, just think of it as the second part of this command with 'print' being the first.

9. Always be aware of the type of data you pass to an operator. Haskell is statically typed, unlike most other popular languages. This takes a lot of getting used to. 90% of the mistakes newbies make when writing their first IO programs come from type conflicts. If Haskell catches you trying to use an Integer operation (like multiplication) on a piece of data that was input as a String, it will refuse to compile the program! This happens all the time with numbers (numbers can be both integers and strings). The getLine function only outputs Strings, regardless of the input you give it; which leads us to rule #10.

10. The read command is your best friend! If you bring in a variable (a number) with getLine, and you need to perform an arithmetical operation on it, use read to convert it (temporarily) into a Number just long enough to perform the operation. Remember that the variable will revert to its original type once the operation is finished.

     In part 2, we'll look at the actual Haskell code used to automate the momsAge function from beginning to end. In every section where one of the 10 IO rules is implemented, I will mark the applied rule and how it interacts with the structure of the overall module.


No comments:

Post a Comment

Note: Only a member of this blog may post a comment.