Chapter 1: Introduction
When personal computers were first introduced, most of them came equipped with a simple programming language, usually a variant of BASIC. Interacting with the computer was closely integrated with this language, and thus every computer-user, whether he wanted to or not, would get a taste of it. Now that computers have become plentiful and cheap, typical users don't get much further than clicking things with a mouse. For most people, this works very well. But for those of us with a natural inclination towards technological tinkering, the removal of programming from every-day computer use presents something of a barrier.
Fortunately, as an effect of developments in the World Wide Web, it so happens that every computer equipped with a modern web-browser also has an environment for programming JavaScript. In today's spirit of not bothering the user with technical details, it is kept well hidden, but a web-page can make it accessible, and use it as a platform for learning to program.
That is what this (hyper-)book tries to do.
I do not enlighten those who are not eager to learn, nor arouse those who are not anxious to give an explanation themselves. If I have presented one corner of the square and they cannot come back to me with the other three, I should not go over the points again.
― Confucius
Besides explaining JavaScript, this book tries to be an introduction to the basic principles of programming. Programming, it turns out, is hard. The fundamental rules are, most of the time, simple and clear. But programs, while built on top of these basic rules, tend to become complex enough to introduce their own rules, their own complexity. Because of this, programming is rarely simple or predictable. As Donald Knuth, who is something of a founding father of the field, says, it is an art.
To get something out of this book, more than just passive reading is required. Try to stay sharp, make an effort to solve the exercises, and only continue on when you are reasonably sure you understand the material that came before.
The computer programmer is a creator of universes for which he alone is responsible. Universes of virtually unlimited complexity can be created in the form of computer programs.
― Joseph Weizenbaum, Computer Power and Human Reason
A program is many things. It is a piece of text typed by a programmer, it is the directing force that makes the computer do what it does, it is data in the computer's memory, yet it controls the actions performed on this same memory. Analogies that try to compare programs to objects we are familiar with tend to fall short, but a superficially fitting one is that of a machine. The gears of a mechanical watch fit together ingeniously, and if the watchmaker was any good, it will accurately show the time for many years. The elements of a program fit together in a similar way, and if the programmer knows what he is doing, the program will run without crashing.
A computer is a machine built to act as a host for these immaterial machines. Computers themselves can only do stupidly straightforward things. The reason they are so useful is that they do these things at an incredibly high speed. A program can, by ingeniously combining many of these simple actions, do very complicated things.
To some of us, writing computer programs is a fascinating game. A program is a building of thought. It is costless to build, weightless, growing easily under our typing hands. If we get carried away, its size and complexity will grow out of control, confusing even the one who created it. This is the main problem of programming. It is why so much of today's software tends to crash, fail, screw up.
When a program works, it is beautiful. The art of programming is the skill of controlling complexity. The great program is subdued, made simple in its complexity.
Today, many programmers believe that this complexity is best managed by using only a small set of well-understood techniques in their programs. They have composed strict rules about the form programs should have, and the more zealous among them will denounce those who break these rules as bad programmers.
What hostility to the richness of programming! To try to reduce it to something straightforward and predictable, to place a taboo on all the weird and beautiful programs. The landscape of programming techniques is enormous, fascinating in its diversity, still largely unexplored. It is certainly littered with traps and snares, luring the inexperienced programmer into all kinds of horrible mistakes, but that only means you should proceed with caution, keep your wits about you. As you learn, there will always be new challenges, new territory to explore. The programmer who refuses to keep exploring will surely stagnate, forget his joy, lose the will to program (and become a manager).
As far as I am concerned, the definite criterion for a program is whether it is correct. Efficiency, clarity, and size are also important, but how to balance these against each other is always a matter of judgement, a judgement that each programmer must make for himself. Rules of thumb are useful, but one should never be afraid to break them.
In the beginning, at the birth of computing, there were no programming languages. Programs looked something like this:
00110001 00000000 00000000 00110001 00000001 00000001 00110011 00000001 00000010 01010001 00001011 00000010 00100010 00000010 00001000 01000011 00000001 00000000 01000001 00000001 00000001 00010000 00000010 00000000 01100010 00000000 00000000
That is a program to add the numbers from one to ten together, and print out the result (1 + 2 + ... + 10 = 55). It could run on a very simple kind of computer. To program early computers, it was necessary to set large arrays of switches in the right position, or punch holes in strips of cardboard and feed them to the computer. You can imagine how this was a tedious, error-prone procedure. Even the writing of simple programs required much cleverness and discipline, complex ones were nearly inconceivable.
Of course, manually entering these arcane patterns of bits (which is what the 1s and 0s above are generally called) did give the programmer a profound sense of being a mighty wizard. And that has to be worth something, in terms of job satisfaction.
Each line of the program contains a single instruction. It could be written in English like this:
- Store the number 0 in memory location 0
- Store the number 1 in memory location 1
- Store the value of memory location 1 in memory location 2
- Subtract the number 11 from the value in memory location 2
- If the value in memory location 2 is the number 0, continue with instruction 9
- Add the value of memory location 1 to memory location 0
- Add the number 1 to the value of memory location 1
- Continue with instruction 3
- Output the value of memory location 0
While that is more readable than the binary soup, it is still rather unpleasant. It might help to use names instead of numbers for the instructions and memory locations:
Set 'total' to 0 Set 'count' to 1 [loop] Set 'compare' to 'count' Subtract 11 from 'compare' If 'compare' is zero, continue at [end] Add 'count' to 'total' Add 1 to 'count' Continue at [loop] [end] Output 'total'
At this point it is not too hard to see how the program works. Can
you? The first two lines give two memory locations their starting
values: total will be used to build up the result of the program,
and count keeps track of the number that we are currently looking
at. The lines using compare are probably the weirdest ones. What the
program wants to do is see if counter is equal to 11, in order to
decide whether it can stop yet. Because the machine is so primitive,
it can only test whether a number is zero, and make a decision (jump)
based on that. So it uses the memory location labelled compare to
compute the value of count - 11, and makes a decision based on that
value. The next two lines add the value of count to the result, and
increment count by one every time the program has decided that it is
not 11 yet.
Here is the same program in JavaScript:
var total = 0, count = 1; while (count <= 10) { total += count; count += 1; } print(total);
This gives us a few more improvements. Most importantly, there is no
need to specify the way we want to program to jump back and forth
anymore. The magic word while takes care of that. It continues
executing the lines below it as long as the condition it was given
holds: count <= 10, which means 'count is less than or equal
to 10'. Apparently, there is no need anymore to create a temporary
value and compare that to zero. This was a stupid little detail, and
the power of programming languages is that they take care of stupid
little details for us.
Finally, here is what the program could look like if we happened to
have the convenient operations range and sum available, which
respectively create a collection of numbers within a range and compute
the sum of a collection of numbers:
print(sum(range(1, 10)));
The moral of this story, then, is that the same program can be
expressed in long and short, unreadable and readable ways. The first
version of the program was extremely obscure, while this last one is
almost English: print the sum of the range of numbers from 1
to 10. (We will see in later chapters how to build things like sum
and range.)
A good programming language helps the programmer by providing a more
abstract way to express himself. It hides uninteresting details,
provides convenient building blocks (such as the while construct),
and, most of the time, allows the programmer to add building blocks
himself (such as the sum and range operations).
JavaScript is the language that is, at the moment, mostly being used to do all kinds of clever and horrible things with pages on the World Wide Web. Some people claim that the next version of JavaScript will become an important language for other tasks too. I am unsure whether that will happen, but if you are interested in programming, JavaScript is definitely a useful language to learn. Even if you do not end up doing much web-programming, the mind-bending programs I will show you in this book will always stay with you, haunt you, and influence the programs you write in other languages.
There are those who will say terrible things about JavaScript. Many of these things are true. When I was for the first time required to write something in JavaScript, I quickly came to despise the language. It would accept almost anything I typed, but interpret it in a way that was completely different from what I meant. This had a lot to do with the fact that I did not have a clue what I was doing, but there is also a real issue here: JavaScript is ridiculously liberal in what it allows. The idea behind this design was that it would make programming in JavaScript easier for beginners. In actuality, it mostly makes finding problems in your programs harder, because the system will not point them out to you.
However, the flexibility of the language is also an advantage. It leaves space for a lot of techniques that are impossible in more rigid languages, and it can be used to overcome some of JavaScript's shortcomings. After learning it properly, and working with it for a while, I have really learned to like this language.
Contrary to what the name suggests, JavaScript has very little to do with the programming language named Java. The similar name was inspired by marketing considerations, rather than good judgement. In 1995, when JavaScript was introduced by Netscape, the Java language was being heavily marketed and gaining in popularity. Apparently, someone thought it a good idea to try and ride along on this marketing. Now we are stuck with the name
Related to JavaScript is a thing called ECMAScript. When browsers other than Netscape started to support JavaScript, or something that looked like it, a document was written to describe precisely how the language should work. The language described in this document is called ECMAScript, after the organisation that standardised it.
ECMAScript describes a general-purpose programming language, and does not say anything about the integration of this language in an Internet browser. JavaScript is ECMAScript plus extra tools for dealing with Internet pages and browser windows.
A few other pieces of software use the language described in the ECMAScript document. Most importantly, the ActionScript language used by Flash is based on ECMAScript (though it does not precisely follow the standard). Flash is a system for adding things that move and make lots of noise to web-pages. Knowing JavaScript won't hurt if you ever find yourself learning to build Flash movies.
At the time I am writing this, people are working on a thing called ECMAScript 4. This is a new version of the ECMAScript language, which adds a number of new features. You should not worry too much about this new version making the things you learn from this book obsolete. For one thing, ECMAScript 4 will mostly be an extension of the language we have now, so almost everything written in this book will still hold. On top of that, it will most likely take quite a while before all major browsers add these new features to their JavaScript systems, and until they do, ECMAScript 4 won't be very practical for web-programming.
Most chapters in this book contain quite a lot of code1. In my experience, reading and writing code are an important part of learning to program. Try to not just glance over these examples, but read them attentively and understand them. This can be slow and confusing at first, but you will quickly get the hang of it. The same goes for the exercises. Don't assume you understand them until you've actually written a working solution.
Because of the way the web works, it is always possible to look at the JavaScript programs that people put in their web-pages. This can be a good way to learn how some things are done. Because most web programmers are not 'professional' programmers, or consider JavaScript programming so uninteresting that they never properly learned it, a lot of the code you can find like this is of a very bad quality. When learning from ugly or incorrect code, the ugliness and confusion will propagate into your own code, so be careful who you learn from.
To allow you to try out programs, both the examples and the code you write yourself, this book makes use of something called a console. If you are using a modern graphical browser (Internet Explorer version 6 or higher, Firefox 1.5 or higher, Opera 9 or higher, Safari 3 or higher), the pages in this book will show a bar at the bottom of your screen. You can open the console by clicking on the little arrow on the far right of this bar.
The console contains three important elements. There is the output
window, which is used to show error messages and things that programs
print out. Below that, there is a line where you can type in a piece
of JavaScript. Try typing in a number, and pressing the enter key to
run what you typed. If the text you typed produced something
meaningful, it will be shown in the output window. Now try typing
wrong!, and press enter again. The output window will show an error
message. You can use the arrow-up and arrow-down keys to go back to
previous commands that you typed.
For bigger pieces of code, those that span multiple lines and which you want to keep around for a while, the field on the right can be used. The 'Run' button is used to execute programs written in this field. It is possible to have multiple programs open at the same time. Use the 'New' and 'Load' buttons to add a new program (empty or from a file on the web). When there is more than one open program, the menu next to the 'Run' button can be used to choose which one is being shown. The 'Close' button, as you might expect, closes a program.
Example programs in this book always have a small button with an arrow in their top-right corner, which can be used to run them. The example we saw earlier looked like this:
var total = 0, count = 1; while (count <= 10) { total += count; count += 1; } print(total);
Run it by clicking the arrow. There is also another button, which is
used to load the program into the console. Do not hesitate to modify
it and try out the result. The worst that could happen is that you
create an endless loop. An endless loop is what you get when the
condition of the while never becomes false, for example if you
choose to add 0 instead of 1 to the count variable. Now the
program will run forever.
Fortunately, browsers keep an eye on the programs running inside them. Whenever one of them is taking suspiciously long to finish, they will ask you if you want to cut it off.
In some later chapters, we will build example programs that consist of many blocks of code. Often, you have to run every one of them for the program to work. As you may have noticed, the arrow in a block of code turns purple after the block has been run. When reading a chapter, try to run every block of code you come across, especially those that 'define' something new (you will see what that means in the next chapter).
It is, of course, possible that you can not read a chapter in one sitting. This means you will have to start halfway when you continue reading, but if you don't run all the code starting from the top of the chapter, some things might not work. By holding the shift key while pressing the 'run' arrow on a block of code, all blocks before that one will be run as well, so when you start in the middle of a chapter, hold shift the first time you run a piece of code, and everything should work as expected.
Finally, the little face in the top-left corner of your screen can be used to send me, the author, a message. If you have a comment, or you find a passage ridiculously confusing, or you just spot a spelling error, tell me about it. Sending a message can be done without leaving the page, so it won't interrupt your reading.
- 'Code' is the substance that programs are made of. Every piece of a program, whether it is a single line or the whole thing, can be referred to as 'code'.
Chapter 2: Basic JavaScript: values, variables, and control flow
Inside the computer's world, there is only data. That which is not data, does not exist. Although all data is in essence just a sequence of bits1, and is thus fundamentally alike, every piece of data plays its own role. In JavaScript's system, most of this data is neatly separated into things called values. Every value has a type, which determines the kind of role it can play. There are six basic types of values: Numbers, strings, booleans, objects, functions, and undefined values.
To create a value, one must merely invoke its name. This is very convenient. You don't have to gather building material for your values, or pay for them, you just call for one and woosh, you have it. They are not created from thin air, of course. Every value has to be stored somewhere, and if you want to use a gigantic amount of them at the same time you might run out of computer memory. Fortunately, this is only a problem if you need them all simultaneously. As soon as you no longer use a value, it will dissipate, leaving behind only a few bits. These bits are recycled to make the next generation of values.
Values of the type number are, as you might have deduced, numeric values. They are written as numbers usually are:
144Enter that in the console, and the same thing is printed in the output window. The text you typed in gave rise to a number value, and the console took this number and wrote it out to the screen again. In a case like this, that was a rather pointless exercise, but soon we will be producing values in less straightforward ways, and it can be useful to 'try them out' on the console to see what they produce.
This is what 144 looks like in bits2:
0100000001100010000000000000000000000000000000000000000000000000
The number above has 64 bits. Numbers in JavaScript always do. This has one important repercussion: There is a limited amount of different numbers that can be expressed. With three decimal digits, only the numbers 0 to 999 can be written, which is 103 = 1000 different numbers. With 64 binary digits, 264 different numbers can be written. This is a lot, more than 1019 (a one with nineteen zeroes).
Not all whole numbers below 1019 fit in a JavaScript number though. For one, there are also negative numbers, so one of the bits has to be used to store the sign of the number. A bigger issue is that non-whole numbers must also be represented. To do this, 11 bits are used to store the position of the decimal dot within the number.
That leaves 52 bits3. Any whole number less than 252, which is over 1015, will safely fit in a JavaScript number. In most cases, the numbers we are using stay well below that, so we do not have to concern ourselves with bits at all. Which is good. I have nothing in particular against bits, but you do need a terrible lot of them to get anything done. When at all possible, it is more pleasant to deal with bigger things.
Fractional numbers are written by using a dot.
9.81For very big or very small numbers, one can also use 'scientific'
notation by adding an e, followed by the exponent of the number:
2.998e8Which is 2.998 * 108 = 299800000.
Calculations with whole numbers (also called integers) that fit in 52 bits are guaranteed to always be precise. Unfortunately, calculations with fractional numbers are generally not. Like π (pi) can not be precisely expressed by a finite amount of decimal digits, many numbers lose some precision when only 64 bits are available to store them. This is a shame, but it only causes practical problems in very specific situations. The important thing is to be aware of it, and treat fractional digital numbers as approximations, not as precise values.
The main thing to do with numbers is arithmetic. Arithmetic operations such as addition or multiplication take two number values and produce a new number from them. Here is what they look like in JavaScript:
100 + 4 * 11
The + and * symbols are called operators. The first stands for
addition, and the second for multiplication. Putting an operator
between two values will apply it to those values, and
produce a new value.
Does the example mean 'add 4 and 100, and multiply the result by 11', or is the multiplication done before the adding? As you might have guessed, the multiplication happens first. But, as in mathematics, this can be changed by wrapping the addition in parentheses:
(100 + 4) * 11
For subtraction, there is the - operator, and division can be done
with /. When operators appear together without parentheses, the
order in which they are applied is determined by the precedence of
the operators. The first example shows that multiplication has a
higher precedence than addition. The full ordering of the arithmetic
operators is: first division, then multiplication, then subtraction,
and finally addition.
Try to figure out what value this produces, and then run it to see if you were correct...
115 * 4 - 4 + 88 / 2
These rules of precedence are not something you should worry about. When in doubt, just add parentheses.
There is one more arithmetic operator which is probably less familiar
to you. The % symbol is used to represent the modulo operation.
X modulo Y is the remainder of dividing X by Y. For example
314 % 100 is 14, 10 % 3 is 1, and 144 % 12 is 0. Modulo's
precedence lies between that of multiplication and subtraction.
The next data type is the string. Its use is not as evident from its name as with numbers, but it also fulfils a very basic role. Strings are used to represent text, the name supposedly derives from the fact that it strings together a bunch of characters. Strings are written by enclosing their content in quotes:
"Patch my boat with chewing gum."Almost anything can be put between double quotes, and JavaScript will make a string value out of it. But a few characters are tricky. You can imagine how putting quotes between quotes might be hard. Newlines, the things you get when you press enter, can also not be put between quotes, the string has to stay on a single line.
To be able to have such characters in a string, the following trick is
used: Whenever a backslash ('\') is found inside quoted text, it
indicates that the character after it has a special meaning. A quote
that is preceded by a backslash will not end the string, but be part
of it. When an 'n' character occurs after a backslash, it is
interpreted as a newline. Similarly, a 't' after a backslash means a
tab character4.
"This is the first line\nAnd this is the second"There are of course situations where you want a backslash in a string to be just a backslash, not a special code. If two backslashes follow each other, they will collapse right into each other, and only one will be left in the resulting string value:
"A newline character is written like \"\\n\"."Strings can not be divided, multiplied, or subtracted. The +
operator can be used on them. It does not add, but it concatenates,
it glues two strings together.
"con" + "cat" + "e" + "nate"
There are more ways of manipulating strings, but these are discussed later.
Not all operators are symbols, some are written as words. For example,
the typeof operator, which produces a string value naming the type
of the value you give it.
typeof 4.5The other operators we saw all operated on two values, typeof takes
only one. Operators that use two values are called binary operators,
while those that take one are called unary operators. The
minus operator can be used both as a binary and a unary
operator:
- (10 - 2)
Then there are values of the boolean type. There are only two of
these: true and false. Here is one way to produce a true
value:
3 > 2
And false can be produced like this:
3 < 2
I hope you have seen the > and < signs before. They mean,
respectively, 'is greater than' and 'is less than'. They are binary
operators, and the result of applying them is a boolean value that
indicates whether they hold in this case.
Strings can be compared in the same way:
"Aardvark" < "Zoroaster"
The way strings are ordered is more or less alphabetic. More or
less... Uppercase letters are always 'less' than lowercase ones, so
"Z" < "a" is true, and non-alphabetic characters ('!', '@',
etc) are also included in the ordering. The actual way in which the
comparison is done is based on the Unicode standard. This standard
assigns a number to virtually every character one would ever need,
including characters from Greek, Arabic, Japanese, Tamil, and so on.
Having such numbers is practical for storing strings inside a computer
― you can represent them as a list of numbers. When comparing
strings, JavaScript just compares the numbers of the characters inside
the string, from left to right.
Other similar operators are >= ('is greater than or equal to'),
<= (is less than or equal to), == ('is equal to'), and !=
('is not equal to').
"Itchy" != "Scratchy"
There are also some useful operations that can be applied to boolean values themselves. JavaScript supports three logical operators: and, or, and not. These can be used to 'reason' about booleans.
The && operator represents logical and. It is a binary operator,
and its result is only true if both of the values given to it are
true.
true && false
|| is the logical or, it is true if either of the values given
to it is true:
true || false
Not is written as an exclamation mark, !, it is a unary operator
that flips the value given to it, !true is false, and !false is
true.
((4 >= 6) || ("grass" != "green")) && !(((12 * 2) == 144) && true)
Is this true? For readability, there are a lot of unnecessary parentheses in there. This simple version means the same thing:
(4 >= 6 || "grass" != "green") && !(12 * 2 == 144 && true)
Yes, it is true. You can reduce it step by step like this:
(false || true) && !(false && true)
true && !false
trueI hope you noticed that "grass" != "green" is true. Grass may be
green, but it is not equal to green.
It is not always obvious when parentheses are needed. In practice, one
can usually get by with knowing that of the operators we have seen so
far, || has the lowest precedence, then comes &&, then the
comparison operators (>, ==, etcetera), and then the rest. This
has been chosen in such a way that, in simple cases, as few
parentheses as possible are necessary.
All the examples so far have used the language like you would use a
pocket calculator. Make some values and apply operators to them to get
new values. Creating values like this is an essential part of every
JavaScript program, but it is only a part. A piece of code that
produces a value is called an expression. Every value that is
written directly (such as 22 or "psychoanalysis") is an
expression. An expression between parentheses is also an expression.
And a binary operator applied to two expressions, or a unary operator
applied to one, is also an expression.
There are a few more ways of building expressions, which will be revealed when the time is ripe.
There exists a unit that is bigger than an expression. It is called a
statement. A program is built as a list of statements. Most
statements end with a semicolon (;). The simplest kind of
statement is an expression with a semicolon after it. This is a
program:
1; !false;
It is a useless program. An expression can be content to just produce
a value, but a statement only amounts to something if it somehow
changes the world. It could print something to the screen ― that
counts as changing the world ― or it could change the internal state
of the program in a way that will affect the statements that come
after it. These changes are called 'side effects'. The statements in
the example above just produce the values 1 and true, and then
immediately throw them into the bit bucket5. This leaves no
impression on the world at all, and is not a side effect.
How does a program keep an internal state? How does it remember things? We have seen how to produce new values from old values, but this does not change the old values, and the new value has to be immediately used or it will dissipate again. To catch and hold values, JavaScript provides a thing called a variable.
var caught = 5 * 5;
A variable always has a name, and it can point at a value, holding on
to it. The statement above creates a variable called caught and uses
it to grab hold of the number that is produced by multiplying 5 by
5.
After running the above program, you can type the word caught into
the console, and it will retrieve the value 25 for you. The name of
a variable is used to fetch its value. caught + 1 also works. A
variable name can be used as an expression, and thus can be part of
bigger expressions.
The word var is used to create a new variable. After var, the
name of the variable follows. Variable names can be almost every word,
but they may not include spaces. Digits can be part of variable names,
catch22 is a valid name, but the name must not start with one. The
characters '$' and '_' can be used in names as if they were
letters, so $_$ is a correct variable name.
If you want the new variable to immediately capture a value, which is
often the case, the = operator can be used to give it the value of
some expression.
When a variable points at a value, that does not mean it is tied to
that value forever. At any time, the = operator can be used on
existing variables to yank them away from their current value and make
them point to a new one.
caught = 4 * 4;
You should imagine variables as tentacles, rather than boxes. They do not contain values, they grasp them ― two variables can refer to the same value. Only the values that the program still has a hold on can be accessed by it. When you need to remember something, you grow a tentacle to hold on to it, or re-attach one of your existing tentacles to a new value: To remember the amount of dollars that Luigi still owes you, you could do...
var luigiDebt = 140;
Then, every time a Luigi pays something back, this amount can be decremented by giving the variable a new number:
luigiDebt = luigiDebt - 35;
The collection of variables and their values that exist at a given time is called the environment. When a program starts up, this environment is not empty. It always contains a number of standard variables. When your browser loads a page, it creates a new environment and attaches these standard values to it. The variables created and modified by programs on that page survive until the browser goes to a new page.
A lot of the values provided by the standard environment have the type
'function'. A function is a piece of program wrapped in a value.
Generally, this piece of program does something useful, which can be
evoked using the function value that contains it. In a browser
environment, the variable alert holds a function that shows a
little dialog window with a message. It is used like this:
alert("Also, your hair is on fire.");
Executing the code in a function is called invoking or
applying it. The notation for doing this uses parentheses. Every
expression that produces a function value can be invoked by putting
parentheses after it. The string value between the parentheses is
given to the function, which uses it as the text to show in the dialog
window. Values given to functions are called parameters or
arguments. alert needs only one of them, but other functions might
need a different number.
Showing a dialog window is a side effect. A lot of functions are
useful because of the side effects they produce. It is also possible
for a function to produce a value, in which case it does not need to
have a side effect to be useful. For example, there is a function
Math.max, which takes two arguments and gives back the biggest of
the two:
alert(Math.max(2, 4));
When a function produces a value, it is said to return it. Because things that produce values are always expressions in JavaScript, function calls can be used as a part of bigger expressions:
alert(Math.min(2, 4) + 100);
Chapter 3 discusses writing your own functions.
As the previous examples show, alert can be useful for showing the
result of some expression. Clicking away all those little windows can
get on one's nerves though, so from now on we will prefer to use a
similar function, called print, which does not pop up a window,
but just writes a value to the output area of the console. print is
not a standard JavaScript function, browsers do not provide it for
you, but it is made available by this book, so you can use it on these
pages.
print("N");
A similar function, also provided on these pages, is show. While
print will display its argument as flat text, show tries to
display it the way it would look in a program, which can give more
information about the type of the value. For example, string values
keep their quotes when given to show:
show("N");
The standard environment provided by browsers contains a few more
functions for popping up windows. You can ask the user an OK/Cancel
question using confirm. This returns a boolean, true if the user
presses 'OK', and false if he presses 'Cancel'.
show(confirm("Shall we, then?"));
prompt can be used to ask an 'open' question. The first argument
is the question, the second one is the text that the user starts with.
A line of text can be typed into the window, and the function will
return this as a string.
show(prompt("Tell us everything you know.", "..."));
It is possible to give almost every variable in the environment a new
value. This can be useful, but also dangerous. If you give print the
value 8, you won't be able to print things anymore. Fortunately,
there is a big 'Reset' button on the console, which will reset the
environment to its original state.
One-line programs are not very interesting. When you put more than one statement into a program, the statements are, predictably, executed one at a time, from top to bottom.
var theNumber = Number(prompt("Pick a number", "")); print("Your number is the square root of " + (theNumber * theNumber));
The function Number converts a value to a number, which is needed
in this case because the result of prompt is a string value. There
are similar functions called String and Boolean which convert
values to those types.
Consider a program that prints out all even numbers from 0 to 12. One way to write this is:
print(0); print(2); print(4); print(6); print(8); print(10); print(12);
That works, but the idea of writing a program is to make something less work, not more. If we needed all even numbers below 1000, the above would be unworkable. What we need is a way to automatically repeat some code.
var currentNumber = 0; while (currentNumber <= 12) { print(currentNumber); currentNumber = currentNumber + 2; }
You may have seen while in the introduction chapter. A statement
starting with the word while creates a loop. A loop is a
disturbance in the sequence of statements, it may cause the program to
repeat some statements multiple times. In this case, the word while
is followed by an expression in parentheses (the parentheses are
compulsory here), which is used to determine whether the loop will
loop or finish. As long as the boolean value produced by this
expression is true, the code in the loop is repeated. As soon as it
is false, the program goes to the bottom of the loop and continues as
normal.
The variable currentNumber demonstrates the way a variable can track
the progress of a program. Every time the loop repeats, it is
incremented by 2, and at the beginning of every repetition, it is
compared with the number 12 to decide whether to keep on looping.
The third part of a while statement is another statement. This is
the body of the loop, the action or actions that must take place
multiple times. If we did not have to print the numbers, the program
could have been:
var currentNumber = 0; while (currentNumber <= 12) currentNumber = currentNumber + 2;
Here, currentNumber = currentNumber + 2; is the statement that forms
the body of the loop. We must also print the number, though, so the
loop statement must consist of more than one statement. Braces
({ and }) are used to group statements into blocks. To the world
outside the block, a block counts as a single statement. In the
example, this is used to include in the loop both the call to print
and the statement that updates currentNumber.
Use the techniques shown so far to write a program that calculates and
shows the value of 210 (2 to the 10th power). You are, obviously, not
allowed to use a cheap trick like just writing 2 * 2 * ....
If you are having trouble with this, try to see it in terms of the
even-numbers example. The program must perform an action a certain
amount of times. A counter variable with a while loop can be used
for that. Instead of printing the counter, the program must multiply
something by 2. This something should be another variable, in which
the result value is built up.
Don't worry if you don't quite see how this would work yet. Even if you perfectly understand all the techniques this chapter covers, it can be hard to apply them to a specific problem. Reading and writing code will help develop a feeling for this, so study the solution, and try the next exercise.
var result = 1; var counter = 0; while (counter < 10) { result = result * 2; counter = counter + 1; } show(result);
The counter could also start at 1 and check for <= 10, but, for
reasons that will become apparent later on, it is a good idea to get
used to counting from 0.
Obviously, your own solutions aren't required to be precisely the same as mine. They should work. And if they are very different, make sure you also understand my solution.
With some slight modifications, the solution to the previous exercise can be made to draw a triangle. And when I say 'draw a triangle' I mean 'print out some text that almost looks like a triangle when you squint'.
Print out ten lines. On the first line there is one '#' character. On the second there are two. And so on.
How does one get a string with X '#' characters in it? One way is to build it every time it is needed with an 'inner loop' ― a loop inside a loop. A simpler way is to reuse the string that the previous iteration of the loop used, and add one character to it.
var line = ""; var counter = 0; while (counter < 10) { line = line + "#"; print(line); counter = counter + 1; }
You will have noticed the spaces I put in front of some statements. These are not required: The computer will accept the program just fine without them. In fact, even the line breaks in programs are optional. You could write them as a single long line if you felt like it. The role of the indentation inside blocks is to make the structure of the code clearer to a reader. Because new blocks can be opened inside other blocks, it can become hard to see where one block ends and another begins in a complex piece of code. When lines are indented, the visual shape of a program corresponds to the shape of the blocks inside it. I like to use two spaces for every open block, but tastes differ.
On browsers other than Opera, the field in the console where you can type programs will help you by automatically adding these spaces. This may seem annoying at first, but when you write a lot of code it becomes a huge time-saver. Pressing the tab key will re-indent the line your cursor is currently on.
In some cases, JavaScript allows you to omit the semicolon at the end of a statement. In other cases, it has to be there or strange things will happen. The rules for when it can be safely omitted are complex and weird. In this book, every statement that needs a semicolon will always be terminated by one, and I strongly urge you to do the same with your own statements.
The uses of while we have seen so far all show the same pattern.
First, a 'counter' variable is created. This variable tracks the
progress of the loop. The while itself contains a check, usually to
see whether the counter has reached some boundary yet. Then, at the
end of the loop body, the counter is updated.
A lot of loops fall into this pattern. For this reason, JavaScript, and similar languages, also provide a slightly shorter and more comprehensive form:
for (var number = 0; number <= 12; number = number + 2) show(number);
This program is exactly equivalent to the earlier even-number-printing
example. The only change is that all the statements that are related
to the 'state' of the loop are now on one line. The parentheses after
the for should contain two semicolons. The part before the first
semicolon initialises the loop, usually by defining a variable. The
second part is the expression that checks whether the loop must
still continue. The final part updates the state of the loop. In
most cases this is shorter and clearer than a while construction.
I have been using some rather odd capitalisation in some variable
names. Because you can not have spaces in these names ― the computer
would read them as two separate variables ― your choices for a name
that is made of several words are more or less limited to the
following: fuzzylittleturtle, fuzzy_little_turtle,
FuzzyLittleTurtle, or fuzzyLittleTurtle. The first one is hard to
read. Personally, I like the one with the underscores, though it is a
little painful to type. However, the standard JavaScript functions,
and most JavaScript programmers, follow the last one. It is not hard
to get used to little things like that, so I will just follow the
crowd and capitalise the first letter of every word after the first.
In a few cases, such as the Number function, the first letter of a
variable is also capitalised. This was done to mark this function as a
constructor. What a constructor is will become clear in chapter 8. For
now, the important thing is not to be bothered by this apparent lack
of consistency.
Note that names that have a special meaning, such as var, while,
and for may not be used as variable names. These are called
keywords. There are also a number of words which
are 'reserved for use' in future versions of JavaScript. These are
also officially not allowed to be used as variable names, though some
browsers do allow them. The full list is rather long:
abstract boolean break byte case catch char class const continue debugger default delete do double else enum export extends false final finally float for function goto if implements import in instanceof int interface long native new null package private protected public return short static super switch synchronized this throw throws transient true try typeof var void volatile while with
Don't worry about memorising these for now, but remember that this
might be the problem when something does not work as expected. In my
experience, char (to store a one-character string) and class are
the most common names to accidentally use.
Rewrite the solutions of the previous two exercises to use for
instead of while.
var result = 1; for (var counter = 0; counter < 10; counter = counter + 1) result = result * 2; show(result);
Note that even if no block is opened with a '{', the statement in
the loop is still indented two spaces to make it clear that it
'belongs' to the line above it.
var line = ""; for (var counter = 0; counter < 10; counter = counter + 1) { line = line + "#"; print(line); }
A program often needs to 'update' a
variable with a value that is based on its previous value. For example
counter = counter + 1. JavaScript provides a shortcut for this:
counter += 1. This also works for many other operators, for example
result *= 2 to double the value of result, or counter -= 1 to
count downwards.
For counter += 1 and counter -= 1 there are even
shorter versions: counter++ and counter--.
Loops are said to affect the control flow of a program. They change the order in which statements are executed. In many cases, another kind of flow is useful: skipping statements.
We want to show all numbers between 0 and 20 which are divisible both by 3 and by 4.
for (var counter = 0; counter < 20; counter++) { if (counter % 3 == 0 && counter % 4 == 0) show(counter); }
The keyword if is not too different from the keyword while: It
checks the condition it is given (between parentheses), and executes
the statement after it based on this condition. But it does this only
once, so that the statement is executed zero or one time.
The trick with the modulo (%) operator is an easy way to test
whether a number is divisible by another number. If it is, the
remainder of their division, which is what modulo gives you, is zero.
If we wanted to print all of the numbers between 0 and 20, but put parentheses around the ones are not divisible by 4, we can do it like this:
for (var counter = 0; counter < 20; counter++) { if (counter % 4 == 0) print(counter); if (counter % 4 != 0) print("(" + counter + ")"); }
But now the program has to determine whether counter is divisible by
4 two times. The same effect can be gotten by appending an else
part after an if statement. The else statement is executed only
when the if's condition is false.
for (var counter = 0; counter < 20; counter++) { if (counter % 4 == 0) print(counter); else print("(" + counter + ")"); }
To stretch this trivial example a bit further, we now want to print these same numbers, but add two stars after them when they are greater than 15, one star when they are greater than 10 (but not greater than 15), and no stars otherwise.
for (var counter = 0; counter < 20; counter++) { if (counter > 15) print(counter + "**"); else if (counter > 10) print(counter + "*"); else print(counter); }
This demonstrates that you can chain if statements together. In this
case, the program first looks if counter is greater than 15. If it
is, the two stars are printed and the other tests are skipped. If it
is not, we continue to check if counter is greater than 10. Only
if counter is also not greater than 10 does it arrive at the last
print statement.
Write a program to ask yourself, using prompt, what the value of 2 +
2 is. If the answer is "4", use alert to say something praising. If
it is "3" or "5", say "Almost!". In other cases, say something mean.
var answer = prompt("You! What is the value of 2 + 2?", ""); if (answer == "4") alert("You must be a genius or something."); else if (answer == "3" || answer == "5") alert("Almost!"); else alert("You're an embarrassment.");
When a loop does not always have to go all the way through to its end,
the break keyword can be useful. It immediately jumps out of the
current loop, continuing after it. This program finds the first number
that is greater than 20 and divisible by 7:
for (var current = 20; ; current++) { if (current % 7 == 0) break; } print(current);
The for construct does not have a part that checks for the end of
the loop. This means that it is dependant on the break statement
inside it to ever stop. The same program could also have been written
as simply...
for (var current = 20; current % 7 != 0; current++) ; print(current);
In this case, the body of the loop is empty. A lone semicolon can be
used to produce an empty statement. Here, the only effect of the loop
is to increment the variable current to its desired value. But I
needed an example that uses break, so pay attention to the first
version too.
Add a while and optionally a break to your solution for the
previous exercise, so that it keeps repeating the question until a
correct answer is given.
Note that while (true) ... can be used to create a loop that does
not end on its own account. This is a bit silly, you ask the program
to loop as long as true is true, but it is a useful trick.
while (true) { var answer = prompt("You! What is the value of 2 + 2?", ""); if (answer == "4") { alert("You must be a genius or something."); break; } else if (answer == "3" || answer == "5") { alert("Almost!"); } else { alert("You're an embarrassment."); } }
Because the first if's body now has two statements, I added braces
around all the bodies. This is a matter of taste. Having an
if/else chain where some of the bodies are blocks and others are
single statements looks a bit lopsided to me, but you can make up your
own mind about that.
Another solution, arguably nicer, but without break:
var value; while (value != "4") { value = prompt("You! What is the value of 2 + 2?", ""); if (value == "4") alert("You must be a genius or something."); else if (value == "3" || value == "5") alert("Almost!"); else alert("You're an embarrassment."); }
In the second solution to the previous exercise there is a statement
var value;. This creates a variable named value, but does not
give it a value. What happens when you take the value of this
variable?
var mysteryVariable; show(mysteryVariable);
In terms of tentacles, this variable ends in thin air, it has nothing
to grasp. When you ask for the value of an empty place, you get a
special value named undefined. Functions which do not return an
interesting value, such as print and alert, also return an
undefined value.
show(alert("I am a side effect."));
There is also a similar value, null, whose meaning is 'this value
is defined, but it does not have a value'. The difference in meaning
between undefined and null is mostly academic, and usually not
very interesting. In practical programs, it is often necessary to
check whether something 'has a value'. In these cases, the expression
something == undefined may be used, because, even though they are
not exactly the same value, null == undefined will produce true.
Which brings us to another tricky subject...
show(false == 0); show("" == 0); show("5" == 5);
All these give the value true. When comparing
values that have different types, JavaScript uses a complicated and
confusing set of rules. I am not going to try to explain them
precisely, but in most cases it just tries to convert one of the
values to the type of the other value. However, when null or
undefined occur, it only produces true if both sides are null or
undefined.
What if you want to test whether a variable refers to the value
false? The rules for converting strings and numbers to boolean
values state that 0 and the empty string count as false, while all
the other values count as true. Because of this, the expression
variable == false is also true when variable refers to 0 or
"". For cases like this, where you do not want any automatic type
conversions to happen, there are two extra operators: === and
!==. The first tests whether a value is precisely equal to the
other, and the second tests whether it is not precisely equal.
show(null === undefined); show(false === 0); show("" === 0); show("5" === 5);
All these are false.
Values given as the condition in an if, while, or for statement
do not have to be booleans. They will be automatically converted to
booleans before they are checked. This means that the number 0, the
empty string "", null, undefined, and of course false, will
all count as false.
The fact that all other values are converted to true in this case
makes it possible to leave out explicit comparisons in many
situations. If a variable is known to contain either a string or
null, one could check for this very simply...
var maybeNull = null; // ... mystery code that might put a string into maybeNull ... if (maybeNull) print("maybeNull has a value");
Except in the case where the mystery code gives maybeNull the value
"". An empty string is false, so nothing is printed. Depending on
what you are trying to do, this might be wrong. It is often a good
idea to add an explicit === null or === false in cases like this
to prevent subtle mistakes. The same occurs with number values that
might be 0.
The line that talks about 'mystery code' in the previous example might have looked a bit suspicious to you. It is often useful to include extra text in a program. The most common use for this is adding some explanations in human language to a program.
// The variable counter, which is about to be defined, is going // to start with a value of 0, which is zero. var counter = 0; // Now, we are going to loop, hold on to your hat. while (counter < 100 /* counter is less than one hundred */) /* Every time we loop, we INCREMENT the value of counter, Seriously, we just add one to it. */ counter++; // And then, we are done.
This kind of text is called a comment. The rules are like this:
'/*' starts a comment that goes on until a '*/' is found. '//'
starts another kind of comment, which goes on until the end of the
line.
As you can see, even the simplest programs can be made to look big, ugly, and complicated by simply adding a lot of comments to them.
There are some other situations that cause automatic type conversions to happen. If you add a non-string value to a string, the value is automatically converted to a string before it is concatenated. If you multiply a number and a string, JavaScript tries to make a number out of the string.
show("Apollo" + 5); show(null + "ify"); show("5" * 5); show("strawberry" * 5);
The last statement prints NaN, which is a special value. It stands
for 'not a number', and is of type number (which might sound a little
contradictory). In this case, it refers to the fact that a strawberry
is not a number. All arithmetic operations on the value NaN result
in NaN, which is why multiplying it by 5, as in the example, still
gives a NaN value. Also, and this can be disorienting at times, NaN
== NaN equals false, checking whether a value is NaN can be done
with the isNaN function.
These automatic conversions can be very convenient, but they are also
rather weird and error prone. Even though + and * are both
arithmetic operators, they behave completely different in the example.
In my own code, I use + on non-strings a lot, but make it a point
not to use * and the other numeric operators on string values.
Converting a number to a string is always possible and
straightforward, but converting a string to a number may not even work
(as in the last line of the example). We can use Number to
explicitly convert the string to a number, making it clear that we
might run the risk of getting a NaN value.
show(Number("5") * 5);
When we discussed the boolean operators && and || earlier, I
claimed they produced boolean values. This turns out to be a bit of an
oversimplification. If you apply them to boolean values, they will
indeed return booleans. But they can also be applied to other kinds of
values, in which case they will return one of their arguments.
What || really does is this: It looks at the value to the left of
it first. If converting this value to a boolean would produce true,
it returns this left value, and otherwise it returns the one on its
right. Check for yourself that this does the correct thing when the
arguments are booleans. Why does it work like that? It turns out this
is very practical. Consider this example:
var input = prompt("What is your name?", "Kilgore Trout"); print("Well hello " + (input || "dear"));
If the user presses 'Cancel' or closes the prompt dialog in some
other way without giving a name, the variable input will hold the
value null or "". Both of these would give false when converted
to a boolean. The expression input || "dear" can in this case be
read as 'the value of the variable input, or else the string
"dear"'. It is an easy way to provide a 'fallback' value.
The && operator works similarly, but the other way around. When
the value to its left is something that would give false when
converted to a boolean, it returns that value, and otherwise it
returns the value on its right.
Another property of these two operators is that the expression to
their right is only evaluated when necessary. In the case of true ||
X, no matter what X is, the result will be true, so X is never
evaluated, and if it has side effects they never happen. The same goes
for false && X.
false || alert("I'm happening!"); true || alert("Not me.");
- Bits are any kinds of two-valued things, usually described as
0s and1s. Inside the computer, they take forms like a high or low electrical charge, a strong or weak signal, a shiny or dull spot on the surface of a CD. - If you were expecting something like
10010000here ― good call, but read on. JavaScript's numbers are not stored as integers. - Actually, 53, because of a trick that can be used to get one bit for free. Look up the 'IEEE 754' format if you are curious about the details.
- When you type string values at the console, you'll notice that they
will come back with the quotes and backslashes the way you typed them.
To get special characters to show properly, you can do
print("a\nb")― why this works, we will see in a moment. - The bit bucket is supposedly the place where old bits are kept. On some systems, the programmer has to manually empty it now and then. Fortunately, JavaScript comes with a fully-automatic bit-recycling system.
Chapter 3: Functions
A program often needs to do the same thing in different places.
Repeating all the necessary statements every time is tedious and
error-prone. It would be better to put them in one place, and have the
program take a detour through there whenever necessary. This is what
functions were invented for: They are canned code that a program can
go through whenever it wants. Putting a string on the screen requires
quite a few statements, but when we have a print function we can
just write print("Aleph") and be done with it.
To view functions merely as canned chunks of code doesn't do them justice though. When needed, they can play the role of pure functions, algorithms, indirections, abstractions, decisions, modules, continuations, data structures, and more. Being able to effectively use functions is a necessary skill for any kind of serious programming. This chapter provides an introduction into the subject, chapter 6 discusses the subtleties of functions in more depth.
Pure functions, for a start, are the things that were called functions in the mathematics classes that I hope you have been subjected to at some point in your life. Taking the cosine or the absolute value of a number is a pure function of one argument. Addition is a pure function of two arguments.
The defining properties of pure functions are that they always return the same value when given the same arguments, and never have side effects. They take some arguments, return a value based on these arguments, and do not monkey around with anything else.
In JavaScript, addition is an operator, but it could be wrapped in a function like this (and as pointless as this looks, we will come across situations where it is actually useful):
function add(a, b) { return a + b; } show(add(2, 2));
add is the name of the function. a and b are the names of the
two arguments. return a + b; is the body of the function.
The keyword function is always used when creating a new function.
When it is followed by a variable name, the resulting function will be
stored under this name. After the name comes a list of argument
names, and then finally the body of the function. Unlike those
around the body of while loops or if statements, the braces around
a function body are obligatory1.
The keyword return, followed by an expression, is used to
determine the value the function returns. When control comes across a
return statement, it immediately jumps out of the current function
and gives the returned value to the code that called the function. A
return statement without an expression after it will cause the
function to return undefined.
A body can, of course, have more than one statement in it. Here is a function for computing powers (with positive, integer exponents):
function power(base, exponent) { var result = 1; for (var count = 0; count < exponent; count++) result *= base; return result; } show(power(2, 10));
If you solved exercise 2.2, this technique for computing a power should look familiar.
Creating a variable (result) and updating it are side effects.
Didn't I just say pure functions had no side effects?
A variable created inside a function exists only inside the function.
This is fortunate, or a programmer would have to come up with a
different name for every variable he needs throughout a program.
Because result only exists inside power, the changes to it only
last until the function returns, and from the perspective of code that
calls it there are no side effects.
Write a function called absolute, which returns the absolute value
of the number it is given as its argument. The absolute value of a
negative number is the positive version of that same number, and the
absolute value of a positive number (or zero) is that number itself.
function absolute(number) { if (number < 0) return -number; else return number; } show(absolute(-144));
Pure functions have two very nice properties. They are easy to think about, and they are easy to re-use.
If a function is pure, a call to it can be seen as a thing in itself. When you are not sure that it is working correctly, you can test it by calling it directly from the console, which is simple because it does not depend on any context2. It is easy to make these tests automatic ― to write a program that tests a specific function. Non-pure functions might return different values based on all kinds of factors, and have side effects that might be hard to test and think about.
Because pure functions are self-sufficient, they are likely to be
useful and relevant in a wider range of situations than non-pure ones.
Take show, for example. This function's usefulness depends on the
presence of a special place on the screen for printing output. If that
place is not there, the function is useless. We can imagine a related
function, let's call it format, that takes a value as an argument
and returns a string that represents this value. This function is
useful in more situations than show.
Of course, format does not solve the same problem as show, and no
pure function is going to be able to solve that problem, because it
requires a side effect. In many cases, non-pure functions are
precisely what you need. In other cases, a problem can be solved with
a pure function but the non-pure variant is much more convenient or
efficient.
Thus, when something can easily be expressed as a pure function, write it that way. But never feel dirty for writing non-pure functions.
Functions with side effects do not have to contain a return
statement. If no return statement is encountered, the function
returns undefined.
function yell(message) { alert(message + "!!"); } yell("Yow");
The names of the arguments of a function are available as variables inside it. They will refer to the values of the arguments the function is being called with, and like normal variables created inside a function, they do not exist outside it. Aside from the top-level environment, there are smaller, local environments created by function calls. When looking up a variable inside a function, the local environment is checked first, and only if the variable does not exist there is it looked up in the top-level environment. This makes it possible for variables inside a function to 'shadow' top-level variables that have the same name.
function alertIsPrint(value) { var alert = print; alert(value); } alertIsPrint("Troglodites");
The variables in this local environment are only visible to the code inside the function. If this function calls another function, the newly called function does not see the variables inside the first function:
var variable = "top-level"; function printVariable() { print("inside printVariable, the variable holds '" + variable + "'."); } function test() { var variable = "local"; print("inside test, the variable holds '" + variable + "'."); printVariable(); } test();
However, and this is a subtle but extremely useful phenomenon, when a function is defined inside another function, its local environment will be based on the local environment that surrounds it instead of the top-level environment.
var variable = "top-level"; function parentFunction() { var variable = "local"; function childFunction() { print(variable); } childFunction(); } parentFunction();
What this comes down to is that which variables are visible inside a function is determined by the place of that function in the program text. All variables that were defined 'above' a function's definition are visible, which means both those in function bodies that enclose it, and those at the top-level of the program. This approach to variable visibility is called lexical scoping.
People who have experience with other programming languages might expect that a block of code (between braces) also produces a new local environment. Not in JavaScript. Functions are the only things that create a new scope. You are allowed to use free-standing blocks like this...
var something = 1; { var something = 2; print("Inside: " + something); } print("Outside: " + something);
... but the something inside the block refers to the same variable
as the one outside the block. In fact, although blocks like this are
allowed, they are utterly pointless. Most people agree that this is a
bit of a design blunder by the designers of JavaScript, and ECMAScript
4 is expected to add some way to define variables that stay inside
blocks.
Here is a case that might surprise you:
var variable = "top-level"; function parentFunction() { var variable = "local"; function childFunction() { print(variable); } return childFunction; } var child = parentFunction(); child();
parentFunction returns its internal function, and the code at the
bottom calls this function. Even though parentFunction has finished
executing at this point, the local environment where variable has
the value "local" still exists, and childFunction still uses it.
This phenomenon is called closure.
Apart from making it very easy to quickly see in which part of a program a variable will be available by looking at the shape of the program text, lexical scoping also allows us to 'synthesise' functions. By using some of the variables from an enclosing function, an inner function can be made to do different things. Imagine we need a few different but similar functions, one that adds 2 to its argument, one that adds 5, and so on.
function makeAddFunction(amount) { function add(number) { return number + amount; } return add; } var addTwo = makeAddFunction(2); var addFive = makeAddFunction(5); show(addTwo(1) + addFive(1));
On top of the fact that different functions can contain variables of
the same name without getting tangled up, these scoping rules also
allow functions to call themselves without running into problems. A
function that calls itself is called recursive. Recursion
allows for some interesting definitions. Look at this implementation
of power:
function power(base, exponent) { if (exponent == 0) return 1; else return base * power(base, exponent - 1); }
This is rather close to the way mathematicians define exponentiation,
and to me it looks a lot nicer than the earlier version. It sort of
loops, but there is no while, for, or even a local side effect to
be seen. By calling itself, the function produces the same effect.
There is one important problem though: In most browsers, this second version is about ten times slower than the first one. In JavaScript, running through a simple loop is a lot cheaper than calling a function multiple times.
The dilemma of speed versus elegance is an interesting one. It not only occurs when deciding for or against recursion. In many situations, an elegant, intuitive, and often short solution can be replaced by a more convoluted but faster solution.
In the case of the power function above the un-elegant version is
still sufficiently simple and easy to read. It doesn't make very much
sense to replace it with the recursive version. Often, though, the
concepts a program is dealing with get so complex that giving up some
efficiency in order to make the program more straightforward becomes
an attractive choice.
The basic rule, which has been repeated by many programmers and with which I wholeheartedly agree, is to not worry about efficiency until your program is provably too slow. When it is, find out which parts are too slow, and start exchanging elegance for efficiency in those parts.
Of course, the above rule doesn't mean one should start ignoring
performance altogether. In many cases, like the power function, not
much simplicity is gained by the 'elegant' approach. In other cases,
an experienced programmer can see right away that a simple approach is
never going to be fast enough.
The reason I am making a big deal out of this is that surprisingly many programmers focus fanatically on efficiency, even in the smallest details. The result is bigger, more complicated, and often less correct programs, which take longer to write than their more straightforward equivalents and often run only marginally faster.
But I was talking about recursion. A concept closely related to recursion is a thing called the stack. When a function is called, control is given to the body of that function. When that body returns, the code that called the function is resumed. While the body is running, the computer must remember the context from which the function was called, so that it knows where to continue afterwards. The place where this context is stored is called the stack.
The fact that it is called 'stack' has to do with the fact that, as we saw, a function body can again call a function. Every time a function is called, another context has to be stored. One can visualise this as a stack of contexts. Every time a function is called, the current context is thrown on top of the stack. When a function returns, the context on top is taken off the stack and resumed.
This stack requires space in the computer's memory to be stored. When the stack grows too big, the computer will give up with a message like "out of stack space" or "too much recursion". This is something that has to be kept in mind when writing recursive functions.
function chicken() { return egg(); } function egg() { return chicken(); } print(chicken() + " came first.");
In addition to demonstrating a very interesting way of writing a broken program, this example shows that a function does not have to call itself directly to be recursive. If it calls another function which (directly or indirectly) calls the first function again, it is still recursive.
Recursion is not always just a less-efficient alternative to looping. Some problems are much easier to solve with recursion than with loops. Most often these are problems that require exploring or processing several 'branches', each of which might branch out again into more branches.
Consider this puzzle: By starting from the number 1 and repeatedly either adding 5 or multiplying by 3, an infinite amount of new numbers can be produced. How would you write a function that, given a number, tries to find a sequence of additions and multiplications that produce that number?
For example, the number 13 could be reached by first multiplying 1 by 3, and then adding 5 twice. The number 15 can not be reached at all.
Here is the solution:
function findSequence(goal) { function find(start, history) { if (start == goal) return history; else if (start > goal) return null; else return find(start + 5, "(" + history + " + 5)") || find(start * 3, "(" + history + " * 3)"); } return find(1, "1"); } print(findSequence(24));
Note that it doesn't necessarily find the shortest sequence of operations, it is satisfied when it finds any sequence at all.
The inner find function, by calling itself in two different ways,
explores both the possibility of adding 5 to the current number and of
multiplying it by 3. When it finds the number, it returns the
history string, which is used to record all the operators that were
performed to get to this number. It also checks whether the current
number is bigger than goal, because if it is, we should stop
exploring this branch, it is not going to give us our number.
The use of the || operator in the example can be read as 'return the
solution found by adding 5 to start, and if that fails, return the
solution found by multiplying start by 3'. It could also have been
written in a more wordy way like this:
else { var found = find(start + 5, "(" + history + " + 5)"); if (found == null) found = find(start * 3, history + " * 3"); return found; }
Even though function definitions occur as statements between the rest of the program, they are not part of the same time-line:
print("The future says: ", future()); function future() { return "We STILL have no flying cars."; }
What is happening is that the computer looks up all function definitions, and stores the associated functions, before it starts executing the rest of the program. The same happens with functions that are defined inside other functions. When the outer function is called, the first thing that happens is that all inner functions are added to the new environment.
There is another way to define function values, which more closely
resembles the way other values are created. When the function
keyword is used in a place where an expression is expected, it is
treated as an expression producing a function value. Functions created
in this way do not have to be given a name (though it is allowed to
give them one).
var add = function(a, b) { return a + b; }; show(add(5, 5));
Note the semicolon after the definition of add. Normal function
definitions do not need these, but this statement has the same general
structure as var add = 22;, and thus requires the semicolon.
This kind of function value is called an anonymous function, because
it does not have a name. Sometimes it is useless to give a function a
name, like in the makeAddFunction example we saw earlier:
function makeAddFunction(amount) { return function (number) { return number + amount; }; }
Since the function named add in the first version of
makeAddFunction was referred to only once, the name does not serve
any purpose and we might as well directly return the function value.
Write a function greaterThan, which takes one argument, a number,
and returns a function that represents a test. When this returned
function is called with a single number as argument, it returns a
boolean: true if the given number is greater than the number that
was used to create the test function, and false otherwise.
function greaterThan(x) { return function(y) { return y > x; }; } var greaterThanTen = greaterThan(10); show(greaterThanTen(9));
Try the following:
alert("Hello", "Good Evening", "How do you do?", "Goodbye");
The function alert officially only accepts one argument. Yet when
you call it like this, the computer does not complain at all, but just
ignores the other arguments.
show();You can, apparently, even get away with passing too few arguments.
When an argument is not passed, its value inside the function is
undefined.
In the next chapter, we will see a way in which a function body can
get at the exact list of arguments that were passed to it. This can be
useful, as it makes it possible to have a function accept any number
of arguments. print makes use of this:
print("R", 2, "D", 2);
Of course, the downside of this is that it is also possible to
accidentally pass the wrong number of arguments to functions that
expect a fixed amount of them, like alert, and never be told about
it.
- Technically, this wouldn't have been necessary, but I suppose the designers of JavaScript felt it would clarify things if function bodies always had braces.
- Technically, a pure function can not use the value of any external variables. These values might change, and this could make the function return a different value for the same arguments. In practice, the programmer may consider some variables 'constant' ― they are not expected to change ― and consider functions that use only constant variables pure. Variables that contain a function value are often good examples of constant variables.
Chapter 4: Data structures: Objects and Arrays
This chapter will be devoted to solving a few simple problems. In the process, we will discuss two new types of values, arrays and objects, and look at some techniques related to them.
Consider the following situation: Your crazy aunt Emily, who is rumoured to have over fifty cats living with her (you never managed to count them), regularly sends you e-mails to keep you up to date on her exploits. They usually look like this:
Dear nephew,
Your mother told me you have taken up skydiving. Is this true? You watch yourself, young man! Remember what happened to my husband? And that was only from the second floor!
Anyway, things are very exciting here. I have spent all week trying to get the attention of Mr. Drake, the nice gentleman who moved in next door, but I think he is afraid of cats. Or allergic to them? I am going to try putting Fat Igor on his shoulder next time I see him, very curious what will happen.
Also, the scam I told you about is going better than expected. I have already gotten back five 'payments', and only one complaint. It is starting to make me feel a bit bad though. And you are right that it is probably illegal in some way.
(... etc ...)
Much love, Aunt Emily
died 27/04/2006: Black Leclère
born 05/04/2006 (mother Lady Penelope): Red Lion, Doctor Hobbles the 3rd, Little Iroquois
To humour the old dear, you would like to keep track of the genealogy of her cats, so you can add things like "P.S. I hope Doctor Hobbles the 2nd enjoyed his birthday this Saturday!", or "How is old Lady Penelope doing? She's five years old now, isn't she?", preferably without accidentally asking about dead cats. You are in the possession of a large quantity of old e-mails from your aunt, and fortunately she is very consistent in always putting information about the cats' births and deaths at the end of her mails in precisely the same format.
You are are hardly inclined to go through all those mails by hand. Fortunately, we were just in need of an example problem, so we will try to work out a program that does the work for us. For a start, we write a program that gives us a list of cats that are still alive after the last e-mail.
Before you ask, at the start of the correspondence, aunt Emily had only a single cat: Spot. (She was still rather conventional in those days.)

It usually pays to have some kind of clue what one's program is going to do before starting to type. Here's a plan:
- Start with a set of cat names that has only "Spot" in it.
- Go over every e-mail in our archive, in chronological order.
- Look for paragraphs that start with "born" or "died".
- Add the names from paragraphs that start with "born" to our set of names.
- Remove the names from paragraphs that start with "died" from our set.
Where taking the names from a paragraph goes like this:
- Find the colon in the paragraph.
- Take the part after this colon.
- Split this part into separate names by looking for commas.
It may require some suspension of disbelief to accept that aunt Emily always used this exact format, and that she never forgot or misspelled a name, but that is just how your aunt is.
First, let me tell you about properties. A lot of JavaScript values
have other values associated with them. These associations are called
properties. Every string has a property called length, which refers
to a number, the amount of characters in that string.
Properties can be accessed in two ways:
var text = "purple haze"; show(text["length"]); show(text.length);
The second way is a shorthand for the first, and it only works when the name of the property would be a valid variable name ― when it doesn't have any spaces or symbols in it and does not start with a digit character.
Numbers, booleans, the value null, and the value undefined do not
have any properties. Trying to read properties from such a value
produces an error. Try the following code, if only to get an idea
about the kind of error-message your browser produces in such a case
(which, for some browsers, can be rather cryptic).
var nothing = null; show(nothing.length);
The properties of a string value can not be changed. There are quite a
few more than just length, as we will see, but you are not allowed
to add or remove any.
This is different with values of the type object. Their main role is to hold other values. They have, you could say, their own set of tentacles in the form of properties. You are free to modify these, remove them, or add new ones.
An object can be written like this:
var cat = {colour: "grey", name: "Spot", size: 46}; cat.size = 47; show(cat.size); delete cat.size; show(cat.size); show(cat);
Like variables, each property attached to an object is labelled by a
string. The first statement creates an object in which the property
"colour" holds the string "grey", the property "name" is attached
to the string "Spot", and the property "size" refers to the number
46. The second statement gives the property named size a new
value, which is done in the same way as modifying a variable.
The keyword delete cuts off properties. Trying to read a
non-existent property gives the value undefined.
If a property that does not yet exist is set with the = operator,
it is added to the object.
var empty = {}; empty.notReally = 1000; show(empty.notReally);
Properties whose names are not valid variable names have to be quoted when creating the object, and approached using brackets:
var thing = {"gabba gabba": "hey", "5": 10}; show(thing["5"]); show(thing[2 + 3]); delete thing["gabba gabba"];
As you can see, the part between the brackets can be any expression. It is converted to a string to determine the property name it refers to. One can even use variables to name properties:
var propertyName = "length"; var text = "mainline"; show(text[propertyName]);
The operator in can be used to test whether an object has a
certain property. It produces a boolean.
var chineseBox = {}; chineseBox.content = chineseBox; show("content" in chineseBox); show("content" in chineseBox.content);
When object values are shown on the console, they can be clicked to inspect their properties. This changes the output window to an 'inspect' window. The little 'x' at the top-right can be used to return to the output window, and the left-arrow can be used to go back to the properties of the previously inspected object.
show(chineseBox);
The solution for the cat problem talks about a 'set' of names. A set is a collection of values in which no value may occur more than once. If names are strings, can you think of a way to use an object to represent a set of names?
Show how a name can be added to this set, how one can be removed, and how you can check whether a name occurs in it.
This can be done by storing the content of the set as the properties
of an object. Adding a name is done by setting a property by that name
to a value, any value. Removing a name is done by deleting this
property. The in operator can be used to determine whether a certain
name is part of the set1.
var set = {"Spot": true}; // Add "White Fang" to the set set["White Fang"] = true; // Remove "Spot" delete set["Spot"]; // See if "Asoka" is in the set show("Asoka" in set);
Object values, apparently, can change. The types of values discussed in chapter 2 are all immutable, it is impossible to change an existing value of those types. You can combine them and derive new values from them, but when you take a specific string value, the text inside it can not change. With objects, on the other hand, the content of a value can be modified by changing its properties.
When we have two numbers, 120 and 120, they can for all practical
purposes be considered the precise same number. With objects, there is
a difference between having two references to the same object and
having two different objects that contain the same properties.
Consider the following code:
var