Chapter 6: Functional Programming

As programs get bigger, they also become more complex and harder to understand. We all think ourselves pretty clever, of course, but we are mere human beings, and even a moderate amount of chaos tends to baffle us. And then it all goes downhill. Working on something you do not really understand is a bit like cutting random wires on those time-activated bombs they always have in movies. If you are lucky, you might get the right one ― especially if you are the hero of the movie and strike a suitably dramatic pose ― but there is always the possibility of blowing everything up.

Admittedly, in most cases, breaking a program does not cause any large explosions. But when a program, by someone's ignorant tinkering, has degenerated into a ramshackle mass of errors, reshaping it into something sensible is a terrible labour ― sometimes you might just as well start over.

Thus, the programmer is always looking for ways to keep the complexity of his programs as low as possible. An important way to do this is to try and make code more abstract. When writing a program, it is easy to get sidetracked into small details at every point. You come across some little issue, and you deal with it, and then proceed to the next little problem, and so on. This makes the code read like a grandmother's tale.

Yes, dear, to make pea soup you will need split peas, the dry kind. And you have to soak them at least for a night, or you will have to cook them for hours and hours. I remember one time, when my dull son tried to make pea soup. Would you believe he hadn't soaked the peas? We almost broke our teeth, all of us. Anyway, when you have soaked the peas, and you'll want about a cup of them per person, and pay attention because they will expand a bit while they are soaking, so if you aren't careful they will spill out of whatever you use to hold them, so also use plenty water to soak in, but as I said, about a cup of them, when they are dry, and after they are soaked you cook them in four cups of water per cup of dry peas. Let it simmer for two hours, which means you cover it and keep it barely cooking, and then add some diced onions, sliced celery stalk, and maybe a carrot or two and some ham. Let it all cook for a few minutes more, and it is ready to eat.

Another way to describe this recipe:

Per person: one cup dried split peas, half a chopped onion, half a carrot, a celery stalk, and optionally ham.

Soak peas overnight, simmer them for two hours in four cups of water (per person), add vegetables and ham, and cook for ten more minutes.

This is shorter, but if you don't know how to soak peas you'll surely screw up and put them in too little water. But how to soak peas can be looked up, and that is the trick. If you assume a certain basic knowledge in the audience, you can talk in a language that deals with bigger concepts, and express things in a much shorter and clearer way. This, more or less, is what abstraction is.

How is this far-fetched recipe story relevant to programming? Well, obviously, the recipe is the program. Furthermore, the basic knowledge that the cook is supposed to have corresponds to the functions and other constructs that are available to the programmer. If you remember the introduction of this book, things like while make it easier to build loops, and in chapter 4 we wrote some simple functions in order to make other functions shorter and more straightforward. Such tools, some of them made available by the language itself, others built by the programmer, are used to reduce the amount of uninteresting details in the rest of the program, and thus make that program easier to work with.


Functional programming, which is the subject of this chapter, produces abstraction through clever ways of combining functions. A programmer armed with a repertoire of fundamental functions and, more importantly, the knowledge on how to use them, is much more effective than one who starts from scratch. Unfortunately, a standard JavaScript environment comes with deplorably few essential functions, so we have to write them ourselves or, which is often preferable, make use of somebody else's code (more on that in chapter 9).

There are other popular approaches to abstraction, most notably object-oriented programming, the subject of chapter 8.


One ugly detail that, if you have any good taste at all, must be starting to bother you is the endlessly repeated for loop going over an array: for (var i = 0; i < something.length; i++) .... Can this be abstracted?

The problem is that, whereas most functions just take some values, combine them, and return something, such a loop contains a piece of code that it must execute. It is easy to write a function that goes over an array and prints out every element:

function printArray(array) {
  for (var i = 0; i < array.length; i++)
    print(array[i]);
}

But what if we want to do something else than print? Since 'doing something' can be represented as a function, and functions are also values, we can pass our action as a function value:

function forEach(array, action) {
  for (var i = 0; i < array.length; i++)
    action(array[i]);
}

forEach(["Wampeter", "Foma", "Granfalloon"], print);

And by making use of an anonymous function, something just like a for loop can be written with less useless details:

function sum(numbers) {
  var total = 0;
  forEach(numbers, function (number) {
    total += number;
  });
  return total;
}
show(sum([1, 10, 100]));

Note that the variable total is visible inside the anonymous function because of the lexical scoping rules. Also note that this version is hardly shorter than the for loop and requires a rather clunky }); at its end ― the brace closes the body of the anonymous function, the parenthesis closes the function call to forEach, and the semicolon is needed because this call is a statement.

You do get a variable bound to the current element in the array, number, so there is no need to use numbers[i] anymore, and when this array is created by evaluating some expression, there is no need to store it in a variable, because it can be passed to forEach directly.

The cat-code in chapter 4 contains a piece like this:

var paragraphs = mailArchive[mail].split("\n");
for (var i = 0; i < paragraphs.length; i++)
  handleParagraph(paragraphs[i]);

This can now be written as...

forEach(mailArchive[mail].split("\n"), handleParagraph);

On the whole, using more abstract (or 'higher level') constructs results in more information and less noise: The code in sum reads 'for each number in numbers add that number to the total', instead of... 'there is this variable that starts at zero, and it counts upward to the length of the array called numbers, and for every value of this variable we look up the corresponding element in the array and add this to the total'.


What forEach does is take an algorithm, in this case 'going over an array', and abstract it. The 'gaps' in the algorithm, in this case, what to do for each of these elements, are filled by functions which are passed to the algorithm function.

Functions that operate on other functions are called higher-order functions. By operating on functions, they can talk about actions on a whole new level. The makeAddFunction function from chapter 3 is also a higher-order function. Instead of taking a function value as an argument, it produces a new function.

Higher-order functions can be used to generalise many algorithms that regular functions can not easily describe. When you have a repertoire of these functions at your disposal, it can help you think about your code in a clearer way: Instead of a messy set of variables and loops, you can decompose algorithms into a combination of a few fundamental algorithms, which are invoked by name, and do not have to be typed out again and again.

Being able to write what we want to do instead of how we do it means we are working at a higher level of abstraction. In practice, this means shorter, clearer, and more pleasant code.


Another useful type of higher-order function modifies the function value it is given:

function negate(func) {
  return function(x) {
    return !func(x);
  };
}
var isNotNaN = negate(isNaN);
show(isNotNaN(NaN));

The function returned by negate feeds the argument it is given to the original function func, and then negates the result. But what if the function you want to negate takes more than one argument? You can get access to any arguments passed to a function with the arguments array, but how do you call a function when you do not know how many arguments you have?

Functions have a method called apply, which is used for situations like this. It takes two arguments. The role of the first argument will be discussed in chapter 8, for now we just use null there. The second argument is an array containing the arguments that the function must be applied to.

show(Math.min.apply(null, [5, 6]));

function negate(func) {
  return function() {
    return !func.apply(null, arguments);
  };
}

Unfortunately, on the Internet Explorer browser a lot of built-in functions, such as alert, are not really functions... or something. They report their type as "object" when given to the typeof operator, and they do not have an apply method. Your own functions do not suffer from this, they are always real functions.


Let us look at a few more basic algorithms related to arrays. The sum function is really a variant of an algorithm which is usually called reduce or fold:

function reduce(combine, base, array) {
  forEach(array, function (element) {
    base = combine(base, element);
  });
  return base;
}

function add(a, b) {
  return a + b;
}

function sum(numbers) {
  return reduce(add, 0, numbers);
}

reduce combines an array into a single value by repeatedly using a function that combines an element of the array with a base value. This is exactly what sum did, so it can be made shorter by using reduce... except that addition is an operator and not a function in JavaScript, so we first had to put it into a function.

The reason reduce takes the function as its first argument instead of its last, as in forEach, is partly that this is tradition ― other languages do it like that ― and partly that this allows us to use a particular trick, which will be discussed at the end of this chapter. It does mean that, when calling reduce, writing the reducing function as an anonymous function looks a bit weirder, because now the other arguments follow after the function, and the resemblance to a normal for block is lost entirely.


Ex. 6.1

Write a function countZeroes, which takes an array of numbers as its argument and returns the amount of zeroes that occur in it. Use reduce.

Then, write the higher-order function count, which takes an array and a test function as arguments, and returns the amount of elements in the array for which the test function returned true. Re-implement countZeroes using this function.

function countZeroes(array) {
  function counter(total, element) {
    return total + (element === 0 ? 1 : 0);
  }
  return reduce(counter, 0, array);
}

The weird part, with the question mark and the colon, uses a new operator. In chapter 2 we have seen unary and binary operators. This one is ternary ― it acts on three values. Its effect resembles that of if/else, except that, where if conditionally executes statements, this one conditionally chooses expressions. The first part, before the question mark, is the condition. If this condition is true, the expression after the question mark is chosen, 1 in this case. If it is false, the part after the colon, 0 in this case, is chosen.

Use of this operator can make some pieces of code much shorter. When the expressions inside it get very big, or you have to make more decisions inside the conditional parts, just using plain if and else is usually more readable.

Here is the solution that uses a count function, with a function that produces equality-testers included to make the final countZeroes function even shorter:

function count(test, array) {
  return reduce(function(total, element) {
    return total + (test(element) ? 1 : 0);
  }, 0, array);
}

function equals(x) {
  return function(element) {return x === element;};
}

function countZeroes(array) {
  return count(equals(0), array);
}

One other generally useful 'fundamental algorithm' related to arrays is called map. It goes over an array, applying a function to every element, just like forEach. But instead of discarding the values returned by function, it builds up a new array from these values.

function map(func, array) {
  var result = [];
  forEach(array, function (element) {
    result.push(func(element));
  });
  return result;
}

show(map(Math.round, [0.01, 2, 9.89, Math.PI]));

Note that the first argument is called func, not function, this is because function is a keyword and thus not a valid variable name.


There once was, living in the deep mountain forests of Transylvania, a recluse. Most of the time, he just wandered around his mountain, talking to trees and laughing with birds. But now and then, when the pouring rain trapped him in his little hut, and the howling wind made him feel unbearably small, the recluse felt an urge to write something, wanted to pour some thoughts out onto paper, where they could maybe grow bigger than he himself was.

After failing miserably at poetry, fiction, and philosophy, the recluse finally decided to write a technical book. In his youth, he had done some computer programming, and he figured that if he could just write a good book about that, fame and recognition would surely follow.

So he wrote. At first he used fragments of tree bark, but that turned out not to be very practical. He went down to the nearest village and bought himself a laptop computer. After a few chapters, he realised he wanted to put the book in HTML format, in order to put it on his web-page...


Are you familiar with HTML? It is the method used to add mark-up to pages on the web, and we will be using it a few times in this book, so it would be nice if you know how it works, at least generally. If you are a good student, you could go search the web for a good introduction to HTML now, and come back here when you have read it. Most of you probably are lousy students, so I will just give a short explanation and hope it is enough.

HTML stands for 'HyperText Mark-up Language'. An HTML document is all text. Because it must be able to express the structure of this text, information about which text is a heading, which text is purple, and so on, a few characters have a special meaning, somewhat like backslashes in JavaScript strings. The 'less than' and 'greater than' characters are used to create 'tags'. A tag gives extra information about the text in the document. It can stand on its own, for example to mark the place where a picture should appear in the page, or it can contain text and other tags, for example when it marks the start and end of a paragraph.

Some tags are compulsory, a whole HTML document must always be contained in between html tags. Here is an example of an HTML document:

<html>
  <head>
    <title>A quote</title>
  </head>
  <body>
    <h1>A quote</h1>
    <blockquote>
      <p>The connection between the language in which we
      think/program and the problems and solutions we can imagine
      is very close.  For this reason restricting language
      features with the intent of eliminating programmer errors is
      at best dangerous.</p>
      <p>-- Bjarne Stroustrup</p>
    </blockquote>
    <p>Mr. Stroustrup is the inventor of the C++ programming
    language, but quite an insightful person nevertheless.</p>
    <p>Also, here is a picture of an ostrich:</p>
    <img src="img/ostrich.png"/>
  </body>
</html>

Elements that contain text or other tags are first opened with <tagname>, and afterwards finished with </tagname>. The html element always contains two children: head and body. The first contains information about the document, the second contains the actual document.

Most tag names are cryptic abbreviations. h1 stands for 'heading 1', the biggest kind of heading. There are also h2 to h6 for successively smaller headings. p means 'paragraph', and img stands for 'image'. The img element does not contain any text or other tags, but it does have some extra information, src="img/ostrich.png", which is called an 'attribute'. In this case, it contains information about the image file that should be shown here.

Because < and > have a special meaning in HTML documents, they can not be written directly in the text of the document. If you want to say '5 < 10' in an HTML document, you have to write '5 &lt; 10', where 'lt' stands for 'less than'. '&gt;' is used for '>', and because these codes also give the ampersand character a special meaning, a plain '&' is written as '&amp;'.

Now, those are only the bare basics of HTML, but they should be enough to make it through this chapter, and later chapters that deal with HTML documents, without getting entirely confused.


The JavaScript console has a function viewHTML that can be used to look at HTML documents. I stored the example document above in the variable stroustrupQuote, so you can view it by executing the following code:

viewHTML(stroustrupQuote);

If you have some kind of pop-up blocker installed or integrated in your browser, it will probably interfere with viewHTML, which tries to show the HTML document in a new window or tab. Try to configure the blocker to allow pop-ups from this site.


So, picking up the story again, the recluse wanted to have his book in HTML format. At first he just wrote all the tags directly into his manuscript, but typing all those less-than and greater-than signs made his fingers hurt, and he constantly forgot to write &amp; when he needed an &. This gave him a headache. Next, he tried to write the book in Microsoft Word, and then save it as HTML. But the HTML that came out of that was fifteen times bigger and more complicated than it had to be. And besides, Microsoft Word gave him a headache.

The solution that he eventually came up with was this: He would write the book as plain text, following some simple rules about the way paragraphs were separated and the way headings looked. Then, he would write a program to convert this text into precisely the HTML that he wanted.

The rules are this:

  1. Paragraphs are separated by blank lines.
  2. A paragraph that starts with a '%' symbol is a header. The more '%' symbols, the smaller the header.
  3. Inside paragraphs, pieces of text can be emphasised by putting them between asterisks.
  4. Footnotes are written between braces.

After he had struggled painfully with his book for six months, the recluse had still only finished a few paragraphs. At this point, his hut was struck by lightning, killing him, and forever putting his writing ambitions to rest. From the charred remains of his laptop, I could recover the following file:

% The Book of Programming

%% The Two Aspects

Below the surface of the machine, the program moves. Without effort,
it expands and contracts. In great harmony, electrons scatter and
regroup. The forms on the monitor are but ripples on the water. The
essence stays invisibly below.

When the creators built the machine, they put in the processor and the
memory. From these arise the two aspects of the program.

The aspect of the processor is the active substance. It is called
Control. The aspect of the memory is the passive substance. It is
called Data.

Data is made of merely bits, yet it takes complex forms. Control
consists only of simple instructions, yet it performs difficult
tasks. From the small and trivial, the large and complex arise.

The program source is Data. Control arises from it. The Control
proceeds to create new Data. The one is born from the other, the
other is useless without the one. This is the harmonious cycle of
Data and Control.

Of themselves, Data and Control are without structure. The programmers
of old moulded their programs out of this raw substance. Over time,
the amorphous Data has crystallised into data types, and the chaotic
Control was restricted into control structures and functions.

%% Short Sayings

When a student asked Fu-Tzu about the nature of the cycle of Data and
Control, Fu-Tzu replied 'Think of a compiler, compiling itself.'

A student asked 'The programmers of old used only simple machines and
no programming languages, yet they made beautiful programs. Why do we
use complicated machines and programming languages?'. Fu-Tzu replied
'The builders of old used only sticks and clay, yet they made
beautiful huts.'

A hermit spent ten years writing a program. 'My program can compute
the motion of the stars on a 286-computer running MS DOS', he proudly
announced. 'Nobody owns a 286-computer or uses MS DOS anymore.',
Fu-Tzu responded.

Fu-Tzu had written a small program that was full of global state and
dubious shortcuts. Reading it, a student asked 'You warned us against
these techniques, yet I find them in your program. How can this be?'
Fu-Tzu said 'There is no need to fetch a water hose when the house is
not on fire.'{This is not to be read as an encouragement of sloppy
programming, but rather as a warning against neurotic adherence to
rules of thumb.}

%% Wisdom

A student was complaining about digital numbers. 'When I take the root
of two and then square it again, the result is already inaccurate!'.
Overhearing him, Fu-Tzu laughed. 'Here is a sheet of paper. Write down
the precise value of the square root of two for me.'

Fu-Tzu said 'When you cut against the grain of the wood, much strength
is needed. When you program against the grain of a problem, much code
is needed.'

Tzu-li and Tzu-ssu were boasting about the size of their latest
programs. 'Two-hundred thousand lines', said Tzu-li, 'not counting
comments!'. 'Psah', said Tzu-ssu, 'mine is almost a *million* lines
already.' Fu-Tzu said 'My best program has five hundred lines.'
Hearing this, Tzu-li and Tzu-ssu were enlightened.

A student had been sitting motionless behind his computer for hours,
frowning darkly. He was trying to write a beautiful solution to a
difficult problem, but could not find the right approach. Fu-Tzu hit
him on the back of his head and shouted '*Type something!*' The student
started writing an ugly solution. After he had finished, he suddenly
understood the beautiful solution.

%% Progression

A beginning programmer writes his programs like an ant builds her
hill, one piece at a time, without thought for the bigger structure.
His programs will be like loose sand. They may stand for a while, but
growing too big they fall apart{Referring to the danger of internal
inconsistency and duplicated structure in unorganised code.}.

Realising this problem, the programmer will start to spend a lot of
time thinking about structure. His programs will be rigidly
structured, like rock sculptures. They are solid, but when they must
change, violence must be done to them{Referring to the fact that
structure tends to put restrictions on the evolution of a program.}.

The master programmer knows when to apply structure and when to leave
things in their simple form. His programs are like clay, solid yet
malleable.

%% Language

When a programming language is created, it is given syntax and
semantics. The syntax describes the form of the program, the semantics
describe the function. When the syntax is beautiful and the semantics
are clear, the program will be like a stately tree. When the syntax is
clumsy and the semantics confusing, the program will be like a bramble
bush.

Tzu-ssu was asked to write a program in the language called Java,
which takes a very primitive approach to functions. Every morning, as
he sat down in front of his computer, he started complaining. All day
he cursed, blaming the language for all that went wrong. Fu-Tzu
listened for a while, and then reproached him, saying 'Every language
has its own way. Follow its form, do not try to program as if you
were using another language.'

To honour the memory of our good recluse, I would like to finish his HTML-generating program for him. A good approach to this problem goes like this:

  1. Split the file into paragraphs by cutting it at every empty line.
  2. Remove the '%' characters from header paragraphs and mark them as headers.
  3. Process the text of the paragraphs themselves, splitting them into normal parts, emphasised parts, and footnotes.
  4. Move all the footnotes to the bottom of the document, leaving numbers1 in their place.
  5. Wrap each piece into the correct HTML tags.
  6. Combine everything into a single HTML document.

This approach does not allow footnotes inside emphasised text, or vice versa. This is kind of arbitrary, but helps keep the example code simple. If, at the end of the chapter, you feel like an extra challenge, you can try to revise the program to support 'nested' mark-up.

The whole manuscript, as a string value, is available on this page by calling recluseFile function.


Step 1 of the algorithm is trivial. A blank line is what you get when you have two newlines in a row, and if you remember the split method that strings have, which we saw in chapter 4, you will realise that this will do the trick:

var paragraphs = recluseFile().split("\n\n");
print("Found ", paragraphs.length, " paragraphs.");

Ex. 6.2

Write a function processParagraph that, when given a paragraph string as its argument, checks whether this paragraph is a header. If it is, it strips off the '%' characters and counts their number. Then, it returns an object with two properties, content, which contains the text inside the paragraph, and type, which contains the tag that this paragraph must be wrapped in, "p" for regular paragraphs, "h1" for headers with one '%', and "hX" for headers with X '%' characters.

Remember that strings have a charAt method that can be used to look at a specific character inside them.

function processParagraph(paragraph) {
  var header = 0;
  while (paragraph.charAt(0) == "%") {
    paragraph = paragraph.slice(1);
    header++;
  }

  return {type: (header == 0 ? "p" : "h" + header),
          content: paragraph};
}

show(processParagraph(paragraphs[0]));

This is where we can try out the map function we saw earlier.

var paragraphs = map(processParagraph,
                     recluseFile().split("\n\n"));

And bang, we have an array of nicely categorised paragraph objects. We are getting ahead of ourselves though, we forgot step 3 of the algorithm:

Process the text of the paragraphs themselves, splitting them into normal parts, emphasised parts, and footnotes.

Which can be decomposed into:

  1. If the paragraph starts with an asterisk, take off the emphasised part and store it.
  2. If the paragraph starts with an opening brace, take off the footnote and store it.
  3. Otherwise, take off the part until the first emphasised part or footnote, or until the end of the string, and store it as normal text.
  4. If there is anything left in the paragraph, start at 1 again.

Ex. 6.3

Build a function splitParagraph which, given a paragraph string, returns an array of paragraph fragments. Think of a good way to represent the fragments.

The method indexOf, which searches for a character or sub-string in a string and returns its position, or -1 if not found, will probably be useful in some way here.

This is a tricky algorithm, and there are many not-quite-correct or way-too-long ways to describe it. If you run into problems, just think about it for a minute. Try to write inner functions that perform the smaller actions that make up the algorithm.

Here is one possible solution:

function splitParagraph(text) {
  function indexOrEnd(character) {
    var index = text.indexOf(character);
    return index == -1 ? text.length : index;
  }

  function takeNormal() {
    var end = reduce(Math.min, text.length,
                     map(indexOrEnd, ["*", "{"]));
    var part = text.slice(0, end);
    text = text.slice(end);
    return part;
  }

  function takeUpTo(character) {
    var end = text.indexOf(character, 1);
    if (end == -1)
      throw new Error("Missing closing '" + character + "'");
    var part = text.slice(1, end);
    text = text.slice(end + 1);
    return part;
  }

  var fragments = [];

  while (text != "") {
    if (text.charAt(0) == "*")
      fragments.push({type: "emphasised",
                      content: takeUpTo("*")});
    else if (text.charAt(0) == "{")
      fragments.push({type: "footnote",
                      content: takeUpTo("}")});
    else
      fragments.push({type: "normal",
                      content: takeNormal()});
  }
  return fragments;
}

Note the over-eager use of map and reduce in the takeNormal function. This is a chapter about functional programming, so program functionally we will! Can you see how this works? The map produces an array of positions where the given characters were found, or the end of the string if they were not found, and the reduce takes the minimum of them, which is the next point in the string that we have to look at.

If you'd write that out without mapping and reducing you'd get something like this:

var nextAsterisk = text.indexOf("*");
var nextBrace = text.indexOf("{");
var end = text.length;
if (nextAsterisk != -1)
  end = nextAsterisk;
if (nextBrace != -1 && nextBrace < end)
  end = nextBrace;

Which is even more hideous. Most of the time, when a decision has to be made based on a series of things, even if there are only two of them, writing it as array operations is nicer than handling every value in a separate if statement. (Fortunately, chapter 10 describes an easier way to ask for the first occurrence of 'this or that character' in a string.)

If you wrote a splitParagraph that stored fragments in a different way than the solution above, you might want to adjust it, because the functions in the rest of the chapter assume that fragments are objects with type and content properties.


We can now wire processParagraph to also split the text inside the paragraphs, my version can be modified like this:

function processParagraph(paragraph) {
  var header = 0;
  while (paragraph.charAt(0) == "%") {
    paragraph = paragraph.slice(1);
    header++;
  }

  return {type: (header == 0 ? "p" : "h" + header),
          content: splitParagraph(paragraph)};
}

Mapping that over the array of paragraphs gives us an array of paragraph objects, which in turn contain arrays of fragment objects. The next thing to do is to take out the footnotes, and put references to them in their place. Something like this:

function extractFootnotes(paragraphs) {
  var footnotes = [];
  var currentNote = 0;

  function replaceFootnote(fragment) {
    if (fragment.type == "footnote") {
      currentNote++;
      footnotes.push(fragment);
      fragment.number = currentNote;
      return {type: "reference", number: currentNote};
    }
    else {
      return fragment;
    }
  }

  forEach(paragraphs, function(paragraph) {
    paragraph.content = map(replaceFootnote,
                            paragraph.content);
  });

  return footnotes;
}     

The replaceFootnote function is called on every fragment. When it gets a fragment that should stay where it is, it just returns it, but when it gets a footnote, it stores this footnote in the footnotes array, and returns a reference to it instead. In the process, every footnote and reference is also numbered.


That gives us enough tools to extract the information we need from the file. All that is left now is generating the correct HTML.

A lot of people think that concatenating strings is a great way to produce HTML. When they need a link to, for example, a site where you can play the game of Go, they will do:

var url = "http://www.gokgs.com/";
var text = "Play Go!";
var linkText = "<a href=\"" + url + "\">" + text + "</a>";
print(linkText);

(Where a is the tag used to create links in HTML documents.) ... Not only is this clumsy, but when the string text happens to include an angular bracket or an ampersand, it is also wrong. Weird things will happen on your website, and you will look embarrassingly amateurish. We wouldn't want that to happen. A few simple HTML-generating functions are easy to write. So let us write them.


The secret to successful HTML generation is to treat your HTML document as a data structure instead of a flat piece of text. JavaScript's objects provide a very easy way to model this:

var linkObject = {name: "a",
                  attributes: {href: "http://www.gokgs.com/"},
                  content: ["Play Go!"]};

Each HTML element contains a name property, giving the name of the tag it represents. When it has attributes, it also contains an attributes property, which contains an object in which the attributes are stored. When it has content, there is a content property, containing an array of other elements contained in this element. Strings play the role of pieces of text in our HTML document, so the array ["Play Go!"] means that this link has only one element inside it, which is a simple piece of text.

Typing in these objects directly is clumsy, but we don't have to do that. We provide a shortcut function to do this for us:

function tag(name, content, attributes) {
  return {name: name, attributes: attributes, content: content};
}

Note that, since we allow the attributes and content of an element to be undefined if they are not applicable, the second and third argument to this function can be left off when they are not needed.

tag is still rather primitive, so we write shortcuts for common types of elements, such as links, or the outer structure of a simple document:

function link(target, text) {
  return tag("a", [text], {href: target});
}

function htmlDoc(title, bodyContent) {
  return tag("html", [tag("head", [tag("title", [title])]),
                      tag("body", bodyContent)]);
}

Ex. 6.4

Looking back at the example HTML document if necessary, write an image function which, when given the location of an image file, will create an img HTML element.

function image(src) {
  return tag("img", [], {src: src});
}

When we have created a document, it will have to be reduced to a string. But building this string from the data structures we have been producing is very straightforward. The important thing is to remember to transform the special characters in the text of our document...

function escapeHTML(text) {
  var replacements = [[/&/g, "&amp;"], [/"/g, "&quot;"],
                      [/</g, "&lt;"], [/>/g, "&gt;"]];
  forEach(replacements, function(replace) {
    text = text.replace(replace[0], replace[1]);
  });
  return text;
}

The replace method of strings creates a new string in which all occurrences of the pattern in the first argument are replaced by the second argument, so "Borobudur".replace(/r/g, "k") gives "Bokobuduk". Don't worry about the pattern syntax here ― we'll get to that in chapter 10. The escapeHTML function puts the different replacements that have to be made into an array, so that it can loop over them and apply them to the argument one by one.

Double quotes are also replaced, because we will also be using this function for the text inside the attributes of HTML tags. Those will be surrounded by double quotes, and thus must not have any double quotes inside of them.

Calling replace four times means the computer has to go over the whole string four times to check and replace its content. This is not very efficient. If we cared enough, we could write a more complex version of this function, something that resembles the splitParagraph function we saw earlier, to go over it only once. For now, we are too lazy for this. Again, chapter 10 shows a much better way to do this.


To turn an HTML element object into a string, we can use a recursive function like this:

function renderHTML(element) {
  var pieces = [];

  function renderAttributes(attributes) {
    var result = [];
    if (attributes) {
      for (var name in attributes) 
        result.push(" " + name + "=\"" +
                    escapeHTML(attributes[name]) + "\"");
    }
    return result.join("");
  }

  function render(element) {
    // Text node
    if (typeof element == "string") {
      pieces.push(escapeHTML(element));
    }
    // Empty tag
    else if (!element.content || element.content.length == 0) {
      pieces.push("<" + element.name +
                  renderAttributes(element.attributes) + "/>");
    }
    // Tag with content
    else {
      pieces.push("<" + element.name +
                  renderAttributes(element.attributes) + ">");
      forEach(element.content, render);
      pieces.push("</" + element.name + ">");
    }
  }

  render(element);
  return pieces.join("");
}

Note the in loop that extracts the properties from a JavaScript object in order to make HTML tag attributes out of them. Also note that in two places, arrays are being used to accumulate strings, which are then joined into a single result string. Why didn't I just start with an empty string and then add the content to it with the += operator?

It turns out that creating new strings, especially big strings, is quite a lot of work. Remember that JavaScript string values never change. If you concatenate something to them, a new string is created, the old ones stay intact. If we build up a big string by concatenating lots of little strings, new strings have to be created at every step, only to be thrown away when the next piece is concatenated to them. If, on the other hand, we store all the little strings in an array and then join them, only one big string has to be created.


So, let us try out this HTML generating system...

print(renderHTML(link("http://www.nedroid.com", "Drawings!")));

That seems to work.

var body = [tag("h1", ["The Test"]),
            tag("p", ["Here is a paragraph, and an image..."]),
            image("img/sheep.png")];
var doc = htmlDoc("The Test", body);
viewHTML(renderHTML(doc));

Now, I should probably warn you that this approach is not perfect. What it actually renders is XML, which is similar to HTML, but more structured. In simple cases, such as the above, this does not cause any problems. However, there are some things, which are correct XML, but not proper HTML, and these might confuse a browser that is trying to show the documents we create. For example, if you have an empty script tag (used to put JavaScript into a page) in your document, browsers will not realise that it is empty and think that everything after it is JavaScript. (In this case, the problem can be fixed by putting a single space inside of the tag, so that it is no longer empty, and gets a proper closing tag.)


Ex. 6.5

Write a function renderFragment, and use that to implement another function renderParagraph, which takes a paragraph object (with the footnotes already filtered out), and produces the correct HTML element (which might be a paragraph or a header, depending on the type property of the paragraph object).

This function might come in useful for rendering the footnote references:

function footnote(number) {
  return tag("sup", [link("#footnote" + number,
                          String(number))]);
}

A sup tag will show its content as 'superscript', which means it will be smaller and a little higher than other text. The target of the link will be something like "#footnote1". Links that contain a '#' character refer to 'anchors' within a page, and in this case we will use them to make it so that clicking on the footnote link will take the reader to the bottom of the page, where the footnotes live.

The tag to render emphasised fragments with is em, and normal text can be rendered without any extra tags.

function renderParagraph(paragraph) {
  return tag(paragraph.type, map(renderFragment,
                                 paragraph.content));
}

function renderFragment(fragment) {
  if (fragment.type == "reference")
    return footnote(fragment.number);
  else if (fragment.type == "emphasised")
    return tag("em", [fragment.content]);
  else if (fragment.type == "normal")
    return fragment.content;
}

We are almost finished. The only thing that we do not have a rendering function for yet are the footnotes. To make the "#footnote1" links work, an anchor must be included with every footnote. In HTML, an anchor is specified with an a element, which is also used for links. In this case, it needs a name attribute, instead of an href.

function renderFootnote(footnote) {
  var number = "[" + footnote.number + "] ";
  var anchor = tag("a", [number], {name: "footnote" + footnote.number});
  return tag("p", [tag("small", [anchor, footnote.content])]);
}

Here, then, is the function which, when given a file in the correct format and a document title, returns an HTML document:

function renderFile(file, title) {
  var paragraphs = map(processParagraph, file.split("\n\n"));
  var footnotes = map(renderFootnote,
                      extractFootnotes(paragraphs));
  var body = map(renderParagraph, paragraphs).concat(footnotes);
  return renderHTML(htmlDoc(title, body));
}

viewHTML(renderFile(recluseFile(), "The Book of Programming"));

The concat method of an array can be used to concatenate another array to it, similar to what the + operator does with strings.


In the chapters after this one, elementary higher-order functions like map and reduce will always be available and will be used by code examples. Now and then, a new useful tool is added to this. In chapter 9, we develop a more structured approach to this set of 'basic' functions.


When using higher-order functions, it is often annoying that operators are not functions in JavaScript. We have needed add or equals functions at several points. Rewriting these every time, you will agree, is a pain. From now on, we will assume the existence of an object called op, which contains these functions:

var op = {
  "+": function(a, b){return a + b;},
  "==": function(a, b){return a == b;},
  "===": function(a, b){return a === b;},
  "!": function(a){return !a;}
  /* and so on */
};

So we can write reduce(op["+"], 0, [1, 2, 3, 4, 5]) to sum an array. But what if we need something like equals or makeAddFunction, in which one of the arguments already has a value? In that case we are back to writing a new function again.

For cases like that, something called 'partial application' is useful. You want to create a new function that already knows some of its arguments, and treats any additional arguments it is passed as coming after these fixed arguments. This can be done by making creative use of the apply method of a function:

function asArray(quasiArray, start) {
  var result = [];
  for (var i = (start || 0); i < quasiArray.length; i++)
    result.push(quasiArray[i]);
  return result;
}

function partial(func) {
  var fixedArgs = asArray(arguments, 1);
  return function(){
    return func.apply(null, fixedArgs.concat(asArray(arguments)));
  };
}

We want to allow binding multiple arguments at the same time, so the asArray function is necessary to make normal arrays out of the arguments objects. It copies their content into a real array, so that the concat method can be used on it. It also takes an optional second argument, which can be used to leave out some arguments at the start.

Also note that it is necessary to store the arguments of the outer function (partial) into a variable with another name, because otherwise the inner function can not see them ― it has its own arguments variable, which shadows the one of the outer function.

Now equals(10) could be written as partial(op["=="], 10), without the need for a specialized equals function. And you can do things like this:

show(map(partial(op["+"], 1), [0, 2, 4, 6, 8, 10]));

The reason map takes its function argument before its array argument is that it is often useful to partially apply map by giving it a function. This 'lifts' the function from operating on a single value to operating on an array of values. For example, if you have an array of arrays of numbers, and you want to square them all, you do this:

function square(x) {return x * x;}

show(map(partial(map, square), [[10, 100], [12, 16], [0, 1]]));

One last trick that can be useful when you want to combine functions is function composition. At the start of this chapter I showed a function negate, which applies the boolean not operator to the result of calling a function:

function negate(func) {
  return function() {
    return !func.apply(null, arguments);
  };
}

This is a special case of a general pattern: call function A, and then apply function B to the result. Composition is a common concept in mathematics. It can be caught in a higher-order function like this:

function compose(func1, func2) {
  return function() {
    return func1(func2.apply(null, arguments));
  };
}

var isUndefined = partial(op["==="], undefined);
var isDefined = compose(op["!"], isUndefined);
show(isDefined(Math.PI));
show(isDefined(Math.PIE));

Here we are defining new functions without using the function keyword at all. This can be useful when you need to create a simple function to give to, for example, map or reduce. However, when a function becomes more complex than these examples, it is usually shorter (not to mention more efficient) to just write it out with function.

  1. Like this...